Tag Archives: serverless

Python 3.13 runtime now available in AWS Lambda

2024-11-14 Julian Wood

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/python-3-13-runtime-now-available-in-aws-lambda/

This post is written by Julian Wood, Principal Developer Advocate, and Leandro Cavalcante Damascena, Senior Solutions Architect Engineer.

AWS Lambda now supports Python 3.13 as both a managed runtime and container base image. Python is a popular language for building serverless applications. The Python 3.13 release includes a number of changes to the language, the implementation, and the standard library. With this release, Python developers can now take advantage of these new features and enhancements when creating serverless applications on Lambda. Python 3.13 also includes experimental support for a number of features, which are not available in Lambda.

You can develop Lambda functions in Python 3.13 using the AWS Management Console, AWS Command Line Interface (AWS CLI), AWS SDK for Python (Boto3), AWS Serverless Application Model (AWS SAM), AWS Cloud Development Kit (AWS CDK), and other infrastructure as code tools.

The Python 3.13 runtime allows you to implement serverless best practices using Powertools for AWS Lambda (Python). This is a developer toolkit that includes observability, batch processing, AWS Systems Manager Parameter Store integration, idempotency, feature flags, Amazon CloudWatch Metrics, structured logging, and more.

Lambda@Edge allows you to use Python 3.13 to customize low-latency content delivered through Amazon CloudFront.

Lambda runtime changes

Amazon Linux 2023

As with the Python 3.12 runtime, the Python 3.13 runtime is based on the provided.al2023 runtime, which is based on the Amazon Linux 2023 minimal container image. The Amazon Linux 2023 minimal image uses microdnf as a package manager, symlinked as dnf. This replaces the yum package manager used in Python 3.11 and earlier AL2-based images. If you deploy your Lambda functions as container images, you must update your Dockerfiles to use dnf instead of yum when upgrading to the Python 3.13 base image from Python 3.11 or earlier base images.

Learn more about the provided.al2023 runtime in the blog post Introducing the Amazon Linux 2023 runtime for AWS Lambda and the Amazon Linux 2023 launch blog post.

New Python features

Data model improvements

There are improvements to the Python data model. __static_attributes__ stores the names of attributes accessed through self.X in any function in a class body.

Typing changes

With the implementation of PEP 702, you can now use the new warnings.deprecated() decorator to mark deprecations in the type system and at runtime.

Python 3.13 also adds PEP 696, which introduces default values for type parameters. This enhancement allows developers to specify default types for TypeVar, ParamSpec, and TypeVarTuple when omitting type arguments.

Standard library

The standard library includes improvements for a new PythonFinalizationError exception, raised when an operation is blocked during finalization.

The new functions base64.z85encode() and base64.z85decode() support encoding and decoding Z85 data.

The copy module now has a copy.replace() function, with support for many built-in types and any class defining the __replace__() method.

The os module has a suite of new functions for working with Linux’s timer notification file descriptors.

There is a change to the defined mutation semantics for locals().

Experimental features that are unavailable

Python 3.13 includes a number of experimental features which are not enabled for the Lambda managed runtime or base images. These features must be enabled when the Python runtime is compiled. Since the Lambda-provided Python 3.13 runtime is intended for production workloads, these features are not enabled in the Lambda build of Python 3.13 and cannot be enabled via an execution-time flag. To use these features in Lambda, you can deploy your own Python runtime using a custom runtime or container image with these features enabled.

Free-threaded CPython

You can not enable the experimental support for running Python in a free-threaded mode, with the global interpreter lock (GIL) disabled.

Just-in-time (JIT) compiler

You can also not enable the experimental JIT compiler within the Lambda managed runtime or base image.

Performance considerations

At launch, new Lambda runtimes receive less usage than existing established runtimes. This can result in longer cold start times due to reduced cache residency within internal Lambda sub-systems. Cold start times typically improve in the weeks following launch as usage increases. As a result, AWS recommends not drawing conclusions from side-by-side performance comparisons with other Lambda runtimes until the performance has stabilized. Since performance is highly dependent on workload, customers with performance-sensitive workloads should conduct their own testing, instead of relying on generic test benchmarks.

Using Python 3.13 in Lambda

AWS Management Console

To use the Python 3.13 runtime to develop your Lambda functions, specify a runtime parameter value Python 3.13 when creating or updating a function. The Python 3.13 version is available in the Runtime dropdown in the Create Function page:

Creating Python function in AWS Management Console

To update an existing Lambda function to Python 3.13, navigate to the function in the Lambda console and choose Edit in the Runtime settings panel. The new version of Python is available in the Runtime dropdown.

Changing a function to Python 3.13

You may need to check your code and dependencies for compatibility with Python 3.13, and update as necessary.

AWS Lambda container image

Change the Python base image version by modifying the FROM statement in your Dockerfile

FROM public.ecr.aws/lambda/python:3.13
# Copy function code
COPY lambda_handler.py ${LAMBDA_TASK_ROOT}

AWS Serverless Application Model (AWS SAM)

In AWS SAM set the Runtime attribute to python3.13 to use this version.

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: Simple Lambda Function
  MyFunction:
    Type: AWS::Serverless::Function
    Properties:
      Description: My Python Lambda Function
      CodeUri: my_function/
      Handler: lambda_function.lambda_handler
      Runtime: python3.13

AWS SAM supports generating this template with Python 3.13 for new serverless applications using the sam init command. Refer to the AWS SAM documentation.

AWS Cloud Development Kit (AWS CDK)

In AWS CDK, set the runtime attribute to Runtime.PYTHON_3_13 to use this version. In Python CDK:

from constructs import Construct 
from aws_cdk import ( App, Stack, aws_lambda as _lambda )

class SampleLambdaStack(Stack):
    def __init__(self, scope: Construct, id: str, **kwargs) -> None:
        super().__init__(scope, id, **kwargs)
        
        base_lambda = _lambda.Function(self, 'python313LambdaFunction', 
                                       handler='lambda_handler.handler', 
                                    runtime=_lambda.Runtime.PYTHON_3_13, 
                                 code=_lambda.Code.from_asset('lambda'))

In TypeScript CDK:

import * as cdk from 'aws-cdk-lib';
import * as lambda from 'aws-cdk-lib/aws-lambda'
import * as path from 'path';
import { Construct } from 'constructs';

export class SampleLambdaStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    // The code that defines your stack goes here

    // The python3.13 enabled Lambda Function
    const lambdaFunction = new lambda.Function(this, 'python313LambdaFunction', {
      runtime: lambda.Runtime.PYTHON_3_13,
      memorySize: 512,
      code: lambda.Code.fromAsset(path.join(__dirname, '/../lambda')),
      handler: 'lambda_handler.handler'
    })
  }
}

Conclusion

Lambda now supports Python 3.13 as a managed language runtime. This release uses the Amazon Linux 2023 OS and includes Python 3.13 language additions including data model improvements, typing changes, and updates to the standard library. This release does not support the experimental option to disable the global interpreter lock or the experimental JIT compiler.

You can build and deploy functions using Python 3.13 using the AWS Management Console, AWS CLI, AWS SDK, AWS SAM, AWS CDK, or your choice of infrastructure as code tool. You can also use the Python 3.13 container base image if you prefer to build and deploy your functions using container images.

Python 3.13 runtime support helps developers to build more efficient, powerful, and scalable serverless applications. Try the Python 3.13 runtime in Lambda today and experience the benefits of this updated language version.

For more serverless learning resources, visit Serverless Land.

Introducing an enhanced local IDE experience for AWS Lambda developers

2024-10-31 Julian Wood

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/introducing-an-enhanced-local-ide-experience-for-aws-lambda-developers/

AWS Lambda is introducing an enhanced local IDE experience to simplify Lambda-based application development. The new features help developers to author, build, debug, test, and deploy Lambda applications more efficiently in their local IDE when using Visual Studio Code (VS Code).

Overview

The IDE experience is part of the AWS Toolkit for Visual Studio Code. A new guided walkthrough helps developers set up their local environment and install required tools. The toolkit includes a set of sample applications which show you how to iterate on your code locally and in the cloud. You can configure and save build settings to speed up application builds. Generate a configuration file to set up the debugging environment for VS Code to attach and launch the step-through debugger. Iterate faster by choosing to sync local application changes quickly to the cloud or perform a full application deploy. Test functions locally and in the cloud and create and share test events to speed up local and cloud testing. There are quick action buttons for build, deploy to cloud, and local or remote invoke. The toolkit integrates with AWS Infrastructure Composer, providing a visual application building experience directly from the IDE.

Using the new features

Installing the extension

To use the updated IDE experience, ensure you have the AWS Toolkit minimum version 3.31.0 installed as a VS Code Extension.

The AWS Toolkit now includes an additional section called Application Builder within the AWS extension side-bar. This allows you to view template resources and create, build, debug, and test serverless applications.

Using Application Builder for existing applications

You can open an existing local application template using Open Folder.

Lambda’s enhanced in-console editing experience allows you to download existing function code and an AWS Serverless Application Model (AWS SAM) template. This allows you to start in the console and more easily move to using infrastructure as code, which is a serverless best practice.

Using the guided walkthrough

The guided walkthrough helps you install dependencies, select an application template, and explains how to use Application Builder to iterate locally and deploy to the cloud.

Choose Open Walkthrough which opens the walkthrough.
Complete installation takes you through a wizard to install required dependencies and select application templates.

The wizard provides download links to install the three dependencies:

If you have installed the dependencies, selecting the links recognizes the installations.

Select Choose your application template, which allows you to create example applications in VS Code.

The Iterate locally tile provides guidance on how to use Application Builder to build and invoke the function, and how to view the results.

Deploy to the cloud provides a link to Configure credentials and explains how to deploy your function to the cloud, remote invoke from your IDE, and view the results.

Creating an application from the samples

The following steps show how to create a function locally from an included template. You build the code artifact, locally test and debug, deploy, and remotely invoke and view results and logs, all without leaving your IDE.

Navigate back to Choose your application template.
New template with visual builder allows you to use Infrastructure Composer to create a new application using a visual canvas.
See more application examples provides additional sample applications across a number of managed runtimes.

There are also two specific example applications to explore Lambda functionality.

Rest API: Learn how to build a synchronous Lambda function behind an API.
File processing.: Learn how to build an asynchronous Amazon S3 file processing application.

Building a synchronous Rest API application

Select Rest API and chose Initialize your project.
Select a language runtime. Select Python for this example.
Open the file explorer, create a folder to download the example application and choose Create Project.

Application Builder downloads the application. This includes the function code hello_world\app.py, with dependencies in requirements.txt, an AWS SAM template, template.yaml file, and an example event trigger, event.json. A README.md file explains the application structure and provides build and deploy instructions.

The Application Builder section populates with the template resources.

The icons provide shortcuts to view, build, and deploy the application.

You can also use the Command Pallete to initiate the AWS SAM commands.

Selecting the Open Template File icon opens the AWS SAM template in Infrastructure Composer.

View the application resources and select Details to edit the template using the visual canvas.

Navigate to the function resource and select Open Function Handler to show the function code.

Building the application

The build step helps you build artifacts from the files in your application project directory.

Select the Build SAM template icon.
Specify build flags allows you to configure AWS SAM builds settings.

Select build settings particular to your configuration. Cached and parallel are useful to speed up future builds. Use container builds your function in a Lambda-like container. This allows you to build applications without having the language runtime and build tools installed locally.

Save parameters adds the default build options in samconfig.toml.

version = 0.1
[default.build.parameters]
template_file = "c:\\Code\\lambda-dx\\Rest-API\\template.yaml"
cached = true
parallel = true
use_container = true

AWS SAM builds the application. It downloads the build container image, installs the dependencies, and copies the function code.

Press any key to close the additional terminal.

Iterate locally: invoke and debug

You can locally invoke and debug your serverless application before uploading it to the cloud. This helps you to test the logic of your function faster. Step-through debugging allows you to identify and fix issues in your application one instruction at a time in your local environment.

Local invoke

In the Application Builder section, navigate to the function and select Local Invoke and Debug Configuration.

Initiating Local Invoke and Debug Configuration

This brings up another window which allows you to configure how to invoke the function locally and set up a debug configuration.

Viewing Local Invoke and Debug Configuration Options

You can create sample event payloads to test your function. Select an event provides a list of common trigger event payloads you can use and customize.

Selecting an example event template

This example application has an included sample event.

Select Local file and choose the events\event.json file.
Select the Invoke button.

This builds the application and locally invokes the function within a Lambda emulation environment, using the event input file.

View the function output within the IDE Output pane.

Viewing function output

Local debugging

You can also debug the function locally using VS Code’s built-in debugger.

Add a breakpoint to the function code.

Adding a breakpoint to the function code

Select the Invoke button again.

This locally invokes the function and attaches a debugger to the Lambda emulation environment.

The debugger stops at the breakpoint and you can view the function variables and call stack.

Viewing step through debugging

Use the VS Code debugger icons to step through the code.

Using VS Code debugger icons to step through the code.

In the Local Invoke and Debug Configuration panel. Chose Save Debug Config.
Choose Add Local Invoke and Debug Configuration.

Saving debug configuration

Enter a debug configuration name which creates a launch.json file and adds the debug configuration.

Naming debug configuration

You can create and save multiple debug configurations for different scenarios. See the AWS SAM documentation for more launch.json configuration options.

Once you save the debug configuration, you can use VS Code’s Run and Debug panel and select which debug configuration to run.

Using the Run and Debug panel

Deploying the application

Navigate to the Application Builder section and chose the Deploy SAM Application icon.

Selecting Deploy SAM Application icon

AWS SAM provides two deployment options:

Sync uses AWS SAM sync to perform an initial CloudFormation deploy and then allows for quick syncing of your application code, which allows for rapid prototyping. Use this for development environments only, as it doesn’t do a full CloudFormation deploy on code changes.
Deploy does a full CloudFormation deploy, which is preferred for non-quick development environments.

Viewing AWS SAM deployment options

Select Sync.
Select Specify required parameters and save as defaults.

Specifying SAM sync parameters

Select a Region to deploy the stack and enter a stack name. It is good practice to specify that this is a dev stack to avoid confusion when using the Deploy option.

Entering dev stack name

Select an existing S3 bucket to store the artifacts, or create a new one.

Selecting S3 bucket

Specify the Sync parameters. Ensure you select Watch as this automatically watches for code changes and quickly syncs code changes to the Lambda service

Setting sync parameters

AWS SAM sync does an initial CloudFormation deploy to build the resources and then waits for code changes.
Make a change to the handler file code and save the file,

Amending code

This performs a quick sync which reduces the time to test in the cloud.

Quickly syncing code

You can use the Deploy option to deploy a non-quick sync test version, amending the stack name to differentiate it from the dev stack.

Naming test version stack

Remote invoke

You can invoke the function in the cloud from your IDE. This allows you to test functionality without having to mock security, external services, or other environment variables.

Once the application is deployed, Application Builder detects changes to samconfig.toml and template.yaml, it updates the resources list with the cloud resources.

Viewing cloud resources

You can browse directly to the CloudFormation stack to view resources.

Browsing to CloudFormation stack

Selecting the function provides quick link functionality, which includes function details and a link directly to the Lambda console for the function.

Viewing function quick link options

Select Invoke in cloud.
Select the same local event file for the local invoke.

Selecting local file for remote invoke

Choose Remote Invoke.

The function invokes in the cloud using the local test event and displays the remote invoke results in the local IDE Output pane.

Viewing remote invoke results

Name and save the local event file as a remote event which becomes available in the Lambda console.

Saving remote test event

Viewing logs

You can fetch the Amazon CloudWatch Log streams generated by your Lambda function in the IDE.

Select the Search Logs icon.

Selecting Search Logs icon

You can optionally filter the results.

Optionally filtering log results

Conclusion

Lambda is introducing an enhanced local IDE experience to simplify the development of Lambda-based applications using the VS Code IDE and AWS Toolkit. This streamlines the code-test-deploy-debug cycle. A guided walkthrough helps set up your local development environment and provides sample applications to explore Lambda functionality. You can then build, debug, test, and deploy Lambda applications using icon shortcuts and the Command Pallette. This allows you to more easily iterate on your Lambda-based applications without switching between multiple interfaces.

For more serverless learning resources, visit Serverless Land.

Introducing an enhanced in-console editing experience for AWS Lambda

2024-10-22 Julian Wood

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/introducing-an-enhanced-in-console-editing-experience-for-aws-lambda/

AWS Lambda is introducing a new code editing experience in the AWS console based on the popular Code-OSS, Visual Studio Code Open Source code editor. This brings the familiar Visual Studio Code interface and many of the features directly into the Lambda console, allowing developers to use their preferred coding environment and tools in the cloud. The Lambda Code Editor displays larger function package sizes and also integrates with Amazon Q Developer. This is an AI-powered coding assistant that provides real-time suggestions and insights to help you write, understand, and troubleshoot your Lambda functions more efficiently.

Overview

Visual Studio Code is the most popular IDE among developers according to the 2023 Stack Overflow Developer Survey. Integrating Code-OSS into the Lambda Console brings a familiar, accessible, and customizable interface to the in-browser code editing capabilities. This provides a coding experience that is substantially similar to working with function code locally. You can install selected extensions, apply preferred themes and settings, and use your familiar keyboard shortcuts and coding preferences.

The new editing experience is included as part of the standard Lambda service, at no extra cost.

Accessibility

The update also addresses important accessibility needs. With features like high color contrast, keyboard-only navigation, and screen reader support, the Code-OSS integration ensures an inclusive and accessible coding experience for all developers.

Differences from Visual Studio Code IDE

The Lambda console’s Code-OSS integration complements, rather than fully replaces, local development workflows. You can view and edit function code that uses an interpreted language, not compiled languages, which is consistent with the previous Lambda console. The terminal window is also unavailable in Code-OSS.

AWS Toolkit for Visual Studio Code extensions

Deeper integration with the AWS Toolkit for VS Code extension provides access to a subset of AWS specific functionality, including Q Developer. This ensures that the Lambda code editing experience benefits from additional developer tooling enhancements provided through the AWS Toolkit.

Larger package sizes

With Lambda, the total package size for ZIP-based functions, including code and libraries, cannot exceed 50 MB. Previously Lambda imposed a 3MB limit for editing code in the console. Now you are able to view function package sizes up to 50 MB in the console, however, there is still a single file limit of 3 MB. This allows you to view function code even when you have larger dependencies.

Using the new features

Viewing code

To experience the new Lambda Code Editor, log into the AWS Management Console and navigate to the Lambda service. Create a new function or edit an existing one. The new Lambda Code Editor is ready to use, with no additional setup required.

This example shows editing an existing function, viewing the function code in the familiar Code-OSS editor.

Viewing function code in the Lambda Code Editor

Previously, the code was not viewable as the code package size was greater than 3 MB. The update allows you to view larger files. The following image shows a package size of 13.3 MB and the Code-OSS editor allows editing of the function handler.

Viewing larger package size

Environment variables

In the left pane, the environment variables are viewable for the function. Select the pencil icon to edit, add, and remove environment variables.

Viewing and editing environment variables

Creating test events

The new split-screen view allows you to test your function and see your code and test results side-by-side, simplifying test event configuration.

Select Create test event to open the panel.

Creating test event

You can create Private test events or Shareable test events for other builders to use with access to the account.

Generate an event using an event template for the Amazon API Gateway HTTP API event trigger that the function uses. Save the test event.

Creating API Gateway test event

Invoke function

Invoke the function by selecting the Invoke button

The function results appear in the Output panel, consistent with the local VS Code IDE experience.

Function invoke result

The function logs appear below the output.

Viewing function logs

This view allows you to view and edit your code, generate and use test events, and invoke your function, all visible within the familiar Lambda Code Editor interface.

Live Tail Logs

Lambda now natively supports Amazon CloudWatch Logs Live Tail. This is an interactive log streaming and analytics capability, which allows you to view and analyze your Lambda function logs in real time.

Select the Run and Debug icon in the Activity Bar on the left-hand side of the code editor in the Code tab.
Select Open CloudWatch Live Tail. This opens the CloudWatch Logs Live Tail bottom drawer.
Select Start to start a Live Tail session and view your Lambda function logs stream in real time.
Alternatively, navigate to the Test tab and select CloudWatch Logs Live Tail to start a Live Tail session.

CloudWatch Logs Live Tail

Keyboard shortcuts

In the left pane Extensions dialog, you can see the keyboard shortcuts are installed by default.

Viewing installed extensions

Select the Manage gear icon which shows which aspects are configurable.

Viewing configuration options

The Keyboard shortcuts dialog allows you to view and change the shortcuts.

Amending keyboard shortcuts

Command Palette

Viewing the Command Palette shows available commands.

Viewing Command Palette

Configuration settings

The Settings panel allows you to configure the Lambda Code Editor to match your local IDE environment if required.

Viewing Settings panel

Navigate to Themes | Color Themes to customize the theme, including dark mode.

Lambda Console Editor dark mode

Downloading function code and template

It is now easier to download the function code and an AWS Serverless Application Model (AWS SAM) template which represents the Cloudformation resources required to set up the function, policies, and triggers. This allows you to start in the console and more easily move to using infrastructure as code, which is a serverless best practice.

Navigate to the Activity Bar Run and Debug section.
Select Download code and SAM template.
Extract the .zip file and open the folder in your local VS Code IDE.

You can continue to edit the function in your local IDE experience, which is consistent with the Lambda Console Editor.

Local VS Code IDE to continue working on function

Using your local IDE terminal or AWS Toolkit for VS Code, you can update the existing function. You can also use AWS SAM functionality to build and deploy the template as a Cloudformation stack to the cloud.

Using Amazon Q

The Amazon Q Developer AI assistant integrates directly into the code editor. This reduces the need to consult external documentation or tutorials, streamlining your development workflow.

Amazon Q provides inline suggestions or by using keyboard shortcuts for common actions you take, such as initiating Amazon Q or accepting a recommendation.

This example below adds more functionality to a new Lambda function to download an object from S3 with the help of Amazon Q. Enter a comment explaining the functionality you need.

Asking Amazon Q a question

Select tab to accept the suggestion.

Accepting an Amazon Q suggestion

You can continue to invoke Q manually to keep adding more code suggestions.

Continue adding functionality with Amazon Q

Conclusion

Lambda is introducing a new AWS console code editing experience based on the popular Code-OSS, Visual Studio Code Open Source code editor. This brings the familiar VS Code IDE interface and features directly into the Lambda console so you can use your preferred coding environment and tools in the cloud. Invoke your function using a new split-screen view to see your code and test results side-by-side, simplifying test event configuration.

The code editor displays larger function package sizes, makes environment variables more visible, and also integrates with Amazon Q Developer. This provides real-time suggestions and insights to help you write, understand, and troubleshoot your Lambda functions more efficiently.

For more serverless learning resources, visit Serverless Land.

Simplifying Lambda function development using CloudWatch Logs Live Tail and Metrics Insights

2024-10-17 Chris McPeek

Post Syndicated from Chris McPeek original https://aws.amazon.com/blogs/compute/simplifying-lambda-function-development-using-cloudwatch-logs-live-tail-and-metrics-insights/

This post is written by Shridhar Pandey, Senior Product Manager, AWS Lambda

Today, AWS is announcing two new features which make it easier for developers and operators to build and operate serverless applications using AWS Lambda. First, the Lambda console now natively supports Amazon CloudWatch Logs Live Tail which provides you real-time visibility into Lambda function logs, making it easier to develop and troubleshoot Lambda functions. Second, the Lambda console now offers Amazon CloudWatch Metrics Insights dashboard for key Lambda function metrics, enabling you to easily identify and troubleshoot the source of errors or performance issues.

This blog post dives into the new capabilities enabled by these launches, how these capabilities simplify the developer and operator experience for building serverless applications with Lambda, and how you can get started with them.

Native CloudWatch Live Tail logs in Lambda console

Customers building serverless applications using Lambda want visibility into the behavior of their Lambda functions in real time, such as when an error occurs or a code change causes unexpected behavior. For example, developers want to instantly see the result of their code or configuration changes on the behavior of the function, and operators want to quickly troubleshoot any critical issues which would prevent the function from operating smoothly.

To help you monitor and troubleshoot the behavior of your function, the Lambda service automatically captures and sends logs to CloudWatch Logs. However, previously, you had to wait for Lambda function logs to be ingested, processed, and stored by CloudWatch Logs before you could view them. Then, you had to navigate to the CloudWatch console to view, search, and query logs using tools like CloudWatch Logs Insights. This caused frequent context switching between the Lambda and CloudWatch consoles in order to access logs, which introduced friction in the process of rapidly developing and troubleshooting Lambda functions.

The Lambda console now natively supports CloudWatch Logs Live Tail, an interactive log streaming and analytics capability which enables developers and operators to view and analyze their Lambda function logs in real time. This capability provides a built-in, real-time view of function logs as they become available in the Lambda console, as seen in the following image. Developers can now easily start a Live Tail session with one click and view the latest log entries as their function is executing. So, they can now edit the function code, deploy changes, invoke the function, and view the result of their code change in real time, without navigating away from the Lambda console. This streamlines and accelerates the author-test-deploy cycle (also known as the “inner dev loop”) when building serverless applications using Lambda.

Figure 1 CloudWatch Logs Live Tail in Lambda console

The Live Tail experience in Lambda console also offers fine-grained log analysis capabilities to filter logs, making it easier for operators and DevOps teams to debug and troubleshoot issues and critical errors in their Lambda functions. For example, while investigating errors in the Lambda function, operators can apply filter patterns to only display log events containing keywords of interest e.g., ERROR, exception, etc. This helps narrow down the search to relevant log events and cut out the noise, reducing the mean time to recovery (MTTR) when troubleshooting Lambda function errors. Thus, Live Tail enables operators to proactively monitor the health of their serverless applications built using Lambda and react faster to resolve errors or unexpected behavior.

Live Tail in action

To use Live Tail capabilities on any CloudWatch log group, you must have logs:StartLiveTail, logs:StopLiveTail, and logs:DescribeLogGroups AWS Identity and Access Management (IAM) permissions for that CloudWatch log group. Alternatively, you can add CloudWatchLogsReadOnlyAccess managed IAM policy (which contains these IAM permissions) to your IAM role. See Overview of managing access permissions to your CloudWatch Logs resources to learn more.

To get started with Live Tail in the Lambda console:

Navigate to the Lambda console at https://console.aws.amazon.com/lambda/
In the Functions page, select the Lambda function for which you want to view Live Tail logs.
In the Code tab, select Run and Debug icon in the Activity Bar on the left-hand side of the code editor. This opens the Run and Debug view.
Select Open CloudWatch Live Tail. This opens the CloudWatch Logs Live Tail bottom drawer.Figure 2: Starting Live Tail from code editor in Lambda console
Select Start to start a Live Tail session and view your Lambda function logs stream in real time. Alternatively, visit Test tab and select CloudWatch Logs Live Tail to start a Live Tail session.

Figure 3: Active Live Tail session

The CloudWatch Logs Live Tail bottom drawer features a Filter panel on the left-hand side, which contains useful controls such as CloudWatch log group selection, option to filter logs, and the Start and Stop buttons. You can collapse this panel if you want to utilize the entire width of your screen to view logs (without horizontal scrolling) in the Live Tail session, as shown in the following image.

Figure 4: Active Live Tail session with collapsed Filter Panel

The Filter panel has the CloudWatch log group corresponding to your Lambda function selected by default, but you can select other log groups. You can select up to 5 log groups at a time. You can also stop the Live Tail session at any time by selecting Stop in the Filter panel.

To filter logs based on specific terms or keywords, apply patterns using the “Add filter pattern” option in the Filter panel. The filters field is case sensitive. You can specify keywords, phrases, numeric values, or regular expressions in the filter pattern. See Filter pattern syntax for metric filters, subscription filters, filter log events, and Live Tail to learn more about how to use filter patterns to display only log events of interest. For example, you can filter log events containing specific keywords or phrases (e.g., ERROR, FATAL, exception, etc.) as shown in the following image.

Figure 5: Active Live Tail session with multiple log groups selected and filter patterns applied

The Live Tail session automatically stops after 15 minutes (i.e., 900 seconds) of inactivity or when the Lambda console session times out. However, when you restart the Live Tail session, the previously applied filtering criteria will be retained. This means, you can pick up where you left off with just one click.

You get 1,800 minutes of Live Tail session usage per month for free with the AWS Free Tier, after which you pay $0.01 per minute of usage. See CloudWatch pricing page for Live Tail pricing details.

The Live Tail experience in Lambda console is available in all commercial AWS Regions where AWS Lambda and Amazon CloudWatch Logs are available. See Lambda documentation and this introductory video to learn more about native Live Tail support for Lambda.

CloudWatch Metrics Insights dashboard in Lambda console

In order to effectively operate distributed applications, easily identifying the source of errors or performance issues is critical. For example, when you notice a spike in critical metrics like errors or invocation duration in your Lambda dashboard, you want to quickly find out which Lambda functions are causing these spikes. Previously, you had to navigate to the CloudWatch console and query metrics or create custom dashboards.

Now, the Lambda console features a new built-in CloudWatch Metrics Insights dashboard which provides you instant visibility into critical insights for Lambda functions in your account, such as “most invoked Lambda functions”, “functions with highest number of errors”, and “functions taking the longest to run”. The dashboard leverages CloudWatch Metrics Insights capability to enable you to easily identify functions driving the highest usage, errors, and performance issues. Thus, the Metrics Insights dashboard surfaces key insights right where you need them, reducing friction due to context switching and making it easy for your operator team(s) to identify and fix errors and performance anomalies for your serverless applications built using Lambda.

Metrics Insights dashboard in action

You can easily get started with Metrics Insights dashboard without making any code changes or creating custom dashboards. Simply navigate to the Dashboard page in the Lambda console to start accessing the insights surfaced in the Metrics Insights dashboard. The following image shows the Metric Insights dashboard in the Lambda console.

Figure 6: Lambda Dashboard page with Metrics Insights dashboard

The dashboard shows the top 10 Lambda functions in your AWS account with highest number of invocations, errors, and longest invocation duration. In the example shown in the following image, the Lambda function named 1-LambdaConsoleStack-er4D14B2288-VulWZHExuSFp shows the highest error rate among all functions experiencing errors. This could be a signal to the operator team to prioritize identifying the root cause behind the high error rate for this function.

Figure 7: Metrics Insights dashboard showing top 10 functions with highest errors, invocations, and concurrent executions

The Metrics Insights dashboard displays data for the most recent 3 hours. You can view and query metrics in the CloudWatch console if you require metrics for longer than 3 hours.

Metrics Insights dashboard in Lambda console is now available in all commercial AWS Regions where AWS Lambda and Amazon CloudWatch are available, including the AWS GovCloud (US) Regions, at no additional cost.

Conclusion

This post introduces and illustrates two new Lambda features — native support for CloudWatch Logs Live Tail and Metrics Insights dashboard in the Lambda console. These features simplify the developer and operator experience for serverless applications built using Lambda. Live Tail enables you to view and analyze Lambda logs in real time, which simplifies and accelerates the author-test-deploy cycle and makes it easy to troubleshoot errors in Lambda functions. On the other hand, Metrics Insights dashboard shows key Lambda metrics like errors, invocations, and duration to reduce the mean time to recover (MTTR) from errors and performance issues for Lambda functions.

For more serverless learning resources, visit Serverless Land.

Enhance Amazon EMR scaling capabilities with Application Master Placement

2024-10-14 Lorenzo Ripani

Post Syndicated from Lorenzo Ripani original https://aws.amazon.com/blogs/big-data/enhance-amazon-emr-scaling-capabilities-with-application-master-placement/

In today’s data-driven world, processing large datasets efficiently is crucial for businesses to gain insights and maintain a competitive edge. Amazon EMR is a managed big data service designed to handle these large-scale data processing needs across the cloud. It allows running applications built using open source frameworks on Amazon Elastic Compute Cloud (Amazon EC2), Amazon Elastic Kubernetes Service (Amazon EKS), or AWS Outposts, or completely serverless. One of the key features of Amazon EMR on EC2 is managed scaling, which dynamically adjusts computing capacity in response to application demands, providing optimal performance and cost-efficiency.

Although managed scaling aims to optimize EMR clusters for best price-performance and elasticity, some use cases require more granular resource allocation. For example, when multiple applications are submitted to the same clusters, resource contention may occur, potentially impacting both performance and cost-efficiency. Additionally, allocating the Application Master (AM) container to non-reliable nodes like Spot can potentially result in loss of the container and immediate shutdown of the entire YARN application, resulting in wasted resources and additional costs for rescheduling the entire YARN application. These uses cases require more granular resource allocation and sophisticated scheduling policies to optimize resource utilization and maintain high performance.

Starting with the Amazon EMR 7.2 release, Amazon EMR on EC2 introduced a new feature called Application Master (AM) label awareness, which allows users to enable YARN node labels to allocate the AM containers within On-Demand nodes only. Because the AM container is responsible for orchestrating the overall job execution, it’s crucial to verify that it gets allocated to a reliable instance and not be subjected to shutdown due to Spot Instance interruption. Additionally, limiting AM containers to On-Demand helps maintain consistent application launch time, because the fulfillment of the On-Demand Instance isn’t prone to unavailable Spot capacity or bid price.

In this post, we explore the key features and use cases where this new functionality can provide significant benefits, enabling cluster administrators to achieve optimal resource utilization, improved application reliability, and cost-efficiency in your EMR on EC2 clusters.

Solution overview

The Application Master label awareness feature in Amazon EMR works in conjunction with YARN node labels, a functionality offered by Hadoop that empowers you to define labels to nodes within a Hadoop cluster. You can use these labels to determine which nodes of the cluster should host specific YARN containers (such as mappers vs. reducers in a MapReduce, or drivers vs. executors in Apache Spark).

This feature is enabled by default when a cluster is launched with Amazon EMR 7.2.0 and later using Amazon EMR managed scaling, and it has been configured to use YARN node labels. The following code is a basic configuration setup that enables this feature:

[
   {
     "Classification": "yarn-site",
     "Properties": {
      "yarn.node-labels.enabled": "true",
       "yarn.node-labels.am.default-node-label-expression": "ON_DEMAND"
     }
   }
]

Within this configuration snippet, we activate the Hadoop node label feature and define a value for the yarn.node-labels.am.default-node-label-expression property. This property defines the YARN node label that will be used to schedule the AM container of each YARN application submitted to the cluster. This specific container plays a key role in maintaining the lifecycle of the workflow, so verifying its placement on reliable nodes in production workloads is crucial, because the unexpected shutdown of this container can result in the shutdown and failure of the entire application.

Currently, the Application Master label awareness feature only supports two predefined node labels that can be specified to allocate the AM container of a YARN job: ON_DEMAND and CORE. When one of these labels is defined using Amazon EMR configurations (see the preceding example code), Amazon EMR automatically creates the corresponding node labels in YARN and labels the instances in the cluster accordingly.

To demonstrate how this feature works, we launch a sample cluster and run some Spark jobs to see how Amazon EMR managed scaling integrates with YARN node labels.

Launch an EMR cluster with Application Manager placement awareness

To perform some tests, you can launch the following AWS CloudFormation stack, which provisions an EMR cluster with managed scaling and the Application Manager placement awareness feature enabled. If this is your first time launching an EMR cluster, make sure to create the Amazon EMR default roles using the following AWS Command Line Interface (AWS CLI) command:

aws emr create-default-roles

To create the cluster, choose Launch Stack:

Provide the following required parameters:

VPC – An existing virtual private cloud (VPC) in your account where the cluster will be provisioned
Subnet – The subnet in your VPC where you want to launch the cluster
SSH Key Name – An EC2 key pair that you use to connect to the EMR primary node

After the EMR cluster has been provisioned, establish a tunnel to the Hadoop Resource Manager web UI to review the cluster configurations. To access the Resource Manager web UI, complete the following steps:

Set up an SSH tunnel to the primary node using dynamic port forwarding.
Point your browser to the URL http://<primary-node-public-dns>:8088/, using the public DNS name of your cluster’s primary node.

This will open the Hadoop Resource Manager web UI, where you can see how the cluster has been configured.

YARN node labels

In the CloudFormation stack, you launched a cluster specifying to allocate the AM containers on nodes labeled as ON_DEMAND. If you explore the Resource Manager web UI, you can see that Amazon EMR created two labels in the cluster: ON_DEMAND and SPOT. To review the YARN node labels present in your cluster, you can inspect the Node Labels page, as shown in the following screenshot.

On this page, you can see how the YARN labels were created in Amazon EMR:

During initial cluster creation, default node labels such as ON_DEMAND and SPOT are automatically generated as non-exclusive partitions
The DEFAULT_PARTITION label stays vacant because every node gets labeled based on its market type—either being an On-Demand or Spot Instance

In our example, because we launched a single core node as On-Demand, you can observe a single node assigned to the ON_DEMAND partition, and the SPOT partition remains empty. Because the labels are created as non-exclusive, nodes with these labels can run both containers launched with a specific YARN label and also containers that don’t specify a YARN label. For additional details on YARN node labels, see YARN Node Labels in the Hadoop documentation.

Now that we have discussed how the cluster was configured, we can perform some tests to validate and review the behavior of this feature when using it in combination with managed scaling.

Concurrent application submission with Spot Instances

To test the managed scaling capabilities, we submit a simple SparkPi job configured to utilize all available memory on the single core node initially launched in our cluster:

spark-example \
  --deploy-mode cluster \
  --driver-memory 10g \
  --executor-memory 10g \
  --conf spark.dynamicAllocation.maxExecutors=1 \
  --conf spark.yarn.executor.nodeLabelExpression=SPOT \
  SparkPi 800000

In the preceding snippet, we tuned specific Spark configurations to utilize all the resources of the cluster nodes launched (you could also achieve this using the maximizeResourceAllocation configuration while launching an EMR cluster). Because the cluster has been launched using m5.xlarge instances, we can launch individual containers up to 12 GB in terms of memory requirements. With these assumptions, the snippet configures the following:

The Spark driver and executors were configured with 10 GB of memory to utilize most of the available memory on the node, in order to have a single container running on each node of our cluster and simplify this example.
The node-labels.am.default-node-label-expression parameter was set to ON_DEMAND, making sure the Spark driver is automatically allocated to the ON_DEMAND partition of our cluster. Because we specified this configuration while launching the cluster, the AM containers are automatically requested to be scheduled on ON_DEMAND labeled instances, so we don’t need to specify it at the job level.
The yarn.executor.nodeLabelExpression=SPOT configuration verifies that the executors operated exclusively on TASK nodes using Spot Instances. Removing this configuration allows the Spark executors to be scheduled both on SPOT and ON_DEMAND labeled nodes.
The dynamicAllocation.maxExecutors setting was set to 1 to delay the processing time of the application and observe the scaling behavior when multiple YARN applications were submitted concurrently in the same cluster.

As the application transitioned to a RUNNING state, we can verify from the YARN Resource Manager UI that its driver placement was automatically assigned to the ON_DEMAND partition of our cluster (see the following screenshot).

Additionally, upon inspecting the YARN scheduler page, we can see that our SPOT partition doesn’t have a resource associated with it because the cluster was launched with just one On-Demand Instance.

Because the cluster didn’t have Spot Instances initially, you can observe from the Amazon EMR console that managed scaling generates a new Spot task group to accommodate the Spark executor requested to run on Spot nodes only (see the following screenshot) . Before this integration, managed scaling didn’t take into account the YARN labels requested by an application, potentially leading to unpredictable scaling behaviors. With this release, managed scaling now considers the YARN labels specified by applications, enabling more predictable and accurate scaling decisions.

While waiting for the launch of the new Spot node, we submitted another SparkPi job with identical specifications. However, because the memory required to allocate the new Spark Driver was 10 GB and such resources were currently unavailable in the ON_DEMAND partition, the application remained in a pending state until resources became available to schedule its container.

Upon detecting the lack of resources to allocate the new Spark driver, Amazon EMR managed scaling commenced scaling the core instance group (On-Demand Instances in our cluster) by launching a new core node. After the new core node was launched, YARN promptly allocated the pending container on the new node, enabling the application to start its processing. Subsequently, the application requested additional Spot nodes to allocate its own executors (see the following screenshot).

This example demonstrates how managed scaling and YARN labels work together to improve the resiliency of YARN applications, while using cost-effective job executions over Spot Instances.

When to use Application Manager placement awareness and managed scaling

You can use this placement awareness feature to improve cost-efficiency by using Spot Instances while protecting the Application Manager from being incorrectly shut down due to Spot interruptions. It’s particularly useful when you want to take advantage of the cost savings offered by Spot Instances while preserving the stability and reliability of your jobs running on the cluster. When working with managed scaling and the placement awareness feature, consider the following best practices:

Maximum cost-efficiency for non-critical jobs – If you have jobs that don’t have strict service level agreement (SLA) requirements, you can force all Spark executors to run on Spot Instances for maximum cost savings. This can be achieved by setting the following Spark configuration:
```
spark.yarn.executor.nodeLabelExpression=SPOT
```
Resilient execution for production jobs – For production jobs where you require a more resilient execution, you might consider not setting the yarn.executor.nodeLabelExpression parameter. When no label is specified, executors are dynamically allocated between both On-Demand and Spot nodes, providing a more reliable execution.
Limit dynamic allocation for concurrent applications – When working with managed scaling and clusters with multiple applications running concurrently (for example, an interactive cluster with concurrent user utilization), you should consider setting a maximum limit for Spark dynamic allocation using the dynamicAllocation.maxExecutors setting. This can help manage resources over-provisioning and facilitate predictable scaling behavior across applications running on the same cluster. For more details, see Dynamic Allocation in the Spark documentation.
Managed scaling configurations – Make sure your managed scaling configurations are set up correctly to facilitate efficient scaling of Spot Instances based on your workload requirements. For example, set an appropriate value for Maximum On-Demand instances in managed scaling based on the number of concurrent applications you want to run on the cluster. Additionally, if you’re planning to use your On-Demand Instances for running solely AM containers, we recommend setting scheduler.capacity.maximum-am-resource-percent to 1 using the Amazon EMR capacity-scheduler classification.
Improve startup time of the nodes – If your cluster is subject to frequent scaling events (for example, you have a long-running cluster that can run multiple concurrent EMR steps), you might want to optimize the startup time of your cluster nodes. When trying to get an efficient node startup, consider only installing the minimum required set of application frameworks in the cluster and, whenever possible, avoid installing non-YARN frameworks such as HBase or Trino, which might delay the startup of processing nodes dynamically attached by Amazon EMR managed scaling. Finally, whenever possible, don’t use complex and time-consuming EMR bootstrap actions to avoid increasing the startup time of nodes launched with managed scaling.

By following these best practices, you can take advantage of the cost savings of Spot Instances while maintaining the stability and reliability of your applications, particularly in scenarios where multiple applications are running concurrently on the same cluster.

Conclusion

In this post, we explored the benefits of the new integration between Amazon EMR managed scaling and YARN node labels, reviewed its implementation and usage, and defined a few best practices that can help you get started. Whether you’re running batch processing jobs, stream processing applications, or other YARN workloads on Amazon EMR, this feature can help you achieve substantial cost savings without compromising on performance or reliability.

As you embark on your journey to use Spot Instances in your EMR clusters, remember to follow the best practices outlined in this post, such as setting appropriate configurations for dynamic allocation, node label expressions, and managed scaling policies. By doing so, you can make sure that your applications run efficiently, reliably, and at the lowest possible cost.

About the authors

Lorenzo Ripani is a Big Data Solution Architect at AWS. He is passionate about distributed systems, open source technologies and security. He spends most of his time working with customers around the world to design, evaluate and optimize scalable and secure data pipelines with Amazon EMR.

Miranda Diaz is a Software Development Engineer for EMR at AWS. Miranda works to design and develop technologies that make it easy for customers across the world to automatically scale their computing resources to their needs, helping them achieve the best performance at the optimal cost.

Sajjan Bhattarai is a Senior Cloud Support Engineer at AWS, and specializes in BigData and Machine Learning workloads. He enjoys helping customers around the world to troubleshoot and optimize their data platforms.

Bezuayehu Wate is an Associate Big Data Specialist Solutions Architect at AWS. She works with customers to provide strategic and architectural guidance on designing, building, and modernizing their cloud-based analytics solutions using AWS.

How CyberArk is streamlining serverless governance by codifying architectural blueprints

2024-10-11 Anton Aleksandrov

Post Syndicated from Anton Aleksandrov original https://aws.amazon.com/blogs/architecture/how-cyberark-is-streamlining-serverless-governance-by-codifying-architectural-blueprints/

This post was co-written with Ran Isenberg, Principal Software Architect at CyberArk and an AWS Serverless Hero.

Serverless architectures enable agility and simplified cloud resource management. Organizations embracing serverless architectures build robust, distributed cloud applications. As organizations grow and the number of development teams increases, maintaining architectural consistency, standardization, and governance across projects becomes crucial.

In this post, you will discover how CyberArk, a leading identity security company, efficiently implements serverless architecture governance, reduces duplicative efforts, and saves months of development time by codifying architectural blueprints. This approach helps to prevent redundant efforts and promotes uniform architectural standards, facilitating the seamless adoption of organizational best practices and governance across diverse teams.

Overview

The risk of duplicative efforts and architectural inconsistencies is particularly pronounced in large organizations, especially for requirements unrelated to specific business domains owned by individual teams. Diverse approaches to Infrastructure-as-Code, CI/CD, observability, and security can lead to inconsistent implementations across teams. Application developers should focus on delivering business value efficiently, rather than navigating the complexities of building and operating distributed architectures while adhering to organizational best practices. To achieve this, you need an approach that empowers developers and provides guardrails to ensure vetted architectural patterns are consistently applied. This solution should enable accelerated delivery without sacrificing agility and innovation.

Some organizations implement internal wiki consolidating architectural guidance. While well-intentioned, relying solely on documentation assumes development teams diligently follow the guidelines, which often requires manual validation and limits scalability. To overcome this limitation, organizations should adopt a scalable approach that codifies, automates, and promotes architectural best practices. This mechanism allows developers to focus on delivering business-domain value and drives standardized operational excellence, governance, and organizational policies adherence.

Introducing serverless blueprints

CyberArk engineering team had over 900 developers. It was looking for ways to ensure they build their serverless services based on vetted architectural and security best practices with fully automated governance controls enforcement. The solution came in the form of codified architecture blueprints and automated tooling.

Serverless architectures are composed using loosely coupled services, integrated based on the application requirements. Application developers use IaC tools such as AWS CDK and HashiCorp Terraform to define their serverless architectures and integration patterns. CyberArk has augmented the IaC with governance tools, such as cdk-nag, AWS Config, and AWS Control Tower. With these complementary tools in place, they’ve built serverless blueprints which include architectural definitions based on organizational best practices, as well as automatically applied governance controls

To illustrate this, consider a simple serverless architecture pattern. In this common pattern, an SQS queue serves as the event source for a Lambda function, which parses incoming messages and updates an Amazon S3 bucket.

Figure 1. A simple serverless architecture with SQS Queue, Lambda function, and S3 Bucket

While this pattern seems simple, turning it into an enterprise-ready service requires additional effort. You must consider aspects like resiliency, security, governance, observability, and coding best practices. Let’s examine several examples codified in architectural blueprints at CyberArk.

Error-handling best practices

Your services should be resilient. Retries can help to overcome occasional network hiccups, but you also need to handle scenarios when your function consistently fails to process particular messages (known as poison message) – for example, because of a code bug. This can lead to endless processing loops, data loss, and potential extra charges. To address this, a blueprint can implement a failure handling mechanism with a dead letter queue, alerting, and redrive. This pattern is straightforward to implement and adds extra resiliency to your architecture. It is also generic and does not contain any business domain code. This is a typical example of an architectural pattern that can be codified in a blueprint and reused across development teams.

Figure 2. The simple serverless architecture with added resiliency best practices

Security best practices

Another example is securing S3 buckets. Organizations must enforce S3 security best practices, such as enabling access logs, blocking public access, and enabling encryption at rest. Codifying these guardrails in architectural blueprints adds an extra layer that allows your developers to comply with organization standards without having to explicitly implement adherence to each best practice and policy on their own.

Figure 3. The simple serverless architecture with added security best practices

The following code snippet uses AWS CDK to create an S3 bucket with common best practices:

Enables bucket versioning on production environments only to save costs in non-production environments
Enforces data encryption with AWS-managed keys.
Blocks all public access
Enforces SSL to block all non-secure-transport access
Enables access logs

def _create_bucket(self, server_access_logs_bucket: s3.Bucket, is_production_env: bool) -> s3.Bucket:
    # Create an S3 bucket with AWS-managed keys encryption
    bucket = s3.Bucket(
        self,
        constants.BUCKET_NAME,
        versioned=True if is_production_env else False,
        encryption=s3.BucketEncryption.S3_MANAGED,
        block_public_access=s3.BlockPublicAccess.BLOCK_ALL,
        enforce_ssl=True,
        server_access_logs_bucket=server_access_logs_bucket, 
        # redacted
    )

Additional security best practices you can codify in your blueprints include the principle of least privilege access, VPC-attachment, and code signing for sensitive Lambda functions, and using KMS keys for encryption.

Lambda best practices

Your Lambda functions are another example of where blueprints can help. By providing a function blueprint implementing the baseline for capabilities like observability, idempotency, and batch processing out-of-the-box, you enable developers to focus on their business domain code.

Figure 4. Layered view of a Lambda function in CyberArk’s serverless architecture blueprint

CyberArk embeds Powertools for AWS Lambda, a toolkit that implements serverless best practices to increase developer velocity, into their blueprints. The following code snippets embed Powertools for enabling enhanced observability and implementing batch processing.

# CDK code
lambda_function = lambda.Function(
    environment={
        constants.POWERTOOLS_SERVICE_NAME: constants.SERVICE_NAME,
        constants.POWER_TOOLS_LOG_LEVEL: 'INFO',  
    },
    tracing=lambda.Tracing.ACTIVE,
    layers=["powertools-layer"],
    log_format=lambda.LogFormat.JSON.value,
    system_log_level=lambda.SystemLogLevel.INFO.value
    # redacted
)

# Function handler code
processor = BatchProcessor(event_type=EventType.SQS, model=OrderSqsRecord)

@logger.inject_lambda_context
@metrics.log_metrics
@tracer.capture_lambda_handler(capture_response=False)
def lambda_handler(event, context: LambdaContext):
    return process_partial_response(
        event=event,
        record_handler=record_handler,
        processor=processor,
        context=context,
)

Governance controls

Blueprints are not static; they evolve as you adopt new best practices and governance policies. Developers start with a vetted blueprint but can deviate as they evolve their serverless apps. To enable continuous adherence, it is important to use a combination of organizational governance tools, such as AWS Control Tower and Service Control Policies, and architecture blueprints that embed governance controls automatically enforced by CI/CD. This ensures that any architectural modification will be validated for adhering to organizational standards.

AWS defines proactive controls as mechanisms that prevent developers from deploying resources that violate governance policies. Detective controls are mechanisms that detect, log, and alert on resource or configuration changes that violate governance policies.

Figure 5. Applying governance controls at all stages of CI/CD

Depending on the IaC tool, you can leverage different types of governance tools for proactive control enforcement. The following screenshot shows a proactive control violation identified during CI/CD via the cdk-nag framework. You can see cdk-nag throwing an error for the stack deployment due to Lambda execution role being assigned wild-card permissions.

Figure 6. Exception thrown by cdk-nag for using wildcard permissions

See the practical guide for implementing serverless governance.

Sample code

Ran Isenberg has open-sourced a sample Lambda Handler Cookbook blueprint illustrating some of the patterns CyberArk has adopted.

Additional serverless architecture patterns you might consider implementing in your blueprints are server-side encryption for an Amazon SNS topic with an encrypted Amazon SQS queue subscribed, auto-adjusting provisioned concurrency for Lambda functions, secure Serverless Aurora Cluster with bastion host, and more.

See more patterns implemented at serverlessland.com and cdkpatterns.com

Conclusion

Translating architectural and security best practices into modular IaC definitions, such as CDK constructs or Terraform modules, is a scalable and reusable technique that allows CyberArk to reduce duplicative efforts and save months of development time. Using IaC tools like AWS CDK or Terraform, augmented with governance tools like cdk-nag or checkov, enabled CyberArk to share implementation best practices and encode governance policies into architectural blueprints. Development teams adopting these blueprints do not need to reinvent the wheel, each trying to solve the same problem on their own. Instead, they leverage the knowledge codified in the blueprint.

Improving platform resilience at Cloudflare through automation

2024-10-09 Opeyemi Onikute

Post Syndicated from Opeyemi Onikute original https://blog.cloudflare.com/improving-platform-resilience-at-cloudflare

Failure is an expected state in production systems, and no predictable failure of either software or hardware components should result in a negative experience for users. The exact failure mode may vary, but certain remediation steps must be taken after detection. A common example is when an error occurs on a server, rendering it unfit for production workloads, and requiring action to recover.

When operating at Cloudflare’s scale, it is important to ensure that our platform is able to recover from faults seamlessly. It can be tempting to rely on the expertise of world-class engineers to remediate these faults, but this would be manual, repetitive, unlikely to produce enduring value, and not scaling. In one word: toil; not a viable solution at our scale and rate of growth.

In this post we discuss how we built the foundations to enable a more scalable future, and what problems it has immediately allowed us to solve.

Growing pains

The Cloudflare Site Reliability Engineering (SRE) team builds and manages the platform that helps product teams deliver our extensive suite of offerings to customers. One important component of this platform is the collection of servers that power critical products such as Durable Objects, Workers, and DDoS mitigation. We also build and maintain foundational software services that power our product offerings, such as configuration management, provisioning, and IP address allocation systems.

As part of tactical operations work, we are often required to respond to failures in any of these components to minimize impact to users. Impact can vary from lack of access to a specific product feature, to total unavailability. The level of response required is determined by the priority, which is usually a reflection of the severity of impact on users. Lower-priority failures are more common — a server may run too hot, or experience an unrecoverable hardware error. Higher-priority failures are rare and are typically resolved via a well-defined incident response process, requiring collaboration with multiple other teams.

The commonality of lower-priority failures makes it obvious when the response required, as defined in runbooks, is “toilsome”. To reduce this toil, we had previously implemented a plethora of solutions to automate runbook actions such as manually-invoked shell scripts, cron jobs, and ad-hoc software services. These had grown organically over time and provided solutions on a case-by-case basis, which led to duplication of work, tight coupling, and lack of context awareness across the solutions.

We also care about how long it takes to resolve any potential impact on users. A resolution process which involves the manual invocation of a script relies on human action, increasing the Mean-Time-To-Resolve (MTTR) and leaving room for human error. This risks increasing the amount of errors we serve to users and degrading trust.

These problems proved that we needed a way to automatically heal these platform components. This especially applies to our servers, for which failure can cause impact across multiple product offerings. While we have mechanisms to automatically steer traffic away from these degraded servers, in some rare cases the breakage is sudden enough to be visible.

Solving the problem

To provide a more reliable platform, we needed a new component that provides a common ground for remediation efforts. This would remove duplication of work, provide unified context-awareness and increase development speed, which ultimately saves hours of engineering time and effort.

A good solution would not allow only the SRE team to auto-remediate, it would empower the entire company. The key to adding self-healing capability was a generic interface for all teams to self-service and quickly remediate failures at various levels: machine, service, network, or dependencies.

A good way to think about auto-remediation is in terms of workflows. A workflow is a sequence of steps to get to a desired outcome. This is not dissimilar to a manual shell script which executes what a human would otherwise do via runbook instructions. Because of this logical fit with workflows, we decided to adopt Temporal.

Temporal is a durable execution platform which is useful to gracefully manage infrastructure failures such as network outages and transient failures in external service endpoints. This capability meant we only needed to build a way to schedule “workflow” tasks and have Temporal provide reliability guarantees. This allowed us to focus on building out the orchestration system to support the control and flow of workflow execution in our data centers.

How does Temporal work?

Before we discuss the system that provides our self-healing functions, let’s explore how the workflow execution engine works, as its native architecture provided numerous benefits that we took advantage of to build a more robust foundation.

The most attractive feature Temporal offered us was the ability to write code that has reliability baked in. Some examples of these primitives are automatic retries, timeouts, rollbacks, and queueing. The Temporal Platform consists of the Temporal Cluster and Worker processes (application code that contains your custom logic).

This architecture allowed us to write our application logic as we normally would, with the added benefits of Temporal. Since Temporal Workers are external to the cluster, we can run tasks anywhere across our global network — a feature that made it easy to build an extensible, easy-to-understand framework for automating tasks.

In Temporal terms, control is provided by the basic principles used to provide workflow execution — Workflows and Activities. A Workflow is simply a sequence of Activities, which are functions that ideally do only ONE task, such as making a request to an external service or rebooting a machine.

Control of workflow behavior can be done using ActivityOptions. This is where you can define timeouts for workflow execution, retry policies, and task queues. Each worker can poll several task queues for both Workflow and Activity tasks. If no worker is polling the task queue in which a Workflow task is declared, nothing happens.

Temporal’s documentation provides a good introduction to writing Temporal workflows.

Building on Temporal

Below, we describe how our automatic remediation system works. It is essentially a way to schedule tasks across our global network with built-in reliability guarantees. With this system, teams can serve their customers more reliably. An unexpected failure mode can be recognized and immediately mitigated, while the root cause can be determined later via a more detailed analysis.

Step one: we need a coordinator

After our initial testing of Temporal, it was now possible to write workflows. But we needed a way to schedule workflow tasks from other internal services. The coordinator was built to serve this purpose, and became the primary mechanism for the authorisation and scheduling of workflows.

The most important roles of the coordinator are authorisation, workflow task routing, and safety constraints enforcement. Each consumer is authorized via mTLS authentication, and the coordinator uses an ACL to determine whether to permit the execution of a workflow. An ACL configuration looks like the following example.

server_config {
    enable_tls = true
    [...]
    route_rule {
      name  = "global_get"
      method = "GET"
      route_patterns = ["/*"]
      uris = ["spiffe://example.com/worker-admin"]
    }
    route_rule {
      name = "global_post"
      method = "POST"
      route_patterns = ["/*"]
      uris = ["spiffe://example.com/worker-admin"]
      allow_public = true
    }
    route_rule {
      name = "public_access"
      method = "GET"
      route_patterns = ["/metrics"]
      uris = []
      allow_public = true
      skip_log_match = true
    }
}

Each workflow specifies two key characteristics: where to run the tasks and the safety constraints, using an HCL configuration file. Example constraints could be whether to run on only a specific node type (such as a database), or if multiple parallel executions are allowed: if a task has been triggered too many times, that is a sign of a wider problem that might require human intervention. The coordinator uses the Temporal Visibility API to determine the current state of the executions in the Temporal cluster.

An example of a configuration file is shown below:

task_queue_target = "<target>"

# The following entries will ensure that
# 1. This workflow is not run at the same time in a 15m window.
# 2. This workflow will not run more than once an hour.
# 3. This workflow will not run more than 3 times in one day.
#
constraint {
    kind = "concurency"
    value = "1"
    period = "15m"
}

constraint {
    kind = "maxExecution"
    value = "1"
    period = "1h"
}

constraint {
    kind = "maxExecution"
    value = "3"
    period = "24h"
    is_global = true
}

Step two: Task Routing is amazing

An unforeseen benefit of using a central Temporal cluster was the discovery of Task Routing. This feature allows us to schedule a Workflow/Activity on any server that has a running Temporal Worker, and further segment by the type of server, its location, etc. For this reason, we have three primary task queues — the general queue in which tasks can be executed by any worker in the datacenter, the node type queue in which tasks can only be executed by a specific node type in the datacenter, and the individual node queue where we target a specific node for task execution.

We rely on this heavily to ensure the speed and efficiency of automated remediation. Certain tasks can be run in datacenters with known low latency to an external resource, or a node type with better performance than others (due to differences in the underlying hardware). This reduces the amount of failure and latency we see overall in task executions. Sometimes we are also constrained by certain types of tasks that can only run on a certain node type, such as a database.

Task Routing also means that we can configure certain task queues to have a higher priority for execution, although this is not a feature we have needed so far. A drawback of task routing is that every Workflow/Activity needs to be registered to the target task queue, which is a common gotcha. Thankfully, it is possible to catch this failure condition with proper testing.

Step three: when/how to self-heal?

None of this would be relevant if we didn’t put it to good use. A primary design goal for the platform was to ensure we had easy, quick ways to trigger workflows on the most important failure conditions. The next step was to determine what the best sources to trigger the actions were. The answer to this was simple: we could trigger workflows from anywhere as long as they are properly authorized and detect the failure conditions accurately.

Example triggers are an alerting system, a log tailer, a health check daemon, or an authorized engineer via a chatbot. Such flexibility allows a high level of reuse, and permits to invest more in workflow quality and reliability.

As part of the solution, we built a daemon that is able to poll a signal source for any unwanted condition and trigger a configured workflow. We have initially found Prometheus useful as a source because it contains both service-level and hardware/system-level metrics. We are also exploring more event-based trigger mechanisms, which could eliminate the need to use precious system resources to poll for metrics.

We already had internal services that are able to detect widespread failure conditions for our customers, but were only able to page a human. With the adoption of auto-remediation, these systems are now able to react automatically. This ability to create an automatic feedback loop with our customers is the cornerstone of these self-healing capabilities and we continue to work on stronger signals, faster reaction times, and better prevention of future occurrences.

The most exciting part, however, is the future possibility. Every customer cares about any negative impact from Cloudflare. With this platform we can onboard several services (especially those that are foundational for the critical path) and ensure we react quickly to any failure conditions, even before there is any visible impact.

Step four: packaging and deployment

The whole system is written in golang, and a single binary can implement each role. We distribute it as an apt package or a container for maximum ease of deployment.

We deploy a Temporal-based worker to every server we intend to run tasks on, and a daemon in datacenters where we intend to automatically trigger workflows based on the local conditions. The coordinator is more nuanced since we rely on task routing and can trigger from a central coordinator, but we have also found value in running coordinators locally in the datacenters. This is especially useful in datacenters with less capacity or degraded performance, removing the need for a round-trip to schedule the workflows.

Step five: test, test, test

Temporal provides native mechanisms to test an entire workflow, via a comprehensive test suite that supports end-to-end, integration, and unit testing, which we used extensively to prevent regressions while developing. We also ensured proper test coverage for all the critical platform components, especially the coordinator.

Despite the ease of written tests, we quickly discovered that they were not enough. After writing workflows, engineers need an environment as close as possible to the target conditions. This is why we configured our staging environments to support quick and efficient testing. These environments receive the latest changes and point to a different (staging) Temporal cluster, which enables experimentation and easy validation of changes.

After a workflow is validated in the staging environment, we can then do a full release to production. It seems obvious, but catching simple configuration errors before releasing has saved us many hours in development/change-related-task time.

Deploying to production

As you can guess from the title of this post, we put this in production to automatically react to server-specific errors and unrecoverable failures. To this end, we have a set of services that are able to detect single-server failure conditions based on analyzed traffic data. After deployment, we have successfully mitigated potential impact by taking any errant single sources of failure out of production.

We have also created a set of workflows to reduce internal toil and improve efficiency. These workflows can automatically test pull requests on target machines, wipe and reset servers after experiments are concluded, and take away manual processes that cost many hours in toil.

Building a system that is maintained by several SRE teams has allowed us to iterate faster, and rapidly tackle long-standing problems. We have set ambitious goals regarding toil elimination and are on course to achieve them, which will allow us to scale faster by eliminating the human bottleneck.

Looking to the future

Our immediate plans are to leverage this system to provide a more reliable platform for our customers and drastically reduce operational toil, freeing up engineering resources to tackle larger-scale problems. We also intend to leverage more Temporal features such as Workflow Versioning, which will simplify the process of making changes to workflows by ensuring that triggered workflows run expected versions.

We are also interested in how others are solving problems using durable execution platforms such as Temporal, and general strategies to eliminate toil. If you would like to discuss this further, feel free to reach out on the Cloudflare Community and start a conversation!

If you’re interested in contributing to projects that help build a better Internet, our engineering teams are hiring.

Extract insights in a 30TB time series workload with Amazon OpenSearch Serverless

2024-10-07 Satish Nandi

Post Syndicated from Satish Nandi original https://aws.amazon.com/blogs/big-data/extract-insights-in-a-30tb-time-series-workload-with-amazon-opensearch-serverless/

In today’s data-driven landscape, managing and analyzing vast amounts of data, especially logs, is crucial for organizations to derive insights and make informed decisions. However, handling large data while extracting insights is a significant challenge, prompting organizations to seek scalable solutions without the complexity of infrastructure management.

Amazon OpenSearch Serverless reduces the burden of manual infrastructure provisioning and scaling while still empowering you to ingest, analyze, and visualize your time-series data, simplifying data management and enabling you to derive actionable insights from data.

We recently announced a new capacity level of 30TB for time series data per account per AWS Region. The OpenSearch Serverless compute capacity for data ingestion and search/query is measured in OpenSearch Compute Units (OCUs), which are shared among various collections with the same AWS Key Management Service (AWS KMS) key. To accommodate larger datasets, OpenSearch Serverless now supports up to 500 OCUs per account per Region, each for indexing and search respectively, more than double from the previous limit of 200. You can configure the maximum OCU limits on search and indexing independently, giving you the reassurance of managing costs effectively. You can also monitor real-time OCU usage with Amazon CloudWatch metrics to gain a better perspective on your workload’s resource consumption. With the support for 30TB datasets, you can analyze data at the 30TB level to unlock valuable operational insights and make data-driven decisions to troubleshoot application downtime, improve system performance, or identify fraudulent activities.

This post discusses how you can analyze 30TB time series datasets with OpenSearch Serverless.

Innovations and optimizations to support larger data size and faster responses

Sufficient disk, memory, and CPU resources are crucial for handling extensive data effectively and conducting thorough analysis. These resources are not just beneficial but crucial for our operations. In time series collections, the OCU disk typically contains older shards that are not frequently accessed, referred to as warm shards. We have introduced a new feature called warm shard recovery prefetch. This feature actively monitors recently queried data blocks for a shard. It prioritizes them during shard movements, such as shard balancing, vertical scaling, and deployment activities. More importantly, it accelerates auto-scaling and provides faster readiness for varying search workloads, thereby significantly improving our system’s performance. The results provided later in this post provide details on the improvements.

A few select customers worked with us on early adoption prior to General Availability. In these trials, we observed up to 66% improvement in warm query performance for some customer workloads. This significant improvement shows the effectiveness of our new features. Additionally, we have enhanced the concurrency between coordinator and worker nodes, allowing more requests to be processed as the OCUs increases through auto scaling. This enhancement has resulted in up to a 10% improvement in query performance for hot and warm queries.

We have enhanced our system’s stability to handle time-series collections of up to 30 TB effectively. Our team is committed to improving system performance, as demonstrated by our ongoing enhancements to the auto-scaling system. These improvements comprised of enhanced shard distribution for optimal placement after rollover, auto-scaling policies based on queue length, and a dynamic sharding strategy that adjusts shard count based on ingestion rate.

In the following section we share an example test setup of a 30TB workload that we used internally, detailing the data being used and generated, along with our observations and results. Performance may vary depending on the specific workload.

Ingest the data

You can use the load generation scripts shared in the following workshop, or you can use your own application or data generator to create a load. You can run multiple instances of these scripts to generate a burst in indexing requests. As shown in the following screenshot, we tested with an index, sending approximately 30 TB of data over a period of 15 days. We used our load generator script to send the traffic to a single index, retaining data for 15 days using a data life cycle policy.

Test methodology

We set the deployment type to ‘Enable redundancy’ to enable data replication across Availability Zones. This deployment configuration will lead to 12-24 hours of data in hot storage (OCU disk memory) and the rest in Amazon Simple Storage Service (Amazon S3). With a defined set of search performance and the preceding ingestion expectation, we set the max OCUs to be 500 for both indexing and search.

As part of the testing, we observed auto-scaling behavior and graphed it. The indexing took around 8 hours to get stabilized at 80 OCU.

On the Search side, it took around 2 days to get stabilized at 80 OCU.

Observations:

Ingestion

The ingestion performance achieved was consistently over 2 TB per day

Search

Queries were of two types, with time ranging from 15 minutes to 15 days.

{"aggs":{"1":{"cardinality":{"field":"carrier.keyword"}}},"size":0,"query":{"bool":{"filter":[{"range":{"@timestamp":{"gte":"now-15m","lte":"now"}}}]}}}

For example

{"aggs":{"1":{"cardinality":{"field":"carrier.keyword"}}},"size":0,"query":{"bool":{"filter":[{"range":{"@timestamp":{"gte":"now-1d","lte":"now"}}}]}}}

The following chart provides the various percentile performance on the aggregation query

The second query was

{"query":{"bool":{"filter":[{"range":{"@timestamp":{"gte":"now-15m","lte":"now"}}}],"should":[{"match":{"originState":"State"}}]}}}

For example

{"query":{"bool":{"filter":[{"range":{"@timestamp":{"gte":"now-15m","lte":"now"}}}],"should":[{"match":{"originState":"California"}}]}}}

The following chart provides the various percentile performance on the search query

The following chart summarizes the time range for different queries

Time-range	Query	P50 (ms)	P90 (ms)	P95 (ms)	P99 (ms)
15 minutes	{“aggs”:{“1”:{“cardinality”:{“field”:”carrier.keyword”}}},”size”:0,”query”:{“bool”:{“filter”:[{“range”:{“@timestamp”:{“gte”:”now-15m”,”lte”:”now”}}}]}}}	325	403.867	441.917	514.75
1 day	{“aggs”:{“1”:{“cardinality”:{“field”:”carrier.keyword”}}},”size”:0,”query”:{“bool”:{“filter”:[{“range”:{“@timestamp”:{“gte”:”now-1d”,”lte”:”now”}}}]}}}	7,693.06	12,294	13,411.19	17,481.4
1 hour	{“aggs”:{“1”:{“cardinality”:{“field”:”carrier.keyword”}}},”size”:0,”query”:{“bool”:{“filter”:[{“range”:{“@timestamp”:{“gte”:”now-1h”,”lte”:”now”}}}]}}}	1,061.66	1,397.27	1,482.75	1,719.53
1 year	{“aggs”:{“1”:{“cardinality”:{“field”:”carrier.keyword”}}},”size”:0,”query”:{“bool”:{“filter”:[{“range”:{“@timestamp”:{“gte”:”now-1y”,”lte”:”now”}}}]}}}	2,758.66	10,758	12,028	22,871.4
4 hour	{“aggs”:{“1”:{“cardinality”:{“field”:”carrier.keyword”}}},”size”:0,”query”:{“bool”:{“filter”:[{“range”:{“@timestamp”:{“gte”:”now-4h”,”lte”:”now”}}}]}}}	3,870.79	5,233.73	5,609.9	6,506.22
7 day	{“aggs”:{“1”:{“cardinality”:{“field”:”carrier.keyword”}}},”size”:0,”query”:{“bool”:{“filter”:[{“range”:{“@timestamp”:{“gte”:”now-7d”,”lte”:”now”}}}]}}}	5,395.68	17,538.12	19,159.18	22,462.32
15 minutes	{“query”:{“bool”:{“filter”:[{“range”:{“@timestamp”:{“gte”:”now-15m”,”lte”:”now”}}}],”should”:[{“match”:{“originState”:”California”}}]}}}	139	190	234.55	6,071.96
1 day	{“query”:{“bool”:{“filter”:[{“range”:{“@timestamp”:{“gte”:”now-1d”,”lte”:”now”}}}],”should”:[{“match”:{“originState”:”California”}}]}}}	678.917	1,366.63	2,423	7,893.56
1 hour	{“query”:{“bool”:{“filter”:[{“range”:{“@timestamp”:{“gte”:”now-1h”,”lte”:”now”}}}],”should”:[{“match”:{“originState”:”Washington”}}]}}}	259.167	305.8	343.3	1,125.66
1 year	{“query”:{“bool”:{“filter”:[{“range”:{“@timestamp”:{“gte”:”now-1y”,”lte”:”now”}}}],”should”:[{“match”:{“originState”:”Washington”}}]}}}	2,166.33	2,469.7	4,804.9	9,440.11
4 hours	{“query”:{“bool”:{“filter”:[{“range”:{“@timestamp”:{“gte”:”now-4h”,”lte”:”now”}}}],”should”:[{“match”:{“originState”:”Washington”}}]}}}	462.933	653.6	725.3	1,583.37
7 days	{“query”:{“bool”:{“filter”:[{“range”:{“@timestamp”:{“gte”:”now-7d”,”lte”:”now”}}}],”should”:[{“match”:{“originState”:”Washington”}}]}}}	1,353	2,745.1	4,338.8	9,496.36

Conclusion

OpenSearch Serverless not only supports a larger data size than prior releases but also introduces performance improvements like warm shard pre-fetch and concurrency optimization for better query response. These features reduce the latency of warm queries and improve auto-scaling to handle varied workloads. We encourage you to take advantage of the 30TB index support and put it to the test! Migrate your data, explore the improved throughput, and take advantage of the enhanced scaling capabilities.

To get started, refer to Log analytics the easy way with Amazon OpenSearch Serverless. To get hands-on experience with OpenSearch Serverless, follow the Getting started with Amazon OpenSearch Serverless workshop, which has a step-by-step guide for configuring and setting up an OpenSearch Serverless collection.

If you have feedback about this post, share it in the comments section. If you have questions about this post, start a new thread on the Amazon OpenSearch Service forum or contact AWS Support.

About the authors

Satish Nandi is a Senior Product Manager with Amazon OpenSearch Service. He is focused on OpenSearch Serverless and has years of experience in networking, security and AI/ML. He holds a Bachelor’s degree in Computer Science and an MBA in Entrepreneurship. In his free time, he likes to fly airplanes and hang gliders and ride his motorcycle.

Milav Shah is an Engineering Leader with Amazon OpenSearch Service. He focuses on search experience for OpenSearch customers. He has extensive experience building highly scalable solutions in databases, real-time streaming and distributed computing. He also possesses functional domain expertise in verticals like Internet of Things, fraud protection, gaming and AI/ML. In his free time, he likes to ride cycle, hike, and play chess.

Qiaoxuan Xue is a Senior Software Engineer at AWS leading the search and benchmarking areas of the Amazon OpenSearch Serverless Project. His passion lies in finding solutions for intricate challenges within large-scale distributed systems. Outside of work, he enjoys woodworking, biking, playing basketball, and spending time with his family and dog.

Prashant Agrawal is a Sr. Search Specialist Solutions Architect with Amazon OpenSearch Service. He works closely with customers to help them migrate their workloads to the cloud and helps existing customers fine-tune their clusters to achieve better performance and save on cost. Before joining AWS, he helped various customers use OpenSearch and Elasticsearch for their search and log analytics use cases. When not working, you can find him traveling and exploring new places. In short, he likes doing Eat → Travel → Repeat.

Designing Serverless Integration Patterns for Large Language Models (LLMs)

2024-10-04 Chris McPeek

Post Syndicated from Chris McPeek original https://aws.amazon.com/blogs/compute/designing-serverless-integration-patterns-for-large-language-models-llms/

This post is written by Josh Hart, Principal Solutions Architect and Thomas Moore, Senior Solutions Architect

This post explores best practice integration patterns for using large language models (LLMs) in serverless applications. These approaches optimize performance, resource utilization, and resilience when incorporating generative AI capabilities into your serverless architecture.

Overview of serverless, LLMs and example use case

Organizations of all shapes and sizes are harnessing LLMs to build generative AI applications to deliver new customer experiences. Serverless technologies such as AWS Lambda, AWS Step Functions and Amazon API Gateway enable you to move from idea to market faster without thinking about servers. The pay-for-use billing model also allows for increased agility at an optimal cost.

The examples in this post leverage Amazon Bedrock, a fully managed service to access foundation models (FMs). The same principles apply to LLMs hosted on other platforms such as Amazon SageMaker. Amazon Bedrock allows developers to consume LLMs via an API without the complexities of infrastructure management. Amazon SageMaker is a fully managed service to build, train and deploy machine learning models.

The example use-case in this post is leveraging LLMs to create compelling marketing content for the launch of a new family SUV. Images of the vehicle were pre-generated using Amazon Titan Image Generator in Amazon Bedrock, which are shown below.

Example use case images generated using Titan Image Generator

As organizations adopt LLMs to power generative AI applications, serverless architectures offer an attractive approach for rapid development and cost-effective scaling. The following sections explore several serverless integration patterns to build cost-effective, performant, and fault-tolerant generative AI applications.

Direct AWS Lambda call

Direct call to Amazon Bedrock from AWS Lambda

The simplest serverless integration pattern is directly calling Bedrock in Lambda using the AWS SDK. Below is an example Lambda function using the Python SDK (boto3), calling the Bedrock InvokeModel API.

import json
import boto3
brt = boto3.client(service_name='bedrock-runtime')

def lambda_handler(event, context):
    body = json.dumps({
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": 1000,
        "messages": [{
            "role": "user",
            "content": [{
                "type": "text",
                "text":"Create a 500 word car advert given these images and the following specification: \n {}".format(event['spec'])
            },
            {
                "type": "image",
                "source": {
                    "type": "base64",
                    "media_type": "image/jpeg",
                    "data": event['image']
                }
            }]
        }]
    })

    modelId = 'anthropic.claude-3-sonnet-20240229-v1:0'
    accept = 'application/json'
    contentType = 'application/json'
    response = brt.invoke_model(body=body, modelId=modelId, accept=accept, contentType=contentType)
    response_body = json.loads(response.get('body').read())

    return {
        'statusCode': 200,
        'body': response_body["content"][0]["text"]
    }

The above code requires the Lambda function execution role to have the correct AWS Identity and Access Management (IAM) permissions to Amazon Bedrock, specifically the bedrock:InvokeModel action.

The example uses the Anthropic Claude 3 Sonnet LLM and the Anthropic Claude Messages API for the payload. The InvokeModel call is synchronous and will therefore wait for a response from the LLM. Depending on the model and prompt, the call can take several seconds. Ensure your Lambda function timeout is set appropriately. In most cases it will need to be increased from the default of 3 seconds.

The boto3 client has a default timeout of 60 seconds. Depending on the use case, you may need to increase the boto3 client timeout as shown in the sample code below.

from botocore.config import Config # Set the read timeout to 600 seconds (10 minutes) config = Config(read_timeout=600)

# Create the Bedrock client with the custom read timeout configuration boto3_bedrock = boto3.client(service_name='bedrock-runtime', config=config)

When working with LLMs, the generated text is often substantial, leading to increased response times or even timeouts. Amazon Bedrock provides the ability to stream responses using InvokeModelWithResponseStream which allows you to process and consume the generated text in chunks as it becomes available. This enables a faster response to the client and allows at least a partial response even if a timeout occurs.

When using response streaming with Lambda functions you should set the boto3 read_timeout to a lower value than the function execution timeout, meaning you will have the option to return at least some content. In some situations this is preferred to no response at all. For example, you might set your Lambda function timeout to 2 minutes and your boto3 read timeout to 90 seconds. This gives you 30 seconds to take additional action. Depending on the failure scenario, you might take various actions:

Transient errors such as rate limiting or service quotas: Consider backing off and retrying the request or load-balancing requests to another region with cross-region inference.
Timeout errors when the boto3 read timeout is hit: Decide whether to retry the request with a simplified prompt (or a shorter response length) or return a partial response.

Prompt chaining with AWS Step Functions

The direct Lambda pattern works well for simple single-prompt inference. Accomplishing complex tasks with LLMs requires a technique called prompt chaining, where tasks are broken down into smaller well-defined subtask prompts and each prompt is fed to the LLM in a defined order.

Prompt chaining inside a single Lambda function can be time consuming, and may exceed the maximum Lambda timeout of 15 minutes in some cases. AWS Step Functions can be used to solve this issue by orchestrating calls to LLMs. Bedrock has an optimized integration for Step Functions which allows you to use Run as Job (.sync). This integration pattern means Step Functions will wait for the InvokeModel request to complete before progressing to the next state. With Step Functions Standard Workflows you only pay for state transitions, which reduces the cost for Lambda idle wait time.

The below example shows prompt chaining with Step Functions using direct integrations only. The example eliminates the need of custom Lambda code.

Prompt chaining using AWS Step Functions

The user input (vehicle description) is passed to Amazon Bedrock via the Step Functions optimized integration.
The generated output of the InvokeModel API call is passed via the ResultPath to the next step.
The state machine sets the input of the next step based on the output of the previous step using the Pass state.
The output of each inference request continues to be passed between each step in the workflow.
The last step runs an inference request and the final result is returned as the output of the state machine.

Another advantage to using AWS Step Functions to invoke the LLM is the built-in error handling. Step Functions can be setup to automatically retry on error and allows you to configure a backoff rate and add jitter to help control throttling. No custom coding is required.

Built-in error handling options for an action in an AWS Step Functions workflow

Handling throttling is particularly important when you are approaching the Bedrock service quota limits, such as the number of requests processed per minute for a particular model. Be aware that some limits are hard limits and cannot be adjusted. See the Bedrock service quotas documentation for the latest information.

Parallel prompts with AWS Step Functions

The performance of the application can be improved by breaking down tasks into smaller sub-tasks and running them in parallel. This can dramatically decrease the overall response time, especially for larger models and complex prompts. In the following example, parallel processing reduced the total execution time of the state machine from 30.8 seconds to 19.2 seconds, an improvement of 37.7% when compared to the same steps run in sequence.

The below example uses the Step Functions parallel state to perform Bedrock InvokeModel actions in parallel.

Prompt chaining example using parallel state in AWS Step Functions

The user input (vehicle description) is passed to Amazon Bedrock via the Step Functions optimized integration.
The Step Functions parallel state allows branching logic to perform multiple steps in parallel.
Complex inference tasks are run in parallel to reduce end-to-end execution time.
Shorter tasks can be combined to balance branch execution time with longer running tasks.
The generated output is combined and the final response returned.

In addition to the parallel state, the Step Functions map state can be used to run the same action multiple times in parallel with different inputs. For example if you wanted to generate marketing materials for 100 vehicles with data stored in Amazon S3 you could run the above workflow nested in a distributed map state.

Result caching

Generating text using LLMs can be a computationally intensive and a time-consuming process, especially for complex prompts or long content generation. To improve performance and reduce latency, caching should be used where possible by storing and reusing previously generated responses. This concept is explored in detail in Mastering LLM Caching for Next-Generation AI.

Caching can be implemented at different levels within your application architecture, each with its own advantages and trade-offs. Here are some examples:

Caching inside the Lambda execution environment: if your Lambda function receives repeated prompts or inputs, you can store these results inside memory or the /tmp directory of a warmed execution environment.
External caching services: to overcome the limitations of in-memory caching and leverage more robust caching solutions, you can integrate with external services to store previous results like Amazon ElastiCache (for Redis or Memcached) or Amazon DynamoDB.

The example below uses a Step Functions workflow to check for a cached response in DynamoDB before invoking the model. The cache key in this case could be the LLM prompt. This helps to reduce costs whilst improving performance. The example generates custom vehicle descriptions based on a particular persona, for example to focus on safety features and luggage space for a family, or performance specifications for a motorsport enthusiast.

Example AWS Step Functions that uses Amazon DynamoDB to cache LLM responses

When implementing caching, it is crucial to consider factors such as cache invalidation strategies, cache size limitations, and data consistency requirements. For example, if your LLM generates dynamic or personalized content, caching may not be suitable, as the responses could be stale or incorrect for different users or contexts.

Conclusion

This post explored integration patterns for consuming LLMs in serverless applications, enabling an efficient and reliable next generation experience for customers. Single-prompt inference can be achieved with AWS Lambda using the AWS SDK.

Responses from LLMs can be large and often leads to manipulating large text responses in memory, especially for Retrieval-Augmented Generation (RAG) use cases. It’s therefore important to select an optimal memory configuration for your function, and the recommended way to do this is using the AWS Lambda Power Tuning.

When more complex prompt chaining is required it’s best practice to explore Step Functions as a way to reduce idle wait time and avoid being limited by the Lambda 15 minute timeout. Step Functions also bring the benefits of an optimized integration for Bedrock, as well as the ability to handle errors and run tasks in parallel.

Remember that model choice is also an important consideration to balance cost, performance and output capabilities. This is discussed further in Choose the best foundational model for your AI applications.

To find more serverless patterns using Amazon Bedrock take a look at Serverless Land.

Empowering builders: introducing the Dev Alliance and Workers Launchpad Cohort #4

2024-09-27 Melissa Kargiannakis

Post Syndicated from Melissa Kargiannakis original https://blog.cloudflare.com/launchpad-cohort4-dev-starter-pack

Today we’re announcing the Dev Starter Pack, an alliance of innovative tools for developers to get started with discounts and free services. We’re also excited to share an update on our Workers Launchpad Program.

Creating from the ground up often means spending countless hours piecing together the right development stack, navigating different pricing models, and managing growing costs — all of which can take your focus away from what truly matters: building your product and growing your business.

Introducing Dev Starter Pack: the tools you need to start building your startup

Hey! Dani Grant here, one of the first PMs at Cloudflare and co-founder of Jam.dev. Ten years ago (during 2014’s Birthday Week), Cloudflare launched Universal SSL, making SSL free on the Internet for the first time, and in one night doubling the size of the encrypted web.

I was a college student back then, and I immediately became enraptured by Cloudflare’s mission: helping build a better Internet. As part of this mission, Cloudflare has developed powerful tools typically accessible only to Internet giants, oftentimes offering them for free to developers and individuals alike. Heck yeah! I joined Cloudflare in January 2015, and 5 years after that, co-founded a developer tool company called Jam, inspired by the impact that I saw building tools for developers could have while at Cloudflare.

It’s now 10 years later, and a lot has changed –– “software ate the world” and it’s now powering all aspects of our lives, from health to finances to how we work. It’s more important than ever to empower every developer with the best tools available, because the faster we build software, the sooner people’s experiences improve.

Today we’re thrilled to announce the Dev Starter Pack, an alliance of like-minded dev tool companies giving away their services for free, or heavily discounting them for developers who want to start companies and build the future.

Not only does this stack include all the tools you need to build a startup, it also includes all the tools you need to build AI-powered features. We believe that the next wave of startups will be AI-native, as AI becomes as ubiquitous as the electricity that powers the servers.

We haven’t even scratched the surface of what’s possible with AI, and we hope this launch gets developers closer to solving the challenges of building non-deterministic software.

If you’re a software engineer, and you want to build a project or a company and need an off the shelf stack of dev tools to get started, go to devstarterpack.io to start using all of these tools.

Each provider is offering developers a heavily discounted or even free plan to get started building. You can redeem these services by either using the special code “devstarterpack” or selecting “Dev Starter Pack” while applying to relevant programs.

We welcome more tools to join the alliance — this is just the beginning. If you are building a developer tool and would like to include your product in the Dev Starter Pack, let us know here, so we can include you.

What will you build?

We are very excited to see what you will build. Please share with us in Cloudflare’s Discord and community forum, so we can support you however it makes sense.

Software developers are changing the world, and we believe in providing support to help you make an even greater impact. If you’re looking for additional funding or support, check out Cloudflare’s Launchpad for developers turned founders building startups.

Introducing Workers Launchpad Cohort #4

Melissa and Chris from the Cloudflare for Startups team here. Our team is blown away by what customers are demonstrating on the Developer Platform. Just a few weeks ago, our Workers Launchpad Cohort #3 wrapped up. On Demo Day, customers demoed their applications built on Cloudflare, spanning AI, dev tools, IaaS, observability, SaaS, media, and beyond. We’re incredibly proud of Cohort #3 participants, and we look forward to their continued success with Cloudflare.

Following Demo Day of Workers Launchpad Cohort #3, we’ve been excited to receive a surge of new applications from startups around the world. These startups are pushing the boundaries of innovation, particularly in areas like observability, PaaS, AI, automation, e-commerce, and other industries. Many startups that applied this go-around demonstrated that they’ve built some great applications on Cloudflare, and today, we’re excited to announce the accepted participants for our upcoming Workers Launchpad Cohort #4.

Let’s take a look at what Cohort #4 participants are building in their own words:

Adster	Hyperscale revenue powered by real-time data intelligence and AI
Almeta	Predict customer behavior on your website
Best Parents	Disruptive educational travel marketplace for Gen Z under 18
Comigo	Companion app to make therapy an engaging daily practice
Datastrato	A unified data catalog for generative AI infrastructure
Equimake	Create professional 3D projects without technical experience
Evefan	Your own Internet scale events infrastructure
Eventuall	Connecting stars with their fans in paid meet & greets and virtual experiences
Fermat	No-code solution to deploy AI models as internal tools
Fiberplane	Development tool that uses observability data to help test and debug APIs
Firetiger	An engineering observability tool that operates at scale inside customer infrastructure
Flightcast	Video-first podcast hosting & distribution
FlightLevel Technologies	AI Analytics and Footage in the aviation industry.
Gitlip	Powerful, collaborative and lightweight computing platform based on Git
GrackerAI	AI-powered organic growth engine for cybersecurity B2B SaaS
Hackernoon	Community-driven blogging network read by millions of technologists
Hanabi.REST	Prompt to REST API with AI-driven building, testing, and deployment
Infrastack	Next-gen application intelligence and observability platform for developers
June	AI productivity companion
Leed AI	Combined marketing workflows, website, and customer journey for a seamless, AI-accelerated experience
lookbk	Make the Internet more shoppable, starting with fashion on socials
Materialized Intelligence	Data-intensive inference solutions
Maxint	Multi-platform money management powered by AI
Midio	Visual tool to build software and AI agents
NikaPlanet	Transformative geospatial analytics experience with Google Colab, QGIS, ChatGPT, and Miro in one solution
NotHotDog	AI-Powered API Testing Tool
Outerbase	View, edit, query, and visualize your data with AI
Procureezy	AI procurement platform to empower hardware engineers to source smarter and launch sooner
Proma	Process management and automation platform to get work done fast
Render Better	Increase e-commerce revenue by optimizing your site speed, automatically
Sherpo	AI-first no-code platform to build and sell digital content
Speak_	AI platform to surface top talent by evaluating candidates against custom criteria
Tightknit	Embedded community engagement platform built for SaaS
Tinfoil	Powerful analytics with cryptographic privacy guarantees
Velvet	AI gateway to monitor, evaluate, and optimize features
Webstudio	An advanced visual site builder that connects to any headless CMS
Zipr	Streamlined visitor management

The Cloudflare team is ecstatic to work with the amazing participants of Cohort #4. If you want to follow along on Cohort #4’s journey, be sure to follow @CloudflareDev on X and join our Developer Discord server.

Are you a startup building on Cloudflare? Apply for Cohort #5!

Let’s Architect! Building multi-tenant SaaS systems

2024-09-26 Luca Mezzalira

Post Syndicated from Luca Mezzalira original https://aws.amazon.com/blogs/architecture/lets-architect-building-multi-tenant-saas-systems/

Software as a Service (SaaS) applications offer a transformative solution for businesses worldwide, delivering on-demand software solutions to a global audience. However, building a successful SaaS platform demands on meticulous architectural planning, especially given the inherent challenges of multi-tenancy. It’s also essential to ensure that each tenant’s data remains isolated and protected from unauthorized access and that multi-tenant systems are cost-optimized and can sustain the scaling of the SaaS business provider.

In this blog post, we will explore some of the key elements and best practices for designing and deploying secure and efficient SaaS systems on AWS.

Building cost-optimized multi-tenant SaaS architectures

Cost is a key factor to consider when we design new systems. Multi-tenancy requires teams to think beyond the basics of auto scaling, adopting strategies to allow their architecture to support a complex cost-scaling challenges. In this session, the speaker covers some design patterns for distributed systems to support the continually evolving scale needs of the environment, while optimizing the cost of the infrastructure.

The architectural model chosen for deploying multi-tenant systems—pooled, siloed, or mixed—significantly influences the cost optimization strategy. Each approach offers distinct trade-offs in terms of resource allocation, scalability, and cost efficiency.

Figure 1. The architectural model chosen for deploying multi-tenant systems—pooled, siloed, or mixed—significantly influences the cost-optimization strategy. Each approach offers distinct trade-offs in terms of resource allocation, scalability, and cost efficiency.

Take me to this video

Well-Architected SaaS Lens

The SaaS Lens for the AWS Well-Architected Framework empowers customers to assess and enhance their cloud-based architectures, fostering a deeper understanding of the business implications of their design choices. By bringing together technical leadership and diverse teams to discuss strategies for improving various aspects of the system, the AWS Well-Architected Framework facilitates collaborative decision-making. Moreover, the AWS account team can provide valuable support in conducting these assessments, offering expert guidance and insights. The AWS SaaS Lens specifically focuses on how to design, deploy, and architect multi-tenant SaaS application workloads within the AWS Cloud.

The microservices running in a multi-tenant environment must be able to reference and apply tenant context within each service. At the same time, it’s also our goal to limit the degree to which developers need to introduce any tenant awareness into their code.

Figure 2. The microservices running in a multi-tenant environment must be able to reference and apply tenant context within each service. At the same time, it’s also our goal to limit the degree to which developers need to introduce any tenant awareness into their code.

Take me to this well-architected framework

SaaS anywhere: Designing distributed multi-tenant architectures

Not every SaaS provider has the luxury of running all the moving parts of their solution within their own infrastructure. SaaS teams might support a range of diverse system models, where architectures might include customer-hosted data, edge deployment for parts of the application, and on-premises components. In this session, you can learn the strategies to support the complexities of this distributed model without undermining the resilience, operational efficiency, and agility goals of your solution. The video covers how this influences the onboarding, deployment, and profile management of the SaaS environment.

In this architectural pattern, tenants are demanding to have the ML workload in their environment. So, the SaaS provider only manages the SaaS Control plane where tenants deploy the application plane in their environment, including the ML workload and the necessary components around it.

Figure 3. In this architectural pattern, tenants are demanding to have the ML workload in their environment. So, the SaaS provider only manages the SaaS control plane where tenants deploy the application plane in their environment, including the ML workload and the necessary components around it.

Take me to this video

Deploying multi-tenant SaaS applications on Amazon ECS and AWS Fargate

Containers are frequently employed in multi-tenant SaaS environments to enhance scalability, isolation, and resource efficiency. Developing such systems requires addressing multiple challenges, including tenant isolation, tenant on-boarding, tenant-specific metering, monitoring, and other factors related to multi-tenancy. This session explores how to effectively manage all of these aspects when deploying solutions on AWS Fargate.

Microservices architecture can enhance security isolation by dividing applications into smaller, independent services, reducing the potential impact of a breach.

Figure 4. Microservices architecture can enhance security isolation by dividing applications into smaller, independent services, reducing the potential impact of a breach.

Take me to this video

AWS Serverless SaaS Workshop

Serverless helps to create multi-tenant architectures thanks to services like AWS Lambda that isolate your business logic per request, making them the perfect companion to run a SaaS platform. This workshop provides a hands-on introduction to creating serverless multi-tenant SaaS applications, helping you get started and gain practical experience.

Figure 5. This is the high-level architecture of the web application you will use in the AWS Serverless SaaS Workshop. In the labs, you will use this web application to add features that are needed to build this final SaaS application.

Take me to this workshop

See you next time!

Thanks for reading! Multi-tenant SaaS architectures require a careful design of your system. In this post, you have discovered key elements for properly designing your next SaaS workloads. In the next blog, we will talk about modern data architectures.

To revisit any of our previous posts or explore the entire series, visit the Let’s Architect! page.

Startup Program revamped: build and grow on Cloudflare with up to $250,000 in credits

2024-09-26 Christopher Rotas

Post Syndicated from Christopher Rotas original https://blog.cloudflare.com/startup-program-250k-credits

Today, we’re pleased to offer startups up to $250,000 in credits to use on Cloudflare’s Developer Platform. This new credits system will allow you to clearly see usage and associated fees to plan for a predictable future after the $250,000 in credits have been used up or after one year, whichever happens first.

You can see eligibility criteria and apply to the start-up program here.

What can you use the credits for?

Credits can be applied to all Developer Platform products, as well as Argo and Cache Reserve. Moreover, we provide participants with up to three Enterprise-level domains, which includes CDN, DDoS, DNS, WAF, Zero Trust, and other security and performance products that a participant can enable for their website.

Developer tools and building on Cloudflare

You can use credits for Cloudflare Developer Platform products, including those listed in the table below.

^{Note: credits for the Cloudflare Startup Program apply to Cloudflare products only, this table is illustrative of similar products in the market.}

Speed and performance with Cloudflare

We know that founders need all the help they can get when starting their businesses. Beyond the Developer Platform, you can also use the Startup Program for our speed and performance products. Getting customers where they need to go within milliseconds on your website or application is the difference between closing a sale or not. You can test your speed here and learn how to optimize your speed and performance here with solutions like: Images, Argo, and Early Hints.

Security from Cloudflare

But, wait, there’s more: beyond the Developer Platform products and speed tools, you can also use Cloudflare’s many security features through the Startup Program as well. These include Web Application Firewall (WAF), DDoS Alerts, bundled protection plans, and more. The Startup Program also includes Zero Trust solutions. Learn how others are securing their technology and tools with Cloudflare Zero Trust.

For more inspiration, check out our Built with Cloudflare site, which highlights what other startups are building.

Who can use the credits?

Eligibility criteria can be found here and include:

Companies building a software-based product or service
Founded within the last 5 years (2019-2024)
Have between $50,000 - $5,000,000 in funding
- Note that for startups who have not yet raised at least $50,000, there may be other opportunities for lower credit amounts. Please apply with the promo code “BOOTSTRAPPED” if you haven’t raised $50,000 yet, but are interested in the Cloudflare Startup Program
Have a LinkedIn profile, valid website, and email address
Bonus criteria that adds to your application: being part of an approved accelerator

What will you build?

We’re excited to see what you will build. Please share what you’re up to with us so that we can help you however it makes sense. If you’re actively using Cloudflare’s Developer Platform, we’d love to hear more about what you’re building and share it on our Built with Cloudflare site.

Are you a startup looking for additional support, resources, or access to funding? Apply for our Workers Launchpad Program! The program runs for a few months, and in addition to the Startup Program, participants get access to hands-on bootcamp sessions, Solutions Architect office hours, introductions to VCs, and the opportunity to present at Demo Day.

Why does Cloudflare support founders and startups?

Founders and developers face enough challenges without having to worry about incurring egregious costs to test technology and start building in the earliest days. You have the world at your fingertips and should be empowered to build and create without limitations. Invest money in your innovation, not in the infrastructure and technology that supports it.

The Startup Program understands this founder experience deeply, as the team is made up of former founders. Cloudflare is committed to programs like this to empower founders building the next big thing. Offering up to $250,000 in credits will allow folks to leverage even more of what we have to offer: a developer experience that removes friction, saves money, and gets applications spun up in hours, not days.

We want to support founders from everywhere on earth.

Be bold and keep building! Follow @CloudflareDev and join our Developer Discord server.

Are you a startup building on Cloudflare? Apply here!

Building a three-tier architecture on a budget

2024-09-25 Adam Nemeth

Post Syndicated from Adam Nemeth original https://aws.amazon.com/blogs/architecture/building-a-three-tier-architecture-on-a-budget/

AWS customers often look for ways to run their systems within or under budget, avoiding unnecessary costs. This post offers practical advice on designing scalable and cost-efficient three-tier architectures by using serverless technologies within the AWS Free Tier.

With AWS, you can start small and scale cost-effectively as your business demand increases. You can begin with minimal investments by using the Free Tier to build a minimum viable product (MVP). Then you can expand resources as your user base grows and your needs evolve, and transition to a full-fledged, large-scale application.

In this blog post, you will learn how to build a three-tier architecture that predominantly relies on AWS service usage within the Free Tier, resulting in a highly affordable architecture.

Note: The Free Tier offerings mentioned within this blog post are subject to change. Always check the AWS Free Tier page for the most current information.

Background: Understanding the AWS Free Tier

The Free Tier provides users with access to a range of AWS services at no cost within predefined monthly usage limits. This offering helps users to run experimentation, development, and even production workloads without charges. The Free Tier is available for more than 100 AWS products today, including Amazon Simple Storage Service (Amazon S3), Amazon Elastic Compute Cloud (Amazon EC2), and Amazon Relational Database Service (Amazon RDS). Depending on the product, there are three types of Free Tier offers:

Free trials – Short-term trial offers that start when the first usage begins. After the trial period expires, you pay standard service rates.
12 months free – Offers available to new AWS customers for 12 months following their sign-up date. After the 12-month free term expires, you pay standard service rates.
Always free – Offers available to both existing and new AWS customers indefinitely.

For example, the AWS Lambda Free Tier includes one million free requests per month and 400,000 GB-seconds of compute time per month usable for functions across both x86 and AWS Graviton processors. The AWS Lambda Free Tier falls under the always free category.

Walkthrough: Three-tier architecture on AWS

Cost efficiency is a prominent advantage of using AWS serverless services. These services decrease the need for provisioning and managing servers, reducing operational overhead and labor costs. Serverless services like AWS Lambda and Amazon API Gateway use a pay-as-you-go model. This way, you only pay for the resources you consume, providing significant savings compared to maintaining idle infrastructure. Serverless technologies also feature automatic scaling and built-in high availability to increase agility and optimize costs.

A three-tier architecture is a popular implementation of a multi-tier architecture and consists of a presentation tier, business logic tier, and data tier. A three-tier architecture separates an application’s functionality into distinct layers (presentation, business logic, and data) to enable scalability, modularity, and flexibility in software development. This type of architecture is suitable for building a wide range of applications such as web applications, enterprise systems, and mobile apps.

The following image is an example of a three-tier architecture fully built with AWS serverless services. In this example, users authenticate and navigate to the website in the presentation tier. They call APIs, which invoke Lambda functions at the business logic tier. Data is stored in DynamoDB at the data tier.

Users authenticate through Amazon Cognito and navigate to the website in the presentation tier. They call APIs, which invoke Lambda functions at the business logic tier. Data is stored in DynamoDB at the data tier.

Figure 1: Example of a three-tier architecture on AWS

In the following sections, we explore how to use AWS serverless services within the Free Tier to build a similar architecture.

Note: The Free Tier offerings mentioned within this blog post are subject to change. Always check the AWS Free Tier page for the most current information.

Presentation tier

The presentation tier is where your users interact with your offering, such as a webpage or an app. You can use the following services within the Free Tier to build your presentation tier.

AWS service	How you can use it	Free Tier details*
Amazon S3	Host static and dynamic assets, like a React Single Page Application (SPA), and distribute them to your end users.	For the first year, you get 5 GB of standard storage, 20,000 GET requests and 2,000 PUT requests. See Amazon S3 pricing for details.
Amazon CloudFront	Use with Amazon S3 for a faster and more performant distribution of your assets to end users. CloudFront gives you access to the AWS content delivery network with more than 410 points of presence worldwide.	CloudFront includes an Always Free Tier, with 1 TB of data transfer out to the internet per month and 10 million HTTP(S) requests per month. See Amazon CloudFront pricing for details.
Amazon Cognito	Use Amazon Cognito user pools to authenticate your users. You can also integrate Amazon Cognito within your application’s UI for a seamless login experience.	Amazon Cognito has an Always Free Tier, including up to 50,000 monthly active users. It also includes 10 GBs of cloud sync storage and 1 million sync operations per month, valid for the first 12 months after sign-up. See Amazon Cognito pricing for details.

*as of September 2024

Business logic tier

The business logic tier is where code translates user actions to application functionality. You can use the following services within the Free Tier to build your business logic tier.

AWS service	How you can use it	Free Tier details*
Amazon API Gateway	Build a front door to your application’s backend by creating REST or WebSocket APIs.	Get 1 million monthly API calls for free, valid for the first 12 months after sign-up. See Amazon API Gateway pricing for details.
AWS Lambda	Use AWS Lambda for a serverless compute environment that integrates with API Gateway. You can embed your business logic into functions that run on AWS without the need for you to run and manage infrastructure.	The Always Free Tier offers 1 million free requests and 400,000 GB-seconds of compute time per month. See AWS Lambda pricing for details.

*as of September 2024

Data tier

The data tier is where your data is stored. You can use the following service within the Free Tier to build your data tier.

AWS service	How you can use it	Free Tier details*
Amazon DynamoDB	Use this serverless NoSQL database for storing data and tracking transactions. In the context of a three-tier architecture, DynamoDB stores and manages the application’s data, providing reliable and secure data access to the business logic tier.	The Always Free Tier offers 25GB of free storage, with up to 200 million requests, 25 write capacity units (WCUs), and 25 read capacity units (RCUs) per month.

*as of September 2024

Walkthrough: Monitoring your usage to avoid unexpected charges

If you use consolidated billing or AWS Organizations, the Free Tier usage accumulates at the management account level. Each management account receives one quota of the Free Tier.

To monitor your Free Tier usage and avoid unexpected charges, you can use the following resources:

See the Free Tier page in the AWS Billing and Cost Management console. The Free Tier page provides detailed insights into current usage per service, Region, and type.
Set up the GetFreeTierUsage API. See Using the Free Tier API for instructions.
Set up AWS Free Tier usage alert emails.

The following image shows an example of the Free Tier page in the AWS Billing and Cost Management console.

Free Tier dashboard showing usage limit, current usage, and forecasted usage for in-scope services.

Figure 2: AWS Free Tier view in the Cost and Usage Report

We also recommend configuring a zero spend budget within the AWS Billing and Cost Management console. With this budget, you receive notifications when your usage exceeds the Free Tier limits, helping you to avoid unexpected charges. The following image shows an example of this budget setup.

Choose budget type menu with Use a template and Zero spend budget both selected.

Figure 3: Zero spend budget

Conclusion

In this post, we explored how to use AWS serverless services within the Free Tier to build a three-tier application. We also explored how to monitor your Free Tier usage. The Free Tier offers a chance to experiment and develop without additional costs, helping businesses minimize infrastructure expenses early on. AWS serverless architectures bring benefits like cost savings, flexibility, and scalability.

Beyond using services within the Free Tier, you can further optimize the cost of your AWS serverless application. For instance, to prevent incurring unnecessary inter-Region data transfer costs, we recommend starting with a single Region deployment for your application.

To learn more about the Free Tier and which services it offers, check out the AWS Free Tier FAQs.

Additionally, you can explore various architectural patterns for AWS serverless multi-tier architectures and the Serverless Full Stack WebApp Starter Kit to create scalable and cost-effective solutions on AWS.

Efficiently processing batched data using parallelization in AWS Lambda

2024-08-28 Chris Munns

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/efficiently-processing-batched-data-using-parallelization-in-aws-lambda/

This post is written by Anton Aleksandrov, Principal Solutions Architect, AWS Serverless

Efficient message processing is crucial when handling large data volumes. By employing batching, distribution, and parallelization techniques, you can optimize the utilization of resources allocated to your AWS Lambda function. This post will demonstrate how to implement parallel data processing within the Lambda function handler, maximizing resource utilization and potentially reducing invocation duration and function concurrency requirements.

Overview

AWS Lambda integrates with various event sources, such as Amazon SQS, Apache Kafka, or Amazon Kinesis, using event-source mappings. When you configure an event-source mapping, Lambda continuously polls the event source and automatically invokes your function to process the retrieved data. Lambda makes more invocations of your function as the number of messages it reads from the event source increases. This can increase the utilized function concurrency and consume the available concurrency in your account. Click the links to learn more about how Lambda consumes messages from SQS queues and Kinesis streams.

To improve the data processing throughput, you can configure event-source mapping batch window and batch size. These settings ensure that your function is invoked only when a sufficient number of messages have accumulated in the event source. For example, if you configure a batch size of 100 messages and a batch window of 10 seconds, Lambda will invoke your function when either 100 messages have accumulated or 10 seconds have elapsed, whichever happens first.

Event source mapping event batching

By processing messages in batches, rather than individually, you can improve throughput and optimize costs by reducing the number of polling requests to event sources and the number of function invocations. For instance, processing a million messages without batching would require one million function invocations, but configuring a batch size of 100 messages can reduce the number of invocations to 10,000.

Optimizing batch processing within the Lambda execution environment

Each Lambda execution environment processes one event per invocation. With batching enabled, the event object Lambda sends to the function handler contains an array of messages retrieved and batched by the event-source mapping. Once an execution environment starts processing an event object containing a batch of messages, it won’t handle additional invocations until the current one is complete. However, simply iterating over the array of messages and processing them one by one may not fully utilize the allocated compute resources. This can lead to underutilized or idle compute resources, like CPU capacity, and hence longer overall processing times.

Underutilized Lambda environments

Underutilization of compute resources can be generally caused by two things – non-CPU-intensive blocking tasks, such as sending HTTP requests and waiting for responses, and single-threaded processing when you have more than one vCPU core. To address these concerns and maximize resource utilization, you can implement your functions to process data in parallel. This allows more efficient utilization of the allocated compute capacity, reducing invocation duration, time spent idle, and the total concurrency required. In addition, when you allocate more than 1.8GB of memory to your function, it also gets more than one vCPU, which allows threads to land on separate cores for even better performance and true parallel processing.

Improved concurrency in Lambda environment

When processing messages sequentially with a low compute utilization rate, reducing memory allocation may seem intuitive to save costs. This, however, can result in slower performance due to less CPU capacity being allocated. When your function is parallelizing data processing within the execution environment, you’re getting a higher compute utilization rate, and since raising the memory allocation also provides additional CPU capacity, it can lead to better performance. Use the Lambda Power Tuning tool to find the optimal memory configuration, balancing cost with performance.

Understanding the Lambda execution environment lifecycle

After processing an invocation, the Lambda execution environment is “frozen” by the Lambda service. Lambda runtime considers the invocation complete and “freezes” the execution environment when your function handler returns.

When the Lambda service is looking for an execution environment to process a new incoming invocation, it will first try to “thaw” and use any available execution environments that were previously “frozen”. This cycle repeats until the execution environment is eventually shut down.

Lambda worker lifecycle over time

Implementing parallel processing within the Lambda execution environment

You can implement parallel processing by running multiple threads in your function handler, but if those threads are still running when the handler returns, then they will be “frozen” together with the execution environment until the next invocation. This can lead to unexpected behavior, where the execution environment is “thawed” to process a new invocation, however, it still has background threads running and processing data from previous invocations. If you do not handle this properly, the behavior can cascade across multiple invocations, leading to delayed or unfinished processing and complicated debugging.

Threads frozen before finishing

To address this concern, you need to ensure that the background threads you spawn in the function handler are done processing data before returning from the handler. All threads spawned within a particular invocation must complete within the same invocation in order not to spill over to subsequent invocations. This is illustrated in the following diagram. You can see threads start and end within the same invocation, and only once all threads have finished, the function handler returns.

Threads returning before end of invoke

Sample code

Programming languages offer diverse techniques and terminology for parallel and concurrent processing. Java employs multi-threading and thread pools. Node.js, though single-threaded, provides event loop and promises (for async programming), as well as child processes and worker threads (for actual multi-threading). Python supports both multi-threading (subject to Global Interpreter Lock) and multi-processing. Concurrent routines is another technique gaining attention.

The following sample is provided for illustration purposes only and is based on Node.js promises running concurrently. The sample code uses a language-agnostic term “worker” to denote a unit of parallel processing. Your specific parallelization implementation depends on your choice of runtime language and frameworks. AWS recommends you use battle-tested frameworks like Powertools for AWS Lambda that implement concurrent batch processing when possible. Regardless of the programming language, it is crucial to ensure all background threads/workers/promises/routines/tasks spawned by the function handler are completed within the same invocation before the handler returns.

Sample implementation with Node.js

const NUMBER_OF_WORKERS = 4;

export const handler = async (event) => {
    const workers = []; 
    const messages = event.Records;
    
    // For handling partial batch processing errors
    const batchItemFailures = [];

    for (let i=0; i<NUMBER_OF_WORKERS;i++){
        // No await here! The waiting will happen later
        const worker = spawnWorker(i, messages, batchItemFailures);
        workers.push(worker);
    }
    
    // This line is crucial. This is where the handler
    // waits for all workers to complete their tasks
    const processingResults = await Promise.allSettled(workers);
    console.log('All done!');

    // Return messageIds of all messages that failed 
    // to process in order to retry
    return {batchItemFailures};
};

async function spawnWorker(id, messages, batchItemFailures){
    console.log(`worker.id=${id} spawning`);
    while (messages.length>0){
        const msg = messages.shift();
        console.log(`worker.id=${id} processing message`);
        try {
            // A blocking, but not CPU-intensive operation 
            await processMessage(msg);
        } catch (err){
            // If message processing failed, add it to 
            // the list of batch item failures
            batchItemFailures.push({ itemIdentifier: msg.messageId});
        }
    }
}

See the sample code and AWS Cloud Development Kit (CDK) stack at github.com.

Testing results

The following chart illustrates a Lambda function processing messages using an SQS event-source mapping. After enabling message processing with 4 workers, the invocation duration and concurrent executions dropped to 1/4th of the previous value, while still processing the same number of messages per second. Thanks to parallelization, the new function is faster and requires less concurrency.

Function performance dashboard

Looking at the invocation log, you can see that the function handler has spawned four workers, and all of them were completed before the handler returned the result. You can also see that although the handler received 20 items, with each item taking 200ms to process, the overall duration is only 1000ms. This is because items were processed in parallel (20 items * 200ms / 4 workers = 1000ms total processing time).

START RequestId: (redacted)  Version: $LATEST
2024-06-18T03:18:03.049Z    INFO    Got messages from SQS
2024-06-18T03:18:03.049Z    INFO    messages.length=20
2024-06-18T03:18:03.049Z    INFO    worker.id=0 spawning
2024-06-18T03:18:03.049Z    INFO    worker.id=0 processing message
2024-06-18T03:18:03.049Z    INFO    worker.id=1 spawning
2024-06-18T03:18:03.049Z    INFO    worker.id=1 processing message
2024-06-18T03:18:03.050Z    INFO    worker.id=2 spawning
2024-06-18T03:18:03.050Z    INFO    worker.id=2 processing message
2024-06-18T03:18:03.050Z    INFO    worker.id=3 spawning
2024-06-18T03:18:03.050Z    INFO    worker.id=3 processing message
2024-06-18T03:18:03.250Z    INFO    worker.id=0 processing message
2024-06-18T03:18:03.250Z    INFO    worker.id=1 processing message
(redacted for brevity)
2024-06-18T03:18:03.852Z    INFO    worker.id=1 processing message
2024-06-18T03:18:03.852Z    INFO    worker.id=2 processing message
2024-06-18T03:18:03.852Z    INFO    worker.id=3 processing message
2024-06-18T03:18:04.052Z    INFO    All done!
END RequestId: (redacted)
REPORT RequestId: (redacted) Duration: 1004.48 ms

Considerations

The technique and samples described in this post assume unordered message processing. In case you use ordered event sources, such as SQS FIFO Queues, and require preserving message order, you will need to address that in your implementation code. One technique might be creating a separate thread for each messageGroupId.
While providing performance and cost benefits, multi-threading and parallel processing is an advanced technique that requires proper error handling. Lambda supports partial batch responses, where you can report back to the event source that specific messages from the batch failed to be processed so they can be retried. You can collect failed message IDs from each thread and return them as your function handler response. This is illustrated in the sample above. See Handling errors for an SQS event source in Lambda and Best Practices for implementing partial batch responses for additional details.

Conclusion

Efficiently processing large volumes of data implies efficient resource utilization. When processing batches of messages from event sources, validate whether your function would benefit from parallel or concurrent processing within the function handler thus increasing the compute capacity utilization rate. With a high compute capacity utilization rate, you can allocate more memory to your function, thus getting more CPU allocated as well, for faster and more efficient processing. Use frameworks like Powertools for AWS Lambda that implement concurrent batch processing when possible, and use the Lambda Power Tuning tool to find the best memory configuration for your functions, balancing performance and cost.

For more serverless learning resources, visit Serverless Land.

How Wesfarmers Health implemented upstream event buffering using Amazon SQS FIFO

2024-08-22 Robbie Cooray

Post Syndicated from Robbie Cooray original https://aws.amazon.com/blogs/architecture/how-wesfarmers-health-implemented-upstream-event-buffering-using-amazon-sqs-fifo/

Customers of all sizes and industries use Software-as-a-Service (SaaS) applications to host their workloads. Most SaaS solutions take care of maintenance and upgrades of the application for you, and get you up and running in a relatively short timeframe. Why spend time, money, and your precious resources to build and maintain applications when this could be offloaded?

However, working with SaaS solutions can introduce new requirements for integration. This blog post shows you how Wesfarmers Health was able to introduce an upstream architecture using serverless technologies in order to work with integration constraints.

At the end of the post, you will see the final architecture and a sample repository for you to download and adjust for your use case.

Let’s get started!

Consent capture problem

Wesfarmers Health used a SaaS solution to capture consent. When capturing consent for a user, order guarantee and delivery semantics become important. Failure to correctly capture consent choice can lead to downstream systems making non-compliant decisions. This can end up in penalties, financial or otherwise, and might even lead to brand reputation damage.

In Wesfarmers’ case, the integration options did not support a queue with order guarantee nor exactly-once processing. This meant that, with enough load and chance, a user’s preference might be captured incorrectly. Let’s look at two scenarios where this could happen.

In both of these scenarios, the user makes a choice, and quickly changes their mind. These are considered two discreet events:

Event 1 – User confirms “yes.”
Event 2 – User then quickly changes their mind to confirm “no.”

Scenario 1: Incorrect order

In this scenario, two events end up in a queue with no order guarantee. Event 2 might be processed before Event 1, so although the user provided a “no,” the system has now captured a “yes.” This is now considered a non-compliant consent capture.

Figure 1. Animation showing messages processed in the wrong order

Scenario 2 – events processed multiple times

In this scenario, perhaps due to the load, Event 1 was transmitted twice, once before and once after Event 2, due to at least once processing. In this scenario, the user’s record could be updated three times, first with Event 1 with “yes,” then Event 2 with “no,” then again with retransmitted Event 1 with “yes,” which ultimately ends up with a “yes,” also considered a non-compliant consent capture.

Figure 2. Animation showing messages processed multiple times

How did Amazon SQS and Amazon DynamoDB help with order?

With Amazon Amazon Simple Queue Service (Amazon SQS), queues come in two flavors: standard and first-in-first-out (FIFO). Standard queues provide best effort ordering and at-least once processing with high throughput, whereas FIFO delivers order and processes exactly once with relatively low throughput, as shown in Figure 3.

Figure 3. Animation showing FIFO queue processing in the correct order

In Wesfarmers Health’s scenario with relatively few events per user, it made sense to deploy a FIFO queue to deliver messages in the order they arrived and also have them delivered once for each event (see more details on quotas at Amazon SQS FIFO queue quotas).

Wesfarmers Health also employed the use of message group IDs to parallelize all users using a unique userID. This means that they can guarantee order and exactly-once processing at the user level, while processing all users in parallel, as shown in Figure 4.

Figure 4. Animation showing a FIFO queue partitioned per user, in the correct order per user

The buffer implementation

Wesfarmers Health also opted to buffer messages for the same user in order to minimize race conditions. This was achieved by employing an Amazon DynamoDB table to capture the timestamp of the last message that was processed. For this, Wesfarmers Health designed the DynamoDB table shown in Figure 5.

Figure 5. Example DynamoDB schema with messageGroupId based on user, and TTL

The messageGroupId value corresponds to a unique identifier for a user. The time-to-live (TTL) value serves dual functions. First, the TTL is the value of the Unix timestamp for the last time a message from a specific user was processed, plus the desired message buffer interval (for example, 60 seconds). It also serves a secondary function of allowing DynamoDB to remove obsolete entries to minimize table size, thus improving cost for certain DynamoDB operations.

In between the Amazon SQS FIFO queue and the Amazon DynamoDB table sits an AWS Lambda function that listens to all events and transmits to the downstream SaaS solution. The main responsibility of this Lambda function is to check the DynamoDB table for the last processed timestamp for the user before processing the event. If, by chance, a user event for the user was already processed within the buffer interval, then that event is sent back to the queue with a visibility timeout that matches the interval, so that the user events for that user is not processed until the buffer interval is passed.

Figure 6. Amazon DynamoDB table and AWS Lambda function introducing the buffer

Final architecture

Figure 7 shows the high-level architecture diagram that powers this integration. When users send their consent events, it is sent to the SQS FIFO queue first. The AWS Lambda function determines, based on the timestamp stored in the DynamoDB table, whether to process it or delay the message. Once the outcome is determined, the function passes through the event downstream.

Figure 7. Final architecture diagram

Why serverless services were used

The Wesfarmers Health Digital Innovations team is strategically aligned towards a serverless first approach where appropriate. This team builds, maintains, and owns these solutions end-to-end. Using serverless technologies, the team gets to focus on delivering business outcomes while leaving the undifferentiated heavy lifting of managing infrastructure to AWS.

In this specific scenario, the number of requests for consent is sporadic. With serverless technologies, you pay as you go. This is a great use case for workloads that have requests fluctuate throughout the day, providing the customer a great option to be cost efficient.

The team at Wesfarmers Health has been on the serverless journey for a while, and are quite mature in developing and managing these workloads in a production setting using best practices mentioned above and employing the AWS Well Architected Framework to guide their solutions.

Conclusion

SaaS solutions are a great mechanism to move fast and reduce the undifferentiated heavy lifting of building and maintaining solutions. However, integrations play a crucial part as to how these solutions work with your existing ecosystem.

Using AWS services, you can build these integration patterns that is fit for purpose, for your unique requirements.

AWS Serverless Patterns is a great place to get started to see what other patterns exist for your use case.

Next steps

Check out the repository hosted on AWS Patterns that sets up this architecture. You can review, modify, and extend it for your own use case.

AWS Lambda introduces recursive loop detection APIs

2024-08-20 Julian Wood

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/aws-lambda-introduces-recursive-loop-detection-apis/

This post is written by James Ngai, Senior Product Manager, AWS Lambda, and Aneel Murari, Senior Specialist SA, Serverless.

Today, AWS Lambda is announcing new recursive loop detection APIs that allow you to set recursive loop detection configuration on individual Lambda functions. This allows you to turn off recursive loop detection on functions that intentionally use recursive patterns, avoiding disruption of these workloads. You can use these APIs to avoid disruption to any intentionally recursive workflows as Lambda expands support of recursive loop detection to other AWS services.

Overview

AWS Lambda functions are triggered in response to events generated by various AWS services. These Lambda functions may interact with other AWS services by invoking the corresponding service APIs. Typically, the service and resource that generates the triggering event is distinct from the service and resource that the Lambda function calls. However, due to coding errors or configuration issues, there may be situations where these two resources are the same, leading to an infinite or recursive loop. Such misconfigurations can result in runaway workloads, which can incur unplanned usage and charges to your AWS account. For example, a Lambda function processes messages from an Amazon Simple Notification Service (SNS) topic but then puts the resulting notification back to the same SNS topic. This causes an infinite loop.

Lambda provides a built-in preventative guardrail that detects and stops functions running in a recursive or infinite loop between Lambda, Amazon Simple Queue Service (SQS), and SNS. This feature, known as recursive loop detection, is enabled by default for all Lambda functions. This serves as a protective mechanism against unintended usage and unexpected billing from runaway workloads.

Lambda uses an AWS X-Ray trace header primitive called “lineage” to track the number of times a function has been invoked with an event. When your function code sends an event using a supported AWS SDK version, Lambda increments the counter in the lineage header. If your function is then invoked with the same triggering event more than 16 times, Lambda stops the next invocation for that event and emits an Amazon CloudWatch RecursiveInvocationsDropped metric. If the function is invoked synchronously, Lambda returns a RecursiveInvocationException to the caller. For asynchronous invocations, Lambda sends the event to a dead-letter queue or on-failure destination if one is configured.

You do not need to configure active X-Ray tracing for this feature to work. For more information on this feature and an example scenario, please refer to Detecting and stopping recursive loops in AWS Lambda functions.

Although AWS generally discourages this practice due to the possibility of runaway workloads, some customers intentionally employ recursive patterns in their workflows. Previously, customers that run workloads that intentionally use recursive patterns could only opt-out of recursive loop detection on a per-account basis by contacting AWS Support. With these new APIs, customers can selectively opt-out of recursive loop detection on individual functions while maintaining this preventative guardrail for the remaining functions in their account that do not use recursive code.

Today we are introducing two new API actions for recursive loop detection:

GetFunctionRecursiveConfig returns details about a function’s recursive loop detection configuration.
PutFunctionRecursiveConfig sets the recursive loop detection configuration for a function. By default, recursive loop detection is turned ON for all functions.

How to use the new recursive loop detection APIs

You can configure recursive loop detection for Lambda functions through the Lambda Console, the AWS CLI, or Infrastructure as Code tools like AWS CloudFormation, AWS Serverless Application Model (AWS SAM), or AWS Cloud Development Kit (CDK). This new configuration option is supported in AWS SAM CLI version 1.123.0 and CDK v2.153.0.

If you turn recursive loop detection off for a function, the metric for RecursiveInvocationsDropped is no longer emitted for that function.

Turning off recursive loop detection on your function means that Lambda no longer prevents recursive invocations caused by misconfiguration. This may lead to unexpected usage and billing to your AWS account. You should explore alternate ways of architecting your workload that do not use recursive patterns. AWS recommends you exercise caution when turning off this guardrail feature.

Setting recursive loop detection configuration on a function using the Lambda Console

You can get recursive loop detection configuration in the AWS Lambda console:

In the AWS Lambda Console, navigate to the Functions page. Select the function that uses intentionally recursive patterns.
Select Configuration. You can find recursive loop detection controls under the Concurrency and recursion detection section.

Recursive loop detection controls in the Lambda console

Recursive loop detection is turned on by default for all functions. You can change the recursive loop detection configuration of a function by choosing Edit.
To turn off recursive loop detection for a function, select Allow recursive loops and select Save.

Setting to allow recursive loops

Setting recursive loop detection configuration using the AWS CLI

You can get the current recursion loop detection configuration of a Lambda function by using the following CLI command:

aws lambda get-function-recursion-config \
--region $AWS_REGION \
--function-name $FUNCTION_NAME

You can update the recursion loop detection configuration for a Lambda function by using the following CLI command:

aws lambda put-function-recursion-config \
--region $AWS_REGION \
--function-name $FUNCTION_NAME \
--recursive-loop Allow|Terminate

Make sure to set appropriate values for AWS_REGION and FUNCTION_NAME in the previous commands. Setting the put-function-recursion-config parameter to Allow turns off the default behavior of detecting recursive loops. Set this value to Terminate to switch back to default behavior.

Setting recursive loop detection configuration using AWS CloudFormation

You can control the recursive loop detection configuration for a Lambda function by setting the RecursiveLoop resource property in CloudFormation. Setting the value of this property to Allow turns off the default behavior of automatically detecting recursive loops. Set this property to Terminate if you want to switch it back to the default behavior. The following CloudFormation snippet shows RecursiveLoop set to Allow.

LambdaFunction:
    Type: AWS::Lambda::Function                                                                                                                                                                                    
    Properties:                                                                                                                                                                                       
      Code:                                                                                                                                                                                          
        S3Bucket:S3_BUCKET                                                                                                                                                                            
        S3Key: S3_KEY      
      Handler: com.example.App::handleRequest                                                                                                                                                        
      MemorySize: 1024
      Role:                                                                                                                                                                                          
        Fn::GetAtt:                                                                                                                                                                                  
        - LambdaFunctionRole                                                                                                                                                                         
        - Arn                                                                                                                                                                               
      Runtime: java17
      RecursiveLoop : Allow                                                                                                                                                                                                                                                                           
      Timeout: 20                                                                                                                                                                        
      TracingConfig:                                                                                                                                                                               
        Mode: Active

Extending recursive loop detection to additional AWS services

Today, recursive loop detection detects and stops loops between Lambda, SQS, and SNS after approximately 16 invocations. Lambda plans to extend support for recursive loop detection to additional AWS services. Using the APIs, you can turn off recursive loop detection for specific functions that use recursive patterns so that they are not impacted when Lambda expands recursive loop detection to additional AWS services in the future.

One way you can identify functions that use recursive patterns is by using the CloudWatch metric RecursiveInvocationsDropped.

Set a CloudWatch alarm on all Lambda functions for the CloudWatch metric RecursiveInvocationsDropped. Configure the alarm to trigger when the metric is greater than a threshold of zero. Refer to CloudWatch documentation to set alarms. You can use the following CLI command to set this alarm:

aws cloudwatch put-metric-alarm --alarm-name lambda-recursive-alarm --metric-name RecursiveInvocationsDropped --namespace AWS/Lambda --statistic Sum --period 60 --threshold 0 --comparison-operator GreaterThanOrEqualToThreshold --evaluation-periods 1 --alarm-actions $arn-of-sns-notification-topic

When Lambda detects recursive invocations, it will emit the RecursiveInvocationsDropped metric, which will trigger the alarm. Note that Lambda will only detect and stop recursive invocations if all the services within the loop support recursive loop detection.
Navigate to the CloudWatch Console and determine which function has emitted the RecursiveInvocationsDropped metric. On the Browse tab, under Metrics, choose to view metrics By Function Name and search for RecursiveInvocationsDropped. This will list all functions that have emitted that metric.

RecursiveInvocationsDropped metric

Determine if recursion is the intended pattern for that function. If so, use the recursive loop detection API to turn off recursive loop detection for this function.

Conclusion

Lambda recursive loop detection automatically detects and stops recursive invocations between Lambda and supported services, preventing runaway workloads. In most cases, you should architect your workloads to avoid any recursive loops. In rare and special circumstances, you may want to turn off the default behavior on a case-by-case basis. The recursive loop detection APIs allow you to set recursive loop detection configuration on individual functions.

This feature is available in all AWS Regions where Lambda supports recursive loop detection.

To learn more about these APIs, refer to the AWS Lambda API Reference.

For more serverless learning resources, visit Serverless Land

Unlock scalable analytics with a secure connectivity pattern in AWS Glue to read from or write to Snowflake

2024-08-19 Caio Montovani

Post Syndicated from Caio Montovani original https://aws.amazon.com/blogs/big-data/unlock-scalable-analytics-with-a-secure-connectivity-pattern-in-aws-glue-to-read-from-or-write-to-snowflake/

In today’s data-driven world, the ability to seamlessly integrate and utilize diverse data sources is critical for gaining actionable insights and driving innovation. As organizations increasingly rely on data stored across various platforms, such as Snowflake, Amazon Simple Storage Service (Amazon S3), and various software as a service (SaaS) applications, the challenge of bringing these disparate data sources together has never been more pressing.

AWS Glue is a robust data integration service that facilitates the consolidation of data from different origins, empowering businesses to use the full potential of their data assets. By using AWS Glue to integrate data from Snowflake, Amazon S3, and SaaS applications, organizations can unlock new opportunities in generative artificial intelligence (AI), machine learning (ML), business intelligence (BI), and self-service analytics or feed data to underlying applications.

In this post, we explore how AWS Glue can serve as the data integration service to bring the data from Snowflake for your data integration strategy, enabling you to harness the power of your data ecosystem and drive meaningful outcomes across various use cases.

Use case

Consider a large ecommerce company that relies heavily on data-driven insights to optimize its operations, marketing strategies, and customer experiences. The company stores vast amounts of transactional data, customer information, and product catalogs in Snowflake. However, they also generate and collect data from various other sources, such as web logs stored in Amazon S3, social media platforms, and third-party data providers. To gain a comprehensive understanding of their business and make informed decisions, the company needs to integrate and analyze data from all these sources seamlessly.

One crucial business requirement for the ecommerce company is to generate a Pricing Summary Report that provides a detailed analysis of pricing and discounting strategies. This report is essential for understanding revenue streams, identifying opportunities for optimization, and making data-driven decisions regarding pricing and promotions. After the Pricing Summary Report is generated and stored in Amazon S3, the company can use AWS analytics services to generate interactive BI dashboards and run one-time queries on the report. This allows business analysts and decision-makers to gain valuable insights, visualize key metrics, and explore the data in depth, enabling informed decision-making and strategic planning for pricing and promotional strategies.

Solution overview

The following architecture diagram illustrates a secure and efficient solution of integrating Snowflake data with Amazon S3, using the native Snowflake connector in AWS Glue. This setup uses AWS PrivateLink to provide secure connectivity between AWS services across different virtual private clouds (VPCs), eliminating the need to expose data to the public internet, which is a critical need for organizations.

The following are the key components and steps in the integration process:

Establish a secure, private connection between your AWS account and your Snowflake account using PrivateLink. This involves creating VPC endpoints in both the AWS and Snowflake VPCs, making sure data transfer remains within the AWS network.
Use Amazon Route 53 to create a private hosted zone that resolves the Snowflake endpoint within your VPC. This allows AWS Glue jobs to connect to Snowflake using a private DNS name, maintaining the security and integrity of the data transfer.
Create an AWS Glue job to handle the extract, transform, and load (ETL) process on data from Snowflake to Amazon S3. The AWS Glue job uses the secure connection established by the VPC endpoints to access Snowflake data. Snowflake credentials are securely stored in AWS Secrets Manager. The AWS Glue job retrieves these credentials at runtime to authenticate and connect to Snowflake, providing secure access management. A VPC endpoint enables you to securely communicate with this service without traversing the public internet, enhancing security and performance.
Store the extracted and transformed data in Amazon S3. Organize the data into appropriate structures, such as partitioned folders, to optimize query performance and data management. We use a VPC endpoint enabled to securely communicate with this service without traversing the public internet, enhancing security and performance. We also use Amazon S3 to store AWS Glue scripts, logs, and temporary data generated during the ETL process.

This approach offers the following benefits:

Enhanced security – By using PrivateLink and VPC endpoints, data transfer between Snowflake and Amazon S3 is secured within the AWS network, reducing exposure to potential security threats.
Efficient data integration – AWS Glue simplifies the ETL process, providing a scalable and flexible solution for data integration between Snowflake and Amazon S3.
Cost-effectiveness – Using Amazon S3 for data storage, combined with the AWS Glue pay-as-you-go pricing model, helps optimize costs associated with data management and integration.
Scalability and flexibility – The architecture supports scalable data transfers and can be extended to integrate additional data sources and destinations as needed.

By following this architecture and taking advantage of the capabilities of AWS Glue, PrivateLink, and associated AWS services, organizations can achieve a robust, secure, and efficient data integration solution, enabling them to harness the full potential of their Snowflake and Amazon S3 data for advanced analytics and BI.

Prerequisites

Complete the following prerequisites before setting up the solution:

Verify that you have access to AWS account with the necessary permissions to provision resources in services such as Route 53, Amazon S3, AWS Glue, Secrets Manager, and Amazon Virtual Private Cloud (Amazon VPC) using AWS CloudFormation, which lets you model, provision, and manage AWS and third-party resources by treating infrastructure as code.
Confirm that you have access to Snowflake hosted in AWS with required permissions to run the steps to configure PrivateLink. Refer to Enabling AWS PrivateLink in the Snowflake documentation to verify the steps, required access level, and service level to set the configurations. After you enable PrivateLink, save the value of the following parameters provided by Snowflake to use in the next step in this post:
1. privatelink-vpce-id
2. privatelink-account-url
3. privatelink_ocsp-url
4. regionless-snowsight-privatelink-url
Make sure you have a Snowflake user snowflakeUser and password snowflakePassword with required permissions to read from and write to Snowflake. The user and password are used in the AWS Glue connection to authenticate within Snowflake.
If your Snowflake user doesn’t have a default warehouse set, you will need a warehouse name. We use snowflakeWarehouse as a placeholder for the warehouse name; replace it with your actual warehouse name.
If you’re new to Snowflake, consider completing the Snowflake in 20 Minutes By the end of the tutorial, you should know how to create required Snowflake objects, including warehouses, databases, and tables for storing and querying data.

Create resources with AWS CloudFormation

This post includes a CloudFormation template for a quick setup of the base resources. You can review and customize it to suit your needs if needed. The CloudFormation template generates the following resources:

VPC (vpc-blog-glue-snowflake)
Subnets (one public subnet and three private subnets)
Route tables that are explicitly associated with the subnets
Security groups that are used to provision the endpoints for Secrets Manager, Amazon S3, and Snowflake, as well as used to provision the AWS Glue connection
Endpoints for Secrets Manager, Amazon S3, and Snowflake
Route 53 hosted zone, which is a container for DNS records
Route 53 record set to route traffic to the Snowflake endpoint
S3 bucket (blog-glue-snowflake-*)
AWS Identity and Access Management (IAM) role for AWS Glue (blog-glue-snowflake-GlueServiceRole-*)
AWS Glue database (db_blog_glue_snowflake)
Amazon Athena workgroup (blog-workgroup)

To create your resources, complete the following steps:

Sign in to the AWS CloudFormation console.
Choose Launch Stack to launch the CloudFormation stack.
Provide the CloudFormation stack parameters:
1. For PrivateLinkAccountURL, enter the value of the parameter privatelink-account-url obtained in the prerequisites.
2. For PrivateLinkOcspURL, enter the value of the parameter privatelink_ocsp-url obtained in the prerequisites.
3. For PrivateLinkVpceId, enter the value of the parameter privatelink-vpce-id obtained in the prerequisites.
4. For PrivateSubnet1CIDR, enter the IP addresses for your private subnet 1.
5. For PrivateSubnet2CIDR, enter the IP addresses for your private subnet 2.
6. For PrivateSubnet3CIDR, enter the IP addresses for your private subnet 3.
7. For PublicSubnet1CIDR, enter the IP addresses for your public subnet 1.
8. For RegionlessSnowsightPrivateLinkURL, enter the value of the parameter regionless-snowsight-privatelink-url obtained in the prerequisites.
9. For VpcCIDR, enter the IP addresses for your VPC.
Choose Next.
Select I acknowledge that AWS CloudFormation might create IAM resources.
Choose Submit and wait for the stack creation step to complete.

After the CloudFormation stack is successfully created, you can see all the resources created on the Resources tab.

Navigate to the Outputs tab to see the outputs provided by CloudFormation stack. Save the value of the outputs GlueSecurityGroupId, VpcId, and PrivateSubnet1Id to use in the next step in this post.

Update the Secrets Manager secret with Snowflake credentials for the AWS Glue connection

To update the Secrets Manager secret with user snowflakeUser, password snowflakePassword, and warehouse snowflakeWarehouse that you will use in the AWS Glue connection to establish a connection to Snowflake, complete the following steps:

On the Secrets Manager console, choose Secrets in the navigation pane.
Open the secret blog-glue-snowflake-credentials.
Under Secret value, choose Retrieve secret value.

Choose Edit.
Enter the user snowflakeUser, password snowflakePassword, and warehouse snowflakeWarehouse for the keys sfUser, sfPassword, and sfWarehouse, respectively.
Choose Save.

Create the AWS Glue connection for Snowflake

An AWS Glue connection is an AWS Glue Data Catalog object that stores login credentials, URI strings, VPC information, and more for a particular data store. AWS Glue crawlers, jobs, and development endpoints use connections in order to access certain types of data stores. To create an AWS Glue connection to Snowflake, complete the following steps:

On the AWS Glue console, in the navigation pane, under Data catalog, choose Connections.
Choose Create connection.
For Data sources, search for and select Snowflake.
Choose Next.

For Snowflake URL, enter https://<privatelink-account-url>.

To obtain the Snowflake PrivateLink account URL, refer to parameters obtained in the prerequisites.

For AWS Secret, choose the secret blog-glue-snowflake-credentials.
For VPC, choose the VpcId value obtained from the CloudFormation stack output.
For Subnet, choose the PrivateSubnet1Id value obtained from the CloudFormation stack output.
For Security groups, choose the GlueSecurityGroupId value obtained from the CloudFormation stack output.
Choose Next.

In the Connection Properties section, for Name, enter glue-snowflake-connection.
Choose Next.

Choose Create connection.

Create an AWS Glue job

You’re now ready to define the AWS Glue job using the Snowflake connection. To create an AWS Glue job to read from Snowflake, complete the following steps:

On the AWS Glue console, under ETL jobs in the navigation pane, choose Visual ETL.

Choose the Job details tab.
For Name, enter a name, for example, Pricing Summary Report Job.
For Description, enter a meaningful description for the job.
For IAM Role, choose the role that has access to the target S3 location where the job is writing to and the source location from where it’s loading the Snowflake data and also to run the AWS Glue job. You can find this role in your CloudFormation stack output, named blog-glue-snowflake-GlueServiceRole-*.
Use the default options for Type, Glue version, Language, Worker type, Number of workers, Number of retries, and Job timeout.
For Job bookmark, choose Disable.
Choose Save to save the job.

On the Visual tab, choose Add nodes.

For Sources, choose Snowflake.

Choose Data source – Snowflake in the AWS Glue Studio canvas.
For Name, enter Snowflake_Pricing_Summary.
For Snowflake connection, choose glue-snowflake-connection.
For Snowflake source, select Enter a custom query.
For Database, enter snowflake_sample_data.
For Snowflake query, add the following Snowflake query:

SELECT l_returnflag
    , l_linestatus
    , Sum(l_quantity) AS sum_qty
    , Sum(l_extendedprice) AS sum_base_price
    , Sum(l_extendedprice * (1 - l_discount)) AS sum_disc_price
    , Sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge
    , Avg(l_quantity) AS avg_qty
    , Avg(l_extendedprice) AS avg_price
    , Avg(l_discount) AS avg_disc
    , Count(*) AS count_order
FROM tpch_sf1.lineitem
WHERE l_shipdate <= Dateadd(day, - 90, To_date('1998-12-01'))
GROUP BY l_returnflag
    , l_linestatus
ORDER BY l_returnflag
    , l_linestatus;

The Pricing Summary Report provides a summary pricing report for all line items shipped as of a given date. The date is within 60–120 days of the greatest ship date contained in the database. The query lists totals for extended price, discounted extended price, discounted extended price plus tax, average quantity, average extended price, and average discount. These aggregates are grouped by RETURNFLAG and LINESTATUS, and listed in ascending order of RETURNFLAG and LINESTATUS. A count of the number of line items in each group is included.

For Custom Snowflake properties, specify Key as sfSchema and Value as tpch_sf1.
Choose Save.

Next, you add the destination as an S3 bucket.

On the Visual tab, choose Add nodes.
For Targets, choose Amazon S3.

Choose Data target – S3 bucket in the AWS Glue Studio canvas.
For Name, enter S3_Pricing_Summary.
For Node parents, select Snowflake_Pricing_Summary.
For Format, select Parquet.
For S3 Target Location, enter s3://<YourBucketName>/pricing_summary_report/ (use the name of your bucket).
For Data Catalog update options, select Create a table in the Data Catalog and on subsequent runs, update the schema and add new partitions.
For Database, choose db_blog_glue_snowflake.
For Table name, enter tb_pricing_summary.
Choose Save.
Choose Run to run the job, and monitor its status on the Runs tab.

You successfully completed the steps to create an AWS Glue job that reads data from Snowflake and loads the results into an S3 bucket using a secure connectivity pattern. Eventually, if you want to transform the data before loading it into Amazon S3, you can use AWS Glue transformations available in AWS Glue Studio. Using AWS Glue transformations is crucial when creating an AWS Glue job because they enable efficient data cleansing, enrichment, and restructuring, making sure the data is in the desired format and quality for downstream processes. Refer to Editing AWS Glue managed data transform nodes for more information.

Validate the results

After the job is complete, you can validate the output of the ETL job run in Athena, a serverless interactive analytics service. To validate the output, complete the following steps:

On the Athena console, choose Launch Query Editor.
For Workgroup, choose blog-workgroup.
If the message “All queries run in the Workgroup, blog-workgroup, will use the following settings:” is displayed, choose Acknowledge.
For Database, choose db_blog_glue_snowflake.
For Query, enter the following statement:

SELECT l_returnflag
    , l_linestatus
    , sum_qty
    , sum_base_price
FROM db_blog_glue_snowflake.tb_pricing_summary

Choose Run.

You have successfully validated your data for the AWS Glue job Pricing Summary Report Job.

Clean up

To clean up your resources, complete the following tasks:

Delete the AWS Glue job Pricing Summary Report Job.
Delete the AWS Glue connection glue-snowflake-connection.
Stop any AWS Glue interactive sessions.
Delete content from the S3 bucket blog-glue-snowflake-*.
Delete the CloudFormation stack blog-glue-snowflake.

Conclusion

Using the native Snowflake connector in AWS Glue provides an efficient and secure way to integrate data from Snowflake into your data pipelines on AWS. By following the steps outlined in this post, you can establish a private connectivity channel between AWS Glue and your Snowflake using PrivateLink, Amazon VPC, security groups, and Secrets Manager.

This architecture allows you to read data from and write data to Snowflake tables directly from AWS Glue jobs running on Spark. The secure connectivity pattern prevents data transfers over the public internet, enhancing data privacy and security.

Combining AWS data integration services like AWS Glue with data platforms like Snowflake allows you to build scalable, secure data lakes and pipelines to power analytics, BI, data science, and ML use cases.

In summary, the native Snowflake connector and private connectivity model outlined here provide a performant, secure way to include Snowflake data in AWS big data workflows. This unlocks scalable analytics while maintaining data governance, compliance, and access control. For more information on AWS Glue, visit AWS Glue.

About the Authors

Caio Sgaraboto Montovani is a Sr. Specialist Solutions Architect, Data Lake and AI/ML within AWS Professional Services, developing scalable solutions according customer needs. His vast experience has helped customers in different industries such as life sciences and healthcare, retail, banking, and aviation build solutions in data analytics, machine learning, and generative AI. He is passionate about rock and roll and cooking, and loves to spend time with his family.

Kartikay Khator is a Solutions Architect within Global Life Sciences at AWS, where he dedicates his efforts to developing innovative and scalable solutions that cater to the evolving needs of customers. His expertise lies in harnessing the capabilities of AWS analytics services. Extending beyond his professional pursuits, he finds joy and fulfillment in the world of running and hiking. Having already completed two marathons, he is currently preparing for his next marathon challenge.

Navnit Shukla, an AWS Specialist Solution Architect specializing in Analytics, is passionate about helping clients uncover valuable insights from their data. Leveraging his expertise, he develops inventive solutions that empower businesses to make informed, data-driven decisions. Notably, Navnit is the accomplished author of the book “Data Wrangling on AWS,” showcasing his expertise in the field.

BDB-4354-awskamen Kamen Sharlandjiev is a Sr. Big Data and ETL Solutions Architect, Amazon MWAA and AWS Glue ETL expert. He’s on a mission to make life easier for customers who are facing complex data integration and orchestration challenges. His secret weapon? Fully managed AWS services that can get the job done with minimal effort. Follow Kamen on LinkedIn to keep up to date with the latest Amazon MWAA and AWS Glue features and news!

Bosco Albuquerque is a Sr. Partner Solutions Architect at AWS and has over 20 years of experience working with database and analytics products from enterprise database vendors and cloud providers. He has helped technology companies design and implement data analytics solutions and products.

Tenant portability: Move tenants across tiers in a SaaS application

2024-08-07 Aman Lal

Post Syndicated from Aman Lal original https://aws.amazon.com/blogs/architecture/tenant-portability-move-tenants-across-tiers-in-a-saas-application/

In today’s fast-paced software as a service (SaaS) landscape, tenant portability is a critical capability for SaaS providers seeking to stay competitive. By enabling seamless movement between tiers, tenant portability allows businesses to adapt to changing needs. However, manual orchestration of portability requests can be a significant bottleneck, hindering scalability and requiring substantial resources. As tenant volumes and portability requests grow, this approach becomes increasingly unsustainable, making it essential to implement a more efficient solution.

This blog post delves into the significance of tenant portability and outlines the essential steps for its implementation, with a focus on seamless integration into the SaaS serverless reference architecture. The following diagram illustrates the tier change process, highlighting the roles of tenants and admins, as well as the impact on new and existing services in the architecture. The subsequent sections will provide a detailed walkthrough of the sequence of events shown in this diagram.

Figure 1. Incorporating tenant portability within a SaaS serverless reference architecture

Why do we need tenant portability?

Flexibility: Tier upgrades or downgrades initiated by the tenant help align with evolving customer demand, preferences, budget, and business strategies. These tier changes generally alter the service contract between the tenant and the SaaS provider.
Quality of service: Generally initiated by the SaaS admin in response to a security breach or when the tenant is reaching service limits, these incidents might require tenant migration to maintain service level agreements (SLAs).

High-level portability flow

Tenant portability is generally achieved through a well-orchestrated process that ensures seamless tier transitions. This process comprises of the following steps:

Figure 2. High-level tenant portability flow

Port identity stores: Evaluate the need for migrating the tenant’s identity store to the target tier. In scenarios where the existing identity store is incompatible with the target tier, you’ll need to provision a new destination identity store and administrative users.
Update tenant configuration: SaaS applications store tenant configuration details such as tenant identifier and tier that are required for operation.
Resource management: Initiate deployment pipelines to provision resources in the target tier and update infrastructure-tenant mapping tables.
Data migration: Migrate tenant data from the old tier to the newly provisioned target tier infrastructure.
Cutover: Redirect tenant traffic to the new infrastructure, enabling zero-downtime utilization of updated resources.

Consideration walkthrough

We’ll now delve into each step of the portability workflow, highlighting key considerations for a successful implementation.

1. Port identity stores

The key consideration for porting identity is migrating user identities while maintaining a consistent end-user experience, without requiring password resets or changes to user IDs.

Create a new identity store and associated application client that the frontend can use; after that, we’ll need a mechanism to migrate users. In the reference architecture using Amazon Cognito, a silo refers to each tenant having its own user pool, while a pool refers to multiple tenants sharing a user pool through user groups.

To ensure a smooth migration process, it’s important to communicate with users and provide them with options to avoid password resets. One approach is to notify users to log in before a deadline to avoid password resets. Employ just-in-time migration, enabling password retention during login for uninterrupted user experience with existing passwords.

However, this requires waiting for all users to migrate, potentially leading to a prolonged migration window. As a complementary measure, after the deadline, the remaining users can be migrated by using bulk import, which enforces password resets. This ensures a consistent migration within a defined timeframe, albeit inconveniencing some users.

2. Update tenant configuration

SaaS providers rely on metadata stores to maintain all tenant-related configuration. Updates to tenant metadata should be completed carefully during the porting process. When you update the tenant configuration for the new tier, two key aspects must be considered:

Retain tenant IDs throughout the porting process to ensure smooth integration of tenant logging, metrics, and cost allocation post-migration, providing a continuous record of events.
Establish new API keys and a throttling mechanism tailored to the new tier to accommodate higher usage limits for the tenants.

To handle this, a new tenant portability service can be introduced in the SaaS reference architecture. This service assigns a different AWS API Gateway usage plan to the tenant based on the requested tier change, and orchestrates calls to other downstream services. Subsequently, the existing tenant management service will need an extension to handle tenant metadata updates (tier, user-pool-id, app-client-id) based on the incoming porting request.

3. Resource management

Successful portability hinges on two crucial aspects during infrastructure provisioning:

Ensure tenant isolation constructs are respected in the porting process through mechanisms to prevent cross-tenant access. Either role-based access control (RBAC) or attribute-based-access control (ABAC) can be used to ensure this. ABAC isolation is generally easier to manage during porting if the tenant identifier is preserved, as in the previous step.
Ensure instrumentation and metric collection are set up correctly in the new tier. Recreate identical metric filters to ensure monitoring visibility for SaaS operations.

To handle infrastructure provisioning and deprovisioning in the reference architecture, extend the tenant provisioning service:

Update the tenant-stack mapping table to record migrated tenant stack details.
Initiate infrastructure provisioning or destruction pipelines as needed (for example, to run destruction pipelines after the data migration and user cutover steps).

Finally, ensure new resources comply with required compliance standards by applying relevant security configurations and deploying a compliant version of the application.

By addressing these aspects, SaaS providers can ensure a seamless transition while maintaining tenant isolation and operational continuity.

4. Data migration

The data migration strategy is heavily influenced by architectural decisions such as the storage engine and isolation approach. Minimizing user downtime during migration requires a focus on accelerating the migration process, maintaining service availability, and setting up a replication channel for incremental updates. Additionally, it’s crucial to address schema changes made by tenants in a silo model to ensure data integrity and avoid data loss when transitioning to a pool model.

Extending the reference architecture, a new data porting service can be introduced to enable Amazon DynamoDB data migration between different tiers. DynamoDB partition migration can be accomplished through multiple approaches, including AWS Glue, custom scripts, or duplicating DynamoDB tables and bulk-deleting partitions. We recommend a hybrid approach to achieve zero-downtime migration. This solution applies only when the DynamoDB schema remains consistent across tiers. If the schema has changed, a custom solution is required for data migration.

5. Cutover

The cutover phase involves redirecting users to the new infrastructure, disabling continuous data replication, and ensuring that compliance requirements are met. This includes running tests or obtaining audits/certifications, especially when moving to high-sensitivity silos. After a successful cutover, cleanup activities are necessary, including removing temporary infrastructure and deleting historical tenant data from the previous tier. However, before deleting data, ensure that audit trails are preserved and compliant with regulatory requirements, and that data deletion aligns with organizational policies.

Conclusion

In conclusion, portability is a vital feature for multi-tenant SaaS. It allows tenants to move data and configurations between tiers effortlessly and can be incorporated in reference architecture as above. Key considerations include maintaining consistent identities, staying compliant, reducing downtime and automating the process.

How to use Amazon Q Developer to deploy a Serverless web application with AWS CDK

2024-08-07 Riya Dani

Post Syndicated from Riya Dani original https://aws.amazon.com/blogs/devops/how-to-use-amazon-q-developer-to-deploy-a-serverless-web-application-with-aws-cdk/

Did you know that Amazon Q Developer, a new type of Generative AI-powered (GenAI) assistant, can help developers and DevOps engineers accelerate Infrastructure as Code (IaC) development using the AWS Cloud Development Kit (CDK)?

IaC is a practice where infrastructure components such as servers, networks, and cloud resources are defined and managed using code. Instead of manually configuring and deploying infrastructure, with IaC, the desired state of the infrastructure is specified in a machine-readable format, like YAML, JSON, or modern programming languages. This allows for consistent, repeatable, and scalable infrastructure management, as changes can be easily tracked, tested, and deployed across different environments. IaC reduces the risk of human errors, increases infrastructure transparency, and enables the application of DevOps principles, such as version control, testing, and automated deployment, to the infrastructure itself.

There are different IaC tools available to manage infrastructure on AWS. To manage infrastructure as code, one needs to understand the DSL (domain-specific language) of each IaC tool and/or construct interface and spend time defining infrastructure components using IaC tools. With the use of Amazon Q Developer, developers can minimize time spent on this undifferentiated task and focus on business problems. In this post, we will go over how Amazon Q Developer can help deploy a fully functional three-tier web application infrastructure on AWS using CDK. AWS CDK is an open-source software development framework to define cloud infrastructure in modern programming languages and provision it through AWS CloudFormation.

Amazon Q Developer is a generative artificial intelligence (AI)-powered conversational assistant that can help you understand, build, extend, and operate AWS applications. You can ask questions about AWS architecture, your AWS resources, best practices, documentation, support, and more. Amazon Q Developer is constantly updating its capabilities so your questions get the most contextually relevant and actionable answers.

In the following sections, we will take a real-world three-tier web application that uses serverless architecture and showcase how you can accelerate AWS CDK code development using Amazon Q Developer as an AI coding companion and thus improve developer productivity.

Prerequisites

To begin using Amazon Q Developer, the following are required:

An AWS Account
An AWS Builder ID or an AWS Identity Center login controlled by your organization
Visual Studio Code or supported JetBrains IDEs
Install AWS CDK
How to set up and chat with Amazon Q Developer

Application Overview

You are a DevOps engineer at a software company and have been tasked with building and launching a new customer-facing web application using a serverless architecture. It will have three tiers, as shown below, consisting of the presentation layer, application layer, and data layer. You have decided to utilize Amazon Q Developer to deploy the application components using AWS CDK.

Three-Tier Web Application Architecture Overview

Figure 1 – Serverless Application Architecture

Accelerating application deployment using Amazon Q Developer as an AI coding companion

Let’s dive into how Amazon Q Developer can be used as an expert companion to accelerate the deployment of the above serverless application resources using AWS CDK.

1. Deploy Presentation Layer Resources

Creating a secured Amazon S3 bucket to host static assets and front it using Amazon CloudFront

When building modern serverless web applications that host large static content, a key architecture consideration is how to efficiently and securely serve static assets such as images, CSS, and JavaScript files. Simply serving these from your application servers can lead to scaling and performance bottlenecks with increased resource utilization (e.g., CPU, I/O, network) on servers. This is where leveraging AWS services like Amazon Simple Storage Service (Amazon S3) and Amazon CloudFront can be a game-changer. By hosting your static content in a secured S3 bucket, you unlock several powerful benefits. First and foremost, you get robust security controls through S3 bucket policies and CloudFront Origin Access Control (OAC) to ensure only authorized access. This is critical for protecting your assets. Secondly, you take load off your application servers by having CloudFront directly serve static assets from its globally distributed edge locations. This improves application performance and reduces operational costs. AWS CDK helps to simplify the infrastructure provisioning by allowing developers to define S3 bucket and CloudFront resource configurations in a modern programming language using CDK constructs that enhance security and include best practices recommendations.

In this application architecture, we will use Amazon Q Developer to develop AWS CDK code to provision presentation layer resources, which include a secured S3 bucket with public access disabled and an Origin Access Control (OAC) that is used to grant CloudFront access to the S3 bucket to securely serve the static assets of the application.

Prompt: Create a cdk stack with python that creates an s3 bucket for cloudfront/s3 static asset, ensure it is secured by using Origin Access Control (OAC)

Using Amazon Q to Generate python CDK code for the presentation layer resources

Developers can customize these configurations using Amazon Q Developer based on your specific security requirements, such as implementing access controls through IAM policies or enabling bucket logging for audit trails. This approach ensures that the S3 bucket is configured securely, aligning with best practices for data protection and access management.

Lets look at an example of adding CloudTrail logging to the S3 bucket:

Prompt: Update the code to include cloudtrail logging to the S3 bucket created

Using Amazon Q Developer to add CloudTrail logging to the S3 bucket

2. Deploy Application Layer Resources

Provision AWS Lambda and Amazon API Gateway to serve end-user requests

Amazon Q Developer makes it easy to provision serverless application backend infrastructure such as AWS Lambda and Amazon API Gateway using AWS CDK. In the above architecture, you can deploy the Lambda function hosting application code, with just a few lines of CDK code along with Lambda configuration such as function name, runtime, handler, timeouts, and environment variables. This Lambda function is fronted using Amazon API Gateway to serve user requests. Anything from a simple micro-service to a complex serverless application can be defined through code in CDK using Amazon Q assistance and deployed repeatedly through CI/CD pipelines. This enables infrastructure automation and consistent governance for applications on AWS.

Prompt: Create a CDK stack that creates a AWS Lambda function that is invoked by Amazon API Gateway

Using Amazon Q Developer to generate CDK code to create an AWS Lambda function that is invoked by Amazon API Gateway

3. Deploy Data Layer Resources

Provision Amazon DynamoDB tables to host application data

By leveraging Amazon Q Developer, we can generate CDK code to provision DynamoDB tables using CDK constructs that offer AWS default best practice recommendations. With Amazon Q Developer, using the CDK construct library, we can define DynamoDB table names, attributes, secondary indexes, encryption, and auto-scaling in just a few lines of CDK code in our programming language of choice. With CDK, this table definition is synthesized into an AWS CloudFormation template that is deployed as a stack to provision the DynamoDB table with all the desired settings. Any data layer resources can be defined this way as Infrastructure as Code (IAC) using Amazon Q Developer. Overall, Amazon Q Developer drastically simplifies deploying managed data backends on AWS through CDK while enforcing best practices around data security, access control, and scalability leveraging CDK constructs.

Prompt: Create a CDK stack of a DynamoDB table with 100 read capacity units and 100 write capacity units.

Using Amazon Q Developer to generate CDK code to create a DynamoDB with 100 read capacity units and 100 write capacity units

4. Monitoring the Application Components

Monitoring using Amazon CloudWatch

Once the application infrastructure stack has been provisioned, it’s important to setup observability to monitor key metrics, detect any issues proactively, and alert operational teams to troubleshoot and fix issues to minimize application downtime. To get started with observability, developers can leverage Amazon CloudWatch, a fully managed monitoring service. With the use of AWS CDK, it is easy to codify CloudWatch components such as dashboards, metrics, log groups, and alarms alongside the application infrastructure and deploy them in an automated and repeatable way leveraging the AWS CDK construct library. Developers can customize these metrics and alarms to meet their workload requirements. All the monitoring configuration gets deployed as part of the infrastructure stack.

Developers can use Amazon Q Developer to assist with setting up application monitoring in AWS using CloudWatch and CDK. With Q, you can describe the resources you want to monitor, such as EC2 instances, Lambda functions, and RDS databases. Q will then generate the necessary CDK code to provision the appropriate CloudWatch alarms, metrics, and dashboards to monitor the resources. By describing what you want to monitor in natural language, Q handles the underlying complexity of generating code.

Prompt: Create a CDK stack of a Cloudwatch event rule to stop web instance EC2 instance every day at 15:00 UTC

Using Amazon Q Developer to generate CDK code to create a CloudWatch event rule to stop an EC2 instance at 15:00 UTC

5. Automate CI/CD of the application

Build a CDK Pipeline for Continuous Integration(CI) and Continuous Deployment(CD) of the Infrastructure

As you iterate on your serverless application, you’ll want a smooth, automated way to reliably deploy infrastructure changes using a CI/CD pipeline. This is where implementing a CDK pipeline, an automated deployment pipeline, becomes useful. As we’ve seen, AWS Cloud Development Kit (CDK) allows you to define your entire multi-tier infrastructure as a reusable, version-controlled code construct. From S3 buckets to CloudFront, API Gateway, Lambda functions, databases and more, all of these can be deployed with IaC. This CI/CD pipeline streamlines the process of deploying both infrastructure and application code, integrating seamlessly with CI/CD best practices.

Here’s how you can leverage Amazon Q Developer to streamline the process of creating a CDK pipeline.

Prompt: Using python, create a CDK pipeline that deploys a three tier serverless application.

Using Amazon Q Developer to generate CDK pipeline using Python

By leveraging Amazon Q Developer in your IDE for CDK pipeline creation, you can speed up adoption of CI/CD best practices, and receive real-time guidance on CDK deployment patterns, making the development process smoother. This CI/CD integration accelerates your CDK development experience, and allows you to focus on building robust and scalable AWS applications.

Conclusion

In this post, you have learned how developers can leverage Amazon Q Developer, a generative-AI powered assistant, as a true expert AI companion in assisting with accelerating Infrastructure As Code (IaC) Development using AWS CDK and seen how to deploy and manage AWS resources of a three-tier serverless application using AWS CDK. In addition to Infrastructure As Code, Amazon Q Developer can be leveraged to accelerate software development, minimize time spent on undifferentiated coding tasks, and help with troubleshooting, so that developers can focus on creative business problems to delight end-users. See the Amazon Q Developer documentation to get started.

Happy Building with Amazon Q Developer!

Deliver Amazon CloudWatch logs to Amazon OpenSearch Serverless

2024-07-31 Balaji Mohan

Post Syndicated from Balaji Mohan original https://aws.amazon.com/blogs/big-data/deliver-amazon-cloudwatch-logs-to-amazon-opensearch-serverless/

Amazon CloudWatch Logs collect, aggregate, and analyze logs from different systems in one place. CloudWatch provides subcriptions as a real-time feed of these logs to other services like Amazon Kinesis Data Streams, AWS Lambda, and Amazon OpenSearch Service. These subscriptions are a popular mechanism to enable custom processing and advanced analysis of log data to gain additional valuable insights. At the time of publishing this blog post, these subscription filters support delivering logs to Amazon OpenSearch Service provisioned clusters only. Customers are increasingly adopting Amazon OpenSearch Serverless as a cost-effective option for infrequent, intermittent and unpredictable workloads.

In this blog post, we will show how to use Amazon OpenSearch Ingestion to deliver CloudWatch logs to OpenSearch Serverless in near real-time. We outline a mechanism to connect a Lambda subscription filter with OpenSearch Ingestion and deliver logs to OpenSearch Serverless without explicitly needing a separate subscription filter for it.

Solution overview

The following diagram illustrates the solution architecture.

CloudWatch Logs: Collects and stores logs from various AWS resources and applications. It serves as the source of log data in this solution.
Subscription filter : A CloudWatch Logs subscription filter filters and routes specific log data from CloudWatch Logs to the next component in the pipeline.
CloudWatch exporter Lambda function: This is a Lambda function that receives the filtered log data from the subscription filter. Its purpose is to transform and prepare the log data for ingestion into the OpenSearch Ingestion pipeline.
OpenSearch Ingestion: This is a component of OpenSearch Service. The Ingestion pipeline is responsible for processing and enriching the log data received from the CloudWatch exporter Lambda function before storing it in the OpenSearch Serverless collection.
OpenSearch Service: This is fully managed service that stores and indexes log data, making it searchable and available for analysis and visualization. OpenSearch Service offers two configurations: provisioned domains and serverless. In this setup, we use serverless, which is an auto-scaling configuration for OpenSearch Service.

Prerequisites

Deploy the solution

With the prerequisites in place, you can create and deploy the pieces of the solution.

Step 1: Create PipelineRole for ingestion

Open the AWS Management Console for AWS Identity and Access Management (IAM).
Choose Policies, and then choose Create policy.
Select JSON and paste the following policy into the editor:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "aoss:BatchGetCollection",
                "aoss:APIAccessAll"
            ],
            "Effect": "Allow",
            "Resource": "arn:aws:aoss:us-east-1:{accountId}:collection/{collectionId}"
        },
        {
            "Action": [
                "aoss:CreateSecurityPolicy",
                "aoss:GetSecurityPolicy",
                "aoss:UpdateSecurityPolicy"
            ],
            "Effect": "Allow",
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "aoss:collection": "{collection}"
                }
            }
        }
    ]
}

// Replace {accountId}, {collectionId}, and {collection} with your own values

Choose Next, choose Next, and name your policy collection-pipeline-policy.
Choose Create policy.
Next, create a role and attach the policy to it. Choose Roles, and then choose Create role.
Select Custom trust policy and paste the following policy into the editor:

{
   "Version":"2012-10-17",
   "Statement":[
      {
         "Effect":"Allow",
         "Principal":{
            "Service":"osis-pipelines.amazonaws.com"
         },
         "Action":"sts:AssumeRole"
      }
   ]
}

Choose Next, and then search for and select the collection-pipeline-policy you just created.
Choose Next and name the role PipelineRole.
Choose Create role.

Step 2: Configure the network and data policy for OpenSearch collection

In the OpenSearch Service console, navigate to the Serverless menu.
Create a VPC endpoint by following the instruction in Create an interface endpoint for OpenSearch Serverless.
Go to Security and choose Network policies.
Choose Create network policy.
Configure the following policy

[
  {
    "Rules": [
      {
        "Resource": [
          "collection/{collection name}"
        ],
        "ResourceType": "collection"
      }
    ],
    "AllowFromPublic": false,
    "SourceVPCEs": [
      "{VPC Enddpoint Id}"
    ]
  },
  {
    "Rules": [
      {
        "Resource": [
          "collection/{collection name}"
        ],
        "ResourceType": "dashboard"
      }
    ],
    "AllowFromPublic": true
  }
]

Go to Security and choose Data access policies.
Choose Create access policy.
Configure the following policy:

[
  {
    "Rules": [
      {
        "Resource": [
          "index/{collection name}/*"
        ],
        "Permission": [
          "aoss:CreateIndex",
          "aoss:UpdateIndex",
          "aoss:DescribeIndex",
          "aoss:ReadDocument",
          "aoss:WriteDocument"
        ],
        "ResourceType": "index"
      }
    ],
    "Principal": [
      "arn:aws:iam::{accountId}:role/PipelineRole",
      "arn:aws:iam::{accountId}:role/Admin"
    ],
    "Description": "Rule 1"
  }
]

Step 3: Create an OpenSearch Ingestion pipeline

Navigate to the OpenSearch Service.
Go to the Ingestion pipelines section.
Choose Create pipeline.
Define the pipeline configuration.

version: "2"
 cwlogs-ingestion-pipeline:

  source:

    http:

      path: /logs/ingest

  sink:

    - opensearch:

        # Provide an AWS OpenSearch Service domain endpoint

        hosts: ["https://{collectionId}.{region}.aoss.amazonaws.com"]

        index: "cwl-%{yyyy-MM-dd}"

        aws:

          # Provide a Role ARN with access to the domain. This role should have a trust relationship with osis-pipelines.amazonaws.com

          sts_role_arn: "arn:aws:iam::{accountId}:role/PipelineRole"

          # Provide the region of the domain.

          region: "{region}"

          serverless: true

          serverless_options:

            network_policy_name: "{Network policy name}"
 # To get the values for the placeholders: 
 # 1. {collectionId}: You can find the collection ID by navigating to the Amazon OpenSearch Serverless Collection in the AWS Management Console, and then clicking on the Collection. The collection ID is listed under the "Overview" section. 
 # 2. {region}: This is the AWS region where your Amazon OpenSearch Service domain is located. You can find this information in the AWS Management Console when you navigate to the domain. 
 # 3. {accountId}: This is your AWS account ID. You can find your account ID by clicking on your username in the top-right corner of the AWS Management Console and selecting "My Account" from the dropdown menu. 
 # 4. {Network policy name}: This is the name of the network policy you have configured for your Amazon OpenSearch Serverless Collection. If you haven't configured a network policy, you can leave this placeholder as is or remove it from the configuration.
 # After obtaining the necessary values, replace the placeholders in the configuration with the actual values.

Step 4: Create a Lambda function

Create a Lambda layer for requests and sigv4 packages. Run the following commands in AWS Cloudshell.

mkdir lambda_layers
 cd lambda_layers
 mkdir python
 cd python
 pip install requests -t ./
 pip install requests_auth_aws_sigv4 -t ./
 cd ..
 zip -r python_modules.zip .


 aws lambda publish-layer-version --layer-name Data-requests --description "My Python layer" --zip-file fileb://python_modules.zip --compatible-runtimes python3.x

Create a function with Python 3.x runtime. See Create your first Lambda function.

import base64
 import gzip
 import json
 import logging
 import json
 import jmespath
 import requests
 from datetime import datetime
 from requests_auth_aws_sigv4 import AWSSigV4
 import boto3


 LOGGER = logging.getLogger(__name__)
 LOGGER.setLevel(logging.INFO)


 def lambda_handler(event, context):

    """Extract the data from the event"""

    data = jmespath.search("awslogs.data", event)

    """Decompress the logs"""

    cwLogs = decompress_json_data(data)

    """Construct the payload to send to OpenSearch Ingestion"""

    payload = prepare_payload(cwLogs)

    print(payload)

    """Ingest the set of events to the pipeline"""    

    response = ingestData(payload)

    return {

        'statusCode': 200

    }
 def decompress_json_data(data):

    compressed_data = base64.b64decode(data)

    uncompressed_data = gzip.decompress(compressed_data)

    return json.loads(uncompressed_data)


 def prepare_payload(cwLogs):

    payload = []

    logEvents = cwLogs['logEvents']

    for logEvent in logEvents:

        request = {}

        request['id'] = logEvent['id']

        dt = datetime.fromtimestamp(logEvent['timestamp'] / 1000) 

        request['timestamp'] = dt.isoformat()

        request['message'] = logEvent['message'];

        request['owner'] = cwLogs['owner'];

        request['log_group'] = cwLogs['logGroup'];

        request['log_stream'] = cwLogs['logStream'];

        payload.append(request)

    return payload

 def ingestData(payload):

    ingestionEndpoint = '{OpenSearch Pipeline Endpoint}'

    endpoint = 'https://' + ingestionEndpoint

    headers = {'Content-Type': 'application/json', 'Accept':'application/json'}

    r = requests.request('POST', f'{endpoint}/logs/ingest', json=payload, auth=AWSSigV4('osis'), headers=headers)

    LOGGER.info('Response received: ' + r.text)

    return r

Replace {OpenSearch Pipeline Endpoint}’ with the endpoint of your OpenSearch Ingestion pipeline.
Attach the following inline policy in execution role.

{

    "Version": "2012-10-17",

    "Statement": [

        {

            "Sid": "PermitsWriteAccessToPipeline",

            "Effect": "Allow",

            "Action": "osis:Ingest",

            "Resource": "arn:aws:osis:{region}:{accountId}:pipeline/{OpenSearch Pipeline Name}"

        }

    ]
 }

Deploy the function.

Step 5: Set up a CloudWatch Logs subscription

Grant permission to a specific AWS service or AWS account to invoke the specified Lambda function. The following command grants permission to the CloudWatch Logs service to invoke the cloud-logs Lambda function for the specified log group. This is necessary because CloudWatch Logs cannot directly invoke a Lambda function without being granted permission. Run the following command in CloudShell to add permission.

aws lambda add-permission
 --function-name "{function name}"
 --statement-id "{function name}"
 --principal "logs.amazonaws.com"
 --action "lambda:InvokeFunction"
 --source-arn "arn:aws:logs:{region}:{accountId}:log-group:{log_group}:*"
 --source-account "{accountId}"

Create a subscription filter for a log group. The following command creates a subscription filter on the log group, which forwards all log events (because the filter pattern is an empty string) to the Lambda function. Run the following command in Cloudshell to create the subscription filter.

aws logs put-subscription-filter
 --log-group-name {log_group}
 --filter-name {filter name}
 --filter-pattern ""
 --destination-arn arn:aws:lambda:{region}:{accountId}:function:{function name}

Step 6: Testing and verification

Generate some logs in your CloudWatch log group. Run the following command in Cloudshell to create sample logs in log group.

aws logs put-log-events --log-group-name {log_group} --log-stream-name {stream_name} --log-events "[{\"timestamp\":{timestamp in millis} , \"message\": \"Simple Lambda Test\"}]"

Check the OpenSearch collection to ensure logs are indexed correctly.

Clean up

Remove the infrastructure for this solution when not in use to avoid incurring unnecessary costs.

Conclusion

You saw how to set up a pipeline to send CloudWatch logs to an OpenSearch Serverless collection within a VPC. This integration uses CloudWatch for log aggregation, Lambda for log processing, and OpenSearch Serverless for querying and visualization. You can use this solution to take advantage of the pay-as-you-go pricing model for OpenSearch Serverless to optimize operational costs for log analysis.

To further explore, you can:

Learn more about querying and visualizing log data in OpenSearch Dashboards.
Integrate additional log sources, such as EC2 instances or container logs, into the same pipeline.
Set up alerting and notification rules based on log patterns or anomalies.

About the Authors

Balaji Mohan is a senior modernization architect specializing in application and data modernization to the cloud. His business-first approach ensures seamless transitions, aligning technology with organizational goals. Using cloud-native architectures, he delivers scalable, agile, and cost-effective solutions, driving innovation and growth.

Souvik Bose is a Software Development Engineer working on Amazon OpenSearch Service.

Muthu Pitchaimani is a Search Specialist with Amazon OpenSearch Service. He builds large-scale search applications and solutions. Muthu is interested in the topics of networking and security, and is based out of Austin, Texas.

Lambda runtime changes

Amazon Linux 2023

New Python features

Data model improvements

Typing changes

Standard library

Experimental features that are unavailable

Free-threaded CPython

Just-in-time (JIT) compiler

Performance considerations

Using Python 3.13 in Lambda

AWS Management Console

AWS Lambda container image

AWS Serverless Application Model (AWS SAM)

AWS Cloud Development Kit (AWS CDK)

Conclusion

Overview

Using the new features

Installing the extension

Using Application Builder for existing applications

Using the guided walkthrough

Creating an application from the samples

Building a synchronous Rest API application

Building the application

Iterate locally: invoke and debug

Local invoke

Local debugging

Deploying the application

Remote invoke

Viewing logs

Conclusion

Overview

Accessibility

Differences from Visual Studio Code IDE

AWS Toolkit for Visual Studio Code extensions

Larger package sizes

Using the new features

Viewing code

Environment variables

Creating test events

Invoke function

Live Tail Logs

Keyboard shortcuts

Command Palette

Configuration settings

Downloading function code and template

Using Amazon Q

Native CloudWatch Live Tail logs in Lambda console

Live Tail in action

CloudWatch Metrics Insights dashboard in Lambda console

Metrics Insights dashboard in action

Conclusion

Solution overview

Launch an EMR cluster with Application Manager placement awareness

YARN node labels

Concurrent application submission with Spot Instances

When to use Application Manager placement awareness and managed scaling

Conclusion

About the authors

Overview

Introducing serverless blueprints

Error-handling best practices

Security best practices

Lambda best practices

Governance controls

Sample code

Conclusion

Further reading

Growing pains

Solving the problem

How does Temporal work?

Building on Temporal

Step one: we need a coordinator

Step two: Task Routing is amazing

Step three: when/how to self-heal?

Step four: packaging and deployment

Step five: test, test, test

Deploying to production

Looking to the future

Innovations and optimizations to support larger data size and faster responses