Tag Archives: AWS Lambda

A scalable, elastic database and search solution for 1B+ vectors built on LanceDB and Amazon S3

Post Syndicated from Audra Devoto original https://aws.amazon.com/blogs/architecture/a-scalable-elastic-database-and-search-solution-for-1b-vectors-built-on-lancedb-and-amazon-s3/

This post was co-authored with Owen Janson, Audra Devoto, and Christopher Brown of Metagenomi.

From CRISPR gene editing to industrial biocatalysis, enzymes power some of the most transformative technologies in healthcare, energy, and manufacturing. But discovering novel enzymes that can transform an industry — such as Cas9 for genome engineering — requires sifting through the billions of diverse enzymes encoded by organisms spanning the tree of life. Advances in DNA sequencing and metagenomics have enabled the growth of vast public and proprietary databases containing known protein sequences, but scanning through these collections to identify high value candidates is fundamentally a big data problem as well as a biological one.

At Metagenomi, we’re developing potentially curative therapeutics by using our extensive metagenomics database (MGXdb) to build a toolbox of novel gene editing systems. In this post, we highlight how Metagenomi is tackling the challenge of enzyme discovery at the billion protein scale by using the scalable infrastructure of Amazon Web Services (AWS) to build a high-performance protein database and search solution based on embeddings. By embedding every protein in our large proprietary database into a vector space, making the data accessible using LanceDB built on Amazon Simple Storage Service (Amazon S3), and accessed with AWS Lambda, we were able to transform enzyme discovery into a nearest neighbor search problem and rapidly access previously unexplored discovery space.

Solution overview

At the core of our solution is LanceDB. LanceDB is an open source vector database that enables rapid approximate nearest neighbor (ANN) searches on indexed vectors. LanceDB is particularly well suited for a serverless stack because it’s entirely file-based and is also compatible with Amazon S3 storage. As a result, we can store our database of embedded protein sequences on relatively low-cost Amazon S3, rather than a persistent disk storage such as Amazon Elastic Block Store (Amazon EBS). Instead of constantly running servers, all that is needed to rapidly query the database on-demand is a Lambda function that uses LanceDB to find nearest neighbors directly from the data on S3.

To overcome the challenge of ingesting and querying billions of vector embeddings representing Metagenomi’s large protein database, we devised a method for splitting the database into equal sized parts (folders) stored for low cost on Amazon S3 that can be indexed in parallel and searched with a map-reduce approach using Lambda. The following diagram illustrates this architecture.

AWS architecture showing protein vector processing workflow with ECR, Lambda, and LanceDB

The process follows four steps:

  1. Data vectorization
  2. Data bucketing
  3. Indexing and ingesting data
  4. Querying the database

Data vectorization

To make use of LanceDB’s fast ANN search capabilities, the data must be in vector form. Our metagenomics database consists of billions of proteins, each a string of amino acids. To convert each protein into a vector that captures biologically meaningful information, we run them through a protein language model (pLM), capturing the model’s hidden layers as a vector representation of that protein. Many pLMs can be used to generate protein embeddings, depending on the desired biological information and computational requirements. Here, we use the AMPLIFY_350M model, a transformer encoder model that is fast enough to scale to our entire protein database. We perform a mean-pool of the final hidden layer of the model to produce a 960-dimension vector for each protein. These vectors and their respective unique protein IDs are then stored in HDF5 files.

Data bucketing

To turn our protein vectors into a searchable database, we use LanceDB to build an index suitable for quickly finding ANNs to a query. However, indexing can take a long time and is difficult to distribute across nodes. To speed up indexing, we first divide our data into roughly evenly sized buckets. We then assign each of our embedding HDF5 files to buckets of size roughly equal to 200 million total vectors using a best-fit bin packing algorithm. The exact size packing method used to bucket data depends on the number and dimension of the vectors, as well as their format. Each bucket is ingested into a separate table that will separately reside in a single LanceDB database object store on Amazon S3.

S3 bucket structure showing LanceDB database organization with vector buckets

By bucketing our data, we can produce several smaller databases that can be indexed on separate nodes in a much shorter amount of time. We can also add more data to our database incrementally as a new bucket, instead of reindexing all the existing data.

Ingesting and indexing bucketed data

After the vectorized data has been assigned to a bucket, it’s time to turn it into a LanceDB table and index it to enable fast ANN querying. The details on how to convert your specific data into a LanceDB table can be found in the LanceDB documentation. For each of our buckets of approximately 200 million vectors, we create a LanceDB table with an IVF-PQ index on the cosine distance. For indexing, we use several partitions equal to the square root of the number of inserted rows, and several sub vectors equal to the number of dimensions of our vectors divided by 16.

To make things smoother to query, we name each table after the bucket from which it was created and upload them to a single S3 directory such that their file structure indicates a single LanceDB database with multiple tables.

The following code snippet provides an example of how you might ingest vectors from an HDF5 file containing id and embedding columns into a LanceDB database and index for fast ANN searches based on cosine distance. The only requirements for running this snippet are python >= 3.9, as well as the lancedb, pyarrow, and h5py packages. It should be noted that this snippet was tested and developed using lancedb version 0.21.1 using the asynchronous LanceDB API.

from typing import List, Iterable
from itertools import islice
from math import sqrt
import pyarrow as pa
import datetime
import asyncio
import lancedb
import h5py

def batched(iterable: Iterable, n: int) -> Iterable[List]:
    """Yield batches of n items from iterable."""
    while batch := list(islice(iterable, n)):
        yield batch

async def vectors_to_db(
    vectors: str,
    db: str,
    table_name: str,
    vector_dim: int,
    ingestion_batch_size: int,
) -> int:
    """Ingest and index vectors from an HDF5 file into a LanceDB table.
    Args:
        vectors (str): An HDF5 file containing protein IDs and their
            960-dimension vector representations.
        db (str): Path to the LanceDB database.
        table_name (str): Name of the table to create.
        vector_dim (int): Dimension of the vectors.
    """
    # create db and table
    custom_schema = pa.schema(
        [
            pa.field("embedding", pa.list_(pa.float32(), vector_dim)),
            pa.field("id", pa.string()),
        ]
    )

    # count the total number of rows as they are added to the table
    total_rows = 0

    # open a connection to the new database and create a table
    with await lancedb.connect_async(db) as db_connection:
        with await db_connection.create_table(
            table_name, schema=custom_schema
        ) as table_connection:
            # open vectors file
            with h5py.File(vectors, "r") as vectors_handle:
                # create a generator over the rows
                rows = (
                    {"embedding": e, "id": i}
                    for e, i in zip(
                        vectors_handle["embedding"],
                        vectors_handle["id"],
                    )
                )

                # insert rows in batches to avoid memory issues
                for batch in batched(rows, ingestion_batch_size):
                    total_rows += len(batch)
                    await table_connection.add(batch)

            # optimize the table and remove old data
            await table_connection.optimize(
                cleanup_older_than=datetime.timedelta(days=0)
            )

            # configure the index for the table
            index_config = lancedb.index.IvfPq(
                distance_type="cosine",
                num_partitions=int(sqrt(total_rows)),
                num_sub_vectors=int(
                    vector_dim / 16
                ),
            )

            # index the table
            await table_connection.create_index(
                "embedding", config=index_config
            )

# ingest and index your data
asyncio.run(
    vectors_to_db(
        vectors="./my_vectors.h5",
        db="./test_db",
        table_name="bucket1",
        vector_dim=960,
        ingestion_batch_size=50000
    )
)

The task of vectorizing, ingesting, indexing each bucket could be parallelized over multiple AWS Batch jobs or run on a single Amazon Elastic Compute Cloud (Amazon EC2) instance.

Querying the database

After the data has been bucketed and ingested into a LanceDB database on Amazon S3, we need a way to query it. Because LanceDB can be queried directly from Amazon S3 using the LanceDB Python API, we can use Lambda functions to take a user-provided query vector and search for ANNs, then return the data to the user. However, because our data has been bucketed across several tables in the database, we need to search for nearest neighbors in each bucket and aggregate the results before passing them back to the user.

We implement the query workflow as an AWS Step Functions state machine that manages a query process for each bucket as Lambda processes, as well as a single Lambda process at the end that aggregates the data and writes the resulting ANNs to a .csv file on Amazon S3. However, this could also be implemented as a series of AWS Batch processes or even run locally. The following snippet shows how a process assigned to one bucket could run an ANN query against one of the database’s buckets, requiring only pandas and lancedb to run on python >= 3.9. As detailed before in the ingestion section, we use the asynchronous LanceDB API and lancedb package version 0.21.1.

from typing import List, Iterable
import asyncio
import lancedb
import pandas
import random

async def run_query_async(
    lancedb_s3_uri: str,
    table_name: str,
    q_vec: List[float],
    k: int,
    vec_col: str,
    n_probes: int,
    refine_factor: int,
) -> pandas.DataFrame:
    """Run a query on a LanceDB table.
    Args:
        lancedb_s3_uri (str): S3 URI of the LanceDB database.
        table_name (str): Name of the table to query.
        q_vec (List[float]): Query vector.
        k (int): Number of nearest neighbors to return.
        vec_col (str): Column name of the vector column.
        n_probes (int): Number of probes to use for the query.
        refine_factor (int): Refine factor for the query.
    Returns:
        pandas.DataFrame: DataFrame containing the approximate nearest
        neighbors to the query vector.
    """
    # open a connection to the database and table
    with await lancedb.connect_async(
        lancedb_s3_uri, storage_options={"timeout": "120s"}
    ) as db_connection:
        with await db_connection.open_table(table_name) as table_connection:
            # query the approximate nearest neighbors to the query vector
            df = (
                await table_connection.query()
                .nearest_to(q_vec)
                .column(vec_col)
                .nprobes(n_probes)
                .refine_factor(refine_factor)
                .limit(k)
                .distance_type("cosine")
                .to_pandas()
            )

    return df

# query the example bucket we produced in the last section
bucket1_df = asyncio.run(
    snippets.run_query_async(
        lancedb_s3_uri="s3://mg-analysis/owen/20250415_lancedb_snippet_testing/test_db/",
        table_name="bucket1",
        q_vec=[random.random() for _ in range(960)],
        k=3,
        vec_col="embedding",
        n_probes=1,
        refine_factor=1,
    )
)

The preceding query will return a panda DataFrame of the following structure:

embedding id _distance
[-5.124435, 4.242000, …] id_1 0.000000
[-5.783999, 4.340500, …] id_2 0.001000
[-6.932943, 3.394850, …] id_3 0.04020

Where the embedding column contains the vector representations of the nearest neighbors, the id column their IDs, and the _distance column their cosine distances to the queried vector.

After each bucket has been independently queried across nodes and each has returned a nearest neighbors DataFrame, the results must be merged and subset to return the user. The following snippet shows how you might do this.

def aggregate_nearest_neighbors(
    dfs: List[pandas.DataFrame], k: int
):
    """Aggregate the nearest neighbors for each query vector.
    Args:
        dfs (List[pandas.DataFrame]): A list of DataFrames containing the
            nearest neighbors queried from each bucket.
        k (int): The number of nearest neighbors to aggregate.
    Returns:
        pd.DataFrame: A DataFrame with the aggregated nearest neighbors.
    """
    # concatenate the DataFrames and get the top k nearest neighbors
    return (
        pandas.concat(dfs, ignore_index=True)
        .sort_values(by=["_distance"], ascending=True)
        .reset_index(drop=True)
        .head(k)
    )

# add the dataframes from querying each bucket to a list
dfs = [bucket1_df, bucket2_df, bucket3_df, bucket4_df, bucket_5]

# aggregate the nearest neighbors across all buckets
nearest_neighbors_all_buckets_df = aggregate_nearest_neighbors(dfs, 5)

Optimizing for large batches of queries

Though querying a LanceDB database directly from its S3 object store on Lambda works well for querying the ANNs of one or a few query vectors, some use cases might require querying thousands or even millions of vectors.

One solution we’ve found that scales well to large batches of queries is to modify the preceding query implementation such that it first downloads one of the database buckets to local storage, then queries it locally using the LanceDB API. Because database buckets can have a large storage footprint, this implementation is better suited for AWS Batch jobs than Lambda, and we recommend using optimized instance storage (for example, i4i instances) rather than EBS volumes. After all query Batch jobs finish, a final job can aggregate their results before returning to the user. Orchestration of parallel query jobs and the aggregation job can be done with Nextflow. Though this implementation will have significantly more overhead and latency from downloading the buckets to disk, it can handle larger batches of queries more efficiently and still requires no continuously running server-based database.

Benchmarking results

Indexing strategies and database split sizes depend on your personal need for performance. Consider the following general optimization guidance when customizing to your use case.

An example database created by Metagenomi consisted of 3.5 billion vector embeddings produced by AMPLIFY, of dimension 960. Ingesting and indexing these 3.5B vector embeddings in split sizes of 200M vectors on i4i.8xlarge instances took 108 total compute hours. Because this solution is serverless and can be queried directly from its S3 object store, the only fixed cost of this database is its storage footprint on Amazon S3 (for an indexed database of 3.5B vectors, this is approximately 12.9 TB). Lambda queries can be an exceptionally low-cost querying solution, with many queries costing fractions of a cent.

In general, larger database splits will be more cost effective to query but will result in longer runtimes and longer indexing times. We recommend scaling up database split sizes to the maximum size that results in an acceptable query return time for a single split while also considering limits of parallelization such as maximum concurrent Lambda functions running. Metagenomi identified database splits of 200M vectors each to yield an optimal trade-off in cost and runtime for both small and large queries. We recommend ingesting and indexing on storage-optimized instances, such as those in the i4i family, for optimal performance and cost savings. If querying is to be done on an instance using a disk-based database (as opposed to Lambda and Amazon S3), we also recommend using storage-optimized instances for queries. We found the Lambda implementation could quickly handle single queries requesting up to 50,000 ANNs, or multi queries of up to 100 sequences with fewer than 5 ANNs. Runtime increases linearly with the number of ANNs requested, as shown in the following graph.

Line graph showing query runtime increasing with number of nearest neighbors

Conclusion

In this post, we showed how Metagenomi was able to store and query billions of protein embeddings at low cost using LanceDB implemented with Amazon S3 and AWS Lambda. This work expands on Metagenomi’s patient-driven mission to create curative genetic medicines by accelerating our discovery and engineering platform. Having quick access to the ANN embedding space of a query protein in seconds has enabled the integration of rapid search methods in our extensive analysis pipelines, accelerated the discovery of several diverse and novel enzyme families, and enabled protein engineering efforts by providing scientists with methods to generate and search embeddings on the fly. As Metagenomi continues to rapidly scale protein and DNA databases, horizontal scaling enabled by database splits that can be indexed and searched in parallel facilitates an embedding database solution that scales to future needs.

The solution outlined in this post focuses on vectors produced by a protein large language model (LLM) but can be applied to other vectorized datasets. To learn more about LanceDB integrated with Amazon S3, refer to the LanceDB documentation.

References

  1. Fournier, Quentin, et al. “Protein language models: is scaling necessary?.” bioRxiv (2024): 2024-09.

About the authors

Enhance the local testing experience for serverless applications with LocalStack

Post Syndicated from Patrick Galvin original https://aws.amazon.com/blogs/compute/enhance-the-local-testing-experience-for-serverless-applications-with-localstack/

Serverless applications often comprise multiple AWS services, such as AWS Lambda, Amazon Simple Queue Service (Amazon SQS), Amazon EventBridge, and Amazon DynamoDB. Although serverless architectures make it easy to build applications that are generally simple to operate and scale, testing them requires extra steps for developers. Recently, AWS brought you the capability to help developers remotely debug Lambda functions to accelerate the development process. Today, we’re excited to announce new capabilities that further simplify the local testing experience for Lambda functions and serverless applications through integration with LocalStack, an AWS Partner, in the AWS Toolkit for Visual Studio Code.

In this post, we will show you how you can enhance your local testing experience for serverless applications with LocalStack using AWS Toolkit.

Challenges with local serverless development

When building serverless applications with infrastructure as code (IaC) tools like the AWS Serverless Application Model (AWS SAM), developers often face challenges during local integration testing of applications that depend on interactions across multiple AWS services. These friction points slow down the critical code-test-debug cycle. Developers might encounter the following common roadblocks:

  • Cloud-based validation slows iteration – Developers previously needed to deploy AWS SAM templates to the cloud to test changes, introducing delays in feedback loops. AWS research shows that developers spend considerable time on deployment and testing, rather than writing code.
  • Tool context switching adds friction – Developers routinely shift between integrated development environments (IDEs), command line interfaces (CLIs), and resource emulators like LocalStack, leading to fragmented workflows.
  • Manual setup increases configuration complexity – Port mapping and code edits for local service integration tests can introduce inconsistencies between local and cloud environments.
  • Service integration debugging is limited – Troubleshooting Lambda functions in the context of AWS service integrations, such as DynamoDB, Amazon Simple Storage Service (Amazon S3), or Amazon SQS, requires manual configuration, extending the duration of troubleshooting efforts.

These challenges directly impact developer productivity and make local testing of integrated serverless applications complex.

Solution overview

Starting today, AWS helps simplify local serverless development by integrating LocalStack directly into the AWS Toolkit for VS Code. This integration helps developers test and debug serverless applications—defined using IaC tools like AWS SAM—entirely within their IDE. The enhanced local testing experience delivers four major improvements:

  • Integrated LocalStack experience – Connect to LocalStack directly within VS Code and manage local resources alongside cloud resources through a unified interface.
  • Emulated service interactions – Test Lambda functions with their interactions with other AWS services like Amazon SQS, DynamoDB, and EventBridge locally.
  • Simplified debugging – Start debugging sessions with LocalStack emulated environment, with a single click – no manual port configurations or code changes required, streamlining the debugging workflow.
  • Streamlined workflow – Deploy, test, and debug serverless applications without leaving the IDE, avoiding context switching between tools.

To set up LocalStack in VS Code (either the free version supporting over 30 core services like Lambda, Amazon S3, DynamoDB, Amazon SQS, and Amazon API Gateway, or the Ultimate version with over 110 services and advanced debugging features) you need essential development tools, including Docker, the AWS Command Line Interface (AWS CLI), AWS SAM CLI, and your preferred IDE such as VS Code. This combination enables full local integration testing of AWS services, including Lambda functions, messaging queues, databases, event-driven architectures, and serverless workflows, so you can develop and test your entire AWS application stack locally before deploying to the cloud.

Automated setup process

LocalStack is a cloud service emulator that you can use to run AWS applications locally for testing and development. To enhance your local testing capabilities, you can install the LocalStack VSCode Extension directly from AWS Walkthrough in AWS Toolkit, which offers a streamlined setup process through an intelligent wizard. After installation, the extension automatically detects whether LocalStack is configured on your system and prompts you to run the setup wizard through a notification. The entire process is quick and requires no manual configuration.

LocalStack extension has an integrated authentication wizard, that simplifies the process of connecting your development environment to LocalStack. During setup, the wizard opens a browser-based authentication flow and maintains an active connection until authentication completes. After it’s verified, it securely stores the authentication token in the ~/.localstack/auth.json file, enabling communication between your local environment and LocalStack services.

The wizard also checks if LocalStack AWS CLI profiles exist, and if not found, automatically creates them by updating the ~/.aws/config and ~/.aws/credentials files with LocalStack-specific endpoints and credentials. This seamless integration of AWS profiles enhances the development workflow by allowing developers to easily switch between different AWS environments, including the local LocalStack setup. By leveraging these profiles, developers can effortlessly point their AWS CLI or SDK to the appropriate endpoint, whether it’s a real AWS account or the LocalStack instance running on their machine. This configuration not only ensures a clear separation between local and cloud environments but also minimizes the risk of cross-environment interference. The automatic creation of these profiles streamlines the setup process, reducing manual configuration errors and saving valuable development time. Visual Studio Code (VS Code) provides real-time feedback throughout the setup. The status bar initially displays an error or warning indicator when LocalStack is not configured and then transitions to a normal or connected state once a successful connection is established. After setup completes, you’re ready to deploy, test, and debug serverless applications locally—without additional configuration. These settings persist across VS Code sessions, so the setup process is a one-time task.The following figure illustrates the process to start and verify LocalStack from VS Code.

To learn more, including installation steps, configuration examples, and troubleshooting guidance, visit the LocalStack Docs.

Test a serverless application

To demonstrate the enhanced local testing capabilities, let’s explore a practical serverless pattern: building and testing an event-driven order processing system that integrates Lambda with Amazon SQS, API Gateway, and Amazon Simple Notification Service (Amazon SNS). The application processes orders through an event-driven workflow: orders are submitted through API Gateway to an SQS queue and processed by a Lambda function, and the status is published to Amazon SNS to trigger customer email notifications.

Architechture digram depicting LocalStack emulation and user interaction with LocalStack emulated AWS Environment.

After you set up LocalStack in VS Code, you can test your entire serverless workflow without deploying to the cloud:

  • Deploy locally – Use the LocalStack AWS profile to deploy your AWS SAM application. The process mirrors cloud deployment but targets local endpoints. You can use the Application builder pane to initiate the deployment to LocalStack environment. The following figure illustrates the process of deploying a sample serverless application.

Picture depicting deploying a Serverless application on LocalStack and verifying the resources

  • Debug Lambda function deployed in LocalStack – Set breakpoints in your Lambda function and step through execution using VS Code’s integrated debugger. With the AWS Toolkit extension, you can invoke your Lambda with one click and inspect live interactions across services, all while running against a LocalStack container on your machine. This setup makes it possible to debug your AWS applications in a controlled, local environment that mimics the cloud infrastructure, without the need for deploying actual AWS services.

Picture depicting debugging a Lambda function using LocalStack and AWS VS Code Toolkit.

  • Validate end-to-end Flows – Test complete workflows from message ingestion through processing and notification, confirming all service integrations work correctly before cloud deployment.

For an in-depth technical demonstration of this LocalStack integration, refer to this youtube video.

Best practices for local Lambda function testing

In this section, we discuss various strategies and best practices for local Lambda function testing.

Optimizing your development workflow

Consider the following strategies to optimize your development workflow:

  • Start with a strong testing foundation – Use the AWS SAM CLI to perform unit tests that validate the core programmatic and business logic of your Lambda functions. Isolating function behavior early helps identify logic errors before introducing external dependencies.
  • Establish environment parity early in the development process – Many production issues stem from discrepancies between local and cloud environments. Use consistent service versions, configurations, and data structures across environments to confirm that what works locally behaves the same in production.
  • Adopt IaC from day one – Whether you choose AWS SAM, AWS CloudFormation, or another IaC framework, defining your application infrastructure as code reduces configuration drift and makes your deployments reproducible across teams and environments.
  • Apply a progressive testing strategy – Follow a structured testing pyramid that starts with fast, isolated unit tests and builds up to broader integration and system-level validation. This layered approach helps you catch issues earlier—when they’re easier and less expensive to fix—while still providing full application coverage.

A strategic approach to testing

Testing should be an integrated part of your serverless development workflow—not an afterthought. Successful teams implement layered testing strategies that use both local and cloud environments to strike a balance between speed and accuracy:

  1. Begin with unit tests that focus on isolated function logic. Use tools like the AWS SAM CLI, AWS Toolkit for VS Code and LocalStack extensions to run and debug functions locally.
  2. After validation, proceed to local integration testing using LocalStack to confirm how your Lambda functions interact with services such as Amazon SQS, DynamoDB, and Amazon SNS. These tests typically complete within minutes and catch most service integration issues before they reach production.
  3. After local testing, validate your application in the actual AWS environment. Cloud testing helps surface issues not present in local emulation, such as AWS Identity and Access Management (IAM) permission mismatches, Amazon Virtual Private Cloud (Amazon VPC) networking challenges, or service-specific nuances such as Lambda concurrency. For troubleshooting issues in the cloud environment, you can also remotely debug your Lambda functions using AWS Toolkit for VS Code.
  4. Lastly, conduct performance testing in AWS to assess how your application handles real-world traffic. These longer-running tests help validate scaling behavior and system resilience under load.

The result is higher-quality applications delivered faster, with fewer production surprises and more confident deployments.

Security considerations

When using LocalStack for local development, follow these security best practices:

  • Isolate the local environment – Use Docker networking to restrict LocalStack access and bind services to localhost to prevent external connections.
  • Use placeholder credentials – Use test credentials (for example, test/test) instead of real AWS credentials.
  • Protect your data – Use synthetic or anonymized datasets instead of production data and regularly purge local data stores to reduce risk.

When to use local versus cloud testing

Although local testing offers significant advantages, it’s important to understand when to use it versus testing in the cloud. The following table lists the potential use cases for each strategy.

Testing Scenario Local Testing Cloud Testing Reason
Function logic validation Fast feedback for core business logic
Service integration testing Quick validation of AWS service interactions
Rapid iteration during development Immediate feedback without deployment overhead
Cost-sensitive development environments Minimizes cloud resource costs during development
Offline development scenarios No internet connectivity required
Performance and scalability testing Requires actual AWS infrastructure for accurate results
IAM permission validation LocalStack doesn’t fully replicate IAM behavior
VPC networking scenarios Network configurations can’t be accurately emulated
Production-like load testing Real performance metrics only available in AWS
Final validation before deployment Supports compatibility with actual AWS environment

Conclusion

In this post, we discussed how to streamline local testing for AWS Serverless applications using LocalStack and the AWS Toolkit for VS Code. By running and debugging serverless applications directly in your IDE, you can reduce context switching, test complex integrations locally, and catch issues earlier—without deploying to the cloud.

We also showed how to apply progressive testing strategies that combine local emulation with cloud validation, optimize development costs, and build event-driven workflows with confidence.These enhancements lead to faster test cycles, lower development costs, and higher-quality deployments—all while staying fully in control of your development environment.

Have questions or feedback about this post? Connect with us on the AWS Compute Blog or join the AWS Developer community.

AWS named as a Leader in 2025 Gartner Magic Quadrant for Cloud-Native Application Platforms and Container Management

Post Syndicated from Channy Yun (윤석찬) original https://aws.amazon.com/blogs/aws/aws-named-as-a-leader-in-2025-gartner-magic-quadrant-for-cloud-native-application-platforms-and-container-management/

A month ago, I shared that Amazon Web Services (AWS) is recognized as a Leader in 2025 Gartner Magic Quadrant for Strategic Cloud Platform Services (SCPS), with Gartner naming AWS a Leader for the fifteenth consecutive year.

In 2024, AWS was named as a Leader in the Gartner Magic Quadrant for AI Code Assistants, Cloud-Native Application Platforms, Cloud Database Management Systems, Container Management, Data Integration Tools, Desktop as a Service (DaaS), and Data Science and Machine Learning Platforms as well as the SCPS. In 2025, we were also recognized as a Leader in the Gartner Magic Quadrant for Contact Center as a Service (CCaaS), Desktop as a Service and Data Science and Machine Learning (DSML) platforms. We strongly believe this means AWS provides the broadest and deepest range of services to customers.

Today, I’m happy to share recent Magic Quadrant reports that named AWS as a Leader in more cloud technology markets: Cloud-Native Application Platforms (aka Cloud Application Platforms) and Container Management.

2025 Gartner Magic Quadrant for Cloud-Native Application Platforms
AWS has been named a Leader in the Gartner Magic Quadrant for Cloud-Native Application Platforms for 2 consecutive years. AWS was positioned highest on “Ability to Execute”. Gartner defines cloud-native application platforms as those that provide managed application runtime environments for applications and integrated capabilities to manage the lifecycle of an application or application component in the cloud environment.

The following image is the graphical representation of the 2025 Magic Quadrant for Cloud-Native Application Platforms.

Our comprehensive cloud-native application portfolio—AWS Lambda, AWS App Runner, AWS Amplify, and AWS Elastic Beanstalk—offers flexible options for building modern applications with strong AI capabilities, demonstrated through continued innovation and deep integration across our broader AWS service portfolio.

You can simplify the service selection through comprehensive documentation, reference architectures, and prescriptive guidance available in the AWS Solutions Library, along with AI-powered, contextual recommendations from Amazon Q based on your specific requirements. While AWS Lambda is optimized for AWS to provide the best possible serverless experience, it follows industry standards for serverless computing and supports common programming languages and frameworks. You can find all necessary capabilities within AWS, including advanced features for AI/ML, edge computing, and enterprise integration.

You can build, deploy, and scale generative AI agents and applications by integrating these compute offerings with Amazon Bedrock for serverless inferences and Amazon SageMaker for artificial intelligence and machine learning (AI/ML) training and management.

Access the complete 2025 Gartner Magic Quadrant for Cloud-Native Application Platforms to learn more.

2025 Gartner Magic Quadrant for Container Management
In the 2025 Gartner Magic Quadrant for Container Management, AWS has been named as a Leader for three years and was positioned furthest for “Completeness of Vision”. Gartner defines container management as offerings that support the deployment and operation of containerized workloads. This process involves orchestrating and overseeing the entire lifecycle of containers, covering deployment, scaling, and operations, to ensure their efficient and consistent performance across different environments.

The following image is the graphical representation of the 2025 Magic Quadrant for Container Management.

AWS container services offer fully managed container orchestration with AWS native solutions and open-source technologies to focus on providing a wide range of deployment options, from Kubernetes to our native orchestrator.

You can use Amazon Elastic Container Service (Amazon ECS) and Amazon Elastic Kubernetes Service (Amazon EKS). Both can be used with AWS Fargate for serverless container deployment. Additionally, EKS Auto Mode simplifies Kubernetes management by automatically provisioning infrastructure, selecting optimal compute instances, and dynamically scaling resources for containerized applications.

You can connect on-premises and edge infrastructure back to AWS container services with EKS Hybrid Nodes and ECS Anywhere, or use EKS Anywhere for a fully disconnected Kubernetes experience supported by AWS. With flexible compute and deployment options, you can reduce operational overhead and focus on innovation and drive business value faster.

Access the complete 2025 Gartner Magic Quadrant for Container Management to learn more.

Channy

Gartner does not endorse any vendor, product or service depicted in its research publications and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

GARTNER is a registered trademark and service mark of Gartner and Magic Quadrant is a registered trademark of Gartner, Inc. and/or its affiliates in the U.S. and internationally and are used herein with permission. All rights reserved.

AWS Weekly Roundup: Strands Agents 1M+ downloads, Cloud Club Captain, AI Agent Hackathon, and more (September 15, 2025)

Post Syndicated from Channy Yun (윤석찬) original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-strands-agents-1m-downloads-cloud-club-captain-ai-agent-hackathon-and-more-september-15-2025/

Last week, Strands Agents, AWS open source for agentic AI SDK just hit 1 million downloads and earned 3,000+ GitHub Stars less than 4 months since launching as a preview in May 2025. With Strands Agents, you can build production-ready, multi-agent AI systems in a few lines of code.

We’ve continuously improved features including support for multi-agent patterns, A2A protocol, and Amazon Bedrock AgentCore. You can use a collection of sample implementations to help you get started with building intelligent agents using Strands Agents. We always welcome your contribution and feedback to our project including bug reports, new features, corrections, or additional documentation.

Here is the latest research article of Amazon Science about the future of agentic AI and questions that scientists are asking about agent-to-agent communications, contextual understanding, common sense reasoning, and more. You can understand the technical topic of agentic AI with with relatable examples, including one about our personal behaviors about leaving doors open or closed, locked or unlocked.

Last week’s launches
Here are some launches that got my attention:

  • Amazon EC2 M4 and M4 Pro Mac instances – New M4 Mac instances offer up to 20% better application build performance compared to M2 Mac instances, while M4 Pro Mac instances deliver up to 15% better application build performance compared to M2 Pro Mac instances. These instances are ideal for building and testing applications for Apple platforms such as iOS, macOS, iPadOS, tvOS, watchOS, visionOS, and Safari.
  • LocalStack integration in Visual Studio Code (VS Code) – You can use LocalStack to locally emulate and test your serverless applications using the familiar VS Code interface without switching between tools or managing complex setup, thus simplifying your local serverless development process.
  • AWS Cloud Development Kit (AWS CDK) Refactor (Preview) –You can rename constructs, move resources between stacks, and reorganize CDK applications while preserving the state of deployed resources. By using AWS CloudFormation’s refactor capabilities with automated mapping computation, CDK Refactor eliminates the risk of unintended resource replacement during code restructuring.
  • AWS CloudTrail MCP Server – New AWS CloudTrail MCP server allows AI assistants to analyze API calls, track user activities, and perform advanced security analysis across your AWS environment through natural language interactions. You can explore more AWS MCP servers for working with AWS service resources.
  • Amazon CloudFront support for IPv6 origins – Your applications can send IPv6 traffic all the way to their origins, allowing them to meet their architectural and regulatory requirements for IPv6 adoption. End-to-end IPv6 support improves network performance for end users connecting over IPv6 networks, and also removes concerns for IPv4 address exhaustion for origin infrastructure.

For a full list of AWS announcements, be sure to keep an eye on the What’s New with AWS? page.

Other AWS news
Here are some additional news items that you might find interesting:

  • A city in the palm of your hand – Check out this interactive feature that explains how our AWS Trainium chip designers think like city planners, optimizing every nanometer to move data at near light speed.
  • Measuring the effectiveness of software development tools and practices – Read how Amazon developers that identified specific challenges before adopting AI tools cut costs by 15.9% year-over-year using our cost-to-serve-software framework (CTS-SW). They deployed more frequently and reduced manual interventions by 30.4% by focusing on the right problems first.
  • Become an AWS Cloud Club Captain – Join a growing network of student cloud enthusiasts by becoming an AWS Cloud Club Captain! As a Captain, you’ll get to organize events and building cloud communities while developing leadership skills. Application window is open September 1-28, 2025.

Upcoming AWS events
Check your calendars and sign up for these upcoming AWS events as well as AWS re:Invent and AWS Summits:

  • AWS AI Agent Global Hackathon – This is your chance to dive deep into our powerful generative AI stack and create something truly awesome. From September 8 to October 20, you have the opportunity to create AI agents using AWS suite of AI services, competing for over $45,000 in prizes and exclusive go-to-market opportunities.
  • AWS Gen AI Lofts – You can learn AWS AI products and services with exclusive sessions and meet industry-leading experts, and have valuable networking opportunities with investors and peers. Register in your nearest city: Mexico City (September 30–October 2), Paris (October 7–21), London (Oct 13–21), and Tel Aviv (November 11–19).
  • AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Aotearoa and Poland (September 18), South Africa (September 20), Bolivia (September 20), Portugal (September 27), Germany (October 7), and Hungary (October 16).

You can browse all upcoming AWS events and AWS startup events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

Channy

Accelerate serverless testing with LocalStack integration in VS Code IDE

Post Syndicated from Micah Walter original https://aws.amazon.com/blogs/aws/accelerate-serverless-testing-with-localstack-integration-in-vs-code-ide/

Today, we’re announcing LocalStack integration in the AWS Toolkit for Visual Studio Code that makes it easier than ever for developers to test and debug serverless applications locally. This enhancement builds upon our recent improvements to the AWS Lambda development experience, including the console to IDE integration and remote debugging capabilities we launched in July 2025, continuing our commitment to simplify serverless development on Amazon Web Services (AWS).

When building serverless applications, developers typically focus on three key areas to streamline their testing experience: unit testing, integration testing, and debugging resources running in the cloud. Although AWS Serverless Application Model Command Line Interface (AWS SAM CLI) provides excellent local unit testing capabilities for individual Lambda functions, developers working with event-driven architectures that involve multiple AWS services, such as Amazon Simple Queue Service (Amazon SQS), Amazon EventBridge, and Amazon DynamoDB, need a comprehensive solution for local integration testing. Although LocalStack provided local emulation of AWS services, developers had to previously manage it as a standalone tool, requiring complex configuration and frequent context switching between multiple interfaces, which slowed down the development cycle.

LocalStack integration in AWS Toolkit for VS Code
To address these challenges, we’re introducing LocalStack integration so developers can connect AWS Toolkit for VS Code directly to LocalStack endpoints. With this integration, developers can test and debug serverless applications without switching between tools or managing complex LocalStack setups. Developers can now emulate end-to-end event-driven workflows involving services such as Lambda, Amazon SQS, and EventBridge locally, without needing to manage multiple tools, perform complex endpoint configurations, or deal with service boundary issues that previously required connecting to cloud resources.

The key benefit of this integration is that AWS Toolkit for VS Code can now connect to custom endpoints such as LocalStack, something that wasn’t possible before. Previously, to point AWS Toolkit for VS Code to their LocalStack environment, developers had to perform manual configuration and context switching between tools.

Getting started with LocalStack in VS Code is straightforward. Developers can begin with the LocalStack Free version, which provides local emulation for core AWS services ideal for early-stage development and testing. Using the guided application walkthrough in VS Code, developers can install LocalStack directly from the toolkit interface, which automatically installs the LocalStack extension and guides them through the setup process. When it’s configured, developers can deploy serverless applications directly to the emulated environment and test their functions locally, all without leaving their IDE.

Let’s try it out
First, I’ll update my copy of the AWS Toolkit for VS Code to the latest version. Once, I’ve done this, I can see a new option when I go to Application Builder and click on Walkthrough of Application Builder. This allows me to install LocalStack with a single click.

Once I’ve completed the setup for LocalStack, I can start it up from the status bar and then I’ll be able to select LocalStack from the list of my configured AWS profiles. In this illustration, I am using Application Composer to build a simple serverless architecture using Amazon API Gateway, Lambda, and DynamoDB. Normally, I’d deploy this to AWS using AWS SAM. In this case, I’m going to use the same AWS SAM command to deploy my stack locally.

I just do `sam deploy –guided –profile localstack` from the command line and follow the usual prompts. Deploying to LocalStack using AWS SAM CLI provides the exact same experience I’m used to when deploying to AWS. In the screenshot below, I can see the standard output from AWS SAM, as well as my new LocalStack resources listed in the AWS Toolkit Explorer.

I can even go in to a Lambda function and edit the function code I’ve deployed locally!

Over on the LocalStack website, I can login and take a look at all the resources I have running locally. In the screenshot below, you can see the local DynamoDB table I just deployed.

Enhanced development workflow
These new capabilities complement our recently launched console-to-IDE integration and remote debugging features, creating a comprehensive development experience that addresses different testing needs throughout the development lifecycle. AWS SAM CLI provides excellent local testing for individual Lambda functions, handling unit testing scenarios effectively. For integration testing, the LocalStack integration enables testing of multiservice workflows locally without the complexity of AWS Identity and Access Management (IAM) permissions, Amazon Virtual Private Cloud (Amazon VPC) configurations, or service boundary issues that can slow down development velocity.

When developers need to test using AWS services in development environments, they can use our remote debugging capabilities, which provide full access to Amazon VPC resources and IAM roles. This tiered approach frees up developers to focus on business logic during early development phases using LocalStack, then seamlessly transition to cloud-based testing when they need to validate against AWS service behaviors and configurations. The integration eliminates the need to switch between multiple tools and environments, so developers can identify and fix issues faster while maintaining the flexibility to choose the right testing approach for their specific needs.

Now available
You can start using these new features through the AWS Toolkit for VS Code by updating to v3.74.0. The LocalStack integration is available in all commercial AWS Regions except AWS GovCloud (US) Regions. To learn more, visit the AWS Toolkit for VS Code and Lambda documentation.

For developers who need broader service coverage or advanced capabilities, LocalStack offers additional tiers with expanded features. There are no additional costs from AWS for using this integration.

These enhancements represent another significant step forward in our ongoing commitment to simplifying the serverless development experience. Over the past year, we’ve focused on making VS Code the tool of choice for serverless developers, and this LocalStack integration continues that journey by providing tools for developers to build and test serverless applications more efficiently than ever before.

Accelerating local serverless development with console to IDE and remote debugging for AWS Lambda

Post Syndicated from Brian Krygsman original https://aws.amazon.com/blogs/compute/accelerating-local-serverless-development-with-console-to-ide-and-remote-debugging-for-aws-lambda/

Delightful developer experience is an important part of building serverless applications efficiently, whether you’re creating an automation script or developing a complex enterprise application. While AWS Lambda has transformed modern application development in the cloud with its serverless computing model, developers spend significant time working in their local environments. They rely on familiar IDEs, debugging tools, testing frameworks, and build within established organizational workflows to deliver production-ready applications.

This post covers some recent enhancements to local developer experience. Two new Lambda features, namely console to IDE and remote debugging, further bridge the gap between cloud and local development, enabling you to leverage the full power of your local tools while working with Lambda functions in the cloud.

Overview

Serverless development with Lambda spans both cloud and local environments, each with its unique strengths. While the Lambda console offers rapid deployment and prototyping, local development provides the depth and flexibility needed for a complex application development workflow that includes integration testing, deployment to shared environments, continuous integration/continuous deployment (CI/CD) pipelines, and collaboration with other team members. The local developer experience encompasses the tools, workflows, and practices that developers use on their local devices to build and maintain their applications. An intuitive local development experience helps application development teams achieve high productivity, ensure code quality, and confidently ship changes to production.

Recent local serverless development experience enhancements

Local development workflows can be seen as two distinct but interconnected loops: the inner loop of writing, testing, and debugging code locally, and the outer loop that extends to cloud deployment, integration testing, release pipeline, and monitoring, as shown in the following figure. For serverless applications, developers want immediate feedback within the inner loop, as they iterate on function code and test integrations with AWS services. AWS has been steadily enhancing the local development experience for developers building on Lambda, with a focus on accelerating the inner loop, where developers spend most of their time.

DevOps workflow diagram showing interconnected local development and cloud deployment cycles with feedback loops

Figure 1: Inner and outer loop

Visual Studio Code (VS Code) is the most popular IDE among developers according to the 2024 Stack Overflow Developer Survey. Enhanced local IDE experience enables developers to code, test, debug, and deploy Lambda-based serverless applications more efficiently in their local IDE when using VS Code. It introduced the Application Builder interface, which streamlines the entire development workflow from setup to deployment with features such as guided walkthrough for environment setup, pre-configured sample applications, build setting management, and improved local debugging capabilities. This eliminates the need to switch between multiple interfaces. This experience also integrates with AWS Infrastructure Composer, which enables visual application building directly from VS Code, and provides quick-action buttons for common tasks like building, deploying, and invoking functions both locally and in the cloud.

AWS development environment setup wizard showing required tools installation process and local development options

Figure 2 Guided walkthrough in VS Code IDE

With Serverless Land’s extensive ready-to-use pattern library available directly in VS Code, you can now browse, search, and implement a collection of curated, pre-built serverless patterns without leaving the IDE. This integration makes it easier to use proven architectures and AWS best practices while building serverless applications. Amazon CloudWatch Logs Live Tail support for Lambda functions in VS Code brings real-time log streaming and analytics capabilities directly into the IDE, enabling you to monitor and troubleshoot your Lambda functions without context switching. Whether testing a new feature or debugging an issue, you can now see the immediate impact of your code changes without leaving the IDE.

Console to IDE

Over the past decade, the Lambda console has enabled developers to quickly get started with writing Lambda functions, allowing them to rapidly iterate through changing code, testing, and deploying their functions. The console IDE experience saw a major usability refresh in 2024, including the introduction of Amazon Q Developer in the Lambda console.

As applications grow in complexity, developers often need to refactor code, add complex logic, include utility libraries as dependencies, or handle edge cases in their Lambda functions. Examples include using external libraries for complex time calculations or adding modules that perform caller-specific validations. This can make functions too bulky to manage in the console.

Developers may also want to move their functions into a software development lifecycle (SDLC) process that includes test frameworks, security scanning tools, infrastructure as code (IaC) templates, or CI/CD pipelines. This may necessitate that they use version control to collaborate across the team or develop with an AI agent steered by custom rules.

Previously, setting this up required manually configuring a local development environment, including IDE, language runtime, and build/package toolchains. Then, you had to download your function code, configuration, and integration settings and copy them into the IDE. You also had to create the required IaC template with AWS Serverless Application Model (AWS SAM). Only then could you deploy to the cloud to validate the accuracy of your code and configuration and continue with your development workflow.

The new Lambda console to IDE feature enables seamless transition from a cloud-hosted code/test cycle to a local environment, allowing you to download your function code and configuration to local VS Code IDE with just one click. From there, you can easily add dependencies and commit code into source control. Furthermore, you can sync back to the cloud for deployment or export a full AWS SAM template with the “Convert to SAM” capability and continue managing your function as if you had started locally. Console to IDE guides you through setting up the IDE on your local device, if you don’t already have one, along with any necessary configuration. The following figures show a function open in the Lambda console and thereafter in the local VS Code IDE.

AWS Lambda console showing Python function code, IoT integration, test events, and configuration settings for temperature monitoring

Figure 3: A Lambda function as seen in the console IDE

AWS Lambda local development interface showing Python IoT temperature monitoring code, terminal, and getting started guide

Figure 4: The same Lambda function as seen in a local IDE after Console to IDE export

By making it easy to transition inner loop development between cloud and local development environments, the console to IDE feature makes it easy to quickly scale an idea from proof-of-concept to a full-fledge serverless application. Visit the Lambda documentation to learn more.

Remote debugging

Developers building serverless applications with Lambda often need to test and debug cross-service integrations. While local debugging tools offer valuable capabilities, they do not fully replicate the Lambda runtime environment and its interactions with other AWS services, especially when dealing with Amazon Virtual Private Cloud (VPC) resources and AWS Identity and Access Management (IAM) permissions. Therefore, developers had to rely on print statements and verbose logging, and for complex scenarios they had to deploy their functions multiple times to diagnose and resolve issues. This process extended development cycles, particularly when troubleshooting issues specific to the production environment. Developers wished they could use advanced local development tools like debuggers to investigate issues with code running in Lambda functions deployed in the cloud.

Lambda’s new remote debugging feature now enables you to debug your functions running in the cloud directly from your local VS Code IDE using the AWS Toolkit extension. You can now debug the execution environment of the function running in the cloud in its IAM execution role’s security context with access to configured VPC resources, and trace execution through entire service flows in the cloud.

To start debugging, enable Remote debugging when invoking your function through the AWS Toolkit. Configure your local code path and payload and choose Remote Invoke. AWS Toolkit automatically adds an AWS-managed debugging Lambda layer to your function, extends the timeout, publishes a temporary version, and reverts the config change. AWS Toolkit then invokes the published debug version. You can then start debugging. This feature establishes a secure connection between your local debugger and the function running in the cloud using AWS IoT Secure Tunneling. When your debug session is finished, Lambda automatically removes the temporary function version. You can end your debug session explicitly. Otherwise, it will end automatically after 60 seconds of inactivity or when the Lambda function timeout is reached.

The following figure shows how setting a breakpoint in VS Code IDE during a remote debugging session pauses execution so that you can inspect the data with which the function running in the cloud is called, along with your function’s variables. You can continue to step forward from this point line-by-line to follow the function’s execution.

AWS Lambda debug environment displaying IoT temperature monitoring code, variable inspection, and execution logs with breakpoint paused state

Figure 5: VS Code IDE debugger attached to execution environment of a Lambda function running in the cloud

All of this means that you don’t have to set up local emulators to approximate cloud behavior, manage complex test frameworks, or continuously capture expensive logs with TRACE-level detail to understand how your code executes. Your debugger can show you exactly what invocation parameters look like, such as event and context, when they reach your function handler. You can step through how your function behaves for different inputs and inspect variable values along the way. Since your code is running in the cloud, you can even see how your function’s IAM execution role affects its behavior. As you step through, you can immediately see when an AWS SDK service call fails due to lack of permissions.

Moreover, you can combine this with the console to IDE feature described previously in this post. When you’ve downloaded your function and scaffolded your local environment with console to IDE, you can debug the function as it runs in the cloud with remote debugging. This gives you much more visibility into the Lambda developer experience, which helps you find issues more easily, fix bugs quickly, and deliver new features rapidly. Follow the steps in the documentation to get started.

Best practices

Although the improved developer experience enables you to move faster when building serverless applications using Lambda, you should incorporate AWS-recommended best practices into your application development workflow.

For large or complex functions, refactor the code following the programming language norms so that developers and AI agents can better understand it. For example, move complex business logic, such as inventory calculations, out of the function handler into a separate module. Console to IDE allows you to use your local refactoring tools to refactor function code.

For isolated cost allocation and security boundaries between development and production, use separate AWS environments for different stages of your development process. You can use console to IDE to generate an AWS SAM template for your application with properties of your function and related AWS resources, which streamlines consistent cross-environment deployments. Then, you can then automate deployments of your template and function code with a CI/CD pipeline.

During development, you should test your functions in the cloud when you can. Remote debugging makes it easier to test functions running in the cloud from your local environment, allowing you to step through your code to validate logic and least-privilege function execution permissions. To optimize cost, focus on logging just enough to recreate problem scenarios, including necessary context about function execution, rather than logging everything you need to diagnose behavior. This also means that you have smaller log volumes to sift through.

You should recreate problem scenarios in an environment where you control the flow of input and can use remote debugging. When possible, you should use a development environment where there are no other sources of invokes. There’s a small window while remote debugging applies the temporary config change where other traffic to $LATEST might cause unexpected results, such as a slower cold start. By default, the debugger does not initialize when running on $LATEST. You should also use Aliases and Versions to explicitly pin environments to the appropriate version of a function, which avoids this problem and gives you more deterministic behavior along with the ability to do canary deployments.

Conclusion

The local development experience enhancements, including debugging workflows and IDE integrations, minimize the configuration and setup needed for developers to locally build serverless applications using Lambda. This enables developers to focus on building business logic. These enhancements also provide the rapid feedback loop developers need while making sure that their local environment accurately reflects cloud behavior.

AWS continues to streamline the local developer experience for serverless applications in areas such as local testing of service integrations, IaC workflows, troubleshooting capabilities, and using AI assistance more deeply in local development workflows. All of this helps developers build more efficient and secure serverless applications.

To get started with these new capabilities, visit the Lambda developer guide for detailed walkthroughs and best practices. Share your experiences and suggestions through the Lambda GitHub issues page to help shape the future of serverless developer experience.

For more serverless learning resources, visit Serverless Land. Likewise, check out this video from an AWS Community Builder showcasing the latest capabilities.

Under the hood: how AWS Lambda SnapStart optimizes function startup latency

Post Syndicated from Ayush Kulkarni original https://aws.amazon.com/blogs/compute/under-the-hood-how-aws-lambda-snapstart-optimizes-function-startup-latency/

When building applications using AWS Lambda, optimizing function startup is an important step to improve performance for latency sensitive applications. The largest contributor to startup latency (often referred to as cold start time) is the time that Lambda spends initializing your function code. Lambda SnapStart is a feature available for Java, Python, and .NET runtimes that helps reduce variable cold start latency from several seconds (or higher) to as low as sub-second. SnapStart typically needs zero or minimal changes to your application code and makes it easier to build highly responsive and scalable applications without implementing complex performance optimizations. This post explains how SnapStart works under the hood and provides recommendations to improve application performance when using SnapStart.

If your function already initializes within hundreds of milliseconds, then AWS recommends using Lambda Provisioned Concurrency to achieve double-digit millisecond startup latency.

What is a cold-start?

Lambda runs your function code in an isolated, secure execution environment that uses Firecracker microVM technology. When you first invoke a Lambda function, Lambda creates a new execution environment for the function to run in. Lambda downloads your function code, starts the language runtime, and runs your function initialization code, which is code outside the handler. This initialization process (INIT) is called a cold start. Then, Lambda runs your function handler code to invoke the function. A Lambda execution environment only handles a single invoke request at a time. The following figure shows the lifecycle of a typical invocation request.

Figure 1. Function invocation lifecycle without SnapStart

Figure 1. Function invocation lifecycle without SnapStart

After the function finishes running, Lambda doesn’t stop the execution environment right away. When your function receives another invocation request, Lambda attempts to route the request to the idle but already running execution environment. As the INIT process has already run for this execution environment, this invoke is called a warm start. When more traffic arrives than Lambda has available idle execution environments, Lambda initializes new execution environments to serve the additional requests, performing the cold start initialization process again.

The last step of the cold start, initializing function code, typically takes the longest. This depends on the startup tasks that you execute in your code and the programming language runtime or framework you use. For languages such as Java and .NET, startup latency is impacted by just-in-time compilation of static code in loaded classes. For Python, it can be impacted if your executed code contains numerous or large modules. Other startup tasks, such as downloading machine learning (ML) models, can also take several seconds to complete, which adds to your function’s initialization latency. SnapStart is designed to optimize this last step of the cold start process and achieves this in three stages.

Stage 1: Snapshotting your Lambda function

When using SnapStart, the Lambda execution environment lifecycle changes. When you enable SnapStart for a particular function, publishing a new function version triggers the snapshotting process. The process runs the function initialization phase and takes an immutable, encrypted Firecracker microVM snapshot of the memory and disk state of the initialized execution environment, caching and chunking the snapshot for reuse. Code paths that are not executed during initialization, such as classes loaded on-demand through dependency injection, are not included in your function’s snapshot. To improve snapshot efficiency, proactively execute code paths during the initialization phase, or use runtime hooks to run code before Lambda creates a snapshot.

Snapshot creation can take a few minutes, during which your function version remains in the PENDING state, becoming ACTIVE when the snapshot is ready.

When you subsequently invoke your function, Lambda restores new execution environments from this snapshot. This optimization makes the invocation time faster and more predictable, because creating new a execution environment no longer requires an initialization.

The following figure shows the lifecycle of a SnapStart configured function.

Diagram illustrating how AWS Lambda SnapStart works. The top section shows the 'Publish Version' phase, where the function is initialized ahead of time by creating the execution environment, downloading the code, starting the runtime, and initializing the function code. At the end of this phase, a microVM snapshot is created. The bottom section shows the 'Request Lifecycle' using SnapStart: each new execution environment resumes from the pre-initialized microVM snapshot and immediately invokes the Lambda handler. This allows multiple environments to start faster by skipping initialization steps.

Figure 2. Function invocation lifecycle with SnapStart

After Lambda creates a snapshot, it periodically regenerates it to apply security patches, runtime updates, and software upgrades. Your invocation requests continue to work throughout the regeneration process.

Stage 2: Storing snapshots for low-latency retrieval at Lambda scale

Lambda operates at a high scale, processing tens of trillions of invocation requests every month. To efficiently manage and retrieve snapshots at this volume of traffic, Lambda uses storage and caching components. These consist of three layers: Amazon S3 for durable storage, a dedicated distributed cache, and a local cache on Lambda worker nodes.

Lambda stores function snapshots in Amazon S3, dividing them into 512 KB chunks to optimize retrieval latency. Retrieval latency from Amazon S3 can take up to hundreds of milliseconds for each 512 KB chunk. Therefore, Lambda uses a two-layer cache to speed-up snapshot retrieval.

When you enable SnapStart, during the optimization process, Lambda stores snapshot chunks in a layer two (L2) cache. This layer is a dedicated distributed cache instance fleet purpose-built by Lambda. Lambda stores a separate copy of each snapshot per AWS Availability Zone (AZ). To balance performance with costs, Lambda may not proactively cache unused snapshot chunks, instead caching them after they are first accessed. Chunks remain cached in the L2 fleet as long as your function version is active. The snapshot restore performance from the L2 layer is typically single digit milliseconds for a 512 KB chunk.

Lambda also maintains a layer one (L1) cache located on Lambda worker nodes, the Amazon Elastic Compute Cloud (Amazon EC2) instances handling function invocations. This layer is available locally, thus it provides the fastest performance, typically 1 millisecond for a 512 KB chunk. Functions with more frequent invocations are more likely to have their snapshot chunks cached in this layer. Functions with fewer invocations are automatically evicted from this cache, because it is bound by the worker instance disk capacity. When a snapshot chunk is not available in the L1 cache, Lambda retrieves the chunk from the L2 cache layer.

Figure 3. SnapStart tiered cache

Figure 3. SnapStart tiered cache

Stage 3: Resuming execution from restored snapshots

Resuming execution from snapshots with low latency is the final SnapStart stage. This involves loading the retrieved snapshot chunks into your function execution environment. Typically, only a subset of the retrieved snapshot is needed to serve an invocation. Storing snapshots as chunks lets Lambda optimize the resume process by proactively loading only the necessary subset of chunks. To achieve this, Lambda tracks and records the snapshot chunks that the function accesses during each function invocation, as shown in the following figure.

Figure 4. Initial invocation, record chunk access pattern

Figure 4. Initial invocation, record chunk access pattern

After the first function invocation, Lambda refers to this recorded chunk access data for subsequent invokes, as shown in the following figure. Lambda proactively retrieves and loads this “working set” of chunks before they are needed for execution. This significantly speeds up cold-start latency. If every invoke executes the same code path, then all necessary chunks are tracked after the first invoke. If your Lambda function includes a method that is conditionally invoked once every five cold starts, then Lambda adds the corresponding chunks representing this method to the chunk access metadata after five cold starts.

Figure 5. Subsequent invocation, load chunks in order of access

Figure 5. Subsequent invocation, load chunks in order of access

Understanding SnapStart function performance

The speed of restoring a snapshot depends on its contents, size, and the caching tier used. As a result, SnapStart performance can vary across individual functions.

Function performance improves with more invocations

Frequently invoked functions are more likely to have their snapshots cached in the L1 layer, which provides the fastest retrieval latency. Infrequently accessed portions of snapshots for functions with sporadic invokes are less likely to be present in the L1 layer, resulting in slower retrieval latency from the L2 and S3 cache layers. Chunk access data for functions with more invocations is also more likely to be “complete”, which speeds up snapshot restore latency.

Pre-load code paths to optimize snapshot restore latency

To maximize the benefits of SnapStart, preload dependencies, initialize resources, and perform heavy computation tasks that contribute to startup latency in your initialization code instead of in the function handler. Code paths not executed during your function’s INIT phase, such as application classes loaded on-demand through dependency injection, are not included in your function’s snapshot. You can further improve SnapStart effectiveness by proactively executing these code paths during function initialization. You can also run code using runtime hooks and invoking your handler during the initialization phase before creating the snapshot. To achieve this, refer to the documentation and posts for Spring Boot and .NET applications to implement the performance tuning.

Performance differs depending on function size

SnapStart performance depends on how quickly Lambda can retrieve and load cached snapshots into your function execution environment. Larger function sizes increase the size of snapshots, and thus the number of chunks, which causes performance to differ for functions of varying sizes.

Not all functions benefit from SnapStart

SnapStart is designed to improve startup latency when function initialization takes several seconds, due to language-specific factors or because of initializing and loading software dependencies and frameworks. If your functions initialize within hundreds of milliseconds, you are unlikely to experience a significant performance improvement with SnapStart. For these scenarios, we recommend Provisioned Concurrency, which pre-initializes execution environments, delivering double-digit millisecond latency.

Conclusion

AWS Lambda SnapStart can deliver as low as sub-second startup performance for Java, .NET, and Python functions with long initialization times. This post explores how the Lambda lifecycle changes with SnapStart and how Lambda efficiently stores and loads snapshots to improve start up performance. SnapStart helps developers build highly responsive and scalable applications without provisioning resources or implementing complex performance optimizations.

To learn more about SnapStart, refer to the documentation and launch posts for Java, and Python and .NET. For performance tuning, refer to the SnapStart best practices section for your preferred language runtime. This post outlines approaches to pre-load code paths to further optimize startup latency. Find more information and sample applications built using SnapStart on Serverlessland.com.

Effectively building AI agents on AWS Serverless

Post Syndicated from Anton Aleksandrov original https://aws.amazon.com/blogs/compute/effectively-building-ai-agents-on-aws-serverless/

Imagine an AI assistant that doesn’t just respond to prompts – it reasons through goals, acts, and integrates with real-time systems. This is the promise of agentic AI.

According to Gartner, by 2028 over 33% of enterprise applications will embed agentic capabilities – up from less than 1% today. While early generative AI efforts focused on GPUs and model training, agentic systems shift the focus to CPUs, orchestration, and integration with live data – the places where organizations are starting to see real return on investment (ROI).

In this post, you’ll learn how to build and run serverless AI agents on AWS using services such as Amazon Bedrock AgentCore (preview as of this post publication), AWS Lambda, and Amazon Elastic Container Service (Amazon ECS), which provide scalable compute foundations for agentic workloads. You’ll also explore architectural patterns, state management, identity, observability, and tool usage to support production-ready deployments.

Overview

Early AI assistants were stateless and reactive – each prompt processed in isolation, with no memory of prior interactions or awareness of broader context. Gradually, AI assistants became more capable by injecting system prompts, preserving conversation history, and incorporating enterprise knowledge using Retrieval-Augmented Generation (RAG), as illustrated in the following diagram.

Despite these improvements, traditional AI assistants still lacked true autonomy. They couldn’t reason through multi-step goals, make decisions on their own, or adjust workflows dynamically based on outcomes. As a result, they worked well for simpler Q&A or predefined workflows, but struggled with dynamic, more complex, real-world tasks that require planning, using external tools, and making decisions along the way.

Agentic AI systems shift from passive content generation to autonomous, goal-driven behavior. Powered by Large Language Models (LLMs) and enhanced with memory, planning, and tool use, these systems can break down complex tasks into smaller steps, reason through each step, and take real-time actions, such as calling APIs, executing tools, or interacting with live data. By referencing the LLM within a control cycle that manages context, memory, and decision-making, these systems can choose the right tools, adapt workflows, and integrate deeply into enterprise environments, with use cases ranging from travel booking and financial analysis to DevOps automation and code debugging. This is referred to as an agentic loop. In this system, the agent relies on the LLM’s reasoning output to execute tools, capture tool results, and feed these results to the LLM as updated context (as shown in the following diagram). This happens in a loop until LLM instructs the agent to return the final output to the caller.

While agentic loop is a lightweight approach to structuring these systems, other control flow paradigms, such as graph, swarm, and workflows, are also available in open-source frameworks like LangGraph.

Introducing Strands Agents SDK

Strands Agents SDK is a code-first framework to build production-ready AI agents with minimal boilerplate. It utilizes the above-mentioned agentic loop system and abstracts common challenges like memory management, tool integration, and multi-step reasoning in a lightweight, modular Python framework. Strands SDK handles state, tool orchestration, and multi-step reasoning so agents can remember past conversations, call external APIs, enforce business rules, and adapt to changing inputs. This allows you to focus on the application’s business logic.

Because agents built with Strands SDK are essentially Python apps, they’re portable and can run across different compute options, such as Bedrock AgentCore Runtime, Lambda functions, ECS tasks, or even locally. This makes Strands Agents SDK a powerful foundation for building scalable and goal-driven AI systems. The following sections assume you’re running your AI agents built with Strands Agents SDK on Lambda functions.

Building your first serverless AI agent

Imagine you’re building an AI-powered corporate travel assistant on AWS, and you have the following technical requirements:

  1. Define the system prompts, memory, and model you want to use
  2. Integrate tools for API calls, business logic, and knowledge bases
  3. Ensure authentication and observability

Strands SDK handles heavy lifting, so you can focus on building smart, responsive agents with minimal overhead. The following code snippet creates a simple agent, according to your configuration.

from strands import Agent

agent = Agent(
    system_prompt=
      """You're a travel assistant that helps 
         employees book business trips 
         according to policy.""",
    model=my_model,
    tools=[get_policies, get_hotels, get_cars, book_travel]
)

response = agent("Book me a flight to NYC next Monday.")

That’s it. Your agent now has a personality, memory, and ability to use external tools. The Agent class in the Strands SDK abstracts agentic logic, such as maintaining conversation history, handling LLM interactions, orchestrating tools and external knowledge sources, and running the full agentic loop.

Session state management

Session state management is critical for agentic workflows. It allows agents to track goals across interactions – enabling coherent conversations, retaining context, and providing personalized experiences. Without state management, each prompt is handled in isolation, making it impossible for the agent to reference prior context or track ongoing tasks. In cloud environments, where applications need to be stateless and scalable, the solution is to externalize session state to persistent storage, such as Amazon Simple Storage Service (Amazon S3). This allows any agent instance to reconstruct the conversation history on demand, delivering a seamless, stateful user experience while keeping the agentic app itself stateless for scalability and resilience.

AI agents built with Strands store conversation history in the agent.messages property (see documentation). To support stateless compute environments, you can externalize the agent state, persisting it after each interaction and restoring it before the next. This preserves continuity across invocations while keeping your agent instances stateless. In user-aware agentic applications, you want to persist state for each user, typically associated with the user’s unique ID. The following example illustrates how you can do it with the built-in S3SessionManager class when running your agent in a stateless environment such as a Lambda function:

    session_manager = S3SessionManager(
        session_id=f"session_for_user_{user.id}",
        bucket=SESSION_STORE_BUCKET_NAME,
        prefix="agent_sessions"
    )

    agent = Agent(
        session_manager=session_manager
    )

When using Bedrock AgentCore, use the fully managed, serverless AgentCore Memory primitive to manage sessions and long-term memory. It provides relevant context to models while helping agents learn from past interactions. You can make Strands’ session manager work with AgentCore Memory similar to S3SessionManager.

Authentication and authorization

For enterprise AI agents to operate safely, they must know who the user is and what they are allowed to do. This goes beyond basic identity validation – AI agents often act on behalf of users, so they might need to enforce role-based access controls, support audit, and comply with corporate policies.

AWS services like Amazon CognitoAmazon Identity and Access Management (IAM), and Amazon API Gateway provide a solid foundation for authentication and authorization. For example, you can use Cognito to authenticate users through user pools or federated identity providers, combined with API Gateway and Lambda authorizer to validate user access permissions before forwarding requests to the agent, as shown in the preceding diagram. IAM policies define what the agent is allowed to do. After the user is both authenticated and authorized, the agent can extract the identity context, for example, from a JSON Web Token (JWT), to personalize prompts, enforce rules, or dynamically restrict actions.

The following code snippet illustrates retrieving user’s identity from the Authorization header and passing it to an agent:

def handler(event: dict, ctx):
    user_id = extract_user_id(event["headers"]["Authorization"])
    user_prompt: dict = json.loads(event["body"])["prompt"]
    agent_response = agent.prompt(user_id, user_prompt)
  
    return {
        "statusCode": 200,
        "body": json.dumps({"text": agent_response.text})
    }

The identity context can become a part of the agent’s execution loop. An agent might check the user’s department before booking travel or restrict access to sensitive tools unless the user has the appropriate permissions. By integrating authentication early, you not only enhance security, but also unlock rich personalization and audit capabilities that make agents enterprise-ready from day one.

When using Bedrock AgentCore, the AgentCore Identity primitive allows your AI agents to securely access AWS services and third-party tools either on behalf of users or as themselves with pre-authorized user consent. It provides managed OAuth 2.0 supported providers for both inbound and outbound authentication. During the preview phase, AgentCore Identity supports identity providers like Amazon Cognito, Auth0 by Okta, Microsoft Entra ID, GitHub, Google, Salesforce, and Slack. Refer to the samples for implementation details.

Building portable Strands agents on AWS

Strands Agents SDK is compute-agnostic. The agents you build are standard Python applications, which can run on any compute type.

For portability and maintainability, separate your agent’s business logic from the interface layer. By doing this, you can reuse the same core agent code across environments, whether invoked through API Gateway and Lambda functions, accessed through Application Load Balancer and Amazon ECS, running on AgentCore Runtime, or even executed locally during development, as shown in the following figure.

The following code snippets illustrate this technique.

Lambda handler code:

def handler(event: dict, ctx):
     user_id = extract_user_id(event)
     user_prompt = json.loads(event["body"])["prompt"]
     agent_response = call_agent(user_id, user_prompt)
     return {
          "statusCode":200,
          "body": json.dumps({
               "text": agent_response.mesage
          })
     }

AgentCore code:

@app.entrypoint
def invoke(payload):
     user_id = extract_user_id(payload)
     user_prompt = payload.get("prompt")
     agent_response = call_agent(user_id, user_prompt)
     return {"result": agent_response.message)

HTTP Handler code:

@app.post("/prompt")
async def prompt(request: Request, prompt_request: PromptRequest):
    user_id=extract_user_id(request)
    user_prompt = prompt_request.prompt
    agent_response = call_agent(user_id, user_prompt)
    return {"text": agent_response.message)

For local testing:

if __name__ == "__main__":
     user_id="local-testing-user"
     user_prompt="book me a trip to NYC"
     agent_response = call_agent(user_id, user_prompt)
     return agent_response.message

Agent code:

def call_agent(user_id, user_prompt):
     agent = Agent(
          system_prompt="You’re a travel agent…",
          model=my_model,
          session_manager = my_session_manager,    
      )
     agent_response = agent(user_prompt)
     return agent_response

Extending agent functionality with tools

A key strength of agentic systems is their ability to invoke tools that perform actions or retrieve real-time data, enabling agents to interact with the outside world, not just generate text. The Strands Agents SDK includes built-in tools and allows you to define your own custom tools, as either in-process Python functions or external tools accessible over HTTP using the Model Context Protocol (MCP). These tools can fetch data, call APIs, or trigger workflows, and can be registered for the agent to reason over and use during execution.

The following snippet illustrates creating an in-process tool. See the documentation for more examples.

from strands import tool 

@tool
def get_weather(city: str) -> str:
    weather = call_weather_api(city)
    return f"The current weather in {city} is {weather}"

Integrating with remote MCP servers

Model Context Protocol (MCP) is an open standard that decouples agents from tools using a client-server model. Instead of embedding tool logic directly into the agent, your agent becomes an MCP client that connects to one or more MCP servers – each exposing tools, resources, and reusable prompts.

Running remote MCP servers is especially valuable when tools span multiple business domains or are provided by third-party vendors, just like how microservices separate responsibilities across teams and systems. This separation allows each domain team to manage their own tools independently while exposing a consistent, standardized interface to agents. It also enables reuse, versioning, and centralized governance without tightly coupling logic into the agent itself. By decoupling tools from agents, MCP unlocks composability, scalability, and long-term ecosystem growth.

The following snippet illustrates configuring an MCP client to connect to a remote MCP Server, retrieving the list of tools, and integrating those tools with an agent.

mcp_client = MCPClient(lambda: streamablehttp_client(
    url=mcp_endpoint,
    headers={"Authorization": f"Bearer {token}"},
))

with mcp_client:
  tools = mcp_client.list_tools_sync()
  agent = Agent(tools=tools)

When using Bedrock AgentCore, you can operate MCP at scale through AgentCore Gateway. It provides an easy and secure way for developers to build, deploy, discover, and connect to remote tools like above at scale. With AgentCore Gateway, developers can convert APIs, Lambda functions, and existing services into Model Context Protocol (MCP)-compatible tools and make them available to agents through Gateway endpoints with just a few lines of code.

Monitoring and observability

Observability is essential when running AI agents. Beyond traditional metrics such as uptime and latency, agentic systems introduce new telemetry dimensions, such as LLM latency, token consumption, and tracing reasoning cycles. These new metrics are essential for understanding both the performance and cost of your agentic systems.

When deploying agents using AWS services such as Bedrock AgentCore, Lambda, or ECS, you inherit the built-in observability capabilities, such as seamless integration with Amazon CloudWatch for metrics, logs, and distributed tracing. This simplifies tracking invocation counts, errors, request duration, and concurrency, as shown in the following figure – essential for operating reliable and scalable agentic applications.

In addition, the Strands Agents SDK provides built-in agent observability features. It uses OpenTelemetry (OTEL) to automatically trace each agent interaction, including spans for LLM calls, tool usage, and context updates. It also exports detailed metrics such as token counts, tool execution times, and decision cycle durations. These metrics can be sent to any OTEL-compatible backend, giving you deep, real-time visibility into how your agents reason, act, and adapt. The following snippet shows built-in token usage metrics:

{
  "accumulated_usage": {
    "inputTokens": 1539,
    "outputTokens": 122,
    "totalTokens": 1661
  },
  "average_cycle_time": 0.881234884262085,
  "total_cycles": 2,
  "total_duration": 1.881234884262085,
  ... redacted ...
}

Learn more about observability and evaluation of Strands agents from this sample code.

When using Bedrock AgentCore, the AgentCore Observability primitive helps you to log and capture metrics and traceability from other AgentCore primitives like runtime, memory, and gateway, as described in this tutorial.

Security considerations

You should build secure communication and access controls layers deploying AI agents that integrate with remote MCP servers. All client-server interactions should be encrypted using TLS, ideally with mutual TLS for bidirectional authentication. Access to tools should be validated through authorization checks with fine-grained permissions to enforce least privilege access. Deploying MCP servers behind an API Gateway provides additional security layers like DDoS protection, WAF, and centralized authentication. Use API Gateway logging capabilities to capture caller identity and execution outcomes. Using trusted, versioned MCP repositories helps protect against supply chain attacks and ensures consistent tool governance across teams. Protocols such as MCP are evolving rapidly, you should always use the most recent versions to minimize potential security vulnerabilities risk.

In addition, you should leverage security best practices described in the AWS Well-Architected Framework Security Pillar, such as enforcing strict IAM role scoping, integrating with identity providers for user context, encrypting all data in transit and at rest, and using VPC endpoints and PrivateLink to limit network exposure. To protect against prompt injection attacks, sanitize inputs, and ensure you maintain comprehensive audit logs for compliance and governance.

Sample project

Follow instructions in this GitHub repo to deploy a sample project implementing the practices described in this post using the AWS Serverless compute. The repo includes a travel agent implemented with Strands Agents SDK and a remote MCP server, both running as Lambda functions.

Conclusion

Agentic AI moves beyond simple prompt-response interactions to enable dynamic, goal-driven workflows. In this post, you learned how to build scalable, production-ready agents on AWS using the Strands Agents SDK and serverless services such as Lambda and Amazon ECS.

By externalizing state, integrating authentication, and adding observability, agents can operate securely and at scale. With support for in-process and remote tools through the MCP, you can cleanly separate responsibilities and build composable, enterprise-ready systems. You can combine these patterns to deliver intelligent, adaptable AI agents that fit naturally into modern cloud and event-driven architectures.

Useful resources

To learn more about Serverless architectures see Serverless Land.

Enhance Amazon EMR observability with automated incident mitigation using Amazon Bedrock and Amazon Managed Grafana

Post Syndicated from Yu-Ting Su original https://aws.amazon.com/blogs/big-data/enhance-amazon-emr-observability-with-automated-incident-mitigation-using-amazon-bedrock-and-amazon-managed-grafana/

Maintaining high availability and quick incident response for Amazon EMR clusters is important in data analytics environments. In this post, we show you how to build an automated observability system that combines Amazon Managed Grafana with Amazon Bedrock to detect and remediate EMR cluster issues. We demonstrate how to integrate real-time monitoring with AI-powered remediation suggestions, combining Amazon Managed Grafana for visualization, Amazon Bedrock for intelligent response recommendations, and AWS Systems Manager for automated remediation actions on Amazon Web Services (AWS).

Solution overview

This solution helps you improve EMR cluster observability through a comprehensive four-layer architecture—comprising monitoring, notification, remediation, and knowledge management—to provide the following features:

  • Real-time monitoring of EMR clusters using Amazon Managed Service for Prometheus and Amazon Managed Grafana
  • Automated first-aid remediation through Systems Manager
  • AI-powered incident response suggestions using Amazon Bedrock
  • Integration with the AWS Premium Support knowledge base
  • Historical incident data archival and analysis

The implementation of this architecture delivers the following key benefit:

  • Reduced Mean time to resolution (MTTR)
  • Proactive incident prevention
  • Automated first-response actions
  • Knowledge base enrichment through machine learning

The following diagram illustrates the solution architecture.

End-to-end AWS monitoring solution diagram integrating Knowledge Center, Support, CloudWatch metrics with EventBridge rules and Lambda processing

The architecture comprises the following core components:

  • Monitoring layer – The monitoring layer uses Amazon Managed Service for Prometheus and Amazon CloudWatch to capture real-time metrics from EMR clusters. Amazon Managed Grafana serves as the visualization layer, offering comprehensive dashboards for Apache YARN, HDFS, Apache HBase, and Apache Hudi performance monitoring. Advanced alerting mechanisms trigger notifications based on predefined query results.
  • Notification layer – To provide timely and reliable alert delivery, the notification layer uses Amazon Simple Notification Service (Amazon SNS) for distribution and Amazon Simple Queue Service (Amazon SQS) for message queuing. This architecture prevents message delays and provides a robust trigger mechanism for AWS Lambda functions.
  • Remediation layer – The remediation layer enables automatic issue resolution through:
    • Lambda functions for orchestration
    • Systems Manager for script execution
    • Amazon Bedrock (amazon.nova-lite-v1:0) for generating intelligent response recommendations
  • Knowledge management layer – To maintain an up-to-date knowledge base, the solution:

We provide an AWS CloudFormation template to deploy the solution resources.

Prerequisites

Before starting this walkthrough, make sure you have access to the following AWS resources and configurations:

  • An AWS account
  • Access to the US East (N. Virginia) AWS Region
    • Add access to Amazon Bedrock foundation models (amazon.nova-lite-v1:0)

  • Amazon EMR version 6.15.0 (used in this demo)
  • Archived technical or troubleshooting articles
  • AWS IAM Identity Center enabled with at least one role that can become a Grafana administrator
  • (Optional) AWS Premium Support with a business support plan or higher for enhanced troubleshooting capabilities

Throughout this walkthrough, we provide detailed instructions to set up and configure these prerequisites if you haven’t already done so.

Configure resources using AWS CloudFormation

Complete the following steps to configure your resources:

  1. Launch the CloudFormation stack:

launch stack

  1. Provide emrobservability as the stack name.
  2. Select a virtual private cloud (VPC) and assign a public subnet.
  3. For EMRClusterName, enter a name for your cluster (default: emrObservability).
  4. Enter an existing Amazon S3 location as the Apache HBase root directory location (for example, s3://mybucket/my/hbase/rootdir/).
  5. For MasterInstanceType and CoreInstanceType, enter your instance types (default: m5.xlarge for both).
  6. For CoreInstanceCount, enter your instance count (default: 2).
  7. For SSHIPRange, use CheckIp and enter your IP (for example, 10.1.10/32).
  8. Choose the release label (default: 6.15.0).
  9. For KeyName, enter a key name to SSH to Amazon Elastic Compute Cloud (Amazon EC2) instances.
  10. For LatestAmiId, enter your AMI (default: /aws/service/ami-amazon-linux-latest/amzn2-ami-hvm-x86_64-gp2).
  11. For KBS3Bucket, enter a name for your S3 bucket (for example, mykbbucket).
  12. For SubscriptionEndpoint, enter an email address to receive notifications and responses (for example, [email protected]).

Accept subscription confirmation

Accept the subscription confirmation sent to the email address you specified in the CloudFormation stack parameters. The following screenshot shows an example of the email you receive.

AWS email confirmation for SNS topic subscription to QA Lambda function responses with opt-out instructions

Prepare the knowledge base

Complete the following steps to populate the S3 bucket with archived technical articles and cases:

  1. On the Lambda console, choose Functions in the navigation pane.
  2. Choose the function CustomFunctionCopyKCArticlesToS3Bucket.

AWS Lambda console displaying Functions page with CustomFunctionCopyKCArticlesToS3Bucket function details

  1. Manually invoke the function by choosing Test on the Test tab.

AWS Lambda Test tab interface with event configuration options

  1. Verify successful execution by checking the CloudWatch logs.

AWS Lambda successful function execution result with null output

  1. Repeat the process for the Lambda function CustomFunctionCopyCasesToS3Bucket.

Lambda function interface displaying CustomFunctionCopyCasesToS3Bucket configuration with CloudFormation ID and description panel

AWS Lambda test interface showing Test event configuration options and action buttons

AWS Lambda function execution success message with null response and SHA-256 code

  1. Confirm the S3 bucket has been populated with archived technical articles and cases.

Amazon S3 bucket interface showing two folders with action buttons and search functionality

Sync data to the Amazon Bedrock knowledge base

Complete the following steps to sync the data to your knowledge base:

  1. On the Lambda console, choose Functions in the navigation pane.
  2. Choose the function KBDataSourceSync.

AWS Lambda console displaying filtered functions with CloudFormation tags, Python runtime versions, and modification timestamps

  1. Manually invoke the function by choosing Test on the Test tab.

This task might take 10–15 minutes to complete.

AWS Lambda console test configuration panel with CloudWatch integration and event creation controls

  1. Verify successful execution by checking the CloudWatch logs.

Lambda function execution results showing successful completion status and details

Configure your Amazon Managed Grafana workspace

Complete the following steps to configure your Amazon Managed Grafana workspace:

  1. On the Amazon Managed Grafana console, choose Workspaces in the navigation pane.
  2. Open your workspace.
  3. Choose Assign new user or group.

Amazon Grafana workspace showing IAM configuration notice and user assignment button

  1. Select your IAM Identity Center role and choose Assign users and groups.

Amazon Grafana IAM Identity Center user assignment panel with search and selection controls

  1. On the Admin dropdown menu, choose Make admin.

Amazon Grafana user list showing assigned viewer with admin action options

  1. Enable Grafana alerting, then choose Save changes.

Amazon Grafana alerting configuration panel showing disabled status with navigation tabs and edit button

Amazon Grafana configuration panel showing enabled alerting and plugin management settings

  1. Wait 10 minutes for the workspace to become active.
  2. When it’s active, sign in to the Grafana workspace. (For more information, refer to Connect to your workspace.)

Configure data sources

Add and configure the following data sources:

  1. For Service, choose CloudWatch, then select your Region and add CloudWatch as a data source.

  1. Choose Amazon Managed Service for Prometheus as a second data source and select your Region.

  1. Validate CloudWatch connectivity:
    1. Run test queries (for example, Namespace: AWS/EC2, Metric name: CPUUtilization, Statistic: Maximum).
      Amazon Managed Gragana interface showing CPU utilization query setup for EC2 instance.
    2. Verify CloudWatch metric retrieval.
      Line graph showing CPU utilization over time with peak at 40%.
  1. Validate Amazon Managed Service for Prometheus connectivity:
    1. Run test queries (for example, Metric: hadoop_hbase_numregionservers, Label filters: cluster_id = <Amazon EMR cluster ID>).
      Amazon Managed Grafana query interface showing Hadoop HBase metric configuration.
    2. Verify Prometheus metric retrieval.
      Amazon Managed Grafana monitoring dashboard showing a graph with HBase Region Server amount from 0 to 2

Confirm SNS notification channels

Complete the following steps to confirm your SNS notification is set up:

  1. On the Amazon SNS console, choose Topics in the navigation pane.
  2. Locate and note the ARNs for -LambdaFunctionTopic and -QALambdaFunctionTopic.

AWS SNS Topics list showing 4 topics with names, types, and ARNs

AWS SNS Topics console showing filtered search results for "LambdaFunctionTopic"

AWS SNS Topics console showing filtered search results for "QALambdaFunctionTopic"

  1. Choose Contact points under Alerting.

  1. Create the first contact point:
    1. For Name, enter SNS_SSM.
    2. For Integration, choose AWS SNS.
    3. For Topic, enter the ARN for LambdaFunctionTopic.
    4. For Auth Provider, choose Workspace IAM role.
    5. For Alert Message format, choose JSON.

  1. Create the second contact point:
    1. For Name, enter SNS_QA.
    2. For Integration, choose AWS SNS.
    3. For Topic, enter the ARN for QALambdaFunctionTopic.
    4. For Auth Provider, choose Workspace IAM role.
    5. For Alert Message format, choose JSON.

Create alert rules

Complete the following steps to set up two critical alert rules:

  1. Choose Alert rules under Alerting.

  1. Set up alerting if the Apache HBase region server status is abnormal:
    1. For Alert name, enter HBase region server down.
    2. For Data source, choose Amazon Managed Service for Prometheus.
    3. For Metric, choose hadoop_hbase_numregionservers.
      Alert rule configuration interface for HBase region server monitoring
    4. For Threshold, configure to alert if the region server count is less than 2 for 3 minutes.
      Amazon Managed Grafana alert rule configuration interface with expressions setup
    5. For Evaluation interval, set to 1 minute.
      New evaluation group creation modal showing P0_RegionServer name input and 1m interval settingHBase alert configuration panel showing P0_RegionServer group and 3m pending period
    6. For Contact point, choose SNS_SSM.
      Amazon Managed Grafana alert configuration interface showing labels and notifications setup with AWS SNS integration
  1. Create a second alert for if Amazon EC2 CPU utilization is abnormal:
    1. For Alert name, enter EC2 CPU utilization too high.
    2. For Data source, choose Amazon CloudWatch.
    3. For Namespace, choose AWS/EC2.
    4. For Metric name, choose CPUUtilization
    5. For Statistic, choose Maximum.
      Amazon CloudWatch query interface for setting up EC2 CPU utilization alert conditions
    6. For Threshold, configure to alert if CPU utilization is more than 95% for 3 minutes.
      Amazon Managed Grafana alert interface with Reduce and Threshold expressions for alert condition management
    7. For Evaluation interval, configure to 1 minute.
      New evaluation group configuration modal showing CPU utilization monitoring setup with 1-minute interval
      AWS Managed Grafana alert rule configuration screen showing evaluation behavior settings
    8. For Contact point, choose SNS_QA.Amazon Managed Grafana alert configuration showing customizable labels, contact point selection for SNS_QA integration
  1. On the alert rule creation page, scroll to 5. Add annotations and for Summary, add a clear description of the alert, for example, CPU utilization on EC2 instance is too high.

Alert configuration summary field with "CPU utilization on EC2 instance is too high" warning message

Apache HBase region server incident test

To confirm the system is working as expected, complete the following Apache HBase region server incident test:

  1. SSH into an EMR core instance.
  2. Stop the Apache HBase region server using systemctl:
 # Stop HBase region server service 
 sudo systemctl stop hbase-regionserver.service 

  1. Verify the service status:
 # Check the current state of HBase region server service 
 sudo systemctl status hbase-regionserver.service
  1. Observe Amazon Managed Grafana alert progression:
    1. Monitor alert status changes.
      Alert dashboard showing HBase region server alert status in pending state
      Alert dashboard showing HBase region server alert in firing state
    2. Verify SNS message generation.
    3. Confirm SQS message queuing.
    4. Track the Lambda function triggered for remediation.

Terminal output showing HBase RegionServer service status and daemon processes

HBase monitoring interface displaying region server status with health indicators and action buttons

CPU utilization stress test

Complete the following CPU utilization stress test:

  1. SSH into the EMR primary instance.
  2. Install stress testing tools:
 sudo amazon-linux-extras install epel -y
 sudo yum install stress -y 

  1. Verify the installation:
 stress --version 

  1. Generate high CPU load using the stress command and the following command structure:
 sudo stress [options] 

For our Amazon EMR test, use the following command:

 # For m5.xlarge instances (4 vCPUs) sudo stress --cpu 4 

-c 4 in the command creates 4 CPU-bound processes (one for each vCPU).The following are instance type vCPUs for your reference:

  • m5.xlarge: 4 vCPUs
  • m5.2xlarge: 8 vCPUs
  • m5.4xlarge: 16 vCPUs
  1. Monitor system response:
    1. Observe Amazon Managed Grafana alert status changes.
      Amazon Managed Grafana dashboard header showing rules status
    2. Verify Amazon Bedrock recommendation generation.
    3. Check SNS email notification delivery.
      AWS SNS notification email showing troubleshooting steps for high CPU usageCode snippet showing CPU usage troubleshooting steps in red text

Best practices and considerations

Monitoring infrastructure requires precise alert prioritization and threshold configuration. Alert aggregation techniques prevent notification overload by consolidating event streams and reducing redundant alerts. Operational teams must maintain dashboards through consistent updates and metric integration, providing real-time visibility into system performance and health.

Security implementations focus on least-privilege AWS Identity and Access Management (IAM) roles, restricting access to critical resources and minimizing potential breach vectors. Data protection strategies involve encryption protocols for information at rest and in transit, using AES-256 standards. Automated security audit processes scan automation scripts, identifying potential vulnerabilities through code analysis and runtime inspection.

Performance optimization in serverless architectures uses Lambda extensions to cache knowledge base content, reducing latency and improving response times. Retry mechanisms for API calls implement exponential backoff strategies, mitigating transient network exceptions and enhancing system resilience. Execution time monitoring of Lambda functions enables detection of anomalies through statistical analysis, providing insights into potential system-wide incidents or performance degradations.

Clean up

To avoid incurring future charges, delete the resources by deleting the parent stack on the AWS CloudFormation console.

Conclusion

This solution provides a robust framework for automated EMR cluster monitoring and incident response. By combining real-time monitoring with AI-powered remediation suggestions and automated execution, organizations can significantly reduce MTTR for common Amazon EMR issues while building a knowledge base for future incident response.

Try out this solution for your own use case, and leave your feedback in the comments section.


About the authors

Author Yu-ting Su, Sr. Hadoop System Engineer, AWS Support Engineering. Yu-Ting is a Sr. Hadoop Systems Engineer at Amazon Web Services (AWS). Her expertise is in Amazon EMR and Amazon OpenSearch Service. She’s passionate about distributing computation and helping people to bring their ideas to life.

AWS Weekly Roundup: OpenAI models, Automated Reasoning checks, Amazon EVS, and more (August 11, 2025)

Post Syndicated from Veliswa Boya original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-openai-models-automated-reasoning-checks-amazon-evs-and-more-august-11-2025/

AWS Summits in the northern hemisphere have mostly concluded but the fun and learning hasn’t yet stopped for those of us in other parts of the globe. The community, customers, partners, and colleagues enjoyed a day of learning and networking last week at the AWS Summit Mexico City and the AWS Summit Jakarta.


Last week’s launches
These are the launches from last week that caught my attention:

  • OpenAI open weight models on AWSOpenAI open weight models (gpt-oss-120b and gpt-oss-20b) are now available on AWS. These open weight models excel at coding, scientific analysis, and mathematical reasoning, with performance comparable to leading alternatives.
  • Amazon Elastic VMware Service — Amazon Elastic VMware Service (Amazon EVS), a new AWS service that lets you run VMware Cloud Foundation (VCF) environments directly within your Amazon Virtual Private Cloud (Amazon VPC), is now generally available.
  • Automated Reasoning checks — Automated Reasoning checks, a new Amazon Bedrock Guardrails policy that was previewed during AWS re:Invent, is now generally available. Automated Reasoning checks helps you validate the accuracy of content generated by foundation models (FMs) against a domain knowledge. Read more in Danilo’s post on how this can help prevent factual errors that can be caused by AI hallucinations.
  • Multi-Region application recovery service — In this post, Sébastien writes about the announcement of Amazon Application Recovery Controller (ARC) Region switch, a fully managed, highly available capability that enables organizations to plan, practice, and orchestrate Region switches with confidence, eliminating the uncertainty around cross-Region recovery operations.

Additional updates
I thought these projects, blog posts, and news items were also interesting:

Upcoming AWS events
Keep a look out and be sure to sign up for these upcoming events:

AWS re:Invent 2025 (December 1-5, 2025, Las Vegas) — AWS’s flagship annual conference offering collaborative innovation through peer-to-peer learning, expert-led discussions, and invaluable networking opportunities.

AWS Summits — Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Coming up soon are the summits at São Paulo (August 13) and Johannesburg (August 20).

AWS Community Days — Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Australia (August 15), Adria (September 5), Baltic (September 10), Aotearoa (September 18), and South Africa (September 20).

Join the AWS Builder Center to learn, build, and connect with builders in the AWS community. Browse here for upcoming in-person and virtual developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

Veliswa.

Understanding and Remediating Cold Starts: An AWS Lambda Perspective

Post Syndicated from Aakash Bhattacharya original https://aws.amazon.com/blogs/compute/understanding-and-remediating-cold-starts-an-aws-lambda-perspective/

Cold starts are an important consideration when building applications on serverless platforms. In AWS Lambda, they refer to the initialization steps that occur when a function is invoked after a period of inactivity or during rapid scale-up. While typically brief and infrequent, cold starts can introduce additional latency, making it essential to understand them, especially when optimizing performance in responsive and latency-sensitive workloads.

In this article, you’ll gain a deeper understanding of what cold starts are, how they may affect your application’s performance, and how you can design your workloads to reduce or eliminate their impact. With the right strategies and tools provided by AWS, you can efficiently manage cold starts and deliver consistent, low-latency experience for your users.

What is a cold start?

Cold starts occur because serverless platforms like AWS Lambda are designed for cost-efficiency – you don’t pay for compute resources when your code isn’t running. As a result, Lambda only provisions resources when needed. A cold start happens when there isn’t an existing execution environment available and a new one must be created. This can happen, for example, when a function is invoked for the first time after a period of inactivity or during a burst in traffic that triggers scale-up.

When this occurs, Lambda rapidly provisions and initializes a new execution environment for running your function code. This initialization adds a small amount of latency to the request, but it only occurs once for the lifecycle of that execution environment.

Cold starts consist of several steps that make up the Initialization Phase, which occurs before the function begins running. These steps take place when the Lambda service creates a new execution environment, contributing to the latency commonly referred to as the INIT duration of the function, as illustrated in a following diagram:

Alternative Text: AWS Lambda cold start flowchart showing sequence from left to right: AWS Lambda Function icon connects to Function Invocation, which leads to an Initialization Phase box (highlighted in blue) containing four steps: Container Provisioning, Runtime Initialization, Function Code Loading, and Dependency Resolution. After Initialization Phase, flow continues to Function Execution (in green) and ends with Function Return. A time arrow spans the bottom of the diagram indicating temporal progression from left to right.

Figure 1: Lambda Function Execution Lifecycle: Cold Start Components

  • Container Provisioning: Lambda allocates the necessary compute resources to run the function, based on its configured memory.
  • Runtime Initialization: Lambda loads the language runtime environment (Node.js, Python, Java, etc.) into the container. You can also define a custom runtime using the Lambda Runtime Interface.
  • Function Code Loading: Lambda downloads and unpacks the function code into the container.
  • Dependency Resolution: Lambda loads required libraries and packages so the function can execute successfully.

While cold starts typically affect less than 1% of requests, they can introduce performance variability in workloads where Lambda needs to create new execution environments more frequently, such as after periods of inactivity or during rapid scaling. This variability can impact perceived response times, especially in latency-sensitive applications such as user-facing APIs.

Why do cold starts occur?

Cold starts are a natural aspect of the serverless computing model due to its core design principles:

  • Resource Efficiency: To optimize cost and resource usage, AWS Lambda automatically shuts down idle execution environments after a period of inactivity. When the function is invoked again, a new environment must be provisioned.
  • Security and Isolation: Each Lambda execution environment runs in an isolated container to ensure strong security boundaries between invocations. This container-level isolation requires a fresh initialization process, which adds startup latency.
  • Auto-Scaling: Lambda automatically creates new environments to handle increased traffic or concurrent invocations. Each new environment requires provisioning and initialization, which contributes to cold start latency.

Understanding and optimizing cold start factors

The following sections explore factors contributing to cold starts, and optimization techniques to initialize your functions faster.

Runtime selection

Lambda supports multiple programming languages through runtimes, including the ability to create custom runtimes. A runtime handles core responsibilities such as relaying invocation events, context, and responses between the Lambda service and your function code. The time it takes to initialize a runtime can vary depending on the language. Interpreted languages, such as Python and Node.js, typically initialize faster, while compiled languages like Java or .NET may take longer due to additional startup steps such as loading classes. Custom, or OS-only runtimes commonly provide fastest cold start performance as they typically run compiled binaries on the underlying Linux environment.

Runtimes are regularly maintained and updated by AWS, with newer versions typically offering improvements in performance, security, and startup latency. To take advantage of these enhancements, AWS recommends keeping your functions up to date with the latest supported runtimes.

Packaging and layers

AWS Lambda supports two packaging options for deploying your function code – ZIP archives and container images. Each approach offers unique advantages and may influence cold start latency depending on how it’s used.

For ZIP-based deployments, you can upload your function code directly (up to 50MB) or via Amazon Simple Storage Service (Amazon S3) (up to 250MB unzipped). To promote reusability, Lambda also supports Lambda layers, allowing you to share common code, libraries, or runtime dependencies across multiple functions. However, larger packages can impact cold start latency due to factors such as increased S3 download time, ZIP extraction overhead, layer mounting and initialization. The size and number of dependencies directly affects initialization time – each added dependency increases the deployment artifact size, which Lambda must download, unpack, and initialize during the INIT phase.

To optimize cold start performance, keep your deployment ZIP packages small, remove unused dependencies with techniques like tree shaking, prioritize lightweight libraries, exclude unnecessary files like tests or docs, and structure your layers efficiently.

When using container-based deployments, you push your function image to Amazon Elastic Container Registry (Amazon ECR) first. This option provides greater flexibility and control over the runtime environment, especially useful when your function code exceeds 250MB or when you require specific language version or system libraries not included in the AWS-managed runtimes. While container images allow for highly customized deployments, pulling large images from ECR might contribute to cold start latency. Similar to ZIP-based approach, make sure to keep your image sizes minimal by removing unnecessary artifacts.

Resource allocation

Memory allocation plays a key role in both the performance and cost of your Lambda functions. When you assign more memory to a function, Lambda also allocates more CPU power, which can help reduce the time it takes to initialize and run your code – often improving cold start performance.

Use the AWS Lambda Power Tuning tool to balance performance benefits with added cost of allocating more memory. This tool runs your function with different memory settings and analyzes the trade-offs between speed and cost. This makes it easier to find the most cost-effective configuration for your workload.

Network configuration

By default, your Lambda functions are connected to the public internet, however you can attach them to your own Amazon Virtual Private Cloud (Amazon VPC) instead, for example when your functions need to access VPC-hosted resources such as databases. When this happens, the Lambda service creates an Elastic Network Interface (ENI) to attach your functions to. This process involves multiple steps, such as creation of network interfaces, subnets, security groups, route table and so on. While Lambda service tries to minimize added latency, applying this configuration might introduce additional latency, therefore you should only use it when access to VPC resources is necessary.

Design considerations

Optimizing your function initialization code can help to reduce cold start latencies. Streamline your function code to load and prepare quickly, alongside its runtime environment and dependencies. Employ lightweight libraries and implement lazy loading for resources to further cut initialization time. Minimize code size by eliminating unnecessary dependencies. Consider your architecture carefully: break down large functions into smaller, more focused units based on invocation patterns. This approach allows for quicker initialization of individual components. These smaller, task-specific functions offer the added benefits of improved modularity, easier testing, and simpler maintenance. However, always strike a balance between function size and functionality to maintain overall system efficiency. By implementing these optimization strategies, you can substantially mitigate cold start impacts while preserving your application’s core functionality and performance.

Provisioned Concurrency

Provisioned Concurrency addresses cold starts by pre-initializing function environments and keeping them “warm”, always ready to respond to incoming function invocations. By maintaining pre-initialized execution environments, Provisioned Concurrency delivers consistent performance for frequently invoked functions while eliminating throttling during peak loads. Provisioned Concurrency results in predictable performance for a function by providing consistent latency at some cost for reserved instances. Provisioned Concurrency is beneficial for high-traffic applications that requires consistent performance during heavy traffic and latency sensitive applications that requires fast responses for an interactive application, thereby reducing cold starts benefitting overall performance. The customer success story from Smartsheet demonstrates significant improvement in user experience with reduced latencies and better cost efficiency.

A side-by-side sequence diagram comparing two AWS Lambda execution patterns. Both sides show interactions between three components: Client, AWS Lambda Function, and Execution Environment. The left diagram labeled "Without Provisioned Concurrency" shows a longer sequence including create, ready, invoke, and response steps with an INIT phase. The right diagram labeled "With Provisioned Concurrency" shows a simplified sequence with fewer steps, skipping the initialization overhead. Both diagrams use standard sequence diagram notation with dashed vertical lifelines and horizontal arrows indicating message flow.

Figure 2: AWS Lambda execution flow comparison: standard vs. Provisioned Concurrency

SnapStart

Lambda SnapStart improves cold invoke latency by reducing the time it takes for a function to initialize and become ready to handle incoming requests. When SnapStart is enabled for a function, Lambda creates an encrypted snapshot of the initialized execution environment when you publish a new function version. This triggers an optimized INIT phase of the function where an immutable, encrypted snapshot of the memory and disk is taken. This snapshot is cached for reuse later. When a SnapStart-enabled function is invoked again, Lambda restores the execution environment from the cached snapshot instead of creating a new environment, thus moderating a cold invoke. SnapStart minimizes the invocation latency of a function, since creating a new execution environment no longer requires a dedicated INIT phase.

SnapStart is an efficient cold start solution, currently available for Java, Python, and .NET functions. It is particularly useful for functions with long initialization times. Inactive snapshots are automatically removed after 14 days without invocation. For detailed pricing information, check out our pricing page.

AWS Lambda SnapStart architecture diagram showing two main workflow paths. Top path: Lambda function (New Version Published) triggers INIT Phase (Optimized), which creates an Encrypted Snapshot (Memory + Disk State). Bottom path: Function Invocation connects to Cache, which restores the execution environment, leading to Function Execution and Function Return. A dashed line connects the Encrypted Snapshot to Cache, indicating snapshot caching. AWS Lambda icons are shown in orange, with process steps in different colored boxes: blue for processing phases, cream for cache, pink for snapshot, green for restored environment, and light blue for execution states. All components are connected by arrows showing the data flow direction.

Figure 3: Lambda SnapStart architecture: optimizing cold starts through snapshot-based initialization

Observability

Use out-of-the-box observability facilities provided by AWS Lambda to investigate whether your functions or user experience are affected by cold starts and identify most impactful optimization areas. Monitoring Lambda cold start performance using built-in metrics such as INIT duration, invocation duration, and error rates is crucial for identifying bottlenecks and refining the function for optimal performance and cost-effectiveness. Use the following metrics:

  • INIT duration: The INIT duration metric, found in the REPORT section of function logs, measures the time taken for the function to initialize and become ready to handle invocation.
  • REPORT Message: Lambda reports total invocation time, such as initialization, in the REPORT log message at the end of each invocation. Monitoring this metric helps identify potential bottlenecks within the function code.
  • Error Rates: Monitoring error rates helps identify issues within the function, thus guaranteeing reliability and stability.
  • Concurrency Metrics: Concurrency metrics help understand if a function is hitting the concurrency limits that can contribute to potential increases in cold start durations and throttling.

Conclusion

In this post you’ve learned a detailed breakdown and insights about various aspects of Lambda cold starts, offering a comprehensive understanding of the challenges and solutions in this space. While cold starts commonly affect less than 1% of requests, understanding their nature and implementing appropriate remediation strategies early can help to minimizing their impact in the most latency-sensitive applications.

Near real-time streaming analytics on protobuf with Amazon Redshift

Post Syndicated from Konstantinos Tzouvanas original https://aws.amazon.com/blogs/big-data/near-real-time-streaming-analytics-on-protobuf-with-amazon-redshift/

Organizations must often deal with a vast array of data formats and sources in their data analytics workloads. This range of data types, such as structured relational data, semi-structured formats like JSON and XML and even binary formats like Protobuf and Avro, has presented new challenges for companies looking to extract valuable insights.

Protocol Buffers (protobuf) has gained significant traction in industries that require efficient data serialization and transmission, particularly in streaming data scenarios. Protobuf’s compact binary representation, language-agnostic nature, and strong typing make it an attractive choice for companies in sectors such as finance, gaming, telecommunications, and ecommerce, where high-throughput and low-latency data processing is crucial.

Although protobuf offers advantages in efficient data serialization and transmission, its binary nature poses challenges when it comes to analytics use cases. Unlike formats like JSON or XML, which can be directly queried and analyzed, protobuf data requires an additional deserialization step to convert it from its compact binary format into a structure suitable for processing and analysis. This extra conversion step introduces complexity into data analytics pipelines and tools. It can potentially slow down data exploration and analysis, especially in scenarios where near real-time insights are crucial.

In this post, we explore an end-to-end analytics workload for streaming protobuf data, by showcasing how to handle these data streams with Amazon Redshift Streaming Ingestion, deserializing and processing them using AWS Lambda functions, so that the incoming streams are immediately available for querying and analytical processing on Amazon Redshift.

The solution provides a solid foundation for handling protobuf data in Amazon Redshift. You can further enhance the architecture to support schema evolution by incorporating AWS Glue Schema Registry. By integrating the AWS Glue Schema Registry, you can make sure your Lambda function uses the latest schema version for deserialization, even as your data structure changes over time. However, for the purpose of this post and to maintain simplicity, we focus on demonstrating how to invoke Lambda from Amazon Redshift to convert protobuf messages to JSON format, which serves as a solid foundation for handling binary data in near real-time analytics scenarios.

Solution overview

The following architecture diagram describes the AWS services and features needed to set up a fully functional protobuf streaming ingestion pipeline for near real-time analytics.

Protobuf deserialization flow

The workflow consists of the following steps:

  1. An Amazon Elastic Compute Cloud (Amazon EC2) event producer generates events and forwards them to a message queue. The events are created and serialized using protobuf.
  2. A message queue using Amazon Managed Streaming for Apache Kafka (Amazon MSK) or Amazon Kinesis accepts the protobuf messages sent by the event producer. For this post, we use Amazon MSK Serverless.
  3. A Redshift cluster (provisioned or serverless), in which a materialized view with an external schema is configured, points to the message queue. For this post, we use Amazon Redshift Serverless.
  4. A Lambda protobuf deserialization function is triggered by Amazon Redshift during ingestion and deserializes protobuf data into JSON data.

Schema

To showcase protobuf’s deserialization functionality, we use a sample protobuf schema that represents a financial trade transaction. This schema will be used across the AWS services mentioned in this post.

// trade.proto 
syntax = "proto3"; 
message Trade{
   int32 userId = 1;   
   string userName = 2;   
   int32 volume = 3;   
   int32 pair = 4;   
   int32 action = 5;
   string TimeStamp = 6;
}

Amazon Redshift materialized view

In order for Amazon Redshift to ingest streaming data from Amazon MSK or Kinesis, an appropriate role needs to be assigned to Amazon Redshift and a materialized view needs to be properly defined. For detailed instructions on how to accomplish this, refer to Streaming ingestion to a materialized view or Simplify data streaming ingestion for analytics using Amazon MSK and Amazon Redshift.

In this section, we focus on the materialized view definition that makes it possible to deserialize protobuf data. Our example focuses on streaming ingestion from Amazon MSK. Typically, the materialized view ingests the Kafka metadata fields and the actual data (kafka_value) like in the following example:

CREATE MATERIALIZED VIEW trade_events AUTO REFRESH YES AS 
SELECT
     kafka_partition,     
     kafka_offset,
     kafka_timestamp_type,
     kafka_timestamp,
     kafka_key,
     JSON_PARSE(kafka_value) as Data,
     kafka_headers 
FROM     
     "dev"."msk_external_schema"."entity" 
WHERE     
     CAN_JSON_PARSE(kafka_value)

When the incoming kafka_value is of type JSON, you can apply the built-in JSON_PARSE function and create a column of type SUPER so you can directly query the data.

Amazon Redshift Lambda user-defined function

In our case, accepting protobuf encoded data requires some additional steps. The first step is to create an Amazon Redshift Lambda user-defined function (UDF). This Amazon Redshift function is the link to a Lambda function that executes the actual deserialization. This way, when data is ingested, Amazon Redshift calls the Lambda function for deserialization.

Creating or updating our Amazon Redshift Lambda UDF is straightforward, as illustrated in the following code. Additional examples are available in the GitHub repo.

CREATE OR REPLACE EXTERNAL FUNCTION f_deserialize_protobuf(VARCHAR(MAX)) 
RETURNS VARCHAR(MAX) IMMUTABLE 
LAMBDA 'f-redshift-deserialize-protobuf' IAM_ROLE ':RedshiftRole';

Because Lambda functions don’t (at the time of writing) accept binary data as input, you must first convert incoming binary data to its hex representation, prior to calling the function. You can do this by using the TO_HEX Amazon Redshift function.
Considering the hex conversation and with the Lambda UDF available, you can now use it in your materialized view definition:

CREATE MATERIALIZED VIEW trade_events AUTO REFRESH YES AS 
SELECT      
       kafka_partition,
       kafka_offset,
       kafka_timestamp_type,
       kafka_timestamp,
       kafka_key,
       kafka_value,
       kafka_headers,
       JSON_PARSE(f_deserialize_protobuf(to_hex(kafka_value)))::super as json_data 
FROM      
       "dev"."msk_external_schema"."entity";

Lambda layer

Lambda functions require access to appropriate protobuf libraries, so that deserialization can take place. You can implement this through a Lambda layer. The layer is provided as a zip file, respecting the following folder structure, and contains the protobuf library, its dependencies, and user-provided code inside the custom folder, which includes the protobuf generated classes:

python
      custom
      google
      Protobuf-4.25.2.dist-info

Because we implemented the Lambda functions in Python, the root folder of the zip file is the python folder. For additional languages, refer to the documentation on how to properly structure your folder structure.

Lambda function

A Lambda function converts incoming protobuf records to JSON records. As a first step, you must import your custom classes from the lambda Layer custom folder:

# Import generated protobuf classes
 from custom import trade_pb2

You can now deserialize incoming hex encoded binary data to objects. This is implemented in a two-step process. The first step is to decode the hex encoded binary data:

 # convert incoming hex data to binary  
binary_data = bytes.fromhex(record)

Next, you instantiate the protobuf defined classes and execute the actual deserialization process using the protobuf library method ParseFromString:

 # Instantiate class  
trade_event = trade_pb2.Trade()
               
# Deserialize into class  
trade_event.ParseFromString(binary_data)

After you run deserialization and instantiate your objects, you can convert to other formats. In our case, we serialize into JSON format, so that Amazon Redshift ingests the JSON content in a single field of type SUPER:

# Serialize into json  
elems = trade_event.ListFields() 
fields = {} 
for elem in elems:     
       fields[elem[0].name] = elem[1] 
json_elem = json.dumps(fields)

Combining these steps together, the Lambda function should look as follows:

import json

# Import the generated protobuf classes
from custom import trade_pb2  

def lambda_handler(event, context):
    
    results = []
    
    recordSets = event['arguments']
    for recordSet in recordSets:
        for record in recordSet:

            # convert incoming hex data to binary data
            binary_data = bytes.fromhex(record)
            
            # Instantiate class
            trade_event = trade_pb2.Trade()
            
            # Deserialize into class
            trade_event.ParseFromString(binary_data)
            
            # Serialize into json 
            elems = trade_event.ListFields()
            fields = {}
            for elem in elems:
                fields[elem[0].name] = elem[1]
            json_elem = json.dumps(fields)

            # Append to results            
            results.append(json_elem)
    
    print('OK')
    
    return json.dumps({"success": True,"num_records": len(results),"results": results})

Batch mode

In the preceding code sample, Amazon Redshift is calling our function in batch mode, meaning that a number of records are sent during a single Lambda function call. More specifically, Amazon Redshift is batching records into the arguments property of the request. Therefore, you must loop through the incoming array of data and apply your deserialization logic per record. At the time of writing, this behavior is internal to Amazon Redshift and can’t be configured or controlled through a configuration option. An Amazon Redshift streaming consumer client will read new records on the message queue since the last time it read. The following is a sample of the payload the Lambda handler function receives:

     "user": "IAMR:Admin",
     "cluster": "arn:aws:redshift:*********************************",
     "database": "dev",
     "external_function": "fn_lambda_protobuf_to_json",
     "query_id": 5583858,
     "request_id": "17955ee8-4637-42e6-897c-5f4881db1df5",     
     "arguments": [         
         [             
"088a1112087374723a3231383618c806200128093217323032342d30332d32302031303a34363a33382e363932"         ],         [             "08a74312087374723a3836313518f83c200728093217323032342d30332d32302031303a34363a33382e393031"         ],         [             "08b01e12087374723a3338383818f73d20f8ffffffffffffffff0128053217323032342d30332d32302031303a34363a33392e303134"         
]     
],     
      "num_records":3 
}

Insights from ingested data

With your data stored in Amazon Redshift after the deserialization process, you can now execute queries against your streaming data and directly gain insights. In this section, we present some sample queries to illustrate functionality and behavior.

Examine lag query

To examine the difference between the most recent timestamp value of our streaming source vs. the current date/time (wall clock), we calculate the most recent point in time at which we ingested data. Because streaming data is expected to flow into the system continuously, this metric also reveals the ingestion lag between our streaming source and Amazon Redshift.

select top 1      
      (GETDATE() - kafka_timestamp) as ingestion_lag 
from     
      trade_events 
order by
      kafka_timestamp desc

Examine content query: Fraud detection on an incoming stream

By applying the query functionality available in Amazon Redshift, we can discover behavior hidden in our data in real time. With the following query, we try to match opposite trade volumes played by different users during the last 5 minutes that result in a zero sum game and could support a potential fraud detection concept:

select  
json_data.volume, 
LISTAGG(json_data.userid::int, ', ') as users, 
LISTAGG(json_data.pair::int, ', ') as pairs 
from     
        trade_events 
where      
        trade_events.kafka_timestamp >= DATEADD(minute, -5, GETDATE()) 
group by      
        json_data.volume 
having      
        sum(json_data.pair) = 0  
and min(abs(json_data.pair)) = max(abs(json_data.pair)) 
and count(json_data.pair) > 1

This query is a rudimentary example of how we can use live data to protect systems from fraudsters.

For a more comprehensive example, see Near-real-time fraud detection using Amazon Redshift Streaming Ingestion with Amazon Kinesis Data Streams and Amazon Redshift ML. In this use case, an Amazon Redshift ML model for anomaly detection is trained using the incoming Amazon Kinesis Data Streams data that is streamed into Amazon Redshift. After sufficient training (for example, 90% accuracy for the model is achieved), the trained model is put into inference mode for inferencing decisions on the same incoming credit card data.

Examine content query: Join with non-streaming data

Having our protobuf records streaming in Amazon Redshift makes it possible to join streaming with non-streaming data. A typical example is combining incoming trades with user information data already recorded in the system. In the following query, we join the incoming stream of trades with user information, like email, to get a list of possible alerts targets:

select   
    user_info.email 
from     
    trade_events
inner join    
    user_info 
on user_info.userId = trade_events.json_data.userid 
where     
    trade_events.json_data.volume > 1000 
and trade_events.kafka_timestamp >= DATEADD(minute, -5, GETDATE())

Conclusion

The ability to effectively analyze and derive insights from data streams, regardless of their format, is crucial for data analytics. Although protobuf offers compelling advantages for efficient data serialization and transmission, its binary nature can pose challenges and perhaps impact performance when it comes to analytics workloads. The solution outlined in this post provides a robust and scalable framework for organizations seeking to gain valuable insights, detect anomalies, and make data-driven decisions with agility, even in scenarios where high-throughput and low-latency processing is crucial. By using Amazon Redshift Streaming Ingestion in conjunction with Lambda functions, organizations can seamlessly ingest, deserialize, and query protobuf data streams, enabling near real-time analysis and insights.

For more information about Amazon Redshift Streaming Ingestion, refer to Streaming ingestion to a materialized view.


About the authors

Konstantinos Tzouvanas is a Senior Enterprise Architect on AWS, specializing in data science and AI/ML. He has extensive experience in optimizing real-time decision-making in High-Frequency Trading (HFT) and applying machine learning to genomics research. Known for leveraging generative AI and advanced analytics, he delivers practical, impactful solutions across industries.

Marios Parthenios is a Senior Solutions Architect working with Small and Medium Businesses across Central and Eastern Europe. He empowers organizations to build and scale their cloud solutions with a particular focus on Data Analytics and Generative AI workloads. He enables businesses to harness the power of data and artificial intelligence to drive innovation and digital transformation.

Pavlos Kaimakis is a Senior Solutions Architect at AWS who helps customers design and implement business-critical solutions. With extensive experience in product development and customer support, he focuses on delivering scalable architectures that drive business value. Outside of work, Pavlos is an avid traveler who enjoys exploring new destinations and cultures.

John Mousa is a Senior Solutions Architect at AWS. He helps power and utilities and healthcare and life sciences customers as part of the regulated industries team in Germany. John has interest in the areas of service integration, microservices architectures, as well as analytics and data lakes. Outside of work, he loves to spend time with his family and play video games.

AWS Weekly Roundup: Amazon DocumentDB, AWS Lambda, Amazon EC2, and more (August 4, 2025)

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-amazon-documentdb-aws-lambda-amazon-ec2-and-more-august-4-2025/

This week brings an array of innovations spanning from generative AI capabilities to enhancements of foundational services. Whether you’re building AI-powered applications, managing databases, or optimizing your cloud infrastructure, these updates help build more advanced, robust, and flexible applications.

Last week’s launches
Here are the launches that got my attention this week:

Additional updates
Here are some additional projects, blog posts, and news items that I found interesting:

Upcoming AWS events
Check your calendars so that you can sign up for these upcoming events:

AWS re:Invent 2025 (December 1-5, 2025, Las Vegas) — AWS’s flagship annual conference offering collaborative innovation through peer-to-peer learning, expert-led discussions, and invaluable networking opportunities.

AWS Summits — Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Mexico City (August 6) and Jakarta (August 7).

AWS Community Days — Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Australia (August 15), Adria (September 5), Baltic (September 10), and Aotearoa (September 18).

Join the AWS Builder Center to learn, build, and connect with builders in the AWS community. Browse here upcoming in-person and virtual developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

Danilo

Introducing v2 of Powertools for AWS Lambda (Java)

Post Syndicated from Philipp Page original https://aws.amazon.com/blogs/compute/introducing-v2-of-powertools-for-aws-lambda-java/

Modern applications increasingly rely on Serverless technologies such as Amazon Web Services (AWS) Lambda to provide scalability, cost efficiency, and agility. The Serverless Applications Lens for the AWS Well-Architected Framework focuses on how to design, deploy, and architect your Serverless applications to overcome some of these challenges.

Powertools for AWS Lambda is a developer toolkit that helps you implement Serverless best practices and directly translates AWS Well-Architected recommendations into actionable, developer friendly utilities. Following the community’s continued successful adoption of Powertools for AWS in Python, Java, TypeScript, and .NET, this post announces the general availability of Powertools for AWS Lambda (Java) v2 coming with major performance improvements, enhanced core utilities, and a brand-new Kafka utility.

Powertools for AWS (Java) v2 provides three updated core utilities:

  • Logging: A re-designed Java idiomatic logging module providing structured logging that streamlines log aggregation and analysis.
  • Metrics: An improved metrics experience allowing custom metrics collection using CloudWatch Embedded Metric Format (EMF).
  • Tracing: An annotation-based way to collect distributed tracing data with AWS X-Ray to visualize and analyze request flows.

Along with the updated core utilities, v2 of the developer toolkit adds two brand new features:

  • GraalVM native image support: Native image support for GraalVM across all core utilities reducing Lambda cold start times up to 75.61% (p95).
  • Kafka utility: This new utility integrates with Amazon Managed Streaming for Apache Kafka (Amazon MSK) and self-managed Kafka event sources on Lambda and allows developers to deserialize directly into Kafka native types such as ConsumerRecords.

Learn more about how to migrate to v2 in our upgrade guide.

Getting started using Powertools for AWS Lambda (Java) v2

Powertools for AWS Lambda (Java) v2 is readily accessible as a Java package on Maven Central and integrates with popular build tools such as Maven and Gradle. This post focuses on Maven-based implementation samples to help you get started quickly. Gradle examples are available for all utilities in the documentation and the examples repository.

The toolkit is compatible with Java 11 and newer versions, making sure you can use modern Java features while building Serverless applications. Examples on how to install each utility are outlined in each section of the post and complete configuration examples are also available in the Powertools documentation.

Logging

The Logging utility helps implement structured logging when running on Lambda while still using familiar Java logging libraries such as slf4j, log4j, and logback. v2 of Logging allows you to do the following:

  • Output structured JSON logs enriched with Lambda context
  • Choose the logging backend of your choice among log4j2 and logback
  • Add structured arguments to logs that get serialized into arbitrarily nested JSON objects
  • Add global log keys using the slf4j default Mapped Diagnostic Context (MDC)

To add the logging utility to your project, include it as a dependency in your Java Maven project. The following example shows how to add the log4j2 logging backend to your application:

<!-- In the dependencies section -->
<dependency>
    <groupId>software.amazon.lambda</groupId>
    <artifactId>powertools-logging-log4j</artifactId>
    <!-- Alternatively, if you wish to use the logback backend
    <artifactId>powertools-logging-logback</artifactId> 
    -->
    <version>2.1.1</version>
</dependency>
<!-- In the build plugins section -->
<plugin>
    <groupId>dev.aspectj</groupId>
    <artifactId>aspectj-maven-plugin</artifactId>
    <configuration>
        <aspectLibraries>
            <aspectLibrary>
                <groupId>software.amazon.lambda</groupId>
                <artifactId>powertools-logging</artifactId>
                <version>2.1.1</version>
            </aspectLibrary>
        </aspectLibraries>
    </configuration>
</plugin>

Create a custom JsonTemplateLayout appender in your log4j2.xml file:

<?xml version="1.0" encoding="UTF-8"?>
<Configuration>
    <Appenders>
        <Console name="JsonAppender" target="SYSTEM_OUT">
            <JsonTemplateLayout eventTemplateUri="classpath:LambdaJsonLayout.json" />
        </Console>
    </Appenders>
    <Loggers>
        <Logger name="JsonLogger" level="INFO" additivity="false">
            <AppenderRef ref="JsonAppender"/>
        </Logger>
        <Root level="info">
            <AppenderRef ref="JsonAppender"/>
        </Root>
    </Loggers>
</Configuration>

To add structured logging to your functions, apply the @Logging annotation to your Lambda handler and use the familiar slf4j Java API when writing log statements. This allows you to adopt the logging utility without major code refactoring. Powertools handles routing to the correct logging backend for you. The following example shows how to add global log keys using MDC, and add a structured entry argument to your log message:

public class App implements RequestHandler<SQSEvent, String> {
    private static final Logger log = LoggerFactory.getLogger(App.class);

    @Logging
    public String handleRequest(final SQSEvent input, final Context context) {
        // Add a global log key using Mapped Diagnostic Context MDC
        MDC.put("myCustomKey", "willBeLoggedForAllLogStatements");

        // Log a message with a structured argument (any JSON serializable Object)
        log.info("My message", entry("anotherCustomKey", Map.of("nested", "object")));

        // ... return response
    }
}

Lambda sends the following JSON-formatted output to Amazon CloudWatch Logs (note how the Java Map gets auto-serialized into a JSON object):

{
  "level": "INFO",
  "message": "My message",
  "cold_start": true,
  "function_arn": "arn:aws:lambda:us-east-1:012345678912:function:AppFunction",
  "function_memory_size": 512,
  "function_name": "AppFunction",
  "function_request_id": "0150a2a4-c5aa-4277-9345-17bad039f6c0",
  "function_version": "$LATEST",
  "sampling_rate": 0.1,
  "service": "powertools-java-sample",
  "timestamp": "2025-05-20T08:35:28.565Z",
  "myCustomKey": "willBeLoggedForAllLogStatements",
  "anotherCustomKey": {
    "nested": "object"
  }
}

Metrics

CloudWatch offers essential built-in service metrics for monitoring application throughput, error rates, and resource usage. Users also need to capture workload specific custom metrics relevant to their business use-case following AWS Well-Architected best-practices.

Powertools for AWS (Java) enables you to create custom metrics asynchronously by outputting metrics in CloudWatch EMF directly to standard output—an approach that needs no other configuration. The Lambda service sends the EMF formatted metrics to CloudWatch on your behalf.

The Metrics utility allows you to:

  • Create custom metrics asynchronously using CloudWatch EMF
  • Reduce latency by avoiding synchronous metric publishing
  • Automatically track cold starts in a custom CloudWatch metric
  • Avoid manually validating your output against the EMF specification
  • Keep you code clean by avoiding manual flushing to standard output

To add the Metrics utility to your project, add the following Maven dependency:

<!-- In the dependencies section -->
<dependency>
    <groupId>software.amazon.lambda</groupId>
    <artifactId>powertools-metrics</artifactId>
    <version>2.1.1</version>
</dependency>
<!-- In the build plugins section -->
<plugin>
    <groupId>dev.aspectj</groupId>
    <artifactId>aspectj-maven-plugin</artifactId>
    <configuration>
        <aspectLibraries>
            <aspectLibrary>
                <groupId>software.amazon.lambda</groupId>
                <artifactId>powertools-metrics</artifactId>
                <version>2.1.1</version>
            </aspectLibrary>
        </aspectLibraries>
    </configuration>
</plugin>

To add custom metrics to your Lambda function, place the @FlushMetrics annotation on your Lambda handler. The library takes care of validating and flushing your metrics to standard output before the Lambda function terminates. The following example shows how you can automatically capture a cold start metric and emit your own custom metrics:

public class App implements RequestHandler<SQSEvent, String> {
    private static final Logger log = LoggerFactory.getLogger(App.class);
    private static final Metrics metrics = MetricsFactory.getMetricsInstance();

    // This configures a default namespace and service dimension for all metrics
    @FlushMetrics(namespace = "ServerlessAirline", service = "payment", captureColdStart = true)
    public String handleRequest(final SQSEvent input, final Context context) {
        // The Metrics instance is a singleton
        metrics.addMetric("CustomMetric1", 1, MetricUnit.COUNT);

        // Publish metrics with non-default configuration options
        DimensionSet dimensionSet = new DimensionSet();
        dimensionSet.addDimension("Service", "AnotherService");
        metrics.flushSingleMetric("CustomMetric2", 1, MetricUnit.COUNT, "AnotherNamespace", dimensionSet);

        // ... return response
    }
}
AWS CloudWatch Metrics Graph View of metrics generated by Metrics utility example.

Figure 1. AWS CloudWatch Metrics Graph View

Tracing

The Tracing utility provides an annotation-based integration with X-Ray for distributed tracing with minimal configuration. Tracing allows you to:

  • Gain visibility into your own methods calls and AWS service interactions visualized in the X-Ray console
  • Automatically capture method responses and errors
  • Automatically capture Lambda cold start information as part of your traces
  • Add custom metadata to traces for more context and debugging information
  • Enable or disable tracing features through environment variables without code changes

To add the Tracing utility to your project, add the following Maven dependency:

<!-- In the dependencies section -->
<dependency>
    <groupId>software.amazon.lambda</groupId>
    <artifactId>powertools-tracing</artifactId>
    <version>2.1.1</version>
</dependency>
<!-- In the build plugins section -->
<plugin>
    <groupId>dev.aspectj</groupId>
    <artifactId>aspectj-maven-plugin</artifactId>
    <configuration>
        <aspectLibraries>
            <aspectLibrary>
                <groupId>software.amazon.lambda</groupId>
                <artifactId>powertools-tracing</artifactId>
                <version>2.1.1</version>
            </aspectLibrary>
        </aspectLibraries>
    </configuration>
</plugin>

To enable tracing in your Lambda function, annotate your Lambda handler and your custom methods that you want to trace with the @Tracing annotation. Each annotation maps to a sub-segment of your main Lambda handler in X-Ray and becomes visible in the console.

public class App implements RequestHandler<APIGatewayProxyRequestEvent, APIGatewayProxyResponseEvent> {
    private static final Logger log = LoggerFactory.getLogger(App.class);

    @Tracing
    public APIGatewayProxyResponseEvent handleRequest(final APIGatewayProxyRequestEvent input, final Context context) {
        // ... business logic
        
        // Get calling IP with tracing
        String location = getCallingIp("https://checkip.amazonaws.com");

        // ... return response
    }

    @Tracing(segmentName = "Location service")
    private String getCallingIp(String address) {
        // Implementation to get IP address
        log.info("Retrieving caller IP address");
        
        // Add custom metadata to current sub-segment
        URL url = new URL(address);
        putMetadata("getCallingIp", address);
        
        // ...
        return "127.0.0.1";
    }
}

The X-Ray console displays a generated service map when traffic begins flowing through your application. Applying the Tracing annotation to your Lambda function handler method or any other methods in the execution chain provides you with comprehensive visibility into the traffic patterns throughout your application. The following figure shows how the custom metadata added in the example is associated with the custom sub-segment.

Picture showing the generated traces in the AWS X-Ray console. Shows the custom named Location service trace along with its metadata as a JSON object.

Figure 2. AWS X-Ray waterfall trace view

Reducing Lambda cold start duration

A key feature in Powertools for AWS Lambda (Java) v2 is GraalVM native image support for all core utilities. Compiling your Lambda functions to native executables allows you to significantly reduce cold start times and memory usage. Using Powertools v2 with GraalVM allows you to reduce cold starts up to 75.61% (p95) compared to using the managed Java runtime. The following benchmark compares the cold start times of an application using all core utilities (logging, metrics, tracing) on the managed java21 runtime as compared to the Lambda provided.al2023 runtime running a GraalVM compiled native image (go to the supported Lambda runtimes):

Environment p95 (ms) Min (ms) Avg (ms) Max (ms) Max Memory (MB) N
Powertools for AWS (Java) v2: JVM 1682.92 1224.55 1224.55 2229.81 205.04 234
Powertools for AWS (Java) v2: GraalVM 542.86 404.92 504.77 752.85 93.46 369

This improvement is particularly valuable for latency-sensitive applications and functions that scale frequently. Check out a full working example on GitHub.

Lambda MSK Event Source Mapping Integration

The new Kafka utility introduced with Powertools for AWS Lambda (Java) v2 streamlines working with the Lambda MSK Event Source Mapping (ESM) and self-managed Kafka event sources. It provides a familiar experience for developers working with Apache Kafka by allowing direct conversion from Lambda events to Kafka’s native types. The key features include:

  • Direct deserialization into Kafka ConsumerRecords<K, V> objects while using the Lambda-native RequestHandler interface
  • Support for deserializing JSON, Avro, and Protobuf encoded records for key and value fields with and without usage of a Schema Registry when producing the messages

To add the Kafka utility to your project, include the powertools-kafka library as a Maven dependency in your pom.xml:

<!-- In the dependencies section -->
<dependency>
    <groupId>software.amazon.lambda</groupId>
    <artifactId>powertools-kafka</artifactId>
    <version>2.1.1</version>
</dependency>
<!-- Kafka clients dependency - compatibility works for >= 3.0.0 -->
<dependency>
    <groupId>org.apache.kafka</groupId>
    <artifactId>kafka-clients</artifactId>
    <version>4.0.0</version>
</dependency>

Use the @Deserialization annotation on your Lambda handler to deserialize messages as native Kafka ConsumerRecords. Make sure to specify the deserializer type. The following example shows how to deserialize Avro encoded record values with String keys. As in a regular Lambda handler, declare the input type to your function in the RequestHandler generic parameters and the utility discovers the deserialization types automatically. The AvroProduct class in the following example is an auto-generated Java class using the Java org.apache.avro.avro library.

public class App implements RequestHandler<ConsumerRecords<String, AvroProduct>, Void> {
    private static final Logger log = LoggerFactory.getLogger(App.class);

    @Deserialization(type = DeserializationType.KAFKA_AVRO)
    public Void handleRequest(ConsumerRecords<String, AvroProduct> consumerRecords, Context context) {
        log.info("Deserialized {} records.", consumerRecords.records().size()); 

        // ... Business logic 
        
        return null;
    }
}

Conclusion

Powertools for AWS Lambda (Java) v2 represents the next evolution in the toolkit for building robust, observable, and high-performing Serverless applications. Throughout this post, we’ve explored the enhanced core observability utilities with their new features, the performance gains through GraalVM native image support, and the new Kafka utility that supports using familiar Kafka patterns when working on Lambda.

Powertools also offers more utilities to handle common Serverless design patterns. Each utility is designed with the same principles of clarity and minimal overhead.To learn more:

  1. Visit the documentation for detailed guides and examples
  2. Try the sample applications
  3. Join the community on GitHub to share your experience and get help

Your next Serverless application awaits with Powertools for AWS Lambda (Java) v2. We would love to hear your feedback!

Streamlining AWS Serverless workflows: From AWS Lambda orchestration to AWS Step Functions

Post Syndicated from Diego Casas original https://aws.amazon.com/blogs/compute/streamlining-aws-serverless-workflows-from-aws-lambda-orchestration-to-aws-step-functions/

This blog post discusses the AWS Lambda as orchestrator anti-pattern and how to redesign serverless solutions using AWS Step Functions with native integrations.

Step Functions is a serverless workflow service that you can use to build distributed applications, automate processes, orchestrate microservices, and create data and machine learning (ML) pipelines. Step Functions provides native integrations with over 200 AWS services in addition to external third-party APIs. You can use these integrations to deploy production-ready solutions with less effort, reducing code complexity, improving long-term maintainability, and minimizing technical debt when operating at scale.

The Lambda as orchestrator anti-pattern

Let’s examine a common anti-pattern: using a Lambda function as an orchestrator for message distribution across multiple channels. Consider this real-world scenario where a system needs to send notifications through SMS or email channels based on user preferences, as shown in the following diagram.

The payload examples for this scenario are:

  1. Send SMS only:
    {
        "body": {
            "channel": "sms",
            "message": "Hello from AWS Lambda!",
            "phoneNumber": "+1234567890",
            "metadata": {
                "priority": "high",
                "category": "notification"
            }
        }
    }

  2. Send email only:
    {
        "body": {
            "channel": "email",
            "message": "Hello from AWS Lambda!",
            "email": {
                "to": "[email protected]",
                "subject": "Test Notification",
                "from": "[email protected]"
            },
            "metadata": {
                "priority": "normal",
                "category": "notification"
            }
        }
    }

  3. Send both SMS and email:
    {
        "body": {
            "channel": "both",
            "message": "Hello from AWS Lambda!",
            "phoneNumber": "+1234567890",
            "email": {
                "to": "[email protected]",
                "subject": "Test Notification",
                "from": "[email protected]"
            },
            "metadata": {
                "priority": "high",
                "category": "notification"
            }
        }
    }

Here’s how it typically starts—with a Lambda function acting as an orchestrator:

import boto3
import json
# Initialize Lambda client
# You can specify region if needed: boto3.client('lambda', region_name='us-east-1')
lambda_client = boto3.client('lambda')
def lambda_handler(event, context):
    try:
        # Parse the incoming event
        body = json.loads(event['body'])
        
        # Validate required fields
        if 'channel' not in body:
            return {
                'statusCode': 400,
                'body': json.dumps('Missing channel parameter')
            }
        
        if 'message' not in body:
            return {
                'statusCode': 400,
                'body': json.dumps('Missing message content')
            }
        
        if body['channel'] == 'both':
            # Invoke SMS Lambda function
            lambda_client.invoke(
                FunctionName='send-sns',
                InvocationType='Event',
                Payload=json.dumps(body)
            )
            
            # Invoke Email Lambda function
            lambda_client.invoke(
                FunctionName='send-email',
                InvocationType='Event',
                Payload=json.dumps(body)
            )
        else:
            # Validate channel value
            if body['channel'] not in ['sms', 'email']:
                return {
                    'statusCode': 400,
                    'body': json.dumps('Invalid channel specified')
                }
            
            # Invoke function based on specified channel
            function_name = 'send-sns' if body['channel'] == 'sms' else 'send-email'
            lambda_client.invoke(
                FunctionName=function_name,
                InvocationType='Event',
                Payload=json.dumps(body)
            )
        
        return {
            'statusCode': 200,
            'body': json.dumps('Messages sent successfully')
        }
        
    except json.JSONDecodeError:
        return {
            'statusCode': 400,
            'body': json.dumps('Invalid JSON in request body')
        }
    except Exception as e:
        return {
            'statusCode': 500,
            'body': json.dumps(f'Error: {str(e)}')
        }

This approach has the following problems:

  • Complex error handling: The orchestrator needs to manage errors from multiple function invocations.
  • Tight coupling: Functions are directly dependent on each other.
  • Limited execution time: The orchestrator Lambda function continues running while sub Lambda functions execute. This could lead to the orchestrator Lambda function timing out.
  • Idle resources: Because the orchestrator Lambda function is sitting idle waiting for returns from other Lambda functions, in this case, the user is now paying for idle resources.

Rearchitecting with Step Functions

You can rebuild the logic using Step Functions and Amazon States Language to replace the Lambda orchestrator function. You can use the Choice state in Amazon States Language to define logical conditions to follow a specific path. This approach reduces code maintenance complexity because you define the conditions using Amazon States Language. You can also use it to to extend the functionality with minimal changes to the codebase.

The following Step Functions workflow diagram shows the rearchitected version of the previous Orchestrator Lambda function:

The following Amazon State Language represents the workflow:

{
  "Comment": "Multi-channel notification workflow",
  "StartAt": "ValidateInput",
  "States": {
    "ValidateInput": {
      "Type": "Choice",
      "Choices": [
        {
          "And": [
            {
              "Variable": "$.message",
              "IsPresent": true
            },
            {
              "Variable": "$.channel",
              "IsPresent": true
            }
          ],
          "Next": "DetermineChannel"
        }
      ],
      "Default": "ValidationError"
    },
    "ValidationError": {
      "Type": "Fail",
      "Error": "ValidationError",
      "Cause": "Required fields missing: message and/or channel"
    },
    "DetermineChannel": {
      "Type": "Choice",
      "Choices": [
        {
          "Variable": "$.channel",
          "StringEquals": "both",
          "Next": "ParallelNotification"
        },
        {
          "Variable": "$.channel",
          "StringEquals": "sms",
          "Next": "SendSMSOnly"
        },
        {
          "Variable": "$.channel",
          "StringEquals": "email",
          "Next": "SendEmailOnly"
        }
      ],
      "Default": "FailState"
    },
    "ParallelNotification": {
      "Type": "Parallel",
      "Branches": [
        {
          "StartAt": "SendSMS",
          "States": {
            "SendSMS": {
              "Type": "Task",
              "Resource": "arn:aws:states:::sns:publish",
              "Parameters": {
                "Message.$": "$.message",
                "PhoneNumber.$": "$.phoneNumber"
              },
              "End": true
            }
          }
        },
        {
          "StartAt": "SendEmail",
          "States": {
            "SendEmail": {
              "Type": "Task",
              "Parameters": {
                "FromEmailAddress.$": "$.email.from",
                "Destination": {
                  "ToAddresses.$": "States.Array($.email.to)",
                  "CcAddresses.$": "States.ArrayGetItem(States.JsonToString($.email.cc), $)",
                  "BccAddresses.$": "States.ArrayGetItem(States.JsonToString($.email.bcc), $)"
                },
                "Content": {
                  "Simple": {
                    "Subject": {
                      "Data.$": "$.email.subject",
                      "Charset": "UTF-8"
                    },
                    "Body": {
                      "Text": {
                        "Data.$": "$.message",
                        "Charset": "UTF-8"
                      },
                      "Html": {
                        "Data.$": "$.email.htmlBody",
                        "Charset": "UTF-8"
                      }
                    }
                  }
                },
                "ReplyToAddresses.$": "States.Array($.email.replyTo)",
                "EmailTags": [
                  {
                    "Name": "channel",
                    "Value": "email"
                  },
                  {
                    "Name": "messageType",
                    "Value.$": "$.email.messageType"
                  }
                ],
                "ConfigurationSetName.$": "$.email.configurationSet",
                "ListManagementOptions": {
                  "ContactListName.$": "$.email.contactList",
                  "TopicName.$": "$.email.topic"
                }
              },
              "Resource": "arn:aws:states:::aws-sdk:sesv2:sendEmail",
              "End": true
            }
          }
        }
      ],
      "End": true
    },
    "SendSMSOnly": {
      "Type": "Task",
      "Resource": "arn:aws:states:::sns:publish",
      "Parameters": {
        "Message.$": "$.message",
        "PhoneNumber.$": "$.phoneNumber"
      },
      "End": true
    },
    "SendEmailOnly": {
      "Type": "Task",
      "Parameters": {
        "FromEmailAddress.$": "$.email.from",
        "Destination": {
          "ToAddresses.$": "States.Array($.email.to)"
        },
        "Content": {
          "Simple": {
            "Subject": {
              "Data.$": "$.email.subject",
              "Charset": "UTF-8"
            },
            "Body": {
              "Text": {
                "Data.$": "$.message",
                "Charset": "UTF-8"
              },
              "Html": {
                "Data.$": "$.email.htmlBody",
                "Charset": "UTF-8"
              }
            }
          }
        }
      },
      "Resource": "arn:aws:states:::aws-sdk:sesv2:sendEmail",
      "End": true
    },
    "FailState": {
      "Type": "Fail",
      "Cause": "Invalid channel specified"
    }
  }
}

This Step Functions implementation offers several advantages:

  • Native service integration: Direct integration with Amazon Simple Notification Service (Amazon SNS), Amazon Simple Email Service (Amazon SES), Amazon DynamoDB, and Amazon CloudWatch eliminates the need for wrapper Lambda functions
  • Visual workflow: The execution flow is visible and maintainable through the AWS Management Console
  • Built-in error handling: Retry policies and error states can be defined declaratively
  • Parallel execution: The Parallel state handles multiple channel delivery efficiently
  • Simplified logic: The Choice state replaces complex if-else statements
  • Centralized data flow: Input and output are managed consistently across states
  • Enhanced workflow duration capabilities: Step Functions Standard workflows support executions that run for up to one year, compared to the 15-minute maximum execution time for Lambda functions

Comparing Lambda function as orchestrator to Step Functions

The summary of different features implemented on Lambda function as orchestrator and Step Functions is reflected in the following table:

Feature Lambda function as orchestrator Step Functions
Orchestration logic Implemented in Python with nested if-else statements. Defined declaratively using the Choice state
Multi-channel delivery Sequential function invocations. Parallel execution using function’s logic. Parallel execution using the Parallel state
Service integration Requires SDK calls or separate Lambda functions. Direct integration with AWS services (Amazon SNS, DynamoDB)
Error handling Custom try-except blocks in Python. Built-in error states and retry policies
Data persistance Custom code to interact with DynamoDB. Native DynamoDB integration with putItem task
Metrics logging Custom code to call CloudWatch. CloudWatch Metrics SDK integration

Implementation considerations

Review the following considerations when re-architecting a Lambda function orchestrator to Step Functions:

  • State machine type: Choose between Standard (up to 1 year runtime) and Express (up to 5 minutes) workflows based on your needs.
  • Input/output management: Parameters manipulation reduces the development effort and give flexible alternatives to implement the workflow:
    • Parameters: Selects specific input fields to pass to the next state
    • ResultSelector: Filters the state response to include only relevant fields
    • ResultPath: Stores the processed result in a specific path of the state input
    • OutputPath: Determines what data passes to the next state
      A code snippet for these features is:

      {
          "ProcessOrder": {
              "Type": "Task",
              "Resource": "arn:aws:states:::lambda:invoke",
              "Parameters": {
                  "FunctionName": "ProcessOrderFunction",
                  "Payload": {
                      "orderId.$": "$.orderId",
                      "customerId.$": "$.customerId"
                  }
              },
              "ResultSelector": {
                  "orderStatus.$": "$.Payload.status",
                  "processedDate.$": "$.Payload.timestamp"
              },
              "ResultPath": "$.orderProcessing",
              "OutputPath": "$.orderProcessing",
              "Next": "NotifyCustomer"
          }
      }

  • Error handling: Implement retry policies and catch errors at both the task and state machine levels.
  • Monitoring: Set up CloudWatch logs and metrics for your state machine to track executions and performance.

Benefits of using Step Functions

Using Step Functions for rearchitecting scenarios bring the following benefits:

  • Reduced code complexity: The business logic is now defined in Amazon States Language rather than distributed across multiple Lambda functions.
  • Improved maintainability: Developers can make workflow changes by modifying the Amazon States Language, often modifying several Lambda functions.
  • Native AWS service integrations: Step Functions offers direct integrations with over 200 AWS services, which you can use to connect and coordinate AWS resources without writing custom integration code.
  • Cost optimization: By using direct service integrations, there are fewer Lambda invocations and reduced costs.
  • Long-running processes: Step Functions can manage workflows that run for up to a year, beyond the 15-minute limit for Lambda functions.

Conclusion

Rearchitecting Lambda-based applications with Step Functions can significantly improve maintainability, scalability, and operational efficiency. By moving orchestration logic into Step Functions and using its native service integrations, you can create more robust and manageable serverless applications.

While this post focused on a message distribution workflow, the principles apply to many serverless architectures. As you develop your applications, consider how Step Functions can help you build more resilient and scalable solutions.

To learn more about serverless architectures visit Serverless Land.

How Zapier runs isolated tasks on AWS Lambda and upgrades functions at scale

Post Syndicated from Anton Aleksandrov original https://aws.amazon.com/blogs/architecture/how-zapier-runs-isolated-tasks-on-aws-lambda-and-upgrades-functions-at-scale/

Zapier is a leading no-code automation provider whose customers use their solution to automate workflows and move data across over 8,000 applications such as Slack, Salesforce, Asana, and Dropbox. Zapier runs these automations through integrations called Zaps, which are implemented using a serverless architecture running on Amazon Web Services (AWS). Each Zap is powered by an AWS Lambda function.

In this post, you’ll learn how Zapier has built their serverless architecture focusing on three key aspects: using Lambda functions to build isolated Zaps, operating over a hundred thousand Lambda functions through Zapier’s control plane infrastructure, and enhancing security posture while reducing maintenance efforts by introducing automated function upgrades and cleanup workflows into their platform architecture.

Architecting a secure and isolated runtime environment

Zaps created by Zapier’s users implement tenant-specific business logic, hence they require cross-tenant compute isolation. Code implementing one Zap can’t share an execution environment with code implementing another Zap. Moreover, the same Zap type used by two different tenants can’t share execution environments as well.

To achieve the required level of isolation, Zapier’s engineering team adopted AWS Lambda, a serverless compute service that runs code in response to events and automatically manages cloud compute resources. Minimal operational overhead, built-in high availability, automated scaling, high level of isolation, and pay-per-use model made Lambda a great fit for this use case. Currently, Zapier’s architecture is running over a hundred thousand Lambda functions to support their customer’s integration workflows.

Because they’re powered by the open source Firecracker microVMs, each function is completely isolated from the others. Moreover, each execution environment belonging to the same function (sometimes referred to as function instances) is also isolated from other execution environments. The following architecture topology diagram uses red lines to represent isolation boundaries. Each execution environment of every function is isolated from its peers and is getting its own virtual resources such as disk, memory, and CPU. For more details, read Security in AWS Lambda.

Isolation boundary

Zapier’s control plane is architected using Amazon Elastic Kubernetes Service (Amazon EKS). A designated database is used to maintain the up-to-date function inventory. Whenever a user creates a new Zap, the control plane creates a corresponding Lambda function and stores a reference in the inventory database. When a Zap is triggered, the control plane retrieves information about a relevant Lambda function and invokes it to facilitate the integration workflow, as illustrated in the following diagram.

Control and Data planes

Understanding the runtime deprecation process

When building architectures using the traditional non-serverless compute, cloud engineers are the ones responsible for keeping operating systems and software on their compute instances up to date and applying security and maintenance patches. With serverless architectures and Lambda functions, security patches and minor runtime upgrades are handled by AWS automatically, which means customers can focus on delivering business value instead of the undifferentiated heavy lifting of infrastructure management.

When a major Lambda managed runtime version reaches end-of-life, AWS initiates a deprecation process through the AWS Health Dashboard and direct email communications to affected customers. Because deprecated runtimes eventually lose access to security updates and support, organizations must upgrade to supported runtime versions to avoid potential security risks. Read more about the shared responsibility model, runtime use after deprecation, and receiving runtime deprecation notifications.

As Zapier’s user base and architectural complexity – and consequently the number of Zaps – were growing, keeping all functions on the most up-to-date major runtime versions became a laborious task. Top contributing factors were:

  • High number of functions. At its peak, the Zapier platform was running Zaps using hundreds of thousands of unique Lambda functions. Approximately 35% of these functions were using a runtime that was scheduled for deprecation in the next 12 months.
  • Zapier architected their data plane environment to be ephemeral – the control plane creates and deletes Lambda functions on demand and manages their lifecycle dynamically. Identifying a specific owner for each affected function wasn’t always straightforward.
  • Security is paramount at Zapier and upgrading affected functions runtime prior to the deprecation date was an absolute must. At no point could Zapier functions use runtimes after their deprecation date. This was a task which required extra resources.
  • The upgrade process shouldn’t have had any impact on the end customer experience. At no point should customer experience be affected.

With a short runway, high-volume workload, and the strict requirements of not impacting customer experience, Zapier’s Platform Engineering team took on this challenge of maintaining high security posture in their platform architecture.

Applying the solution

The solution had three work streams:

  1. Reducing the risk by analyzing the architecture and identifying and cleaning up unused functions.
  2. Prioritizing upgrades by identifying the most critical and impactful functions.
  3. Empowering engineering teams with automated tools and knowledge to streamline the upgrade process in future.

Identify and clean up unused functions

The first step in streamlining the upgrade process was identifying and removing unused functions. This reduced the total number of functions in Zapier’s architecture that required upgrades, eliminating unnecessary work for the team.

Zapier started by augmenting the function inventory with runtime information using AWS Trusted Advisor and Amazon Cloud Intelligence Trusted Advisor dashboards, as illustrated in the following diagram.

Gathering data

This meant the team could build a detailed inventory of functions that were running on soon-to-be deprecated runtimes. Using Amazon CloudWatch, Zapier’s platform team started to monitor metrics such as number of invocations. They identified which functions were active, which functions weren’t used for an extended period, and which functions didn’t have an active owner and could be removed.

One of the primary mechanisms for ownership validation within the organization was using resource tags. Functions that were active, but didn’t have clear ownership, were flagged for additional review before removal. Functions that were confirmed as unused or didn’t have an active owner were marked for deletion. Removing such functions allowed Zapier to significantly simplify their architecture and reduce the number of functions that had to be upgraded.

Prioritizing upgrades

With a smaller volume of functions to upgrade, Zapier’s platform team prioritized function upgrades based on usage patterns, criticality, and potential customer impact. Three primary prioritization categories were:

  • Customer-facing functions – Any functions directly involved in executing user Zaps were marked as high priority. These had to be upgraded first to avoid service disruptions.
  • Backend infrastructure functions – Internal functions that supported system operations were evaluated based on their importance to platform stability.
  • High-volume functions – Functions with the highest execution frequency were prioritized because upgrading them would have the greatest impact on reducing operational risk.

Using these factors, Zapier’s platform team has created an upgrade roadmap, ensuring that critical assets were addressed first while minimizing potential disruptions.

Refer to Retrieve data about Lambda functions that use a deprecated runtime in the Lambda Developer Guide to learn how to identify most commonly and most frequently used Lambda functions in your serverless architecture.

Empowering engineering teams with automated tools and knowledge

To ensure a smooth and efficient upgrade process across their serverless architecture, Zapier’s team empowered engineering teams with clear guidelines and automated solutions. The platform incorporated two main approaches: Terraform-managed functions and a custom-built Lambda runtime canary tool. Implementing and adopting these tools and practices resulted in reducing the number of functions using soon-to-be deprecated runtimes by 95%.

For functions managed through infrastructure-as-code (IaC), Zapier’s team developed standardized Terraform modules that specified supported runtime versions. Development teams implemented these modules in their configurations:

resource "aws_lambda_function" "example" {
    runtime = "python3.13"  # Updated to supported runtime
}

After applying the new module version, teams validated changes by testing the new runtime in staging environments and monitoring Terraform plan outputs to ensure proper runtime version updates.

To efficiently manage most Lambda functions in their architecture, Zapier developed the Lambda runtime canary tool suite. Using this solution, they automated the runtime upgrade process for thousands of active Lambda functions with minimal manual intervention. The tool suite implements several key features:

  • Architected for gradual traffic shifting with the Lambda built-in routing mechanism through function version and aliasing. The tool can gradually shift traffic distribution from an old to a new function version. During this gradual traffic shift, the system monitors CloudWatch metrics for errors and automatically rolls back if error rates exceed acceptable thresholds.
  • Optimistic upgrade strategy implements direct upgrades for infrequently used functions using a flag value stored in a cache to detect potential issues during the first post-upgrade invocation. If this invocation fails, the control plane retries it using the previous function version. If the retried invocation succeeds, Zapier’s control plane initiates a rollback, assuming the error is most likely due to the runtime upgrade. After rollback, it will log the error and alert relevant stakeholders.
  • Integration with existing infrastructure uses an administrative interface and task queue for automated traffic shifting. A database ledger maintains tracking of function states and rollback information.
  • Operational controls provide manual rollback capabilities and implement centralized control switches for process management. After a function was upgraded to a new runtime and no rollback activity was detected within a set time period, an automated pruning task cleans up older versions.

Zapier’s Lambda canary tool, through its integration of gradual traffic shifting, real-time CloudWatch monitoring, and automated rollback mechanisms, established a sustainable framework for managing runtime upgrades across their serverless architecture. This approach not only automated the upgrade process and minimized operational risks but also created a scalable solution that provides continuous runtime upgrades, preventing the use of deprecated runtimes at any point. By allowing continuous function runtime updates with minimal disruption to end user experience, Zapier maintains security and stability while requiring minimal manual intervention. This framework efficiently manages their growing serverless infrastructure, providing both security and operational efficiency for future runtime updates.

Conclusion

In this post, you’ve learned how Zapier architected their software-as-a-service (SaaS) platform to provide secure, isolated execution environments using AWS Lambda and Amazon EKS, enabling their customers to create hundreds of thousands of Zaps. You’ve learned how Zapier’s team implemented the function runtime upgrade process at scale and reduced the number of functions running on soon-to-be deprecated runtimes by 95%. You’ve seen best practices that were established and techniques that helped Zapier to keep high security posture without impacting customer experience.

Use the following links to learn more about Lambda runtimes and upgrading your functions to the latest runtime versions:


About the authors

AWS Weekly Roundup: Kiro, AWS Lambda remote debugging, Amazon ECS blue/green deployments, Amazon Bedrock AgentCore, and more (July 21, 2025)

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-kiro-aws-lambda-remote-debugging-amazon-ecs-blue-green-deployments-amazon-bedrock-agentcore-and-more-july-21-2025/

I’m writing this as I depart from Ho Chi Minh City back to Singapore. Just realized what a week it’s been, so let me rewind a bit. This week, I tried my first Corne keyboard, wrapped up rehearsals for AWS Summit Jakarta with speakers who are absolutely raising the bar, and visited Vietnam to participate as a technical keynote speaker in AWS Community Day Vietnam, an energetic gathering of hundreds of cloud practitioners and AWS enthusiasts who shared knowledge through multiple technical tracks and networking sessions.

What I presented was a keynote titled “Reinvent perspective as modern developers”, featuring serverless, containers, and how we can cut the learning curves and be more productive with Amazon Q Developer and Kiro. I got a chance to discuss with a couple of AWS Community Builders and community developers, who shared how Amazon Q Developer actually addressed their challenges on building applications, with several highlighting significant productivity improvements and smoother learning curves in their cloud development journeys.

As I head back to Singapore, I’m carrying with me not just memories of delicious cà phê sữa đá (iced milk coffee), but also fresh perspectives and inspirations from this vibrant community of cloud innovators.

Introducing Kiro
One of the highlights from last week was definitely Kiro, an AI IDE that helps you deliver from concept to production through a simplified developer experience for working with AI agents. Kiro goes beyond “vibe coding” with features like specs and hooks that help get prototypes into production systems with proper planning and clarity.

Join the waitlist to get notified when it becomes available.

Last week’s AWS Launches
In other news, last week we had AWS Summit in New York, where we released several services. Here are some launches that caught my attention:

Console to IDE Integration

ECS Blue-Green Deployments

AWS Free Tier Enhanced Benefits

  • Monitor and debug event-driven applications with new Amazon EventBridge logging — Amazon EventBridge now provides enhanced logging capabilities that offer comprehensive event lifecycle tracking with detailed information about successes, failures, and status codes. This new observability feature addresses microservices and event-driven architecture monitoring challenges by providing visibility into the complete event journey.

EventBridge Enhanced Logging

S3 Vectors Overview

  • Amazon EKS enables ultra-scale AI/ML workloads with support for 100k nodes per cluster — Amazon EKS now supports up to 100,000 worker nodes in a single cluster, enabling customers to scale up to 1.6 million AWS Trainium accelerators or 800K NVIDIA GPUs. This industry-leading scale empowers customers to train trillion-parameter models and advance AGI development while maintaining Kubernetes conformance and familiar developer experience.

EKS Ultra-Scale Performance Improvements

From AWS Builder Center
In case you missed it, we just launched AWS Builder Center and integrated community.aws. Here are my top picks from the posts:

Upcoming AWS events
Check your calendars and sign up for upcoming AWS and AWS Community events:

  • AWS re:Invent – Register now to get a head start on choosing your best learning path, booking travel and accommodations, and bringing your team to learn, connect, and have fun. If you’re an early-career professional, you can apply to the All Builders Welcome Grant program, which is designed to remove financial barriers and create diverse pathways into cloud technology.
  • AWS Builders Online Series – If you’re based in one of the Asia Pacific time zones, join and learn fundamental AWS concepts, architectural best practices, and hands-on demonstrations to help you build, migrate, and deploy your workloads on AWS.
  • AWS Summits — Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Taipei (July 29), Mexico City (August 6), and Jakarta (June 26–27).
  • AWS Community Days — Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Singapore (August 2), Australia (August 15), Adria (September 5), Baltic (September 10), and Aotearoa (September 18).

You can browse all upcoming AWS led in-person and virtual developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

Donnie

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!


Join Builder ID: Get started with your AWS Builder journey at builder.aws.com

Simplify serverless development with console to IDE and remote debugging for AWS Lambda

Post Syndicated from Micah Walter original https://aws.amazon.com/blogs/aws/simplify-serverless-development-with-console-to-ide-and-remote-debugging-for-aws-lambda/

Today, we’re announcing two significant enhancements to AWS Lambda that make it easier than ever for developers to build and debug serverless applications in their local development environments: console to IDE integration and remote debugging. These new capabilities build upon our recent improvements to the Lambda development experience, including the enhanced in-console editing experience and the improved local integrated development environment (IDE) experience launched in late 2024.

When building serverless applications, developers typically focus on two areas to streamline their workflow: local development environment setup and cloud debugging capabilities. While developers can bring functions from the console to their IDE, they’re looking for ways to make this process more efficient. Additionally, as functions interact with various AWS services in the cloud, developers want enhanced debugging capabilities to identify and resolve issues earlier in the development cycle, reducing their reliance on local emulation and helping them optimize their development workflow.

Console to IDE integration

To address the first challenge, we’re introducing console to IDE integration, which streamlines the workflow from the AWS Management Console to Visual Studio Code (VS Code). This new capability adds an Open in Visual Studio Code button to the Lambda console, enabling developers to quickly move from viewing their function in the browser to editing it in their IDE, eliminating the time-consuming setup process for local development environments.

The console to IDE integration automatically handles the setup process, checking for VS Code installation and the AWS Toolkit for VS Code. For developers that have everything already configured, choosing the button immediately opens their function code in VS Code, so they can continue editing and deploy changes back to Lambda in seconds. If VS Code isn’t installed, it directs developers to the download page, and if the AWS Toolkit is missing, it prompts for installation.

To use console to IDE, look for the Open in VS Code button in either the Getting Started popup after creating a new function or the Code tab of existing Lambda functions. After selecting, VS Code opens automatically (installing AWS Toolkit if needed). Unlike the console environment, you now have access to a full development environment with integrated terminal – a significant improvement for developers who need to manage packages (npm install, pip install), run tests, or use development tools like linters and formatters. You can edit code, add new files/folders, and any changes you make will trigger an automatic deploy prompt. When you choose to deploy, the AWS Toolkit automatically deploys your function to your AWS account.

Screenshot showing Console to IDE

Remote debugging

Once developers have their functions in their IDE, they can use remote debugging to debug Lambda functions deployed in their AWS account directly from VS Code. The key benefit of remote debugging is that it allows developers to debug functions running in the cloud while integrated with other AWS services, enabling faster and more reliable development.

With remote debugging, developers can debug their functions with complete access to Amazon Virtual Private Cloud (VPC) resources and AWS Identity and Access Management (AWS IAM) roles, eliminating the gap between local development and cloud execution. For example, when debugging a Lambda function that interacts with an Amazon Relational Database Service (Amazon RDS) database in a VPC, developers can now debug the execution environment of the function running in the cloud within seconds, rather than spending time setting up a local environment that might not match production.

Getting started with remote debugging is straightforward. Developers can select a Lambda function in VS Code and enable debugging in seconds. AWS Toolkit for VS Code automatically downloads the function code, establishes a secure debugging connection, and enables breakpoint setting. When debugging is complete, AWS Toolkit for VS Code automatically cleans up the debugging configuration to prevent any impact on production traffic.

Let’s try it out

To take remote debugging for a spin, I chose to start with a basic “hello world” example function, written in Python. I had previously created the function using the AWS Management Console for AWS Lambda. Using the AWS Toolkit for VS Code, I can navigate to my function in the Explorer pane. Hovering over my function, I can right-click (ctrl-click in Windows) to download the code to my local machine to edit the code in my IDE. Saving the file will ask me to decide if I want to deploy the latest changes to Lambda.

Screenshot view of the Lambda Debugger in VS Code

From here, I can select the play icon to open the Remote invoke configuration page for my function. This dialog will now display a Remote debugging option, which I configure to point at my local copy of my function handler code. Before choosing Remote invoke, I can set breakpoints on the left anywhere I want my code to pause for inspection.

My code will be running in the cloud after it’s invoked, and I can monitor its status in real time in VS Code. In the following screenshot, you can see I’ve set a breakpoint at the print statement. My function will pause execution at this point in my code, and I can inspect things like local variable values before either continuing to the next breakpoint or stepping into the code line by line.

Here, you can see that I’ve chosen to step into the code, and as I go through it line by line, I can see the context and local and global variables displayed on the left side of the IDE. Additionally, I can follow the logs in the Output tab at the bottom of the IDE. As I step through, I’ll see any log messages or output messages from the execution of my function in real time.

Enhanced development workflow

These new capabilities work together to create a more streamlined development experience. Developers can start in the console, quickly transition to VS Code using the console to IDE integration, and then use remote debugging to debug their functions running in the cloud. This workflow eliminates the need to switch between multiple tools and environments, helping developers identify and fix issues faster.

Now available

You can start using these new features through the AWS Management Console and VS Code with the AWS Toolkit for VS Code (v3.69.0 or later) installed. Console to IDE integration is available in all commercial AWS Regions where Lambda is available, except AWS GovCloud (US) Regions. Learn more about it in Lambda and AWS Toolkit for VS Code documentation. To learn more about remote debugging capability, including AWS Regions it is available in, visit the AWS Toolkit for VS Code and Lambda documentation.

Console to IDE and remote debugging are available to you at no additional cost. With remote debugging, you pay only for the standard Lambda execution costs during debugging sessions. Remote debugging will support Python, Node.js, and Java runtimes at launch, with plans to expand support to additional runtimes in the future.

These enhancements represent a significant step forward in simplifying the serverless development experience, which means developers can build and debug Lambda functions more efficiently than ever before.

Modernizing SOAP applications using Amazon API Gateway and AWS Lambda

Post Syndicated from Daniel Abib original https://aws.amazon.com/blogs/compute/modernizing-soap-applications-using-amazon-api-gateway-and-aws-lambda/

This post demonstrates how you can modernize legacy SOAP applications using Amazon API Gateway and AWS Lambda to create bidirectional proxy architectures that enable integration between SOAP and REST systems without disrupting existing business operations.

Many organizations today face the challenge of maintaining critical business systems that were built decades ago. These legacy applications power essential business operations despite relying on outdated technologies and integration patterns. Although complete system replacement would be ideal, practical constraints such as budget limitations, resource availability, technical complexity, and missing documentation often make modernization efforts challenging.

This post first shows proxy architecture patterns to expose a legacy SOAP server over a REST API. It then shows how to integrate a legacy SOAP client with applications using a REST API.

While SOAP and REST APIs share HTTP as their foundation, SOAP has some limitations compared to REST, like limited HTTP methods (GET/POST only) and mandatory XML formatting. REST is more flexible with multiple HTTP methods and diverse payload formats (plain text, binary, HTML, JSON, XML).

Using API Gateway and Lambda to proxy SOAP service

Consider a legacy solution that only supports SOAP. The following diagram shows the architecture for a SOAP proxy server using API Gateway and Lambda.

Figure 1: SOAP Server Proxy for modernized architecture

Figure 1: SOAP Server Proxy for modernized architecture

The proxy exposes the APIs hosted on the SOAP Server (on the right side of the image) over a REST interface. A SOAP service expects the HTTP Content-Type header set to text/xml, and a XML format payload that follows the WSDL specification defined by the server.

In the proposed architecture, the Lambda function is the core transformation engine, handling the bidirectional conversion between JSON and XML formats. Lambda functions can be developed in multiple programming languages such as Python, Node.js, Java, C#, Go, Ruby, and PowerShell, allowing you to use your existing development expertise. The serverless nature of Lambda provides automatic scaling to handle traffic spikes without needing infrastructure management or capacity planning.

API Gateway acts as the intelligent front door, managing all incoming requests and routing them appropriately. It provides enterprise-grade features such as request throttling to protect backend systems from overload, comprehensive authentication and authorization mechanisms, API key management for partner access control, request and response validation, caching capabilities for improved performance, and detailed monitoring and logging. These built-in features remove the need for custom middleware development and provide immediate operational benefits. API Gateway can receive multiple payload format such as XML, JSON, binary data, and plain text. This makes it suitable for diverse integration scenarios.

Using API Gateway and Lambda to support legacy SOAP clients

The previous section focused on exposing SOAP services over REST APIs. Organizations also face the reverse challenge where legacy SOAP client applications must access REST services. The architecture for supporting legacy SOAP clients follows a similar pattern but with reversed data flow. In this case, the legacy SOAP client sends XML-formatted requests to what it believes is a SOAP server. However, behind the scenes API Gateway and Lambda work together to translate these requests into REST API calls.

Figure 2: Legacy SOAP client modernization architecture

Figure 2: Legacy SOAP client modernization architecture

The legacy SOAP client application sends XML SOAP messages to API Gateway. The Lambda function receives these SOAP requests, extracts the relevant data from the XML envelope, and transforms it into JSON format for the modern REST service.

The Lambda function wraps the JSON response from the REST services into the SOAP XML format that the legacy client expects. It recreates the appropriate XML structure, SOAP headers, and ensures that the response conforms to the WSDL specification that the client application was designed to consume.

Example scenario

Let’s suppose our legacy client application needs to send a SOAP request to convert an integer number to its word form. The SOAP envelop to convert the number 1519 to its long form “one thousand, five hundred and nine” looks like this:

<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
    <soap:Body> \
        <ConvertNumberToWordsSoapIn>
            <NumberToWordsRequest>1519</NumberToWordsRequest>
        </ConvertNumberToWordsSoapIn>
    </soap:Body>
</soap:Envelope>

The REST conversion service expects a JSON payload the as follownig:

jsonObject = {
	"data" : 1519
}

The following code block shows a sample Lambda function implementation for this. This function converts the SOAP XML envelop to JSON, changes the http header to application/json, and converts response from REST service to SOAP format.

var parseString = require('xml2js').parseString;
const axios = require('axios');

exports.handler = async (event, context) => {
    var valueNumber;
    
    try {
        console.log("Parsing XML string");

        // Parsing the XML to obtain data needed for conversion (number to words)
        parseString(event.body, function (err, result) {
            if (!err) {
                valueNumber = result['soap:Envelope']['soap:Body'][0]
                              ['ConvertNumberToWordsSoapIn'][0]
                              ['NumberToWordsRequest'][0];
            } else { 
                console.log (err);
                throw (err);
            }
        });
        console.log("Creating JSON for calling the service");
        // Creating JSON to call service
        var jsonObject = {
            "data" : valueNumber
        }
        
        console.log("Calling Microservice (NumberToWords)");
        const headers = { 
            'Content-Type': 'Application/json'
        };
        
        console.log ("Parameter for NumberToWords URL:" + 
                    JSON.stringify(process.env.NumberToWordMicroservice));

        // Calling numberToWords REST Server
        var resultNumberToWords = await 
            axios.post(process.env.NumberToWordMicroservice, jsonObject, { headers });
        
        // Creating the response
        console.log("Creating response XML");

        var resp =  create_response (JSON.stringify(resultNumberToWords.data.message));
        console.log("Response in XML: "+ resp);
        
        // Returning the value in XML using text/xml content type
        let response = {'statusCode': 200, headers: {"content-type": "text/xml"}, 
                        'body': JSON.stringify(resp)}
        return response;
        
    } catch (err) {
        console.log ("Error: " + err);
        let response = {'statusCode': 500, 
                        headers: {"content-type": "text/xml"}, 'body': err}
        return response;
    }
};

// Function to create a SOAP XML envelope with the result value
function create_response(numberInWords) {
  return '<?xml version="1.0" encoding="utf-8"?> \
            <soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">\
            <soap:Body>\
              <m:ConvertNumberToWordsResponse xmlns:m="http://www.dataaccess.com/webservicesserver/"> \
                  <m:ConvertNumberToWordsResponseResult>' + numberInWords + '</m:ConvertNumberToWordsResponseResult> \
              </m:ConvertNumberToWordsResponse> \
            </soap:Body>\
            </soap:Envelope>';
}

With this approach, you can maintain your existing SOAP client applications without modification, allowing them to consume modern REST services. You can preserve investments in legacy client applications while gradually modernizing the overall system. This architecture is particularly valuable in scenarios where multiple legacy SOAP clients need to access the same modern REST services. This is because a single proxy can serve multiple client applications simultaneously. The serverless nature of the architecture makes sure that it scales automatically based on the number of client requests, providing cost-effective operation regardless of usage patterns.

Alternative approach using API Gateway transformation capabilities

The Lambda-based approach provides maximum flexibility and control. API Gateway also offers built-in transformation capabilities that can handle certain SOAP modernization scenarios without the need for compute resources.

The native API Gateway transformation uses Apache Velocity Template Language mapping templates. It converts the payload directly at the gateway, offering a streamlined solution for specific modernization scenarios.

The VTL approach works by defining mapping templates that handle the conversion process between different payload formats. When modernizing SOAP services, these templates can intercept REST requests with JSON payloads, restructure the data into XML format compatible with your legacy SOAP endpoints, and reverse the process for responses returning to the client.

Figure 3: API Gateway with velocity template language transformation

Figure 3: API Gateway with velocity template language transformation

This gateway-native transformation strategy offers several operational advantages. You benefit from streamlined architecture because the transformation logic resides entirely within the API Gateway service. There are no other infrastructure components to manage or monitor, and the solution avoids the complexity of coordinating between multiple AWS services. Cost efficiency is another key benefit, as there are no compute charges beyond the standard API Gateway pricing.

Consider the previous example of converting a number to its word format. The VTL transformation in API Gateway will look like this:

## Parse the SOAP envelope and extract the number value
#set(\$xmlDoc = \$input.path('\$'))
#set(\$numberToWords = \$xmlDoc.Envelope.Body.ConvertNumberToWordsSoapIn.NumberToWordsRequest)

## Convert to integer if it's a string
#if(\$numberToWords.toString().matches("^\d+\$"))
  #set(\$dataValue = \$numberToWords.toInteger())
#else
  #set(\$dataValue = \$numberToWords)
#end

{
  "data": \$dataValue
}

You should consider VTL transformations when your SOAP services have predictable, stable schemas with relatively direct XML structures. This approach works particularly well for legacy systems that rarely undergo changes and have clear request-response patterns. For more dynamic environments or complex transformation requirements, the Lambda-based solution provides superior flexibility and maintainability.

Security considerations

An important consideration when working with legacy SOAP services is understanding their authentication mechanisms. SOAP protocols often implement authentication through security standards, where authentication credentials and security tokens are embedded directly within the SOAP envelope headers. This includes username tokens, digital signatures, and encryption elements that are part of the XML structure.

When SOAP envelopes contain unencrypted authentication information in the headers, the proxy architecture typically functions without more modifications. This is because the Lambda function can pass through these authentication elements transparently to the backend SOAP service. However, due to the nature of SOAP authentication being tightly integrated with the XML envelope structure, certain scenarios may need custom handling within the Lambda function.

For example, if the SOAP service uses timestamp-based authentication tokens, session management, or needs specific security header modifications, the Lambda function may need customization to properly handle, validate, or refresh these authentication elements during the JSON-to-XML transformation process. Organizations should carefully analyze their SOAP service authentication requirements to determine if more Lambda logic is needed to maintain security compliance.

Moreover, make sure that any SOAP authentication credentials processed by the Lambda function are handled securely and never logged in plain text.

Conclusion

In this post, you learned how cloud-native services can bridge the gap between legacy systems and modern application architectures, allowing you to use your existing investments while adopting contemporary development practices and technologies.

Amazon API Gateway and AWS Lambda enable organizations to create REST services that proxy legacy SOAP servers, allowing modern applications to consume legacy services through JSON payloads while preserving existing SOAP infrastructure. This serverless solution provides cost-effectiveness, automatic scaling, and reduced operational overhead while facilitating company modernization through scalable APIs without abandoning legacy software investments.

This modernization strategy allows you to gradually transition from legacy SOAP services to modern REST APIs without disrupting existing business operations. As your modernization journey progresses, you can extend this pattern to support more SOAP services or implement more sophisticated transformation logic based on your specific business requirements.

For more serverless learning resources, visit Serverless Land.

Orchestrating document processing with AWS AppSync Events and Amazon Bedrock

Post Syndicated from Mehdi Amrane original https://aws.amazon.com/blogs/compute/orchestrating-document-processing-with-aws-appsync-events-and-amazon-bedrock/

Many organizations implement intelligent document processing pipelines in order to extract meaningful insights from an increasing volume of unstructured content (such as insurance claims, loan applications and more). Traditionally, these pipelines require significant engineering efforts, as the implementation often involves using several machine learning (ML) models and orchestrating complex workflows.

As organizations integrate these pipelines to customer facing applications (such as web applications for customers to upload documents such as insurance claims, loan approval documents and more), they set goals to provide insights in real time to increase the end customer experience. These organizations also aim to run and scale these workloads with minimal operational overhead and optimizing on costs. In addition, these organizations require the implementation of common security practices such as identity and access management, to make sure that only authorized and authenticated users are allowed to perform specific actions or access specific resources.

In this post, we show you a solution to simplify the creation of an intelligent document processing pipeline, with a web application for customers to upload their files (documents and images) and derive insights from it (summarization, fields extraction and classification). The solution primarily use serverless technologies, it includes a web socket to receive insights in real time and offers several benefits, such as automatic scaling, built-in high availability, and a pay-per-use billing model to optimize on costs. The solution also includes an authentication layer and an authorization layer to manage identities and permissions.

Solution overview

In this post, we provide an operational overview of the solution, and then describe how to set it up with the following services:

The solution architecture is illustrated in the following diagram:

Step 1: The user authenticates to the web application (hosted in AWS Amplify).
Step 2: Amazon Cognito validates the authentication details. After this, the user is now logged in the web application.
Steps 3aand 3b:

  • Step 3a: The web application (AWS Amplify) subscribes to an AWS AppSync Events web socket.
  • Step 3b: The AWS AppSync Events web socket calls an AWS Lambda authorizer to confirm that the user is authorized to subscribe to the web socket.

Step 4: The user uploads a file (document or image) using the web application.
Step 5: The web application (hosted in AWS Amplify) calls Amazon Cognito (identity pool) to confirm that the user is authorized to upload a file.
Step 6: The file is uploaded in an Amazon S3 bucket.
Steps 7a and 7b: Upon reception of an Amazon S3 upload event (which notifies that the file was uploaded in the Amazon S3 bucket) in the default Amazon Event Bridge bus, an Amazon Event Bridge bus rule triggers the execution of an AWS Step Functions state machine to start the orchestration workflow.
Step 8 (Step to extract fields from a file and classify it):

  • Step 8a: The first AWS Lambda function starts a new Amazon Bedrock Automation job (this job extracts specific fields from the uploaded file and classify it)
  • Step 8b: Once the job is completed, the results are stored in an Amazon S3 bucket.
  • Step 8c and 8d: Upon reception of an Amazon S3 event (which notifies that the results were stored in the Amazon S3 bucket) in the default Amazon Event Bridge, an Amazon Event Bridge bus rule triggers the execution of an AWS Lambda function
  • Step 8e: An AWS Lambda function publishes the results to the web socket.

Steps 9a and 9b: The second AWS Lambda function submits a prompt to an Amazon Bedrock foundation model (Sonnet 3), to request a summarization in streaming of the uploaded file. The AWS Lambda function publishes the streaming data to the web socket.

After Step 8e and Step 9b, the user can now consult the summarization result and extraction insights of the uploaded file in the web application.

Pre-requisites

To follow along and set up this solution, you must have the following:

  • An AWS account
  • A device with access to your AWS account with the following:
    • Python 3.12 installed (including pip)
    • Node.js 20.12.0 installed
  • Enable Model Access to the Claude 3 Sonnet model in Amazon Bedrock


Note: Deploying this solution will incur costs. Review the pricing page of each AWS service used in this post for details on costs. The cost of running this solution will primarily depend on:

  • The number of documents (and the size of each document)
  • The number of active users

Setup Amazon Bedrock Data Automation

In this section, we setup an Amazon Bedrock Data Automation project and an Amazon Bedrock blueprint.

A project contains a list of blueprints, and each blueprint defines the fields to extract from different types of files (such as documents or images). In this post, we define a blueprint for a driving license.

Complete the following steps to create an Amazon Bedrock Data Automation project and a driving license blueprint:

  1. Clone the GitHub repository
    git clone https://github.com/aws-samples/sample-create-idp-with-appsyncevents-and-amazonbedrock.git

  2. Go to the sample-create-idp-with-appsyncevents-and-amazonbedrock folder
    cd sample-create-idp-with-appsyncevents-and-amazonbedrock

  3. Initialize the environment (make the shell script files, from the GitHub repository, ready to be used)
    chmod +x ./init-env.sh && source ./init-env.sh

  4. Run the script setup-bda-project.sh to create an Amazon Bedrock Data Automation project and a sample driving license blueprint:
    ./setup-bda-project.sh

Create the web socket and orchestration backend

In this section, we create the following resources:

  • A user directory for web authentication and authorization, created with an Amazon Cognito user pool. An Amazon Cognito identity pool is also created to validate that users are authorized to upload files via the web application.
  • A web socket using AWS AppSync Events. This allows our web application to receive real time updates for summarization and extraction results. An authorization layer is also created to protect the web socket from unauthorized users. This is implemented with a Lambda authorizer function to validate that incoming requests include valid authorization details.
  • A state machine using AWS Step Functions and AWS Lambda to orchestrate the summarization and extraction operations from the unstructured content
  • Amazon S3 buckets to store files for document processing, and code files for AWS Lambda functions

Complete the following steps to create the web socket and the orchestration backend of the solution, using AWS CloudFormation templates:

  1. Create Amazon S3 buckets used by the solution by running the following script. These buckets will store the files uploaded by users and code files of the AWS Lambda functions used in this solution.
    cd $CURRENT_DIR/s3; ./create-s3-buckets.sh

  2. Create the Amazon Cognito user pool and identity pool by running the create-cognito-userpool.sh script:
    cd $CURRENT_DIR/cognito; ./create-cognito-userpool.sh

  3. Create the AWS AppSync Events web socket by running the following script:
    cd $CURRENT_DIR/appsync/; ./create-appsync-api.sh

  4. Create the AWS Step Functions state machine (including AWS Lambda functions) by running the following scripts:
    cd $CURRENT_DIR/orchestration/; ./create-orchestration.sh

Configure the Amazon Cognito user pool

In this section, we create a user in our Amazon Cognito user pool. This user will log in to our web application.

Run the script create-cognito-testuser.sh to create the user (make sure to provide your email address):

cd $CURRENT_DIR/cognito; ./create-cognito-testuser.sh #your-email-address#

After you create the user, you should receive an email with a temporary password in this format: “Your username is #your-email-address# and temporary password is #temporary-password#.”

Keep note of these login details (email address and temporary password) to use later when testing the web application.

Create the web application

In this section, we build a web application using AWS Amplify and publish it to make it accessible through an endpoint URL.

Complete the following steps to create the web application:

  1. Run the script create-webapp.sh to create the web application with AWS Amplify:
    cd $CURRENT_DIR/amplify/; ./create-webapp.sh

  2. Run the script deploy.sh to deploy the web application
    cd $CURRENT_DIR/amplify/amplify-idp; ./deploy.sh

The web application is now available for testing and a URL should be displayed, as shown in the following screenshot. Take note of the URL to use in the following section.

Test the web application

In this section, we test the web application and upload a file to be processed:

  1. Open the URL of the AWS Amplify application in your web browser.
  2. Enter your login information (your email and the temporary password you received earlier while configuring the user pool in Amazon Cognito) and choose Sign in.
  3. When prompted, enter a new password and choose Change Password.
  4. You should now be able to see a web interface.
  5. Download the sample driving license at this location and upload it via the web application using either your camera or a file in your local device, as illustrated

Once the file is uploaded, you should start receiving responses in the web application. When all the operations are completed, you should see a result equivalent to what is shown in the following screenshot:

Note: If you are planning to use other driving license sample images with other formats, you may have to update the existing Bedrock Data Automation blueprint we created earlier or define a new blueprint in your Bedrock Data Automation project we created earlier for these new images to work. For more information, please review the Bedrock Data Automation documentation.

Clean up

To make sure that no additional cost is incurred, remove the resources provisioned in your account. Make sure you’re in the correct AWS account before deleting the following resources.

Important note: You should exercise caution when performing the preceding steps. Make sure you are deleting the resources in the correct AWS account.

You can either navigate to the AWS CloudFormation console to delete the CloudFormation stacks associated to the resources provisioned or use the cleanup helper script cleanup.sh available at the root of the sample-create-idp-with-appsyncevents-and-amazonbedrock folder:

./cleanup.sh #region#

Conclusion

In this post, we walked through a solution to create a document processing pipeline, with a web application using serverless services. Via the web application, we were able to upload a file and receive responses in real time for different types of operations (summarization, extraction of specific fields and classification). First, we created an Amazon Bedrock Data Automation project (with a driving license blueprint). Then we created a web socket along with an orchestration solution using a state machine (AWS Step Functions and AWS Lambda functions). We also configured a user pool to grant a user access to the web application. Finally, we created the frontend of the web application in AWS Amplify.

To dive deeper into this solution, a self-paced workshop is available in AWS Workshop Studio.