Tag Archives: contributed

Starting up faster with AWS Lambda SnapStart

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/starting-up-faster-with-aws-lambda-snapstart/

This blog written by Tarun Rai Madan, Sr. Product Manager, AWS Lambda, and Mike Danilov, Sr. Principal Engineer, AWS Lambda.

AWS Lambda SnapStart is a new performance optimization developed by AWS that can significantly improve the startup time for applications. Announced at AWS re:Invent 2022, the first capability to feature SnapStart is Lambda SnapStart for Java. This feature delivers up to 10x faster function startup times for latency-sensitive Java applications at no extra cost, and with minimal or no code changes.

Overview

When applications start up, whether it’s an app on your phone, or a serverless Lambda function, they go through initialization. The initialization process can vary based on the application and the programming language, but even the smallest applications written in the most efficient programming languages require some kind of initialization before they can do anything useful. For a Lambda function, the initialization phase involves downloading the function’s code, starting the runtime and any external dependencies, and running the function’s initialization code. Ordinarily, for a Lambda function, this initialization happens every time your application scales up to create a new execution environment.

With SnapStart, the function’s initialization is done ahead of time when you publish a function version. Lambda takes a Firecracker microVM snapshot of the memory and disk state of the initialized execution environment, encrypts the snapshot, and caches it for low-latency access. When your application starts up and scales to handle traffic, Lambda resumes new execution environments from the cached snapshot instead of initializing them from scratch, improving startup performance.

The following diagram compares a cold start request lifecycle for a non-SnapStart function and a SnapStart function. The time it takes to initialize the function, which is the predominant contributor to high startup latency, is replaced by a faster resume phase with SnapStart.

Diagram of a non-SnapStart function versus a SnapStart function

Diagram of a non-SnapStart function versus a SnapStart function

Request lifecycle for a non-SnapStart function versus a SnapStart function

Front loading the initialization phase can significantly improve the startup performance for latency-sensitive Lambda functions, such as synchronous microservices that are sensitive to initialization time. Because Java is a dynamic language with its own runtime and garbage collector, Lambda functions written in Java can be amongst the slowest to initialize. For applications that require frequent scaling, the delay introduced by initialization, commonly referred to as a cold start, can lead to a suboptimal experience for end users. Such applications can now start up faster with SnapStart.

AWS’ work in Firecracker makes it simple to use SnapStart. Because SnapStart uses micro Virtual Machine (microVM) snapshots to checkpoint and restore full applications, the approach is adaptable and general purpose. It can be used to speed up many kinds of application starts. While microVMs have long been used for strong secure isolation between applications and environments, the ability to front-load initialization with SnapStart means that microVMs can also augment performance savings at scale.

SnapStart and uniqueness

Lambda SnapStart speeds up applications by re-using a single initialized snapshot to resume multiple execution environments. As a result, unique content included in the snapshot during initialization is reused across execution environments, and so may no longer remain unique. A class of applications where uniqueness of state is a key consideration is cryptographic software, which assumes that the random numbers are truly random (both random and unpredictable). If content such as a random seed is saved in the snapshot during initialization, it is re-used when multiple execution environments resume and may produce predictable random sequences.

To maintain uniqueness, you must verify before using SnapStart that any unique content previously generated during the initialization now gets generated after that initialization. This includes unique IDs, unique secrets, and entropy used to generate pseudo-randomness.

Multiple execution environments resumed from a shared snapshot

SnapStart life cycle

SnapStart life cycle

However, we have implemented a few things to make it easier for customers to maintain uniqueness.

First, it is not common or a best practice for applications to generate these unique items directly. Still, it’s worth confirming that your application handles uniqueness correctly. That’s usually a matter of checking for any unique IDs, keys, timestamps, or “homemade” entropy in the initializer methods for your function.

Lambda offers a SnapStart scanning tool that checks for certain categories of code that assume uniqueness, so customers can make changes as required. The SnapStart scanning tool is an open-source SpotBugs plugin that runs static analysis against a set of rules and reports “potential SnapStart bugs”. We are committed to engaging with the community to expand these set of rules against which the scanning tool checks the code.

As an example, the following Lambda function creates a unique log stream for each execution environment during initialization. This unique value is re-used across execution environments when they re-use a snapshot.

public class LambdaUsingUUID {

    private AWSLogsClient logs;
    private final UUID sandboxId;

    public LambdaUsingUUID() {
       sandboxId = UUID.randomUUID(); // <-- unique content created
       logs = new AWSLogsClient();
    }
    @Override
    public String handleRequest(Map<String,String> event, Context context) {
       CreateLogStreamRequest request = new CreateLogStreamRequest(
         "myLogGroup", sandboxId + ".log9.txt");
         logs.createLogStream(request);     
         return "Hello world!";
    }
} 

When you run the scanning tool on the previous code, the following message helps identify a potential implementation that assumes uniqueness. One way to address such cases is to move the generation of the unique ID inside your function’s handler method.

H C SNAP_START: Detected a potential SnapStart bug in Lambda function initialization code. At LambdaUsingUUID.java: [line 7]

A best practice used by many applications is to rely on the system libraries and kernel for uniqueness. These have long-handled other cases where keys and IDs may be inadvertently duplicated, such as when forking or cloning processes. AWS has worked with upstream kernel maintainers and open source developers so that the existing protection mechanisms use the open standard VM Generation ID (vmgenid) that SnapStart supports. vmgenid is an emulated device, which exposes a 128-bit, cryptographically random integer value identifier to the kernel, and is statistically unique across all resumed microVMs.

Lambda’s included versions of Amazon Linux 2, OpenSSL (1.0.2), and java.security.SecureRandom all automatically re-initialize their randomness and secrets after a SnapStart. Software that always gets random numbers from the operating system (for example, from /dev/random or /dev/urandom) does not need any updates to maintain randomness. Because Lambda always reseeds /dev/random and /dev/urandom when restoring a snapshot, random numbers are not repeated even when multiple execution environments resume from the same snapshot.

Lambda’s request IDs are already unique for each invocation and are available using the getAwsRequestId() method of the Lambda request object. Most Lambda functions should require no modification to run with SnapStart enabled. It’s generally recommended that for SnapStart, you do not include unique state in the function’s initialization code, and use cryptographically secure random number generators (CSPRNGs) when needed.

Second, if you do want to create unique data directly in a Lambda function initialization phase, Lambda supports two new runtime hooks. Runtime hooks are available as part of the open-source Coordinated Restore at Checkpoint (CRaC) project. You can use the beforeCheckpoint hook to run code immediately before a snapshot is taken, and use the afterRestore hook to run code immediately after restoring a snapshot. This helps you delete any unique content before the snapshot is created, and restore any unique content after the snapshot is restored. For an example of how to use CRaC with a reference application, see the CRaC GitHub repository.

Conclusion

This blog describes how SnapStart optimizes startup performance under the hood, and outlines considerations around uniqueness. We also introduce the new interfaces that AWS Lambda provides (via scanning tool and runtime hooks) to customers to maintain uniqueness for their SnapStart functions.

SnapStart is made possible by several pieces of open-source work, including Firecracker, Linux, CraC, OpenSSL and more. AWS is grateful to the maintainers and developers who have made this possible. With this work, we’re excited to launch Lambda SnapStart for Java as what we hope is the first amongst many other capabilities to benefit from the performance savings and enhanced security that SnapStart microVMs provide.

For more serverless learning resources, visit Serverless Land.

Introducing payload-based message filtering for Amazon SNS

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/introducing-payload-based-message-filtering-for-amazon-sns/

This post is written by Prachi Sharma (Software Development Manager, Amazon SNS), Mithun Mallick (Principal Solutions Architect, AWS Integration Services), and Otavio Ferreira (Sr. Software Development Manager, Amazon SNS).

Amazon Simple Notification Service (SNS) is a messaging service for Application-to-Application (A2A) and Application-to-Person (A2P) communication. The A2A functionality provides high-throughput, push-based, many-to-many messaging between distributed systems, microservices, and event-driven serverless applications. These applications include Amazon Simple Queue Service (SQS), Amazon Kinesis Data Firehose, AWS Lambda, and HTTP/S endpoints. The A2P functionality enables you to communicate with your customers via mobile text messages (SMS), mobile push notifications, and email notifications.

Today, we’re introducing the payload-based message filtering option of SNS, which augments the existing attribute-based option, enabling you to offload additional filtering logic to SNS and further reduce your application integration costs. For more information, see Amazon SNS Message Filtering.

Overview

You use SNS topics to fan out messages from publisher systems to subscriber systems, addressing your application integration needs in a loosely-coupled way. Without message filtering, subscribers receive every message published to the topic, and require custom logic to determine whether an incoming message needs to be processed or filtered out. This results in undifferentiating code, as well as unnecessary infrastructure costs. With message filtering, subscribers set a filter policy to their SNS subscription, describing the characteristics of the messages in which they are interested. Thus, when a message is published to the topic, SNS can verify the incoming message against the subscription filter policy, and only deliver the message to the subscriber upon a match. For more information, see Amazon SNS Subscription Filter Policies.

However, up until now, the message characteristics that subscribers could express in subscription filter policies were limited to metadata in message attributes. As a result, subscribers could not benefit from message filtering when the messages were published without attributes. Examples of such messages include AWS events published to SNS from 60+ other AWS services, like Amazon Simple Storage Service (S3), Amazon CloudWatch, and Amazon CloudFront. For more information, see Amazon SNS Event Sources.

The new payload-based message filtering option in SNS empowers subscribers to express their SNS subscription filter policies in terms of the contents of the message. This new capability further enables you to use SNS message filtering for your event-driven architectures (EDA) and cross-account workloads, specifically where subscribers may not be able to influence a given publisher to have its events sent with attributes. With payload-based message filtering, you have a simple, no-code option to further prevent unwanted data from being delivered to and processed by subscriber systems, thereby simplifying the subscribers’ code as well as reducing costs associated with downstream compute infrastructure. This new message filtering option is available across SNS Standard and SNS FIFO topics, for JSON message payloads.

Applying payload-based filtering in a use case

Consider an insurance company moving their lead generation platform to a serverless architecture based on microservices, adopting enterprise integration patterns to help them develop and scale these microservices independently. The company offers a variety of insurance types to its customers, including auto and home insurance. The lead generation and processing workflow for each insurance type is different, and entails notifying different backend microservices, each designed to handle a specific type of insurance request.

Payload filtering example

Payload filtering example

The company uses multiple frontend apps to interact with customers and receive leads from them, including a web app, a mobile app, and a call center app. These apps submit the customer-generated leads to an internal lead storage microservice, which then uploads the leads as XML documents to an S3 bucket. Next, the S3 bucket publishes events to an SNS topic to notify that lead documents have been created. Based on the contents of each lead document, the SNS topic forks the workflow by delivering the auto insurance leads to an SQS queue and the home insurance leads to another SQS queue. These SQS queues are respectively polled by the auto insurance and the home insurance lead processing microservices. Each processing microservice applies its business logic to validate the incoming leads.

The following S3 event, in JSON format, refers to a lead document uploaded with key auto-insurance-2314.xml to the S3 bucket. S3 automatically publishes this event to SNS, which in turn matches the S3 event payload against the filter policy of each subscription in the SNS topic. If the event matches the subscription filter policy, SNS delivers the event to the subscribed SQS queue. Otherwise, SNS filters the event out.

{
  "Records": [{
    "eventVersion": "2.1",
    "eventSource": "aws:s3",
    "awsRegion": "sa-east-1",
    "eventTime": "2022-11-21T03:41:29.743Z",
    "eventName": "ObjectCreated:Put",
    "userIdentity": {
      "principalId": "AWS:AROAJ7PQSU42LKEHOQNIC:demo-user"
    },
    "requestParameters": {
      "sourceIPAddress": "177.72.241.11"
    },
    "responseElements": {
      "x-amz-request-id": "SQCC55WT60XABW8CF",
      "x-amz-id-2": "FRaO+XDBrXtx0VGU1eb5QaIXH26tlpynsgaoJrtGYAWYRhfVMtq/...dKZ4"
    },
    "s3": {
      "s3SchemaVersion": "1.0",
      "configurationId": "insurance-lead-created",
      "bucket": {
        "name": "insurance-bucket-demo",
        "ownerIdentity": {
          "principalId": "A1ATLOAF34GO2I"
        },
        "arn": "arn:aws:s3:::insurance-bucket-demo"
      },
      "object": {
        "key": "auto-insurance-2314.xml",
        "size": 17,
        "eTag": "1530accf30cab891d759fa3bb8322211",
        "sequencer": "00737AF379B2683D6C"
      }
    }
  }]
}

To express its interest in auto insurance leads only, the SNS subscription for the auto insurance lead processing microservice sets the following filter policy. Note that, unlike attribute-based policies, payload-based policies support property nesting.

{
  "Records": {
    "s3": {
      "object": {
        "key": [{
          "prefix": "auto-"
        }]
      }
    },
    "eventName": [{
      "prefix": "ObjectCreated:"
    }]
  }
}

Likewise, to express its interest in home insurance leads only, the SNS subscription for the home insurance lead processing microservice sets the following filter policy.

{
  "Records": {
    "s3": {
      "object": {
        "key": [{
          "prefix": "home-"
        }]
      }
    },
    "eventName": [{
      "prefix": "ObjectCreated:"
    }]
  }
}

Note that each filter policy uses the string prefix matching capability of SNS message filtering. In this use case, this matching capability enables the filter policy to match only the S3 objects whose key property value starts with the insurance type it’s interested in (either auto- or home-). Note as well that each filter policy matches only the S3 events whose eventName property value starts with ObjectCreated, as opposed to ObjectRemoved. For more information, see Amazon S3 Event Notifications.

Deploying the resources and filter policies

To deploy the AWS resources for this use case, you need an AWS account with permissions to use SNS, SQS, and S3. On your development machine, install the AWS Serverless Application Model (SAM) Command Line Interface (CLI). You can find the complete SAM template for this use case in the aws-sns-samples repository in GitHub.

The SAM template has a set of resource definitions, as presented below. The first resource definition creates the SNS topic that receives events from S3.

InsuranceEventsTopic:
    Type: AWS::SNS::Topic
    Properties:
      TopicName: insurance-events-topic

The next resource definition creates the S3 bucket where the insurance lead documents are stored. This S3 bucket publishes an event to the SNS topic whenever a new lead document is created.

InsuranceEventsBucket:
    Type: AWS::S3::Bucket
    DeletionPolicy: Retain
    DependsOn: InsuranceEventsTopicPolicy
    Properties:
      BucketName: insurance-doc-events
      NotificationConfiguration:
        TopicConfigurations:
          - Topic: !Ref InsuranceEventsTopic
            Event: 's3:ObjectCreated:*'

The next resource definitions create the SQS queues to be subscribed to the SNS topic. As presented in the architecture diagram, there’s one queue for auto insurance leads, and another queue for home insurance leads.

AutoInsuranceEventsQueue:
    Type: AWS::SQS::Queue
    Properties:
      QueueName: auto-insurance-events-queue
      
HomeInsuranceEventsQueue:
    Type: AWS::SQS::Queue
    Properties:
      QueueName: home-insurance-events-queue

The next resource definitions create the SNS subscriptions and their respective filter policies. Note that, in addition to setting the FilterPolicy property, you need to set the FilterPolicyScope property to MessageBody in order to enable the new payload-based message filtering option for each subscription. The default value for the FilterPolicyScope property is MessageAttributes.

AutoInsuranceEventsSubscription:
    Type: AWS::SNS::Subscription
    Properties:
      Protocol: sqs
      Endpoint: !GetAtt AutoInsuranceEventsQueue.Arn
      TopicArn: !Ref InsuranceEventsTopic
      FilterPolicyScope: MessageBody
      FilterPolicy:
        '{"Records":{"s3":{"object":{"key":[{"prefix":"auto-"}]}}
        ,"eventName":[{"prefix":"ObjectCreated:"}]}}'
  
HomeInsuranceEventsSubscription:
    Type: AWS::SNS::Subscription
    Properties:
      Protocol: sqs
      Endpoint: !GetAtt HomeInsuranceEventsQueue.Arn
      TopicArn: !Ref InsuranceEventsTopic
      FilterPolicyScope: MessageBody
      FilterPolicy:
        '{"Records":{"s3":{"object":{"key":[{"prefix":"home-"}]}}
        ,"eventName":[{"prefix":"ObjectCreated:"}]}}'

Once you download the full SAM template from GitHub to your local development machine, run the following command in your terminal to build the deployment artifacts.

sam build –t SNS-Payload-Based-Filtering-SAM.template

Once SAM has finished building the deployment artifacts, run the following command to deploy the AWS resources and the SNS filter policies. The command guides you through the process of setting deployment preferences, which you can answer based on your requirements. For more information, refer to the SAM Developer Guide.

sam deploy --guided

Once SAM has finished deploying the resources, you can start testing the solution in the AWS Management Console.

Testing the filter policies

Go the AWS CloudFormation console, choose the stack created by the SAM template, then choose the Outputs tab. Note the name of the S3 bucket created.

S3 bucket name

S3 bucket name

Now switch to the S3 console, and choose the bucket with the corresponding name. Once on the bucket details page, upload a test file whose name starts with the auto- prefix. For example, you can name your test file auto-insurance-7156.xml. The upload triggers an S3 event, typed as ObjectCreated, which is then routed through the SNS topic to the SQS queue that stores auto insurance leads.

Insurance bucket contents

Insurance bucket contents

Now switch to the SQS console, and choose to receive messages for the SQS queue storing an auto insurance lead. Note that the SQS queue for home insurance leads is empty.

SQS home insurance queue empty

SQS home insurance queue empty

If you want to check the filter policy configured, you may switch to the SNS console, choose the SNS topic created by the SAM template, and choose the SNS subscription for auto insurance leads. Once on the subscription details page, you can view the filter policy, in JSON format, alongside the filter policy scope set to “Message body”.

SNS filter policy

SNS filter policy

You may repeat the testing steps above, now with another file whose name starts with the home- prefix, and see how the S3 event is routed through the SNS topic to the SQS queue that stores home insurance leads.

Monitoring the filtering activity

CloudWatch provides visibility into your SNS message filtering activity, with dedicated metrics, which also enables you to create alarms. You can use the NumberOfNotifcationsFilteredOut-MessageBody metric to monitor the number of messages filtered out due to payload-based filtering, as opposed to attribute-based filtering. For more information, see Monitoring Amazon SNS topics using CloudWatch.

Moreover, you can use the NumberOfNotificationsFilteredOut-InvalidMessageBody metric to monitor the number of messages filtered out due to having malformed JSON payloads. You can have these messages with malformed JSON payloads moved to a dead-letter queue (DLQ) for troubleshooting purposes. For more information, see Designing Durable Serverless Applications with DLQ for Amazon SNS.

Cleaning up

To delete all the AWS resources that you created as part of this use case, run the following command from the project root directory.

sam delete

Conclusion

In this blog post, we introduce the use of payload-based message filtering for SNS, which provides event routing for JSON-formatted messages. This enables you to write filter policies based on the contents of the messages published to SNS. This also removes the message parsing overhead from your subscriber systems, as well as any custom logic from your publisher systems to move message properties from the payload to the set of attributes. Lastly, payload-based filtering can facilitate your event-driven architectures (EDA) by enabling you to filter events published to SNS from 60+ other AWS event sources.

For more information, see Amazon SNS Message Filtering, Amazon SNS Event Sources, and Amazon SNS Pricing. For more serverless learning resources, visit Serverless Land.

Introducing container, database, and queue utilization metrics for the Amazon MWAA environment

Post Syndicated from David Boyne original https://aws.amazon.com/blogs/compute/introducing-container-database-and-queue-utilization-metrics-for-the-amazon-mwaa-environment/

This post is written by Uma Ramadoss (Senior Specialist Solutions Architect), and Jeetendra Vaidya (Senior Solutions Architect).

Today, AWS is announcing the availability of container, database, and queue utilization metrics for Amazon Managed Workflows for Apache Airflow (Amazon MWAA). This is a new collection of metrics published by Amazon MWAA in addition to existing Apache Airflow metrics in Amazon CloudWatch. With these new metrics, you can better understand the performance of your Amazon MWAA environment, troubleshoot issues related to capacity, delays, and get insights on right-sizing your Amazon MWAA environment.

Previously, customers were limited to Apache Airflow metrics such as DAG processing parse times, pool running slots, and scheduler heartbeat to measure the performance of the Amazon MWAA environment. While these metrics are often effective in diagnosing Airflow behavior, they lack the ability to provide complete visibility into the utilization of the various Apache Airflow components in the Amazon MWAA environment. This could limit the ability for some customers to monitor the performance and health of the environment effectively.

Overview

Amazon MWAA is a managed service for Apache Airflow. There are a variety of deployment techniques with Apache Airflow. The Amazon MWAA deployment architecture of Apache Airflow is carefully chosen to allow customers to run workflows in production at scale.

Amazon MWAA has distributed architecture with multiple schedulers, auto-scaled workers, and load balanced web server. They are deployed in their own Amazon Elastic Container Service (ECS) cluster using AWS Fargate compute engine. Amazon Simple Queue Service (SQS) queue is used to decouple Airflow workers and schedulers as part of Celery Executor architecture. Amazon Aurora PostgreSQL-Compatible Edition is used as the Apache Airflow metadata database. From today, you can get complete visibility into the scheduler, worker, web server, database, and queue metrics.

In this post, you can learn about the new metrics published for Amazon MWAA environment, build a sample application with a pre-built workflow, and explore the metrics using CloudWatch dashboard.

Container, database, and queue utilization metrics

  1. In the CloudWatch console, in Metrics, select All metrics.
  2. From the metrics console that appears on the right, expand AWS namespaces and select MWAA tile.
  3. MWAA metrics

    MWAA metrics

  4. You can see a tile of dimensions, each corresponding to the container (cluster), database, and queue metrics.
  5. MWAA metrics drilldown

    MWAA metrics drilldown

Cluster metrics

The base MWAA environment comes up with three Amazon ECS clusters – scheduler, one worker (BaseWorker), and a web server. Workers can be configured with minimum and maximum numbers. When you configure more than one minimum worker, Amazon MWAA creates another ECS cluster (AdditionalWorker) to host the workers from 2 up to n where n is the max workers configured in your environment.

When you select Cluster from the console, you can see the list of metrics for all the clusters. To learn more about the metrics, visit the Amazon ECS product documentation.

MWAA metrics list

MWAA metrics list

CPU usage is the most important factor for schedulers due to DAG file processing. When you have many DAGs, CPU usage can be higher. You can improve the performance by setting min_file_process_interval higher. Similarly, you can apply other techniques described in the Apache Airflow Scheduler page to fine tune the performance.

Higher CPU or memory utilization in the worker can be due to moving large files or doing computation on the worker itself. This can be resolved by offloading the compute to purpose-built services such as Amazon ECS, Amazon EMR, and AWS Glue.

Database metrics

Amazon Aurora DB clusters used by Amazon MWAA come up with a primary DB instance and a read replica to support the read operations. Amazon MWAA publishes database metrics for both READER and WRITER instances. When you select Database tile, you can view the list of metrics available for the database cluster.

Database metrics

Database metrics

Amazon MWAA uses connection pooling technique so the database connections from scheduler, workers, and web servers are taken from the connection pool. If you have many DAGs scheduled to start at the same time, it can overload the scheduler and increase the number of database connections at a high frequency. This can be minimized by staggering the DAG schedule.

SQS metrics

An SQS queue helps decouple scheduler and worker so they can independently scale. When workers read the messages, they are considered in-flight and not available for other workers. Messages become available for other workers to read if they are not deleted before the 12 hours visibility timeout. Amazon MWAA publishes in-flight message count (RunningTasks), messages available for reading count (QueuedTasks) and the approximate age of the earliest non-deleted message (ApproximateAgeOfOldestTask).

Database metrics

Database metrics

Getting started with container, database and queue utilization metrics for Amazon MWAA

The following sample project explores some key metrics using an Amazon CloudWatch dashboard to help you find the number of workers running in your environment at any given moment.

The sample project deploys the following resources:

  • Amazon Virtual Private Cloud (Amazon VPC).
  • Amazon MWAA environment of size small with 2 minimum workers and 10 maximum workers.
  • A sample DAG that fetches NOAA Global Historical Climatology Network Daily (GHCN-D) data, uses AWS Glue Crawler to create tables and AWS Glue Job to produce an output dataset in Apache Parquet format that contains the details of precipitation readings for the US between year 2010 and 2022.
  • Amazon MWAA execution role.
  • Two Amazon S3 buckets – one for Amazon MWAA DAGs, one for AWS Glue job scripts and weather data.
  • AWSGlueServiceRole to be used by AWS Glue Crawler and AWS Glue job.

Prerequisites

There are a few tools required to deploy the sample application. Ensure that you have each of the following in your working environment:

Setting up the Amazon MWAA environment and associated resources

  1. From your local machine, clone the project from the GitHub repository.
  2. git clone https://github.com/aws-samples/amazon-mwaa-examples

  3. Navigate to mwaa_utilization_cw_metric directory.
  4. cd usecases/mwaa_utilization_cw_metric

  5. Run the makefile.
  6. make deploy

  7. Makefile runs the terraform template from the infra/terraform directory. While the template is being applied, you are prompted if you want to perform these actions.
  8. MWAA utilization terminal

    MWAA utilization terminal

This provisions the resources and copies the necessary files and variables for the DAG to run. This process can take approximately 30 minutes to complete.

Generating metric data and exploring the metrics

  1. Login into your AWS account through the AWS Management Console.
  2. In the Amazon MWAA environment console, you can see your environment with the Airflow UI link in the right of the console.
  3. MMQA environment console

    MMQA environment console

  4. Select the link Open Airflow UI. This loads the Apache Airflow UI.
  5. Apache Airflow UI

    Apache Airflow UI

  6. From the Apache Airflow UI, enable the DAG using Pause/Unpause DAG toggle button and run the DAG using the Trigger DAG link.
  7. You can see the Treeview of the DAG run with the tasks running.
  8. Navigate to the Amazon CloudWatch dashboard in another browser tab. You can see a dashboard by the name, MWAA_Metric_Environment_env_health_metric_dashboard.
  9. Access the dashboard to view different key metrics across cluster, database, and queue.
  10. MWAA dashboard

    MWAA dashboard

  11. After the DAG run is complete, you can look into the dashboard for worker count metrics. Worker count started with 2 and increased to 4.

When you trigger the DAG, the DAG runs 13 tasks in parallel to fetch weather data from 2010-2022. With two small size workers, the environment can run 10 parallel tasks. The rest of the tasks wait for either the running tasks to complete or automatic scaling to start. As the tasks take more than a few minutes to finish, MWAA automatic scaling adds additional workers to handle the workload. Worker count graph now plots higher with AdditionalWorker count increased to 3 from 1.

Cleanup

To delete the sample application infrastructure, use the following command from the usecases/mwaa_utilization_cw_metric directory.

make undeploy

Conclusion

This post introduces the new Amazon MWAA container, database, and queue utilization metrics. The example shows the key metrics and how you can use the metrics to solve a common question of finding the Amazon MWAA worker counts. These metrics are available to you from today for all versions supported by Amazon MWAA at no additional cost.

Start using this feature in your account to monitor the health and performance of your Amazon MWAA environment, troubleshoot issues related to capacity and delays, and to get insights into right-sizing the environment

Build your own CloudWatch dashboard using the metrics data JSON and Airflow metrics. To deploy more solutions in Amazon MWAA, explore the Amazon MWAA samples GitHub repo.

For more serverless learning resources, visit Serverless Land.

Apache, Apache Airflow, and Airflow are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

Using the AWS Parameter and Secrets Lambda extension to cache parameters and secrets

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/using-the-aws-parameter-and-secrets-lambda-extension-to-cache-parameters-and-secrets/

This post is written by Pal Patel, Solutions Architect, and Saud ul Khalid, Sr. Cloud Support Engineer.

Serverless applications often rely on AWS Systems Manager Parameter Store or AWS Secrets Manager to store configuration data, encrypted passwords, or connection details for a database or API service.

Previously, you had to make runtime API calls to AWS Parameter Store or AWS Secrets Manager every time you wanted to retrieve a parameter or a secret inside the execution environment of an AWS Lambda function. This involved configuring and initializing the AWS SDK client and managing when to store values in memory to optimize the function duration, and avoid unnecessary latency and cost.

The new AWS Parameters and Secrets Lambda extension provides a managed parameters and secrets cache for Lambda functions. The extension is distributed as a Lambda layer that provides an in-memory cache for parameters and secrets. It allows functions to persist values through the Lambda execution lifecycle, and provides a configurable time-to-live (TTL) setting.

When you request a parameter or secret in your Lambda function code, the extension retrieves the data from the local in-memory cache, if it is available. If the data is not in the cache or it is stale, the extension fetches the requested parameter or secret from the respective service. This helps to reduce external API calls, which can improve application performance and reduce cost. This blog post shows how to use the extension.

Overview

The following diagram provides a high-level view of the components involved.

High-level architecture showing how parameters or secrets are retrieved when using the Lambda extension

The extension can be added to new or existing Lambda. It works by exposing a local HTTP endpoint to the Lambda environment, which provides the in-memory cache for parameters and secrets. When retrieving a parameter or secret, the extension first queries the cache for a relevant entry. If an entry exists, the query checks how much time has elapsed since the entry was first put into the cache, and returns the entry if the elapsed time is less than the configured cache TTL. If the entry is stale, it is invalidated, and fresh data is retrieved from either Parameter Store or Secrets Manager.

The extension uses the same Lambda IAM execution role permissions to access Parameter Store and Secrets Manager, so you must ensure that the IAM policy is configured with the appropriate access. Permissions may also be required for AWS Key Management Service (AWS KMS) if you are using this service. You can find an example policy in the example’s AWS SAM template.

Example walkthrough

Consider a basic serverless application with a Lambda function connecting to an Amazon Relational Database Service (Amazon RDS) database. The application loads a configuration stored in Parameter Store and connects to the database. The database connection string (including user name and password) is stored in Secrets Manager.

This example walkthrough is composed of:

  • A Lambda function.
  • An Amazon Virtual Private Cloud (VPC).
  • Multi-AZ Amazon RDS Instance running MySQL.
  • AWS Secrets Manager database secret that holds database connection.
  • AWS Systems Manager Parameter Store parameter that holds the application configuration.
  • An AWS Identity and Access Management (IAM) role that the Lambda function uses.

Lambda function

This Python code shows how to retrieve the secrets and parameters using the extension

import pymysql
import urllib3
import os
import json

### Load in Lambda environment variables
port = os.environ['PARAMETERS_SECRETS_EXTENSION_HTTP_PORT']
aws_session_token = os.environ['AWS_SESSION_TOKEN']
env = os.environ['ENV']
app_config_path = os.environ['APP_CONFIG_PATH']
creds_path = os.environ['CREDS_PATH']
full_config_path = '/' + env + '/' + app_config_path

### Define function to retrieve values from extension local HTTP server cachce
def retrieve_extension_value(url): 
    http = urllib3.PoolManager()
    url = ('http://localhost:' + port + url)
    headers = { "X-Aws-Parameters-Secrets-Token": os.environ.get('AWS_SESSION_TOKEN') }
    response = http.request("GET", url, headers=headers)
    response = json.loads(response.data)   
    return response  

def lambda_handler(event, context):
       
    ### Load Parameter Store values from extension
    print("Loading AWS Systems Manager Parameter Store values from " + full_config_path)
    parameter_url = ('/systemsmanager/parameters/get/?name=' + full_config_path)
    config_values = retrieve_extension_value(parameter_url)['Parameter']['Value']
    print("Found config values: " + json.dumps(config_values))

    ### Load Secrets Manager values from extension
    print("Loading AWS Secrets Manager values from " + creds_path)
    secrets_url = ('/secretsmanager/get?secretId=' + creds_path)
    secret_string = json.loads(retrieve_extension_value(secrets_url)['SecretString'])
    #print("Found secret values: " + json.dumps(secret_string))

    rds_host =  secret_string['host']
    rds_db_name = secret_string['dbname']
    rds_username = secret_string['username']
    rds_password = secret_string['password']
    
    
    ### Connect to RDS MySQL database
    try:
        conn = pymysql.connect(host=rds_host, user=rds_username, passwd=rds_password, db=rds_db_name, connect_timeout=5)
    except:
        raise Exception("An error occurred when connecting to the database!")

    return "DemoApp sucessfully loaded config " + config_values + " and connected to RDS database " + rds_db_name + "!"

In the global scope the environment variable PARAMETERS_SECRETS_EXTENSION_HTTP_PORT is retrieved, which defines the port the extension HTTP server is running on. This defaults to 2773.

The retrieve_extension_value function calls the extension’s local HTTP server, passing in the X-Aws-Parameters-Secrets-Token as a header. This is a required header that uses the AWS_SESSION_TOKEN value, which is present in the Lambda execution environment by default.

The Lambda handler code uses the extension cache on every Lambda invoke to obtain configuration data from Parameter Store and secret data from Secrets Manager. This data is used to make a connection to the RDS MySQL database.

Prerequisites

  1. Git installed
  2. AWS SAM CLI version 1.58.0 or greater.

Deploying the resources

  1. Clone the repository and navigate to the solution directory:
    git clone https://github.com/aws-samples/parameters-secrets-lambda-extension-
    sample.git

     

     

  2. Build and deploy the application using following command:
    sam build
    sam deploy --guided

This template takes the following parameters:

  • pVpcCIDR — IP range (CIDR notation) for the VPC. The default is 172.31.0.0/16.
  • pPublicSubnetCIDR — IP range (CIDR notation) for the public subnet. The default is 172.31.3.0/24.
  • pPrivateSubnetACIDR — IP range (CIDR notation) for the private subnet A. The default is 172.31.2.0/24.
  • pPrivateSubnetBCIDR — IP range (CIDR notation) for the private subnet B, which defaults to 172.31.1.0/24
  • pDatabaseName — Database name for DEV environment, defaults to devDB
  • pDatabaseUsername — Database user name for DEV environment, defaults to myadmin
  • pDBEngineVersion — The version number of the SQL database engine to use (the default is 5.7).

Adding the Parameter Store and Secrets Manager Lambda extension

To add the extension:

  1. Navigate to the Lambda console, and open the Lambda function you created.
  2. In the Function Overview pane. select Layers, and then select Add a layer.
  3. In the Choose a layer pane, keep the default selection of AWS layers and in the dropdown choose AWS Parameters and Secrets Lambda Extension
  4. Select the latest version available and choose Add.

The extension supports several configurable options that can be set up as Lambda environment variables.

This example explicitly sets an extension port and TTL value:

Lambda environment variables from the Lambda console

Testing the example application

To test:

  1. Navigate to the function created in the Lambda console and select the Test tab.
  2. Give the test event a name, keep the default values and then choose Create.
  3. Choose Test. The function runs successfully:

Lambda execution results visible from Lambda console after successful invocation.

To evaluate the performance benefits of the Lambda extension cache, three tests were run using the open source tool Artillery to load test the Lambda function. This can use the Lambda URL to invoke the function. The Artillery configuration snippet shows the duration and requests per second for the test:

config:
  target: "https://lambda.us-east-1.amazonaws.com"
  phases:
    -
      duration: 60
      arrivalRate: 10
      rampTo: 40

scenarios:
  -
    flow:
      -
        post:
          url: "https://abcdefghijjklmnopqrst.lambda-url.us-east-1.on.aws/"
  • Test 1: The extension cache is disabled by setting the TTL environment variable to 0. This results in 1650 GetParameter API calls to Parameter Store over 60 seconds.
  • Test 2: The extension cache is enabled with a TTL of 1 second. This results in 106 GetParameter API calls over 60 seconds.
  • Test 3: The extension is enabled with a TTL value of 300 seconds. This results in only 18 GetParameter API calls over 60 seconds.

In test 3, the TTL value is longer than the test duration. The 18 GetParameter calls correspond to the number of Lambda execution environments created by Lambda to run requests in parallel. Each execution environment has its own in-memory cache and so each one needs to make the GetParameter API call.

In this test, using the extension has reduced API calls by ~98%. Reduced API calls results in reduced function execution time, and therefore reduced cost.

Cleanup

After you test this example, delete the resources created by the template, using following commands from the same project directory to avoid continuing charges to your account.

sam delete

Conclusion

Caching data retrieved from external services is an effective way to improve the performance of your Lambda function and reduce costs. Implementing a caching layer has been made simpler with this AWS-managed Lambda extension.

For more information on the Parameter Store, Secrets Manager, and Lambda extensions, refer to:

For more serverless learning resources, visit Serverless Land.

Introducing cross-account access capabilities for AWS Step Functions

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/introducing-cross-account-access-capabilities-for-aws-step-functions/

This post is written by Siarhei Kazhura, Senior Solutions Architect, Serverless.

AWS Step Functions allows you to integrate with more than 220 AWS services by using optimized integrations (for services such as AWS Lambda), and AWS SDK integrations. These capabilities provide the ability to build robust solutions using AWS Step Functions as the engine behind the solution.

Many customers are using multiple AWS accounts for application development. Until today, customers had to rely on resource-based policies to make cross-account access for Step Functions possible. With resource-based policies, you can specify who has access to the resource and what actions they can perform on it.

Not all AWS services support resource-based policies. For example, it is possible to enable cross-account access via resource-based policies with services like AWS Lambda, Amazon SQS, or Amazon SNS. However, services such as Amazon DynamoDB do not support resource-based policies, so your workflows can only use Step Functions’ direct integration if it belongs to the same account.

Now, customers can take advantage of identity-based policies in Step Functions so your workflow can directly invoke resources in other AWS accounts, thus allowing cross-account service API integrations.

Overview

This example demonstrates how to use cross-account capability using two AWS accounts:

  • A trusted AWS account (account ID 111111111111) with a Step Functions workflow named SecretCacheConsumerWfw, and an IAM role named TrustedAccountRl.
  • A trusting AWS account (account ID 222222222222) with a Step Functions workflow named SecretCacheWfw, and two IAM roles named TrustingAccountRl, and SecretCacheWfwRl.

AWS Step Functions cross-account workflow example

At a high level:

  1. The SecretCacheConsumerWfw workflow runs under TrustedAccountRl role in the account 111111111111. The TrustedAccountRl role has permissions to assume the TrustingAccountRl role from the account 222222222222.
  2. The FetchConfiguration Step Functions task fetches the TrustingAccountRl role ARN, the SecretCacheWfw workflow ARN, and the secret ARN (all these resources belong to the Trusting AWS account).
  3. The GetSecretCrossAccount Step Functions task has a Credentials field with the TrustingAccountRl role ARN specified (fetched in the step 2).
  4. The GetSecretCrossAccount task assumes the TrustingAccountRl role during the SecretCacheConsumerWfw workflow execution.
  5. The SecretCacheWfw workflow (that belongs to the account 222222222222) is invoked by the SecretCacheConsumerWfw workflow under the TrustingAccountRl role.
  6. The results are returned to the SecretCacheConsumerWfw workflow that belongs to the account 111111111111.

The SecretCacheConsumerWfw workflow definition specifies the Credentials field and the RoleArn. This allows the GetSecretCrossAccount step to assume an IAM role that belongs to a separate AWS account:

{
  "StartAt": "FetchConfiguration",
  "States": {
    "FetchConfiguration": {
      "Type": "Task",
      "Next": "GetSecretCrossAccount",
      "Parameters": {
        "Name": "<ConfigurationParameterName>"
      },
      "Resource": "arn:aws:states:::aws-sdk:ssm:getParameter",
      "ResultPath": "$.Configuration",
      "ResultSelector": {
        "Params.$": "States.StringToJson($.Parameter.Value)"
      }
    },
    "GetSecretCrossAccount": {
      "End": true,
      "Type": "Task",
      "ResultSelector": {
        "Secret.$": "States.StringToJson($.Output)"
      },
      "Resource": "arn:aws:states:::aws-sdk:sfn:startSyncExecution",
      "Credentials": {
        "RoleArn.$": "$.Configuration.Params.trustingAccountRoleArn"
      },
      "Parameters": {
        "Input.$": "$.Configuration.Params.secret",
        "StateMachineArn.$": "$.Configuration.Params.trustingAccountWorkflowArn"
      }
    }
  }
}

Permissions

AWS Step Functions cross-account permissions setup example

At a high level:

  1. The TrustedAccountRl role belongs to the account 111111111111.
  2. The TrustingAccountRl role belongs to the account 222222222222.
  3. A trust relationship setup between the TrustedAccountRl and the TrustingAccountRl role.
  4. The SecretCacheConsumerWfw workflow is executed under the TrustedAccountRl role in the account 111111111111.
  5. The SecretCacheWfw is executed under the SecretCacheWfwRl role in the account 222222222222.

The TrustedAccountRl role (1) has the following trust policy setup that allows the SecretCacheConsumerWfw workflow to assume (4) the role.

{
  "RoleName": "<TRUSTED_ACCOUNT_ROLE_NAME>",
  "AssumeRolePolicyDocument": {
    "Version": "2012-10-17",
    "Statement": [
      {
        "Effect": "Allow",
        "Principal": {
          "Service": "states.<REGION>.amazonaws.com"
        },
        "Action": "sts:AssumeRole"
      }
    ]
  }
}

The TrustedAccountRl role (1) has the following permissions configured that allow it to assume (3) the TrustingAccountRl role (2).

{
  "RoleName": "<TRUSTED_ACCOUNT_ROLE_NAME>",
  "PolicyDocument": {
    "Version": "2012-10-17",
    "Statement": [
      {
        "Action": "sts:AssumeRole",
        "Resource":  "arn:aws:iam::<TRUSTING_ACCOUNT>:role/<TRUSTING_ACCOUNT_ROLE_NAME>",
        "Effect": "Allow"
      }
    ]
  }
}

The TrustedAccountRl role (1) has the following permissions setup that allow it to access Parameter Store, a capability of AWS Systems Manager, and fetch the required configuration.

{
  "RoleName": "<TRUSTED_ACCOUNT_ROLE_NAME>",
  "PolicyDocument": {
    "Version": "2012-10-17",
    "Statement": [
      {
        "Action": [
          "ssm:DescribeParameters",
          "ssm:GetParameter",
          "ssm:GetParameterHistory",
          "ssm:GetParameters"
        ],
        "Resource": "arn:aws:ssm:<REGION>:<TRUSTED_ACCOUNT>:parameter/<CONFIGURATION_PARAM_NAME>",
        "Effect": "Allow"
      }
    ]
  }
}

The TrustingAccountRl role (2) has the following trust policy that allows it to be assumed (3) by the TrustedAccountRl role (1). Notice the Condition field setup. This field allows us to further control which account and state machine can assume the TrustingAccountRl role, preventing the confused deputy problem.

{
  "RoleName": "<TRUSTING_ACCOUNT_ROLE_NAME>",
  "AssumeRolePolicyDocument": {
    "Version": "2012-10-17",
    "Statement": [
      {
        "Effect": "Allow",
        "Principal": {
          "AWS": "arn:aws:iam::<TRUSTED_ACCOUNT>:role/<TRUSTED_ACCOUNT_ROLE_NAME>"
        },
        "Action": "sts:AssumeRole",
        "Condition": {
          "StringEquals": {
            "sts:ExternalId": "arn:aws:states:<REGION>:<TRUSTED_ACCOUNT>:stateMachine:<CACHE_CONSUMER_WORKFLOW_NAME>"
          }
        }
      }
    ]
  }
}

The TrustingAccountRl role (2) has the following permissions configured that allow it to start Step Functions Express Workflows execution synchronously. This capability is needed because the SecretCacheWfw workflow is invoked by the SecretCacheConsumerWfw workflow under the TrustingAccountRl role via a StartSyncExecution API call.

{
  "RoleName": "<TRUSTING_ACCOUNT_ROLE_NAME>",
  "PolicyDocument": {
    "Version": "2012-10-17",
    "Statement": [
      {
        "Action": "states:StartSyncExecution",
        "Resource": "arn:aws:states:<REGION>:<TRUSTING_ACCOUNT>:stateMachine:<SECRET_CACHE_WORKFLOW_NAME>",
        "Effect": "Allow"
      }
    ]
  }
}

The SecretCacheWfw workflow is running under a separate identity – the SecretCacheWfwRl role. This role has the permissions that allow it to get secrets from AWS Secrets Manager, read/write to DynamoDB table, and invoke Lambda functions.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "secretsmanager:getSecretValue",
            ],
            "Resource": "arn:aws:secretsmanager:<REGION>:<TRUSTING_ACCOUNT>:secret:*",
            "Effect": "Allow"
        },
        {
            "Action": "dynamodb:GetItem",
            "Resource": "arn:aws:dynamodb:<REGION>:<TRUSTING_ACCOUNT>:table/<SECRET_CACHE_DDB_TABLE_NAME>",
            "Effect": "Allow"
        },
        {
            "Action": "lambda:InvokeFunction",
            "Resource": [
"arn:aws:lambda:<REGION>:<TRUSTING_ACCOUNT>:function:<CACHE_SECRET_FUNCTION_NAME>",
"arn:aws:lambda:<REGION>:<TRUSTING_ACCOUNT>:function:<CACHE_SECRET_FUNCTION_NAME>:*"
            ],
            "Effect": "Allow"
        }
    ]
}

Comparing with resource-based policies

To implement the solution above using resource-based policies, you must front the SecretCacheWfw with a resource that supports resource base policies. You can use Lambda for this purpose. A Lambda function has a resource permissions policy that allows for the access by SecretCacheConsumerWfw workflow.

The function proxies the call to the SecretCacheWfw, waits for the workflow to finish (synchronous call), and yields the result back to the SecretCacheConsumerWfw. However, this approach has a few disadvantages:

  • Extra cost: With Lambda you are charged based on the number of requests for your function, and the duration it takes for your code to run.
  • Additional code to maintain: The code must take the payload from the SecretCacheConsumerWfw workflow and pass it to the SecretCacheWfw workflow.
  • No out-of-the-box error handling: The code must handle errors correctly, retry the request in case of a transient error, provide the ability to do exponential backoff, and provide a circuit breaker in case of persistent errors. Error handling capabilities are provided natively by Step Functions.

AWS Step Functions cross-account setup using resource-based policies

The identity-based policy permission solution provides multiple advantages over the resource-based policy permission solution in this case.

However, resource-based policy permissions provide some advantages and can be used in conjunction with identity-based policies. Identity-based policies and resource-based policies are both permissions policies and are evaluated together:

  • Single point of entry: Resource-based policies are attached to a resource. With resource-based permissions policies, you control what identities that do not belong to your AWS account have access to the resource at the resource level. This allows for easier reasoning about what identity has access to the resource. AWS Identity and Access Management Access Analyzer can help with the identity-based policies, providing an ability to identify resources that are shared with an external identity.
  • The principal that accesses a resource via a resource-based policy still works in the trusted account and does not have to give its permissions to receive the cross-account role permissions. In this example, SecretCacheConsumerWfw still runs under TrustedAccountRl role, and does not need to assume an IAM role in the Trusting AWS account to access the Lambda function.

Refer to the how IAM roles differ from resource-based policies article for more information.

Solution walkthrough

To follow the solution walkthrough, visit the solution repository. The walkthrough explains:

  1. Prerequisites required.
  2. Detailed solution deployment walkthrough.
  3. Solution testing.
  4. Cleanup process.
  5. Cost considerations.

Conclusion

This post demonstrates how to create a Step Functions Express Workflow in one account and call it from a Step Functions Standard Workflow in another account using a new credentials capability of AWS Step Functions. It provides an example of a cross-account IAM roles setup that allows for the access. It also provides a walk-through on how to use AWS CDK for TypeScript to deploy the example.

For more serverless learning resources, visit Serverless Land.

Node.js 18.x runtime now available in AWS Lambda

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/node-js-18-x-runtime-now-available-in-aws-lambda/

This post is written by Suraj Tripathi, Cloud Consultant, AppDev.

You can now develop AWS Lambda functions using the Node.js 18 runtime. This version is in active LTS status and considered ready for general use. When creating or updating functions, specify a runtime parameter value of nodejs18.x or use the appropriate container base image to use this new runtime.

This runtime version is supported by functions running on either Arm-based AWS Graviton2 processors or x86-based processors. Using the Graviton2 processor architecture option allows you to get up to 34% better price performance.

This blog post explains the major changes available with the Node.js 18 runtime in Lambda.

AWS SDK for JavaScript upgrade to v3

Lambda’s Node.js runtimes include the AWS SDK for JavaScript. This enables customers to use the AWS SDK to connect to other AWS services from their function code, without having to include the AWS SDK in their function deployment. This is especially useful when creating functions in the AWS Management Console. It’s also useful for Lambda functions deployed as inline code in CloudFormation templates.

Up until Node.js 16, Lambda’s Node.js runtimes have included the AWS SDK for JavaScript version 2. This has since been superseded by the AWS SDK for JavaScript version 3, which was released in December 2020. With this release, Lambda has upgraded the version of the AWS SDK for JavaScript included with the runtime from v2 to v3.

If your existing Lambda functions are using the included SDK v2, then you must update your function code to use the SDK v3 when upgrading to the Node.js 18 runtime. This is the recommended approach when upgrading existing functions to Node.js 18. Alternatively, you can use the Node.js 18 runtime without updating your existing code if you deploy the SDK v2 together with your function code.

Version 3 of the SDK for JavaScript offers many benefits over version 2. Most importantly, it is modular, so your code only loads the modules it needs. Modularity also reduces your function size if you choose to deploy the SDK with your function code rather than using the version built into the Lambda runtime. Learn more about optimizing Node.js dependencies in Lambda here.

For example, for a function interacting with Amazon S3 using the v2 SDK, you import the entire SDK, even though you don’t use most of it:

const AWS = require("aws-sdk");

With the v3 SDK, you only import the modules you need, such as ListBucketsCommand, and a service client like S3Client.

import { S3Client, ListBucketsCommand } from "@aws-sdk/client-s3";

Another difference between SDK v2 and SDK v3 is the default settings for TCP connection re-use. In the SDK v2, connection re-use is disabled by default. In SDK v3, it is enabled by default. In most cases, enabling connection re-use improves function performance. To stop TCP connection reuse, set the AWS_NODEJS_CONNECTION_REUSE_ENABLED environment variable to false. You can also stop keeping the connections alive on a per-service client basis.

For more information, see Why and how you should use AWS SDK for JavaScript (v3) on Node.js 18.

Support for ES module resolution using NODE_PATH

Another change in the Node.js 18 runtime is added support for ES module resolution via the NODE_PATH environment variable.

ES modules are supported by Lambda’s Node.js 14 and Node.js 16 runtimes. They enable top-level await, which can lower cold start latency when used with Provisioned Concurrency. However, by default Node.js does not search the folders in the NODE_PATH environment variable when importing ES modules. This makes it difficult to import ES modules from folders outside of the /var/task/ folder in which the function code is deployed. For example, to load the AWS SDK included in the runtime as an ES module, or to load ES modules from Lambda layers.

The Node.js 18.x runtime for Lambda searches the folders listed in NODE_PATH when loading ES modules. This makes it easier to include the AWS SDK as an ES module or load ES modules from Lambda layers.

Node.js 18 language updates

The Lambda Node.js 18 runtime also enables you to take advantage of new Node.js 18 language features. This includes improved performance for class fields and private class methods, JSON import assertions, and experimental features such as the Fetch API, Test Runner module, and Web Streams API.

JSON import assertion

The import assertions feature allows module import statements to include additional information alongside the module specifier. Now the following code is valid:

// index.mjs

// static import
import fooData from './foo.json' assert { type: 'json' };

// dynamic import
const { default: barData } = await import('./bar.json', { assert: { type: 'json' } });

export const handler = async(event) => {

    console.log(fooData)
    // logs data in foo.json file
    console.log(barData)
    // logs data in bar.json file

    const response = {
        statusCode: 200,
        body: JSON.stringify('Hello from Lambda!'),
    };
    return response;
};

foo.json

{
  "foo1" : "1234",
  "foo2" : "4678"
}

bar.json

{
  "bar1" : "0001",
  "bar2" : "0002"
}

Experimental features

While still experimental, the global fetch API is available by default in Node.js 18. The API includes a fetch function, making fetch polyfills and third-party HTTP packages redundant.

// index.mjs 

export const handler = async(event) => {
    
    const res = await fetch('https://nodejs.org/api/documentation.json');
    if (res.ok) {
      const data = await res.json();
      console.log(data);
    }

    const response = {
        statusCode: 200,
        body: JSON.stringify('Hello from Lambda!'),
    };
    return response;
};

Experimental features in Node.js can be enabled/disabled via the NODE_OPTIONS environment variable. For example, to stop the experimental fetch API you can create a Lambda environment variable NODE_OPTIONS and set the value to --no-experimental-fetch.

With this change, if you run the previous code for the fetch API in your Lambda function, it throws a reference error because the experimental fetch API is now disabled.

Conclusion

Node.js 18 is now supported by Lambda. When building your Lambda functions using the zip archive packaging style, use a runtime parameter value of nodejs18.x to get started building with Node.js 18.

You can also build Lambda functions in Node.js 18 by deploying your function code as a container image using the Node.js 18 AWS base image for Lambda. You may learn more about writing functions in Node.js 18 by reading about the Node.js programming model in the Lambda documentation.

For existing Node.js functions, review your code for compatibility with Node.js 18, including deprecations, then migrate to the new runtime by changing the function’s runtime configuration to nodejs18.x.

For more serverless learning resources, visit Serverless Land.

Introducing attribute-based access controls (ABAC) for Amazon SQS

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/introducing-attribute-based-access-controls-abac-for-amazon-sqs/

This post is written by Vikas Panghal (Principal Product Manager), and Hardik Vasa (Senior Solutions Architect).

Amazon Simple Queue Service (SQS) is a fully managed message queuing service that makes it easier to decouple and scale microservices, distributed systems, and serverless applications. SQS queues enable asynchronous communication between different application components and ensure that each of these components can keep functioning independently without losing data.

Today we’re announcing support for attribute-based access control (ABAC) using queue tags with the SQS service. As an AWS customer, if you use multiple SQS queues to achieve better application decoupling, it is often challenging to manage access to individual queues. In such cases, using tags can enable you to classify these resources in different ways, such as by owner, category, or environment.

This blog post demonstrates how to use tags to allow conditional access to SQS queues. You can use attribute-based access control (ABAC) policies to grant access rights to users through policies that combine attributes together. ABAC can be helpful in rapidly growing environments, where policy management for each individual resource can become cumbersome.

ABAC for SQS is supported in all Regions where SQS is currently available.

Overview

SQS supports tagging of queues. Each tag is a label comprising a customer-defined key and an optional value that can make it easier to manage, search for, and filter resources. Tags allows you to assign metadata to your SQS resources. This can help you track and manage the costs associated with your queues, provide enhanced security in your AWS Identity and Access Management (IAM) policies, and lets you easily filter through thousands of queues.

SQS queue options in the console

The preceding image shows SQS queue in AWS Management Console with two tags – ‘auto-delete’ with value of ‘no’ and ‘environment’ with value of ‘prod’.

Attribute-based access controls (ABAC) is an authorization strategy that defines permissions based on tags attached to users and AWS resources. With ABAC, you can use tags to configure IAM access permissions and policies for your queues. ABAC hence enables you to scale your permissions management easily. You can author a single permissions policy in IAM using tags that you create per business role, and you no longer need to update the policy while adding each new resource.

You can also attach tags to AWS Identity and Access Management (IAM) principals to create an ABAC policy. These ABAC policies can be designed to allow SQS operations when the tag on the IAM user or role making the call matches the SQS queue tag.

ABAC provides granular and flexible access control based on attributes and values, reduces security risk because of misconfigured role-based policy, and easily centralizes auditing and access policy management.

ABAC enables two key use cases:

  1. Tag-based Access Control: You can use tags to control access to your SQS queues, including control plane and data plane API calls.
  2. Tag-on-Create: You can enforce tags during the creation of SQS queues and deny the creation of SQS resources without tags.

Tagging for access control

Let’s take a look at a couple of examples on using tags for access control.

Let’s say that you would want to restrict IAM user to all SQS actions for all queues that include a resource tag with the key environment and the value production. The following IAM policy helps to fulfill the requirement.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "DenyAccessForProd",
            "Effect": "Deny",
            "Action": "sqs:*",
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "aws:ResourceTag/environment": "prod"
                }
            }
        }
    ]
}

Now, for instance you need to restrict IAM policy for any operation on resources with a given tag with key environment and value production as an argument within the API call, the following IAM policy helps fulfill the requirements.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "DenyAccessForStageProduction",
            "Effect": "Deny",
            "Action": "sqs:*",
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "aws:RequestTag/environment": "production"
                }
            }
        }
    ]
}

Creating IAM user and SQS queue using AWS Management Console

Configuration of the ABAC on SQS resources is a two-step process. The first step is to tag your SQS resources with tags. You can use the AWS API, the AWS CLI, or the AWS Management Console to tag your resources. Once you have tagged the resources, create an IAM policy that allows or denies access to SQS resources based on their tags.

This post reviews the step-by-step process of creating ABAC policies for controlling access to SQS queues.

Create an IAM user

  1. Navigate to the AWS IAM console and choose User from the left navigation pane.
  2. Choose Add Users and provide a name in the User name text box.
  3. Check the Access key – Programmatic access box and choose Next:Permissions.
  4. Choose Next:Tags.
  5. Add tag key as environment and tag value as beta
  6. Select Next:Review and then choose Create user
  7. Copy the access key ID and secret access key and store in a secure location.

IAM configuration

Add IAM user permissions

  1. Select the IAM user you created.
  2. Choose Add inline policy.
  3. In the JSON tab, paste the following policy:
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "AllowAccessForSameResTag",
                "Effect": "Allow",
                "Action": [
                    "sqs:SendMessage",
                    "sqs:ReceiveMessage",
                    "sqs:DeleteMessage"
                ],
                "Resource": "*",
                "Condition": {
                    "StringEquals": {
                        "aws:ResourceTag/environment": "${aws:PrincipalTag/environment}"
                    }
                }
            },
            {
                "Sid": "AllowAccessForSameReqTag",
                "Effect": "Allow",
                "Action": [
                    "sqs:CreateQueue",
                    "sqs:DeleteQueue",
                    "sqs:SetQueueAttributes",
                    "sqs:tagqueue"
                ],
                "Resource": "*",
                "Condition": {
                    "StringEquals": {
                        "aws:RequestTag/environment": "${aws:PrincipalTag/environment}"
                    }
                }
            },
            {
                "Sid": "DenyAccessForProd",
                "Effect": "Deny",
                "Action": "sqs:*",
                "Resource": "*",
                "Condition": {
                    "StringEquals": {
                        "aws:ResourceTag/stage": "prod"
                    }
                }
            }
        ]
    }
    
  4. Choose Review policy.
  5. Choose Create policy.
    Create policy

The preceding permissions policy ensures that the IAM user can call SQS APIs only if the value of the request tag within the API call matches the value of the environment tag on the IAM principal. It also makes sure that the resource tag applied to the SQS queue matches the IAM tag applied on the user.

Creating IAM user and SQS queue using AWS CloudFormation

Here is the sample CloudFormation template to create an IAM user with an inline policy attached and an SQS queue.

AWSTemplateFormatVersion: "2010-09-09"
Description: "CloudFormation template to create IAM user with custom in-line policy"
Resources:
    IAMPolicy:
        Type: "AWS::IAM::Policy"
        Properties:
            PolicyDocument: |
                {
                    "Version": "2012-10-17",
                    "Statement": [
                        {
                            "Sid": "AllowAccessForSameResTag",
                            "Effect": "Allow",
                            "Action": [
                                "sqs:SendMessage",
                                "sqs:ReceiveMessage",
                                "sqs:DeleteMessage"
                            ],
                            "Resource": "*",
                            "Condition": {
                                "StringEquals": {
                                    "aws:ResourceTag/environment": "${aws:PrincipalTag/environment}"
                                }
                            }
                        },
                        {
                            "Sid": "AllowAccessForSameReqTag",
                            "Effect": "Allow",
                            "Action": [
                                "sqs:CreateQueue",
                                "sqs:DeleteQueue",
                                "sqs:SetQueueAttributes",
                                "sqs:tagqueue"
                            ],
                            "Resource": "*",
                            "Condition": {
                                "StringEquals": {
                                    "aws:RequestTag/environment": "${aws:PrincipalTag/environment}"
                                }
                            }
                        },
                        {
                            "Sid": "DenyAccessForProd",
                            "Effect": "Deny",
                            "Action": "sqs:*",
                            "Resource": "*",
                            "Condition": {
                                "StringEquals": {
                                    "aws:ResourceTag/stage": "prod"
                                }
                            }
                        }
                    ]
                }
                
            Users: 
              - "testUser"
            PolicyName: tagQueuePolicy

    IAMUser:
        Type: "AWS::IAM::User"
        Properties:
            Path: "/"
            UserName: "testUser"
            Tags: 
              - 
                Key: "environment"
                Value: "beta"

Testing tag-based access control

Create queue with tag key as environment and tag value as prod

We will use AWS CLI to demonstrate the permission model. If you do not have AWS CLI, you can download and configure it for your machine.

Run this AWS CLI command to create the queue:

aws sqs create-queue --queue-name prodQueue —region us-east-1 —tags "environment=prod"

You receive an AccessDenied error from the SQS endpoint:

An error occurred (AccessDenied) when calling the CreateQueue operation: Access to the resource <queueUrl> is denied.

This is because the tag value on the IAM user does not match the tag passed in the CreateQueue API call. Remember that we applied a tag to the IAM user with key as ‘environment’ and value as ‘beta’.

Create queue with tag key as environment and tag value as beta

aws sqs create-queue --queue-name betaQueue —region us-east-1 —tags "environment=beta"

You see a response similar to the following, which shows the successful creation of the queue.

{
"QueueUrl": "<queueUrl>“
}

Sending message to the queue

aws sqs send-message --queue-url <queueUrl> —message-body testMessage

You will get a successful response from the SQS endpoint. The response will include MD5OfMessageBody and MessageId of the message.

{
"MD5OfMessageBody": "<MD5OfMessageBody>",
"MessageId": "<MessageId>"
}

The response shows successful message delivery to the SQS queue since the IAM user permission allows sending message with queue with tag ‘beta’.

Benefits of attribute-based access controls

The following are benefits of using attribute-based access controls (ABAC) in Amazon SQS:

  • ABAC for SQS requires fewer permissions policies – You do not have to create different policies for different job functions. You can use the resource and request tags that apply to more than one queue. This reduces the operational overhead.
  • Using ABAC, teams can scale quickly – Permissions for new resources are automatically granted based on tags when resources are appropriately tagged upon creation.
  • Use permissions on the IAM principal to restrict resource access – You can create tags for the IAM principal and restrict access to specific action only if it matches the tag on the IAM principal. This helps automate granting of request permissions.
  • Track who is accessing resources – Easily determine the identity of a session by looking at the user attributes in AWS CloudTrail to track user activity in AWS.

Conclusion

In this post, we have seen how Attribute-based access control (ABAC) policies allow you to grant access rights to users through IAM policies based on tags defined on the SQS queues.

ABAC for SQS supports all SQS API actions. Managing the access permissions via tags can save you engineering time creating complex access permissions as your applications and resources grow. With the flexibility of using multiple resource tags in the security policies, the data and compliance teams can now easily set more granular access permissions based on resource attributes.

For additional details on pricing, see Amazon SQS pricing. For additional details on programmatic access to the SQS data protection, see Actions in the Amazon SQS API Reference. For more information on SQS security, see the SQS security public documentation page. To get started with the attribute-based access control for SQS, navigate to the SQS console.

For more serverless learning resources, visit Serverless Land.

Building serverless .NET applications on AWS Lambda using .NET 7

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-serverless-net-applications-on-aws-lambda-using-net-7/

This post is written by James Eastham, Senior Cloud Architect, Beau Gosse, Senior Software Engineer, and Samiullah Mohammed, Senior Software Engineer

Today, AWS is announcing tooling support to enable applications running .NET 7 to be built and deployed on AWS Lambda. This includes applications compiled using .NET 7 native AOT. .NET 7 is the latest version of .NET and brings many performance improvements and optimizations.

Native AOT enables .NET code to be ahead-of-time compiled to native binaries for up to 86% faster cold starts when compared to the .NET 6 managed runtime. The fast execution and lower memory consumption of native AOT can also result in reduced Lambda costs. This post walks through how to get started running .NET 7 applications on AWS Lambda with native AOT.

Overview

Customers can use .NET 7 with Lambda in two ways. First, Lambda has released a base container image for .NET 7, enabling customers to build and deploy .NET 7 functions as container images. Second, you can use Lambda’s custom runtime support to run functions compiled to native code using .NET 7 native AOT. Lambda has not released a managed runtime for .NET 7, since it is not a long-term support (LTS) release.

Native AOT allows .NET applications to be pre-compiled to a single binary, removing the need for JIT (Just In Time compilation) and the .NET runtime. To use this binary in a custom runtime, it needs to include the Lambda runtime client. The runtime client integrates your application code with the Lambda runtime API, which enables your application code to be invoked by Lambda.

The enhanced tooling announced today streamlines the tasks of building .NET applications using .NET 7 native AOT and deploying them to Lambda using a custom runtime. This tooling comprises three tools. The AWS Lambda extension to the ‘dotnet’ CLI (Amazon.Lambda.Tools) contains the commands to build and deploy Lambda functions using .NET. The dotnet CLI can be used directly, and is also used by the AWS Toolkit for Visual Studio, and the AWS Serverless Application Model (AWS SAM), an open-source framework for building serverless applications.

Native AOT compiles code for a specific OS version. If you run the dotnet publish command on your machine, the compiled code only runs on the OS version and processor architecture of your machine. For your application code to run in Lambda using native AOT, the code must be compiled on the Amazon Linux 2 (AL2) OS. The new tooling supports compiling your Lambda functions within an AL2-based Docker image, with the compiled application stored on your local hard drive.

Develop Lambda functions with .NET 7 native AOT

In this section, we’ll discuss how to develop your Lambda function code to be compatible with .NET 7 native AOT. This is the first GA version of native AOT Microsoft has released. It may not suit all workloads, since it does come with trade-offs. For example, dynamic assembly loading and the System.Reflection.Emit library are not available. Native AOT also trims your application code, resulting in a small binary that contains the essential components for your application to run.

Prerequisites

Getting Started

To get started, create a new Lambda function project using a custom runtime from the .NET CLI.

dotnet new lambda.NativeAOT -n LambdaNativeAot
cd ./LambdaNativeAot/src/LambdaNativeAot/
dotnet add package Amazon.Lambda.APIGatewayEvents
dotnet add package AWSSDK.Core

To review the project settings, open the LambdaNativeAot.csproj file. The target framework in this template is set to net7.0. To enable native AOT, add a new property named PublishAot, with value true. This PublishAot flag is an MSBuild property required by the .NET SDK so that the compiler performs native AOT compilation.

When using Lambda with a custom runtime, the Lambda service looks for an executable file named bootstrap within the packaged ZIP file. To enable this, the OutputType is set to exe and the AssemblyName to bootstrap.

The correctly configured LambdaNativeAot.csproj file looks like this:

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net7.0</TargetFramework>
    <ImplicitUsings>enable</ImplicitUsings>
    <Nullable>enable</Nullable>
    <AWSProjectType>Lambda</AWSProjectType>
    <AssemblyName>bootstrap</AssemblyName>
    <PublishAot>true</PublishAot>
  </PropertyGroup> 
  …
</Project>

Function code

Running .NET with a custom runtime uses the executable assembly feature of .NET. To do this, your function code must define a static Main method. Within the Main method, you must initialize the Lambda runtime client, and configure the function handler and the JSON serializer to use when processing Lambda events.

The Amazon.Lambda.RuntimeSupport Nuget package is added to the project to enable this runtime initialization. The LambdaBootstrapBuilder.Create() method is used to configure the handler and the ILambdaSerializer implementation to use for (de)serialization.

private static async Task Main(string[] args)
{
    Func<string, ILambdaContext, string> handler = FunctionHandler;
    await LambdaBootstrapBuilder.Create(handler, new DefaultLambdaJsonSerializer())
        .Build()
        .RunAsync();
}

Assembly trimming

Native AOT trims application code to optimize the compiled binary, which can cause two issues. The first is with de/serialization. Common .NET libraries for working with JSON like Newtonsoft.Json and System.Text.Json rely on reflection. The second is with any third party libraries not yet updated to be trim-friendly. The compiler may trim out parts of the library that are required for the library to function. However, there are solutions for both issues.

Working with JSON

Source generated serialization is a language feature introduced in .NET 6. It allows the code required for de/serialization to be generated at compile time instead of relying on reflection at runtime. One drawback of native AOT is that the ability to use System.Relefection.Emit library is lost. Source generated serialization enables developers to work with JSON while also using native AOT.

To use the source generator, you must define a new empty partial class that inherits from System.Text.Json.JsonSerializerContext. On the empty partial class, add the JsonSerializable attribute for any .NET type that your application must de/serialize.

In this example, the Lambda function needs to receive events from API Gateway. Create a new class in the project named HttpApiJsonSerializerContext and copy the code below:

[JsonSerializable(typeof(APIGatewayHttpApiV2ProxyRequest))]
[JsonSerializable(typeof(APIGatewayHttpApiV2ProxyResponse))]
public partial class HttpApiJsonSerializerContext : JsonSerializerContext
{
}

When the application is compiled, static classes, properties, and methods are generated to perform the de/serialization.

This custom serializer must now also be passed in to the Lambda runtime to ensure that event inputs and outputs are serialized and deserialized correctly. To do this, pass a new instance of the serializer context into the runtime when bootstrapped. Here is an example of a Lambda function using API Gateway as a source:

using System.Text.Json.Serialization;
using Amazon.Lambda.APIGatewayEvents;
using Amazon.Lambda.Core;
using Amazon.Lambda.RuntimeSupport;
using Amazon.Lambda.Serialization.SystemTextJson;
namespace LambdaNativeAot;
public class Function
{
    /// <summary>
    /// The main entry point for the custom runtime.
    /// </summary>
    private static async Task Main()
    {
        Func<APIGatewayHttpApiV2ProxyRequest, ILambdaContext, Task<APIGatewayHttpApiV2ProxyResponse>> handler = FunctionHandler;
        await LambdaBootstrapBuilder.Create(handler, new SourceGeneratorLambdaJsonSerializer<HttpApiJsonSerializerContext>())
            .Build()
            .RunAsync();
    }

    public static async Task<APIGatewayHttpApiV2ProxyResponse> FunctionHandler(APIGatewayHttpApiV2ProxyRequest apigProxyEvent, ILambdaContext context)
    {
        // API Handling logic here
        return new APIGatewayHttpApiV2ProxyResponse()
        {
            StatusCode = 200,
            Body = "OK"
        };
    }
}

Third party libraries

The .NET compiler provides the capability to control how applications are trimmed. For native AOT compilation, this enables us to exclude specific assemblies from trimming. For any libraries used in your applications that may not yet be trim-friendly this is a powerful way to still use native AOT. This is important for any of the Lambda event source NuGet packages like Amazon.Lambda.ApiGatewayEvents. Without controlling this, the C# objects for the Amazon API Gateway event sources are trimmed, leading to serialization errors at runtime.

Currently, the AWSSDK.Core library used by all the .NET AWS SDKs must also be excluded from trimming.

To control the assembly trimming, create a new file in the project root named rd.xml. Full details on the rd.xml format are found in the Microsoft documentation. Adding assemblies to the rd.xml file excludes them from trimming.

The following example contains an example of how to exclude the AWSSDK.Core, API Gateway event and function library from trimming:

<Directives xmlns="http://schemas.microsoft.com/netfx/2013/01/metadata">
	<Application>
		<Assembly Name="AWSSDK.Core" Dynamic="Required All"></Assembly>
		<Assembly Name="Amazon.Lambda.APIGatewayEvents" Dynamic="Required All"></Assembly>
		<Assembly Name="bootstrap" Dynamic="Required All"></Assembly>
	</Application>
</Directives>

Once added, the csproj file must be updated to reference the rd.xml file. Edit the csproj file for the Lambda project and add this ItemGroup:

<ItemGroup>
  <RdXmlFile Include="rd.xml" />
</ItemGroup>

When the function is compiled, assembly trimming skips the three libraries specified. If you are using .NET 7 native AOT with Lambda, we recommend excluding both the AWSSDK.Core library and the specific libraries for any event sources your Lambda function uses. If you are using the AWS X-Ray SDK for .NET to trace your serverless application, this must also be excluded.

Deploying .NET 7 native AOT applications

We’ll now explain how to build and deploy .NET 7 native AOT functions on Lambda, using each of the three deployment tools.

Using the dotnet CLI

Prerequisites

  • Docker (if compiling on a non-Amazon Linux 2 based machine)

Build and deploy

To package and deploy your Native AOT compiled Lambda function, run:

dotnet lambda deploy-function

When compiling and packaging your Lambda function code using the Lambda tools CLI, the tooling checks for the PublishAot flag in your project. If set to true, the tooling pulls an AL2-based Docker image and compiles your code inside. It mounts your local file system to the running container, allowing the compiled binary to be stored back to your local file system ready for deployment. As a default, the generated ZIP file is output to the bin/Release directory.

Once the deployment completes, you can execute the below command to invoke the created function, replacing the FUNCTION_NAME option with the name of the function chosen during deployment.

dotnet lambda invoke-function FUNCTION_NAME

Using the Visual Studio Extension

AWS is also announcing support for compiling and deploying native AOT-based Lambda functions from within Visual Studio using the AWS Toolkit for Visual Studio.

Prerequisites

Getting Started

As part of this release, templates are available in Visual Studio 2022 to get started using native AOT with AWS Lambda. From within Visual Studio, select File -> New Project. Search for Lambda .NET 7 native AOT to start a new project pre-configured for native AOT.

Create a new project

Build and deploy

Once the project is created, right-click the project in Visual Studio and choose Publish to AWS Lambda.

Solution Explorer

Complete the steps in the publish wizard and press upload. The log messages created by Docker appear in the publish window as it compiles your function code for native AOT.

Uploading function

You can now invoke the deployed function from within Visual Studio by setting the Example request dropdown to API Gateway AWS Proxy and pressing the Invoke button.

Invoke example

Using the AWS SAM CLI

Prerequisites

  • Docker (If compiling on a non-AL2 based machine)
  • AWS SAM v1.6.4 or later

Getting started

Support for compiling and deploying .NET 7 native AOT is built into the AWS SAM CLI. To get started, initialize a new AWS SAM project:

sam init

In the new project wizard, choose:

  1. What template source would you like to use? 1 – AWS Quick Start Template
  2. Choose an AWS Quick start application template. 1 – Hello World example
  3. Use the most popular runtime and package type? – N
  4. Which runtime would you like to use? aot.dotnet7 (provided.al2)
  5. Enable X-Ray Tracing? N
  6. Choose a project name

The cloned project includes the configuration to deploy to Lambda.

One new AWS SAM metadata property called ‘BuildMethod’ is required in the AWS SAM template:

HelloWorldFunction:
  Type: AWS::Serverless::Function
  Properties:
    Runtime: 'provided.al2' # // Use provided to deploy to AWS Lambda for .NET 7 native AOT
    Architectures:
      - x86_64
  Metadata:
    BuildMethod: 'dotnet7' # // But build with new build method for .NET 7 that calls into Amazon.Lambda.Tools 

Build and deploy

Build and deploy your serverless application, completing the guided deployment steps:

sam build
sam deploy –-guided

The AWS SAM CLI uses the Amazon.Lambda.Tools CLI to pull an AL2-based Docker image and compile your application code inside a container. You can use AWS SAM accelerate to speed up the update of serverless applications during development. It uses direct API calls instead of deploying changes through AWS CloudFormation, automating updates whenever you change your local code base. Learn more in the AWS SAM development documentation.

Conclusion

AWS now supports .NET 7 native AOT on Lambda. Read the Lambda Developer Guide for more getting started information. For more details on the performance improvements from using .NET 7 native AOT on Lambda, see the serverless-dotnet-demo repository on GitHub.

To provide feedback for .NET on AWS Lambda, contact the AWS .NET team on the .NET Lambda GitHub repository.

For more serverless learning resources, visit Serverless Land.

Better together: AWS SAM CLI and HashiCorp Terraform

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/better-together-aws-sam-cli-and-hashicorp-terraform/

This post is written by Suresh Poopandi, Senior Solutions Architect and Seb Kasprzak, Senior Solutions Architect.

Today, AWS is announcing the public preview of AWS Serverless Application Model CLI (AWS SAM CLI) support for local development, testing, and debugging of serverless applications defined using HashiCorp Terraform configuration.

AWS SAM and Terraform are open-source frameworks for building applications using infrastructure as code (IaC). Both frameworks allow building, changing, and managing cloud infrastructure in a repeatable way by defining resource configurations.

Previously, you could use the AWS SAM CLI to build, test, and debug applications defined by AWS SAM templates or through the AWS Cloud Development Kit (CDK). With this preview release, you can also use AWS SAM CLI to test and debug serverless applications defined using Terraform configurations.

Walkthrough of Terraform support

This blog post contains a sample Terraform template, which shows how developers can use AWS SAM CLI to build locally, test, and debug AWS Lambda functions defined in Terraform. This sample application has a Lambda function that stores a book review score and review text in an Amazon DynamoDB table. An Amazon API Gateway book review API uses Lambda proxy integration to invoke the book review Lambda function.

Demo application architecture

Demo application architecture

Prerequisites

Before running this example:

  • Install the AWS CLI.
    • Configure with valid AWS credentials.
    • Note that AWS CLI now requires Python runtime.
  • Install HashiCorp Terraform.
  • Install the AWS SAM CLI.
  • Install Docker (required to run AWS Lambda function locally).

Since Terraform support is currently in public preview, you must provide a –beta-features flag while executing AWS SAM commands. Alternatively, set this flag in samconfig.toml file by adding beta_features=”true”.

Deploying the example application

This Lambda function interacts with DynamoDB. For the example to work, it requires an existing DynamoDB table in an AWS account. Deploying this creates all the required resources for local testing and debugging of the Lambda function.

To deploy:

  1. Clone the aws-sam-terraform-examples repository locally:
    git clone https://github.com/aws-samples/aws-sam-terraform-examples
  2. Change to the project directory:
    cd aws-sam-terraform-examples/zip_based_lambda_functions/api-lambda-dynamodb-example/

    Terraform must store the state of the infrastructure and configuration it creates. Terraform uses this state to map cloud resources to configuration and track changes. This example uses a local backend to store the state file on the local filesystem.

  3. Open the main.tf file and review its contents. Locate the backend section of the code, updating the region field with the target deployment Region of this sample solution:
    provider “aws” {
        region = “<AWS region>” # e.g. us-east-1
    }
  4. Initialize a working directory containing Terraform configuration files:
    terraform init
  5. Deploy the application using Terraform CLI. When prompted by “Do you want to perform these actions?”, enter Yes.
    terraform apply

Terraform deploys the application, as shown in the terminal output.

Terminal output

Terminal output

After completing the deployment process, the AWS account is ready for use by the Lambda function with all the required resources.

Terraform Configuration for local testing

Lambda functions require application dependencies bundled together with function code as a deployment package (typically a .zip file) to run correctly. Terraform natively does not create the deployment package and a separate build process handles this package creation.

This sample application uses Terraform’s null_resource and local-exec provisioner to trigger a build process script. This installs Python dependencies in a temporary folder and creates a .zip file with dependencies and function code. It contains this logic within the main.tf file of the example application.

To explain each code segment in more detail:

Terraform example

Terraform example

  1. aws_lambda_function: This sample defines a Lambda function resource. It contains properties such as environment variables (in this example, the DynamoDB table_id) and the depends_on argument, which creates the .zip package before deploying the Lambda function.

    Terraform example

    Terraform example

  2. null_resource: When the AWS SAM CLI build command runs, AWS SAM reviews Terraform code for any null_resource starting with sam_metadata_ and uses the information contained within this resource block to gather the location of the Lambda function source code and .zip package. This information allows the AWS SAM CLI to start the local execution of the Lambda function. This special resource should contain the following attributes:
    • resource_name: The Lambda function address as defined in the current module (aws_lambda_function.publish_book_review)
    • resource_type: Packaging type of the Lambda function (ZIP_LAMBDA_FUNCTION)
    • original_source_code: Location of Lambda function code
    • built_output_path: Location of .zip deployment package

Local testing

With the backend services now deployed, run local tests to see if everything is working. The locally running sample Lambda function interacts with the services deployed in the AWS account. Run the sam build to reflect the local sam testing environment with changes after each code update.

  1. Local Build: To create a local build of the Lambda function for testing, use the sam build command:
    sam build --hook-name terraform --beta-features
  2. Local invoke: The first test is to invoke the Lambda function with a mocked event payload from the API Gateway. These events are in the events directory. Run this command, passing in a mocked event:
    AWS_DEFAULT_REGION=<Your Region Name>
    sam local invoke aws_lambda_function.publish_book_review -e events/new-review.json --beta-features

    AWS SAM mounts the Lambda function runtime and code and runs it locally. The function makes a request to the DynamoDB table in the cloud to store the information provided via the API. It returns a 200 response code, signaling the successful completion of the function.

  3. Local invoke from AWS CLI
    Another test is to run a local emulation of the Lambda service using “sam local start-lambda” and invoke the function directly using AWS SDK or the AWS CLI. Start the local emulator with the following command:

    sam local start-lambda
    Terminal output

    Terminal output

    AWS SAM starts the emulator and exposes a local endpoint for the AWS CLI or a software development kit (SDK) to call. With the start-lambda command still running, run the following command to invoke this function locally with the AWS CLI:

    aws lambda invoke --function-name aws_lambda_function.publish_book_review --endpoint-url http://127.0.0.1:3001/ response.json --cli-binary-format raw-in-base64-out --payload file://events/new-review.json

    The AWS CLI invokes the local function and returns a status report of the service to the screen. The response from the function itself is in the response.json file. The window shows the following messages:

    Invocation results

    Invocation results

  4. Debugging the Lambda function

Developers can use AWS SAM with a variety of AWS toolkits and debuggers to test and debug serverless applications locally. For example, developers can perform local step-through debugging of Lambda functions by setting breakpoints, inspecting variables, and running function code one line at a time.

The AWS Toolkit Integrated Development Environment (IDE) plugin provides the ability to perform many common debugging tasks, like setting breakpoints, inspecting variables, and running function code one line at a time. AWS Toolkits make it easier to develop, debug, and deploy serverless applications defined using AWS SAM. They provide an experience for building, testing, debugging, deploying, and invoking Lambda functions integrated into IDE. Refer to this link that lists common IDE/runtime combinations that support step-through debugging of AWS SAM applications.

Visual Studio Code keeps debugging configuration information in a launch.json file in a workspace .vscode folder. Here is a sample launch configuration file to debug Lambda code locally using AWS SAM and Visual Studio Code.

{
    "version": "0.2.0",
    "configurations": [
          {
            "name": "Attach to SAM CLI",
            "type": "python",
            "request": "attach",
            "address": "localhost",
            "port": 9999,
            "localRoot": "${workspaceRoot}/sam-terraform/book-reviews",
            "remoteRoot": "/var/task",
            "protocol": "inspector",
            "stopOnEntry": false
          }
    ]
}

After adding the launch configuration, start a debug session in the Visual Studio Code.

Step 1: Uncomment the following two lines in zip_based_lambda_functions/api-lambda-dynamodb-example/src/index.py

Enable debugging in the Lambda function

Enable debugging in the Lambda function

Step 2: Run the Lambda function in the debug mode and wait for the Visual Studio Code to attach to this debugging session:

sam local invoke aws_lambda_function.publish_book_review -e events/new-review.json -d 9999

Step 3: Select the Run and Debug icon in the Activity Bar on the side of VS Code. In the Run and Debug view, select “Attach to SAM CLI” and choose Run.

For this example, set a breakpoint at the first line of lambda_handler. This breakpoint allows viewing the input data coming into the Lambda function. Also, it helps in debugging code issues before deploying to the AWS Cloud.

Debugging in then IDE

Debugging in then IDE

Lambda Terraform module

A community-supported Terraform module for lambda (terraform-aws-lambda) has added support for SAM metadata null_resource. When using the latest version of this module, AWS SAM CLI will automatically support local invocation of the Lambda function, without additional resource blocks required.

Conclusion

This blog post shows how to use the AWS SAM CLI together with HashiCorp Terraform to develop and test serverless applications in a local environment. With AWS SAM CLI’s support for HashiCorp Terraform, developers can now use the AWS SAM CLI to test their serverless functions locally while choosing their preferred infrastructure as code tooling.

For more information about the features supported by AWS SAM, visit AWS SAM. For more information about the Metadata resource, visit HashiCorp Terraform.

Support for the Terraform configuration is currently in preview, and the team is asking for feedback and feature request submissions. The goal is for both communities to help improve the local development process using AWS SAM CLI. Submit your feedback by creating a GitHub issue here.

For more serverless learning resources, visit Serverless Land.

Introducing the AWS Lambda Telemetry API

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/introducing-the-aws-lambda-telemetry-api/

This blog post is written by Anton Aleksandrov, Principal Solution Architect and Shridhar Pandey, Senior Product Manager

Today AWS is announcing the AWS Lambda Telemetry API. This provides an easier way to receive enhanced function telemetry directly from the Lambda service and send it to custom destinations. Developers and operators can now more easily monitor and observe their Lambda functions using Lambda extensions from their preferred observability tool providers.

Extensions can use the Lambda Logs API to collect logs generated by the Lambda service and code running in their Lambda function. While the Logs API provides extensions with access to logs, it does not provide a way to collect additional telemetry, such as traces and metrics, which the Lambda service generates during initialization and invocation of your Lambda function.

Previously, observability tools retrieved traces from AWS X-Ray using the AWS X-Ray API or built their own custom tracing libraries to generate traces during Lambda function invocation. Tools required customers to modify AWS Identity and Access Management (IAM) policies to grant access to the traces from X-Ray. This caused additional complexity for tools to collect traces and metrics from multiple sources and introduced latency in seeing Lambda function traces in observability tool dashboards.

The Lambda Telemetry API is a new API that enhances the existing Lambda Logs API functionality. With the new Telemetry API, observability tools can receive function and extension logs, and also events, traces, and metrics directly from within the Lambda service. You do not need to install additional tracing libraries. This reduces latency and simplifies access permissions, as the extension does not require additional access to X-Ray.

Today you can use Telemetry API-enabled extensions to send telemetry data to Coralogix, Datadog, Dynatrace, Lumigo, New Relic, Sedai, Site24x7, Serverless.com, Sumo Logic, Sysdig, Thundra, or your own custom destinations.

Overview

To receive logs, extensions subscribe using the new Lambda Telemetry API.

Lambda Telemetry API

Lambda Telemetry API

The Lambda service then streams the telemetry events directly to the extension. The events include platform events, trace spans, function and extension logs, and additional Lambda platform metrics. The extension can then process, filter, and route them to any preferred destination.

You can add an extension from the tooling provider of your choice to your Lambda function. You can deploy extensions, including ones that use the Telemetry API, as Lambda layers, with the AWS Management Console and AWS Command Line Interface (AWS CLI). You can also use infrastructure as code tools such as AWS CloudFormation, the AWS Serverless Application Model (AWS SAM), Serverless Framework, and Terraform.

Lambda Extensions from the AWS Partner Network (APN) available at launch

Today, you can use Lambda extensions that use Telemetry API from the following Lambda partners:

  • The Coralogix AWS Lambda Telemetry Exporter extension now offers improved monitoring and alerting for Lambda functions by further streamlining collection and correlation of logs, metrics, and traces.
  • The Datadog extension further simplifies how you visualize the impact of cold starts, and monitor and alert on latency, duration, and payload size of your Lambda functions by collecting logs, traces, and real-time metrics from your function in a simple and cost-effective way.
  • Dynatrace now provides a simplified observability configuration for AWS Lambda through a seamless integration. The new solution delivers low-latency telemetry, enables monitoring at scale, and helps reduce monitoring costs for your serverless workloads.
  • The Lumigo lambda-log-shipper extension simplifies aggregating and forwarding Lambda logs to third-party tools. It now also makes it easy for you to detect Lambda function timeouts.
  • The New Relic extension now provides a unified observability view for your Lambda functions with insights that help you better understand and optimize the performance of your functions.
  • Sedai now uses the Telemetry API to help you improve the performance and availability of your Lambda functions by gathering insights about your function and providing recommendations for manual and autonomous remediation in a cost-effective manner.
  • The Site24x7 extension now offers new metrics, which enable you to get deeper insights into the different phases of the Lambda function lifecycle, such as initialization and invocation.
  • Serverless.com now uses the Telemetry API to provide real-time performance details for your Lambda function through the Dev Mode feature of their new Serverless Console V.2 offering, which simplifies debugging in the AWS Cloud.
  • Sumo Logic now makes it easier, faster, and more cost-effective for you to get your mission-critical Lambda function telemetry sent directly to Sumo Logic so you could quickly analyze and remediate errors and exceptions.
  • The Sysdig Monitor extension generates and collects real-time metrics directly from the Lambda platform. The simplified instrumentation offers lower latency, reduced MTTR (mean time to resolution) for critical issues, and cost benefits while monitoring your serverless applications.
  • The Thundra extension enables you to export logs, metrics, and events for Lambda execution environment lifecycle events emitted by the Telemetry API to a destination of your choice such as an S3 bucket, a database, or a monitoring backend.

Seeing example Telemetry API extensions in action

This demo shows an example of using a telemetry extension to receive telemetry, batch, and send it to a desired destination.

To set up the example, visit the GitHub repo for the extension implemented in the language of your choice and follow the instructions in the README.md file.

To configure the batching behavior, which controls when the extension sends the data, set the Lambda environment variable DISPATCH_MIN_BATCH_SIZE. When the extension receives the batch threshold, it POSTs the telemetry events batch to the destination specified in the DISPATCH_POST_URI environment variable.

You can configure an example DISPATCH_POST_URL for the extension to deliver the telemetry data using https://webhook.site/.

Lambda environment variables

Lambda environment variables

Telemetry events for one invoke may be received and processed during the next invocation. Events for the last invoke may be processed during the SHUTDOWN event.

Test and invoke the function from the Lambda console, or AWS CLI. You can see that the webhook receives the telemetry data.

Webhook receiving telemetry data

Webhook receiving telemetry data

You can also view the function and extension logs in CloudWatch Logs. The example extension includes verbose logging to understand the extension lifecycle.

CloudWatch Logs showing extension verbose logging

Sample Telemetry API events

When the extension receives telemetry data, each event contains a JSON dictionary with additional information, such as related metrics or trace spans. The following example shows a function initialization event. You can see that the function initializes with on-demand concurrency. The runtime version is Node.js 14, the initialization is successful, and the initialization duration is 123 milliseconds.

{
  "time": "2022-08-02T12:01:23.521Z",
  "type": "platform.initStart",
  "record": {
    "initializationType": "on-demand",
    "phase":"init",
    "runtimeVersion": "nodejs-14.v3",
    "runtimeVersionArn": "arn"
  }
}

{
  "time": "2022-08-02T12:01:23.521Z",
  "type": "platform.initRuntimeDone",
  "record": {
    "initializationType": "on-demand",
    "status": "success"
  }
}

{
  "time": "2022-08-02T12:01:23.521Z",
  "type": "platform.initReport",
  "record": {
    "initializationType": "on-demand",
    "phase":"init",
    "metrics": {
      "durationMs": 123.0,
    }
  }
}

Function invocation events include the associated requestId and tracing information connecting this invocation with the X-Ray tracing context, and platform spans showing response latency and response duration as well as invocation metrics such as duration in milliseconds.

{
    "time": "2022-08-02T12:01:23.521Z",
    "type": "platform.start",
    "record": {
      "requestId": "e6b761a9-c52d-415d-b040-7ba94b9452f3",
      "version": "$LATEST",
      "tracing": {
        "spanId": "54565fb41ac79632",
        "type": "X-Amzn-Trace-Id",
        "value": "Root=1-62e900b2-710d76f009d6e7785905449a;Parent=0efbd19962d95b05;Sampled=1"
      }
    }
  }
  
  {
    "time": "2022-08-02T12:01:23.521Z",
    "type": "platform.runtimeDone",
    "record": {
      "requestId": "e6b761a9-c52d-415d-b040-7ba94b9452f3",
      "status": "success",
      "tracing": {
        "spanId": "54565fb41ac79632",
        "type": "X-Amzn-Trace-Id",
        "value": "Root=1-62e900b2-710d76f009d6e7785905449a;Parent=0efbd19962d95b05;Sampled=1"
      },
      "spans": [
        {
          "name": "responseLatency", 
          "start": "2022-08-02T12:01:23.521Z",
          "durationMs": 23.02
        },
        {
          "name": "responseDuration", 
          "start": "2022-08-02T12:01:23.521Z",
          "durationMs": 20
        }
      ],
      "metrics": {
        "durationMs": 200.0,
        "producedBytes": 15
      }
    }
  }
  
  {
    "time": "2022-08-02T12:01:23.521Z",
    "type": "platform.report",
    "record": {
      "requestId": "e6b761a9-c52d-415d-b040-7ba94b9452f3",
      "metrics": {
        "durationMs": 220.0,
        "billedDurationMs": 300,
        "memorySizeMB": 128,
        "maxMemoryUsedMB": 90,
        "initDurationMs": 200.0
      },
      "tracing": {
        "spanId": "54565fb41ac79632",
        "type": "X-Amzn-Trace-Id",
        "value": "Root=1-62e900b2-710d76f009d6e7785905449a;Parent=0efbd19962d95b05;Sampled=1"
      }
    }
  }

Building a Telemetry API extension

Lambda extensions run as independent processes in the execution environment and continue to run after the function invocation is fully processed. Because extensions run as separate processes, you can write them in a language different from the function code. We recommend implementing extensions using a compiled language as a self-contained binary. This makes the extension compatible with all the supported runtimes.

Extensions that use the Telemetry API have the following lifecycle.

Telemetry API lifecycle

Telemetry API lifecycle

  1. The extension registers itself using the Lambda Extension API and subscribes to receive INVOKE and SHUTDOWN events. With the Telemetry API, the registration response body contains additional information, such as function name, function version, and account ID.
  2. The extensions start a telemetry listener. This is a local HTTP or TCP endpoint. We recommend using HTTP rather than TCP.
  3. The extensions use the Telemetry API to subscribe to desired telemetry event streams.
  4. The Lambda service POSTs telemetry stream data to your telemetry listener. We recommend batching the telemetry data as it arrives to the listener. You can perform any custom processing on this data and send it on to an S3 bucket, other custom destination, or an external observability service.

See the Telemetry API documentation and sample extensions for additional details.

The Lambda Telemetry API supersedes the Lambda Logs API. While the Logs API remains fully functional, AWS recommends using the Telemetry API. New functionality is only available with the Extensions API. Extensions can only subscribe to either the Logs or Telemetry API. After subscribing to one of them, any attempt to subscribe to the other returns an error.

Mapping Telemetry API schema to OpenTelemetry spans

The Lambda Telemetry API schema is semantically compatible with OpenTelemetry (OTEL). You can use events received from the Telemetry API to build and report OTEL spans. Three Telemetry API lifecycle events represent a single function invocation: start, runtimeDone, and runtimeReport. You should represent this as a single OTEL span. You can add additional details to your spans using information available in runtimeDone events under the event.spans property.

Mapping of Telemetry API events to OTEL spans is described in the Telemetry API documentation.

Metrics and pricing

The Telemetry API introduces new per-invoke metrics to help you understand the impact of extensions on your function’s performance. The metrics are available within the report.runtimeDone event.

  • platform.runtime measures the time taken by the Lambda Runtime to run your function handler code.
  • producedBytes measures the number of bytes returned during the invoke phase.

There are also two new trace spans available within the report.runtimeDone event:

  • responseLatencyMs measures the time taken by the Runtime to send a response.
  • responseDurationMs measures the time taken by the Runtime to finish sending the response from when it starts streaming it.

Extensions using Telemetry API, like other extensions, share the same billing model as Lambda functions. When using Lambda functions with extensions, you pay for requests served, and the combined compute time used to run your code and all extensions, in 1-ms increments. To learn more about the billing for extensions, visit the Lambda pricing page.

Useful links

Conclusion

The Lambda Telemetry API allows you to receive enhanced telemetry data more easily using your preferred monitoring and observability tools. The Telemetry API enhances the functionality of the Logs API to receive logs, metrics, and traces directly from the Lambda service. Developers and operators can send telemetry to destinations without custom libraries, with reduced latency, and simplified permissions.

To see how the Telemetry API works, try the demos in the GitHub repository.

Build your own extensions using the Telemetry API today, or use extensions provided by the Lambda observability partners.

For more serverless learning resources, visit Serverless Land.

Enriching operational events with AWS Serverless

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/enriching-operational-events-with-aws-serverless/

This post was written by Ben Moses, Senior Solutions Architect, Enterprise.

AWS Serverless is a fit for many IT automation and operations use cases, especially for reacting to events. Infrastructure events are a useful way to understand the health of your infrastructure that supports your applications and customers and this blog examines using serverless to help enrich these operational events.

The scenario used in this post shows how an infrastructure event can be intercepted in real-time, enriched with additional information from your AWS environment and workloads, and be sent to a downstream consumer with the added valuable information.

This example focuses on Amazon EC2 state change events. The concept applies to any type of event, for example those emitted by other AWS services to Amazon CloudWatch Events. These events could also include events produced by AWS Config, and some of AWS CloudTrail’s events, including CloudTrail Insights.

The purpose is to add more valuable information and context to events in real-time. Operators and downstream consumers can then identify emerging patterns in near real-time.

How does this happen today?

It is common for existing solutions to store infrastructure events in whatever format the source system generates, or in a standardized open or proprietary format. Operations staff and systems then analyze these logs to understand patterns and to support root cause analysis. This data must often be enriched by other sources to give it context and meaning. This is done either in a scheduled batch operation by using CSV data from other systems, or by integrating with other enterprise tooling.

The state of your cloud infrastructure changes frequently due to the elasticity and disposability of resources. This can cause an issue with your data quality when using the schedule batch method. When you come to enrich an infrastructure event, the state may have changed by the time your scheduled batch runs. This leads to gaps or inaccuracies in data, which makes it harder for operators to spot trends and anomalies.

A serverless approach

This example uses serverless services and concepts from event driven architecture (EDA). With this architecture, you only pay when events happen and are enriched. There’s no need for any third-party tooling, and your events are enriched in near real-time.

The EC2 “State Change Event” is enriched by obtaining the instance’s name tag, if it has one. The end-to-end journey look like this:

Overview

  1. An EC2 instance’s state changes (i.e., shutdown, restart).
  2. An Amazon EventBridge rule that matches the event pattern triggers a target action to run an AWS Step Functions state machine.
  3. The state machine transforms inputs, makes a native AWS API SDK call to the EC2 service to find a name tag, and emits a newly enriched event back to EventBridge.
  4. An EventBridge rule matching the enriched event triggers an action to send an email via Amazon SNS to simulate a downstream consumer.

EventBridge is a serverless event bus that can be used with event driven architectures on AWS. An EventBridge rule is defined with a pattern, and if an event matches that pattern, then the rule’s target action is triggered. In this example, the rule is:

{
  "detail-type": ["EC2 Instance State-change Notification"],
  "source": ["aws.ec2"]
}

An EC2 state change event looks like this:

{
  "version": "0",
  "id": "672123fe-53aa-3b22-3b37-1fae26df2aff",
  "detail-type": "EC2 Instance State-change Notification",
  "source": "aws.ec2",
  "account": "1234567890",
  "time": "2022-08-17T18:25:01Z",
  "region": "eu-west-1",
  "resources": [
    "arn:aws:ec2:eu-west-1:1234567890:instance/i-1234567890"
  ],
  "detail": {
    "instance-id": "i-0123456789",
    "state": "running"
  }
}

See the detail-type and source fields in the event. These match the rule and this entire event payload is passed on to the next component of the architecture: the Step Functions state machine.

Step Functions uses JSONPath to select, transform, and move data through the states within a state machine. This flexibility means that, in this example, no compute resources such as AWS Lambda are required. This can mean less custom code, lower cost, and less complexity.

Step Functions Workflow Studio lets you design workflows visually. These are the key actions that take place when the state machine runs using the EC2 state change event:

Step Functions state machine

1. Remove problem characters from input

Pass states allow us to transform inputs and outputs. In this architecture, a Pass state is used to remove any problem characters from the incoming event that are known to cause issues in future steps, such as API calls to services.

In this example, the parameters for the API call used in Step 2 requires the EC2 instance ID. This information is in the detail of the original event, but the API action can’t use anything with a hyphen in it.

To solve this, use a JSONPath Parameter to effectively rewrite this information without the hyphen. This creates a new field named instanceid, which is assigned the value from the original event’s detail.

{
  "instanceid.$": "$.detail.instance-id"
}

2. Get instance name from Tag

The “EC2: DescribeInstances” task in Step Functions is an example of a native SDK integration with an AWS service. This action expects a single parameter to the API, an array of EC2 instance IDs.

{
  "InstanceIds.$": "States.Array($.detail.refined.instanceid)"
}

The States.Array() intrinsic function is used to wrap the instance ID from the re-written field created in step 1. This single-member array is then passed to the EC2 Describe Instances API.

When a response is received from the EC2 Describe Instances API call, it is passed to a Result Selector. The purpose of this is to extract the value of a “Name” tag, if one was returned from the EC2 Describe Instances API.

Step Functions supports the use of JSONPath filter expressions.

{
  "instancename.$": "$..Reservations[0].Instances[0].Tags[?(@.Key==Name)].Value",
  "instanceid.$": "$.Reservations[0].Instances[0].InstanceId"
}

To understand the advanced JSONPath filter expression used in this example, read this blog post.

If an error occurs with the API call, or the filter expression is unable to find a “Name” tag on the EC2 instance, then Step Functions allows you to handle these errors within the workflow.

3. Convert instance name to a string

The output from the previous state returns an array, but an EC2 instance can only have one unique “Name” tag. A pass state is used again, with a parameter as seen in Step 1. This parameter expression takes the first element from the array and stores it in a new field named instancename.

{
  "instancename.$": "$.detail.refined.instancename[0]",
  "instanceid.$": "$.detail.refined.instanceid"
}

As with previous steps, the instanceid is re-written as part of the output, and both of these values are appended to the state’s output.

4. Get default name from Parameter Store

If the filter expression in the result selector in step 2 fails for any reason, then Step Functions error handling moves here.

Failures can happen for a variety of reasons, and with Step Functions, you can branch out error handling for each different error type. In this example, all errors are dealt with the same regardless of the cause being a missing “Name” tag, or a permissions issue. In this architecture, a default placeholder value is used in place of the name of the instance. In your context, a different approach may be more suitable.

The default placeholder name is stored as a static value in AWS Systems Manager Parameter Store. The native Systems Manager: GetParameter action within Step Functions can retrieve this value directly. An advantage of this approach is that the parameter can be updated externally without having to make any changes to the Step Functions state machine itself.

5. Add ID back to refined

A pass state is used to format the response from the Parameter Store API and parameter expression then appends the default instance name on to the output.

Whether the workflow execution followed the intended execution path, or encountered an error, there is now an enriched event payload with an instance name.

6. Emit enriched event

The EventBridge: PutEvents native SDK action within Step Functions is used to construct and emit the enriched event.

{
  "Entries": [
    {
      "Detail": {
        "Message.$": "$"
      },
      "DetailType": "EnrichedEC2Event",
      "EventBusName": "serverless-event-enrichment-ApplicationEventBus",
      "Source": "custom.enriched.ec2"
    }
  ]
}

The DetailType and Source of the enriched event are custom values, specified in the last step of the state machine. As you consider schemas for your events within your organization, note that the AWS prefix is reserved for AWS service events.

The enriched event payload looks like this:

{
  "version": "0",
  "id": "a80e378b-e9a7-8007-1f18-b947e6d72c4b",
  "detail-type": "EnrichedEC2Event",
  "source": "custom.enriched.ec2",
  "account": "123456789",
  "time": "2022-08-17T18:25:03Z",
  "region": "eu-west-1",
  "resources": [
    "arn:aws:states:eu-west-1:123456789:stateMachine:EventEnrichmentStateMachine-2T5jFlCPOha1",
    "arn:aws:states:eu-west-1:123456789:execution:EventEnrichmentStateMachine-2T5jFlCPOha1:672123fe-53aa-3b22-3b37-1fae26df2aff_90821b68-ba92-2374-5015-8804c8da5769"
  ],
  "detail": {
    "Message": {
      "version": "0",
      "id": "672123fe-53aa-3b22-3b37-1fae26df2aff",
      "detail-type": "EC2 Instance State-change Notification",
      "source": "aws.ec2",
      "account": "123456789",
      "time": "2022-08-17T18:25:01Z",
      "region": "eu-west-1",
      "resources": [
        "arn:aws:ec2:eu-west-1:123456789:instance/i-123456789"
      ],
      "detail": {
        "instance-id": "i-123456789",
        "state": "running",
        "refined": {
          "instancename": "ec2-enrichment-demo-instance",
          "instanceid": "i-123456789"
        }
      }
    }
  }
}

Consuming enriched events

When enriching event data in real-time, the events are only valuable if they’re consumed. To use these enriched events, a consuming service must create and own a new EventBridge rule on the custom application bus. In this architecture, an appropriate rule pattern is:

{
  "detail-type": ["EnrichedEC2Event"],
  "source": ["custom.enriched.ec2"]
}

The target of the rule depends on the use case. For operational events, then service management applications or log aggregation services may make the most sense. In this example, the rule has an SNS topic as the target. When SNS receives a message, it is sent to operator via email. With EventBridge, future consumers can add their own rules to match the enriched events, and add their specific target actions to suit their use case.

Conclusion

This post shows how you can create rules in EventBridge to react to operational events from AWS services. These events are routed to Step Functions, which runs a workflow consisting of steps to enrich the event, handle errors, and emit the enriched event. The example shows how to consume the enriched events, resulting in an operator receiving an email.

This example is available on GitHub as an AWS Serverless Application Model (AWS SAM) template. It contains instructions to deploy, test, and then remove all of the resources when you’ve finished.

For more serverless learning resources, visit Serverless Land.

Server-side rendering micro-frontends – the architecture

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/server-side-rendering-micro-frontends-the-architecture/

This post is written by Luca Mezzalira, Principal Specialist Solutions Architect, Serverless.

Microservices are a common pattern for building distributed systems. As frontend developers have modified their approaches to build architectures at scale, many are building micro-frontends.

This blog series explores how to implement micro-frontends using a server-side rendering (SSR) approach with AWS services. This first article covers the architecture characteristics and building blocks for designing a successful micro-frontends architecture in the AWS Cloud.

What are micro-frontends?

Micro-frontends are the technical representation of a business subdomain. They allow independent teams to work in parallel, reducing external dependencies and increasing delivery throughput. They embody several microservices characteristics such as governance decentralization, design for failure, and evolutionary design.

The main difference between micro-frontends and components is related to the domain ownership present inside a micro-frontend. With components, the domain knowledge is usually delegated to its container, which knows how to use the component’s property based on the context. Owning the domain inside a micro-frontend enables the independence that you expect in a distributed system. This doesn’t mean that micro-frontends cannot communicate or share resources, but the mindset is different compared with components.

If you are using microservices today, you may benefit from micro-frontends for scaling your frontend applications. Before micro-frontends, scaling was based primarily on developers’ expertise. Micro-frontends allow you to modernize frontend applications iteratively like you would with microservices. Every user downloads only the code needed for accomplishing a specific task, increasing the performance and users experience of a web application.

Architecture characteristics

This blog series builds a product details page of an example ecommerce website using micro-frontends with serverless infrastructure.

Page layout

The page is composed of:

  • A template that includes a header. This could include more common parts but uses one in this example.
  • A notifications micro-frontend that is client-side rendered. The notifications system must react to user interactions, so cannot be server-side rendered with the rest of the page.
  • A product details micro-frontend.
  • A reviews micro-frontend.

Every micro-frontend is independent and can be developed by different teams working on the same project. This can reduce external dependencies and potential bugs across the entire application.

The key system characteristics of this project are:

  1. Server-side rendering: The system must be designed with a server-side rendering approach. This provides fast rendering of the page inside modern browsers and reduces the need of client-side JavaScript for rendering the page.
  2. Framework agnostic: The solution must work with a broad variety of JavaScript libraries available and not be bound or optimized to a specific framework.
  3. Use optimizations best practices: Optimization is a key feature for server-side rendering applications. Many industries rely on these characteristics for increasing sales. This example encapsulates core web vitals metrics, progressive hydration, and different levels of caches to speed up the response times of the webpages.
  4. Team independence: Every micro-frontend must be developed with minimum external dependencies. Constant coordination across teams can be a sign of design-time coupling that invalidates the purpose behind a distributed system.
  5. Serverless infrastructure for frontend developers: The serverless paradigm helps developers focus on the business logic instead of infrastructure, using a “pay for value” model, which helps to reduce costs. You can cache micro-frontend responses and reduce the traffic on the origin and the need to scale every part of the system in the same way.

High-level architecture design

This is the high-level design to incorporate these architectural characteristics:

Architectural overview

  1. The application entry point is a content delivery network (CDN) that is used for caching, performance, and security reasons.
  2. The server-side rendering approach requires a place to store all the static files to hydrate the JavaScript code in the browser and for styling components.
  3. Pages requests require a UI composer that retrieves the micro-frontends and stitches them together to provide the page consumed by a browser. It streams the final HTML page to the browser to enhance the largest contentful paint (LCP) metric from the core web vitals.
  4. Decouple micro-frontends from the UI composer relies on two mechanisms: A micro-frontends discovery that acts like a service discovery in a microservice architecture, and an HTML template per page that describes where to inject the micro-frontends inside a page. The templates can live in the same repository where the other static files are present.
  5. The notification micro-frontend reacts to user interactions, providing a notification when a user adds a product in the cart.
  6. The product details micro-frontend has highly cacheable data that doesn’t require many changes over time.
  7. The reviews micro-frontend must retrieve user reviews of a specific product.

The key element for avoiding design-time coupling in this architecture is the micro-frontends discovery. The main advantages of this approach are to provide discoverability to simplify multi-environments strategies, and also to reduce the blast radius thanks to using blue/green deployments or canary releases. This topic will be covered in depth in an upcoming post.

From high-level design into implementation

The framework-agnostic approach helps to enable control over system evolution. It achieves this by using HTML-over-the-wire, where every micro-frontend renders an HTML fragment and returns it to the UI composer.

When the UI composer gathers the HTML fragments, it composes the final page to render using transclusion. Every page is represented by a specific template hosted in static files. The UI composer retrieves the template and then retrieves placeholder references in the template that can be replaced with the micro-frontend fragments.

This is the architecture used:

Architecture diagram

  1. Amazon CloudFront provides a unique entry point to the application. The distribution has two origins: the first for static files and the second for the UI composer.
  2. Static files are hosted in an Amazon S3 bucket. They are consumed by the browser and the UI composer for HTML templates.
  3. The UI composer runs on a containers cluster in AWS Fargate. Using a containerized solution allows you to use streaming capabilities and multithreading rendering if needed.
  4. AWS Systems Manager Parameter Store is used as a basic micro-frontends discovery system. This service is a key-value store used by the UI composer for retrieving the micro-frontends endpoints to consume.
  5. The notifications micro-frontend stores the optimized JavaScript bundle in the S3 bucket. This renders on the client since it must react to user interactions.
  6. The reviews micro-frontend is composed by an AWS Lambda function with the user reviews stored in Amazon DynamoDB. It’s rendered fully server-side and it outputs an HTML fragment.
  7. The product details micro-frontend is a low-code micro-frontend using AWS Step Functions. The Express Workflow can be invoked synchronously and contains the logic for rendering the HTML fragment and a caching layer. This increases performance due to the native integration with over 200 AWS services.

Using this approach, every team developing a micro-frontend is independent to build and evolve their business domain. The main touchpoints with other teams are related to the initial integrations and the communication mechanism between micro-frontends present in the same page. When these points are achieved, every team reduces external dependencies and can embrace the evolutionary nature of micro-frontends.

Conclusion

This first post starts the journey into micro-frontends, a distributed architecture for frontend applications. The next post will explore the UI composer and micro-frontends discovery implementations.

If you are interested in learning more about micro-frontends, see the micro-frontends decisions framework, a mental model created for the initial complexity of approaching micro-frontends design. When used as a north star, the decisions framework simplifies the development of micro-frontends applications.

In the AWS reference architectures section, you can find a complete diagram similar to the application described in this blog series with additional details.

For more serverless learning resources, visit Serverless Land.

Serverless and Application Integration sessions at AWS re:Invent 2022

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/serverless-and-application-integration-sessions-at-aws-reinvent-2022/

This post is written by Josh Kahn, Tech Leader, AWS Serverless.

AWS re:Invent 2022 is only a few weeks away, featuring an exciting slate of sessions on Serverless and Application Integration. This post highlights many of the sessions we are hosting on Serverless and Application Integration. It groups sessions by theme to help you quickly find the sessions most interesting to you.

AWS re:Invent 2022

As in past years, the conference offers a variety of session formats:

  • Breakout sessions: lecture-style presentations delivered by AWS experts, builders, and customers.
  • Builder’s sessions: smaller sessions led by AWS experts during which you will build a project on your own laptop.
  • Chalk talks: interactive sessions led by experts on a variety of topics. Share your own experiences and feedback.
  • Workshops: hands-on learning sessions designed to help you learn about new technologies. Bring your own laptop.

For detailed descriptions and schedule, visit the AWS re:Invent Session Catalog. If you are attending re:Invent, we would love to connect at our AWS Village and Serverlesspresso booths in the Expo or the Modern Applications Zone at the Venetian. You can also reach out to your AWS account team.

Don’t have a ticket yet? Join us in Las Vegas from November 28-December 2, 2022 by registering for re:Invent 2022.

Leadership session (SVS210)

Join Holly Mesrobian, Vice President of Serverless Compute at AWS, to learn how serverless technology empowers organizations to go to market faster while lowering cost across a wide range of applications. Learn about the innovations happening at all layers of the stack, across both serverless functions and serverless containers. Explore newly released innovations that enable more secure, reliable, and performant applications.

Getting started

Are you new to Serverless or taking your first steps? Hear from AWS experts and customers on best practices and strategies for building serverless workloads. Get hands-on with services by building the next great “to do” app or customer experience for a theme park:

We also offer a series of Builder’s Sessions where you can build the same serverless project using three different infrastructure as code frameworks (attend one or more). These sessions are an opportunity to test drive another IaC framework or understand how your framework of choice can be used with serverless:

Event-driven architectures

Event-driven architectures (EDA) are a popular approach to building modern applications. EDA utilizes events (a change in state) to communicate between decoupled services. This architectural approach lends itself well to a wide-variety of use cases from ecommerce to order fulfillment with individual components able to scale (and fail) independently.

Whether you are getting started with EDA, want to get hands-on, or dive into complex architectures, there is a session for you:

Building serverless architectures

Explore the range of tools available to build serverless architectures and cross-cutting concerns, such as security and observability. These sessions cover the brass tacks of building with serverless, going to beyond “hello world” to help builders understand how to implement a serverless strategy:

Orchestration

AWS offers several options to orchestrate complex workflows. Whether you need to tightly control data processing workflows or user sign-ups, you can take advantage of these orchestration engines to simplify, become more agile, and modernize your workflows.

Integration patterns

Explore the variety of enterprise integration patterns available using AWS, including Amazon SNS, Amazon SQS, Amazon MQ, and more. These sessions explore the wide variety of patterns available using managed services:

Advanced topics

If you are already familiar with serverless, advanced sessions provide an opportunity to go deeper, including under the hood of the AWS Lambda service. Learn advanced design patterns, best practices, and how to build performant, reliable workloads:

Building serverless applications with Java

New this year, there are several sessions dedicating to building serverless applications with the Java runtime. These sessions dive deep on best practices for building performant Java-based applications:

Other talks

Serverless has become such a popular topic that you will also find related sessions in other tracks as well. This list is not exhaustive, but includes talks that you may want to explore:

If you are unable to join us in-person, Breakout Sessions will be available via our YouTube channel after the event. Contact your AWS Account Team is you are interested in learning more about any of these sessions or how to bring our experts to you.

We look forward to seeing you at re:Invent 2022! For more serverless learning resources, visit Serverless Land.

Propagating valid mTLS client certificate identity to downstream services using Amazon API Gateway

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/propagating-valid-mtls-client-certificate-identity-to-downstream-services-using-amazon-api-gateway/

This blog written by Omkar Deshmane, Senior SA and Anton Aleksandrov, Principal SA, Serverless.

This blog shows how to use Amazon API Gateway with a custom authorizer to process incoming requests, validate the mTLS client certificate, extract the client certificate subject, and propagate it to the downstream application in a base64 encoded HTTP header.

This pattern allows you to terminate mTLS at the edge so downstream applications do not need to perform client certificate validation. With this approach, developers can focus on application logic and offload mTLS certificate management and validation to a dedicated service, such as API Gateway.

Overview

Authentication is one of the core security aspects that you must address when building a cloud application. Successful authentication proves you are who you are claiming to be. There are various common authentication patterns, such as cookie-based authentication, token-based authentication, or the topic of this blog – a certificate-based authentication.

Transport Layer Security (TLS) certificates are at the core of a secure and safe internet. TLS certificates secure the connection between the client and server by encrypting data, ensuring private communication. When using the TLS protocol, the server must prove its identity to the client using a certificate signed by a certificate authority trusted by the client.

Mutual TLS (mTLS) introduces an additional layer of security, in which both the client and server must prove their identities to each other. Developers commonly use mTLS for application-to-application authentication — using digital certificates to represent both client and server apps is a common authentication pattern for building such workflows. We highly recommended decoupling the mTLS implementation from the application business logic so that you do not have to update the application when changing the mTLS configuration. It is a common pattern to implement the mTLS authentication and termination in a network appliance at the edge, such as Amazon API Gateway.

In this solution, we show a pattern of using API Gateway with an authorizer implemented with AWS Lambda to validate the mTLS client certificate, extract the client certificate subject, and propagate it to the downstream application in a base64 encoded HTTP header.

While this blog describes how to implement this pattern for identities extracted from the mTLS client certificates, you can generalize it and apply to propagating information obtained via any other means of authentication.

mTLS Sample Application

This blog includes a sample application implemented using the AWS Serverless Application Model (AWS SAM). It creates a demo environment containing resources like API Gateway, a Lambda authorizer, and an Amazon EC2 instance, which simulates the backend application.

The EC2 instance is used for the backend application to mimic common customer scenarios. You can use any other type of compute, such as Lambda functions or containerized applications with Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service (Amazon EKS), as a backend application layer as well.

The following diagram shows the solution architecture:

Example architecture diagram

Example architecture diagram

  1. Store the client certificate in a trust store in an Amazon S3 bucket.
  2. The client makes a request to the API Gateway endpoint, supplying the client certificate to establish the mTLS session.
  3. API Gateway retrieves the trust store from the S3 bucket. It validates the client certificate, matches the trusted authorities, and terminates the mTLS connection.
  4. API Gateway invokes the Lambda authorizer, providing the request context and the client certificate information.
  5. The Lambda authorizer extracts the client certificate subject. It performs any necessary custom validation, and returns the extracted subject to API Gateway as a part of the authorization context.
  6. API Gateway injects the subject extracted in the previous step into the integration request HTTP header and sends the request to a downstream endpoint.
  7. The backend application receives the request, extracts the injected subject, and uses it with custom business logic.

Prerequisites and deployment

Some resources created as part of this sample architecture deployment have associated costs, both when running and idle. This includes resources like Amazon Virtual Private Cloud (Amazon VPC), VPC NAT Gateway, and EC2 instances. We recommend deleting the deployed stack after exploring the solution to avoid unexpected costs. See the Cleaning Up section for details.

Refer to the project code repository for instructions to deploy the solution using AWS SAM. The deployment provisions multiple resources, taking several minutes to complete.

Following the successful deployment, refer to the RestApiEndpoint variable in the Output section to locate the API Gateway endpoint. Note this value for testing later.

AWS CloudFormation output

AWS CloudFormation output

Key areas in the sample code

There are two key areas in the sample project code.

In src/authorizer/index.js, the Lambda authorizer code extracts the subject from the client certificate. It returns the value as part of the context object to API Gateway. This allows API Gateway to use this value in the subsequent integration request.

const crypto = require('crypto');

exports.handler = async (event) => {
    console.log ('> handler', JSON.stringify(event, null, 4));

    const clientCertPem = event.requestContext.identity.clientCert.clientCertPem;
    const clientCert = new crypto.X509Certificate(clientCertPem);
    const clientCertSub = clientCert.subject.replaceAll('\n', ',');

    const response = {
        principalId: clientCertSub,
        context: { clientCertSub },
        policyDocument: {
            Version: '2012-10-17',
            Statement: [{
                Action: 'execute-api:Invoke',
                Effect: 'allow',
                Resource: event.methodArn
            }]
        }
    };

    console.log('Authorizer Response', JSON.stringify(response, null, 4));
    return response;
};

In template.yaml, API Gateway injects the client certificate subject extracted by the Lambda authorizer previously into the integration request as X-Client-Cert-Sub HTTP header. X-Client-Cert-Sub is a custom header name and you can choose any other custom header name instead.

SayHelloGetMethod:
    Type: AWS::ApiGateway::Method
    Properties:
      AuthorizationType: CUSTOM
      AuthorizerId: !Ref CustomAuthorizer
      HttpMethod: GET
      ResourceId: !Ref SayHelloResource
      RestApiId: !Ref RestApi
      Integration:
        Type: HTTP_PROXY
        ConnectionType: VPC_LINK
        ConnectionId: !Ref VpcLink
        IntegrationHttpMethod: GET
        Uri: !Sub 'http://${NetworkLoadBalancer.DNSName}:3000/'
        RequestParameters:
          'integration.request.header.X-Client-Cert-Sub': 'context.authorizer.clientCertSub' 

Testing the example

You create a client key and certificate during the deployment, which are stored in the /certificates directory. Use the curl command to make a request to the REST API endpoint using these files.

curl --cert certificates/client.pem --key certificates/client.key \
<use the RestApiEndpoint found in CloudFormation output>
Example flow diagram

Example flow diagram

The client request to API Gateway uses mTLS with a client certificate supplied for mutual TLS authentication. API Gateway uses the Lambda authorizer to extract the certificate subject, and inject it into the Integration request.

The HTTP server runs on the EC2 instance, simulating the backend application. It accepts the incoming request and echoes it back, supplying request headers as part of the response body. The HTTP response received from the backend application contains a simple message and a copy of the request headers sent from API Gateway to the backend.

One header is x-client-cert-sub with the Common Name value you provided to the client certificate during creation. Verify that the value matches the Common Name that you provided when generating the client certificate.

Response example

Response example

API Gateway validated the mTLS client certificate, used the Lambda authorizer to extract the subject common name from the certificate, and forwarded it to the downstream application.

Cleaning up

Use the sam delete command in the api-gateway-certificate-propagation directory to delete the sample application infrastructure:

sam delete

You can also refer to the project code repository for the clean-up instructions.

Conclusion

This blog shows how to use the API Gateway with a Lambda authorizer for mTLS client certificate validation, custom field extraction, and downstream propagation to backend systems. This pattern allows you to terminate mTLS at the edge so that downstream applications are not responsible for client certificate validation.

For additional documentation, refer to Using API Gateway with Lambda Authorizer. Download the sample code from the project code repository. For more serverless learning resources, visit Serverless Land.

Simplifying serverless permissions with AWS SAM Connectors

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/simplifying-serverless-permissions-with-aws-sam-connectors/

This post written by Kurt Tometich, Senior Solutions Architect, AWS.

Developers have been using the AWS Serverless Application Model (AWS SAM) to streamline the development of serverless applications with AWS since late 2018. Besides making it easier to create, build, test, and deploy serverless applications, AWS SAM now further simplifies permission management between serverless components with AWS SAM Connectors.

Connectors allow the builder to focus on the relationships between components without expert knowledge of AWS Identity and Access Management (IAM) or direct creation of custom policies. AWS SAM connector supports AWS Step Functions, Amazon DynamoDB, AWS Lambda, Amazon SQS, Amazon SNS, Amazon API Gateway, Amazon EventBridge and Amazon S3, with more resources planned in the future.

AWS SAM policy templates are an existing feature that helps builders deploy serverless applications with minimally scoped IAM policies. Because there are a finite number of templates, they’re a good fit when a template exists for the services you’re using. Connectors are best for those getting started and who want to focus on modeling the flow of data and events within their applications. Connectors will take the desired relationship model and create the permissions for the relationship to exist and function as intended.

In this blog post, I show you how to speed up serverless development while maintaining secure best practices using AWS SAM connector. Defining a connector in an AWS SAM template requires a source, destination, and a permission (for example, read or write). From this definition, IAM policies with minimal privileges are automatically created by the connector.

Usage

Within an AWS SAM template:

  1. Create serverless resource definitions.
  2. Define a connector.
  3. Add a source and destination ID of the resources to connect.
  4. Define the permissions (read, write) of the connection.

This example creates a Lambda function that requires write access to an Amazon DynamoDB table to keep track of orders created from a website.

AWS Lambda function needing write access to an Amazon DynamoDB table

AWS Lambda function needing write access to an Amazon DynamoDB table

The AWS SAM connector for the resources looks like the following:

LambdaDynamoDbWriteConnector:
  Type: AWS::Serverless::Connector
  Properties:
    Source:
      Id: CreateOrder
    Destination:
      Id: Orders
    Permissions:
      - Write

“LambdaDynamoDbWriteConnector” is the name of the connector, while the “Type” designates it as an AWS SAM connector. “Properties” contains the source and destination logical ID for our serverless resources found within our template. Finally, the “Permissions” property defines a read or write relationship between the components.

This basic example shows how easy it is to define permissions between components. No specific role or policy names are required, and this syntax is consistent across many other serverless components, enforcing standardization.

Example

AWS SAM connectors save you time as your applications grow and connections between serverless components become more complex. Manual creation and management of permissions become error prone and difficult at scale. To highlight the breadth of support, we’ll use an AWS Step Functions state machine to operate with several other serverless components. AWS Step Functions is a serverless orchestration workflow service that integrates natively with other AWS services.

Solution overview

Architectural overview

Architectural overview

This solution implements an image catalog moderation pipeline. Amazon Rekognition checks for inappropriate content, and detects objects and text in an image. It processes valid images and stores metadata in an Amazon DynamoDB table, otherwise emailing a notification for invalid images.

Prerequisites

  1. Git installed
  2. AWS SAM CLI version 1.58.0 or greater installed

Deploying the solution

  1. Clone the repository and navigate to the solution directory:
    git clone https://github.com/aws-samples/step-functions-workflows-collection
    cd step-functions-workflows-collection/moderated-image-catalog
  2. Open the template.yaml file located at step-functions-workflows-collection/moderated-image-catalog and replace the “ImageCatalogStateMachine:” section with the following snippet. Ensure to preserve YAML formatting.
    ImageCatalogStateMachine:
        Type: AWS::Serverless::StateMachine
        Properties:
          Name: moderated-image-catalog-workflow
          DefinitionUri: statemachine/statemachine.asl.json
          DefinitionSubstitutions:
            CatalogTable: !Ref CatalogTable
            ModeratorSNSTopic: !Ref ModeratorSNSTopic
          Policies:
            - RekognitionDetectOnlyPolicy: {}
  3. Within the same template.yaml file, add the following after the ModeratorSNSTopic section and before the Outputs section:
    # Serverless connector permissions
    StepFunctionS3ReadConnector:
      Type: AWS::Serverless::Connector
      Properties:
        Source:
          Id: ImageCatalogStateMachine
        Destination:
          Id: IngestionBucket
        Permissions:
          - Read
    
    StepFunctionDynamoWriteConnector:
      Type: AWS::Serverless::Connector
      Properties:
        Source:
          Id: ImageCatalogStateMachine
        Destination:
          Id: CatalogTable
        Permissions:
          - Write
    
    StepFunctionSNSWriteConnector:
      Type: AWS::Serverless::Connector
      Properties:
        Source:
          Id: ImageCatalogStateMachine
        Destination:
          Id: ModeratorSNSTopic
        Permissions:
          - Write

    You have removed the existing inline policies for the state machine and replaced them with AWS SAM connector definitions, except for the Amazon Rekognition policy. At the time of publishing this blog, connectors do not support Amazon Rekognition. Take some time to review each of the connector’s syntax.

  4. Deploy the application using the following command:
    sam deploy --guided

    Provide a stack name, Region, and moderators’ email address. You can accept defaults for the remaining prompts.

Verifying permissions

Once the deployment has completed, you can verify the correct role and policies.

  1. Navigate to the Step Functions service page within the AWS Management Console and ensure you have the correct Region selected.
  2. Select State machines from the left menu and then the moderated-image-catalog-workflow state machine.
  3. Select the “IAM role ARN” link, which will take you to the IAM role and policies created.

You should see a list of policies that correspond to the AWS SAM connectors in the template.yaml file with the actions and resources.

Permissions list in console

Permissions list in console

You didn’t need to supply the specific policy actions: Use Read or Write as the permission and the service handles the rest. This results in improved readability, standardization, and productivity, while retaining security best practices.

Testing

  1. Upload a test image to the Amazon S3 bucket created during the deployment step. To find the name of the bucket, navigate to the AWS CloudFormation console. Select the CloudFormation stack via the name entered as part of “sam deploy –guided.” Select the Outputs tab and note the IngestionBucket name.
  2. After uploading the image, navigate to the AWS Step Functions console and select the “moderated-image-catalog-workflow” workflow.
  3. Select Start Execution and input an event:
    {
        "bucket": "<S3-bucket-name>",
        "key": "<image-name>.jpeg"
    }
  4. Select Start Execution and observe the execution of the workflow.
  5. Depending on the image selected, it will either add to the image catalog, or send a content moderation email to the email address provided. Find out more about content considered inappropriate by Amazon Rekognition.

Cleanup

To delete any images added to the Amazon S3 bucket, and the resources created by this template, use the following commands from the same project directory.

aws s3 rm s3://< bucket_name_here> --recursive
sam delete

Conclusion

This blog post shows how AWS SAM connectors simplify connecting serverless components. View the Developer Guide to find out more about AWS SAM connectors. For further sample serverless workflows like the one used in this blog, see Serverless Land.

Announcing server-side encryption with Amazon Simple Queue Service -managed encryption keys (SSE-SQS) by default

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/announcing-server-side-encryption-with-amazon-simple-queue-service-managed-encryption-keys-sse-sqs-by-default/

This post is written by Sofiya Muzychko (Sr Product Manager), Nipun Chagari (Principal Solutions Architect), and Hardik Vasa (Senior Solutions Architect).

Amazon Simple Queue Service (SQS) now provides server-side encryption (SSE) using SQS-owned encryption (SSE-SQS) by default. This feature further simplifies the security posture to encrypt the message body in SQS queues.

SQS is a fully managed message queuing service that enables you to decouple and scale microservices, distributed systems, and serverless applications. Customers are increasingly decoupling their monolithic applications to microservices and moving sensitive workloads to SQS, such as financial and healthcare applications, whose compliance regulations mandate data encryption.

SQS already supports server-side encryption with customer-provided encryption keys using the AWS Key Management Service (SSE-KMS) or using SQS-owned encryption keys (SSE-SQS). Both encryption options greatly reduce the operational burden and complexity involved in protecting data. Additionally, with the SSE-SQS encryption type, you do not need to create, manage, or pay for SQS-managed encryption keys.

Using the default encryption

With this feature, all newly created queues using HTTPS (TLS) and Signature Version 4 endpoints are encrypted using SQS-owned encryption (SSE-SQS) by default, enhancing the protection of your data against unauthorized access. Any new queue created using the non-TLS endpoint will not enable SSE-SQS encryption by default. We hence encourage you to create SQS queues using HTTPS endpoints as a security best practice.

The SSE-SQS default encryption is available for both standard and FIFO. You do not need to make any code or application changes to encrypt new queues. This does not affect existing queues. You can however change the encryption option for existing queues at any time using the SQS console, AWS Command Line Interface, or API.

Create queue

The preceding image shows the SQS queue creation console wizard with configuration options for encryption. As you can see, server-side encryption is enabled by default with encryption key type SSE-SQS option selected.

Creating a new SQS queue with SSE-SQS encryption using AWS CloudFormation

Default SSE-SQS encryption is also supported in AWS CloudFormation. To learn more, see this documentation page.

Here is the sample CloudFormation template to create an SQS standard queue with SQS owned Server Side Encryption (SSE-SQS) explicitly enabled.

AWSTemplateFormatVersion: "2010-09-09"
Description: SSE-SQS Cloudformation template
Resources:
  SQSEncryptionQueue:
    Type: AWS::SQS::Queue
    Properties: 
      MaximumMessageSize: 262144
      MessageRetentionPeriod: 86400
      QueueName: SSESQSQueue
      SqsManagedSseEnabled: true
      KmsDataKeyReusePeriodSeconds: 900
      VisibilityTimeout: 30

Note that if the SqsManagedSseEnabled: true property is not specified, SSE-SQS is enabled by default.

Configuring SSE-SQS encryption for existing queues vis AWS Management Console

To configure SSE-SQS encryption for an existing queue using the SQS console:

  1. Navigate to the SQS console at https://console.aws.amazon.com/sqs/.
  2. In the navigation pane, choose Queues.
  3. Select a queue, and then choose Edit.
  4. Under the Encryption dialog box, for Server-side encryption, choose Enabled.
  5. Select Amazon SQS key (SSE-SQS).
  6. Choose Save.

Edit standard queue

To configure SSE-SQS encryption for an existing queue using the AWS CLI

To enable SSE-SQS to an existing queue with no encryption, use the following AWS CLI command

aws sqs set-queue-attributes --queue-url <queueURL> --attributes SqsManagedSseEnabled=true

Replace <queueURL> with the URL of your SQS queue.

To disable SSE-SQS for an existing queue using the AWS CLI, run:

aws sqs set-queue-attributes --queue-url <queueURL> --attributes SqsManagedSseEnabled=false

Testing the queue with the SSE-SQS encryption enabled

To test sending message to the SQS queue with SSE-SQS enabled, run:

aws sqs send-message --queue-url <queueURL> --message-body test-message

Replace <queueURL> with the URL of your SQS queue. You see the following response, which means the message is successfully sent to the queue:

{
    "MD5OfMessageBody": "beaa0032306f083e847cbf86a09ba9b2",
    "MessageId": "6e53de76-7865-4c45-a640-f058c24a619b"
}

Default SSE-SQS encryption key rotation

You can choose how often the keys will be rotated by configuring the KmsDataKeyReusePeriodSeconds queue attribute. The value must be an integer between 60 (1 minute) and 86,400 (24 hours). The default is 300 (5 minutes).

To update the KMS data key reuse period for an existing SQS queue, run:

aws sqs set-queue-attributes --queue-url <queueURL> --attributes KmsDataKeyReusePeriodSeconds=900

This configures the queue with KMS key rotation to every 900 seconds (15 minutes).

Default SSE-SQS and encrypted messages

Encrypting a message makes its contents unavailable to unauthorized or anonymous users. Anonymous requests are requests made to a queue that is open to a public network without any authentication. Note, if you are using anonymous SendMessage and ReceiveMessage requests to the newly created queues, the requests will now be rejected with SSE-SQS enabled by default.

Making anonymous requests to SQS queues does not follow SQS security best practices. We strongly recommend updating your policy to make signed requests to SQS queues using AWS SDK or AWS CLI and to continue using SSE-SQS enabled by default.

Look at the SQS service response for anonymous messages when SSE-SQS encryption is enabled. For an existing queue, you can change the queue policy to grant all users (anonymous users) SendMessage permission for a queue named EncryptionQueue:

{
  "Version": "2012-10-17",
  "Id": "Queue1_Policy_UUID",
  "Statement": [
    {
      "Sid": "Queue1_SendMessage",
      "Effect": "Allow",
      "Principal": "*",
      "Action": "sqs:SendMessage",
      "Resource": "<queueARN>"
    }
  ]
}

You can then make an anonymous request against the queue:

curl <queueURL> -d 'Action=SendMessage&MessageBody=Hello'

You get an error message similar to the following:

<?xml version="1.0"?>
<ErrorResponse
	xmlns="http://queue.amazonaws.com/doc/2012-11-05/">
	<Error>
		<Type>Sender</Type>
		<Code>AccessDenied</Code>
		<Message>Access to the resource The specified queue does not exist or you do not have access to it. is denied.</Message>
		<Detail/>
	</Error>
	<RequestId> RequestID </RequestId>
</ErrorResponse>

However, for any reason if you want to continue using anonymous requests to the newly created queues in the future, you must create or update the queue with SSE-SQS encryption disabled.

SqsManagedSseEnabled=false

You can also disable the SSE-SQS using the Amazon SQS console.

Encrypting SQS queues with your own encryption keys

You can always change the default of SSE-SQS queues encryption and use your own keys. To encrypt SQS queues with your own encryption keys using the AWS Key Management Service (SSE-KMS), the default encryption with SSE-SQS can be overwritten to SSE-KMS during the queue creation process or afterwards.

You can update the SQS queue Server-side encryption key type using the Amazon SQS console, AWS Command Line Interface, or API.

Benefits of SQS owned encryption (SSE-SQS)

There are a number of significant benefits to encrypting your data with SQS owned encryption (SSE-SQS):

  • SSE-SQS lets you transmit data more securely and improve your security posture commonly required for compliance and regulations with no additional overhead, as you do not need to create and manage encryption keys.
  • Encryption at rest using the default SSE-SQS is provided at no additional charge.
  • The encryption and decryption of your data are handled transparently and continue to deliver the same performance you expect.
  • Data is encrypted using the 256-bit Advanced Encryption Standard (AES-256 GCM algorithm), so that only authorized roles and services can access data.

In addition, customers can enable CloudWatch Alarms to alarm on activities such as authorization failures, AWS Identity and Access Management (IAM) policy changes, or tampering with CloudTrail logs to help detect and stay on top of security incidents in the customer application (to learn more, see Amazon CloudWatch User Guide).

Conclusion

SQS now provides server-side encryption (SSE) using SQS-owned encryption (SSE-SQS) by default. This enhancement makes it easier to create SQS queues, while greatly reducing the operational burden and complexity involved in protecting data.

Encryption at rest using the default SSE-SQS is provided at no additional charge and is supported for both Standard and FIFO SQS queues using HTTPS endpoints. The default SSE-SQS encryption is available now.

To learn more about Amazon Simple Queue Service (SQS), see Getting Started with Amazon SQS and Amazon Simple Queue Service Developer Guide.

For more serverless learning resources, visit Serverless Land.

Introducing message data protection for Amazon SNS

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/introducing-message-data-protection-for-amazon-sns/

This post is written by Otavio Ferreira, Senior Software Development Manager, Marc Pinaud, Senior Product Manager, Usman Nisar, Senior Software Engineer, Hardik Vasa, Senior Solutions Architect, and Mithun Mallick, Senior Specialist Solution Architect.

Today, we are announcing the public preview release of new data protection capabilities for Amazon Simple Notification Service (SNS), message data protection. This is a new way to discover and protect sensitive data in motion at scale, without writing custom code.

SNS is a fully managed serverless messaging service. It provides topics for push-based, many-to-many pub/sub messaging for decoupling distributed systems, microservices, and event-driven serverless applications. As applications grow, so does the amount of data transmitted and the number of systems sending and receiving data. When moving data between different applications, guardrails can help you comply with data privacy regulations that require you to safeguard sensitive personally identifiable information (PII) or protected health information (PHI).

With message data protection for SNS, you can scan messages in real time for PII/PHI data and receive audit reports containing scan results. You can also prevent applications from receiving sensitive data by blocking inbound messages to an SNS topic or outbound messages to an SNS subscription. Message data protection for SNS supports a repository of over 25 unique PII/PHI data identifiers. These include people’s names, addresses, social security numbers, credit card numbers, and prescription drug codes.

These capabilities can help you adhere to a variety of compliance regulations, including HIPAA, FedRAMP, GDPR, and PCI. For more information, including the complete list of supported data identifiers, see message data protection in the SNS Developer Guide.

Overview

SNS topics enable you to integrate distributed applications more easily. As applications become more complex, it can become challenging for topic owners to manage the data flowing through their topics. Developers that publish messages to a topic may inadvertently send sensitive data, increasing regulatory risk. Message data protection enables SNS topic owners to protect sensitive application data with built-in, no-code, scalable capabilities.

To discover and protect data flowing through SNS topics with message data protection, topic owners associate data protection policies to their topics. Within these policies, you can write statements that define which types of sensitive data you want to discover and protect. As part of this, you can define whether you want to act on data flowing inbound to a topic or outbound to a subscription, which AWS accounts or specific AWS Identity and Access Management (AWS IAM) principals the policy is applicable to, and the actions you want to take on the data.

Message data protection provides two actions to help you protect your data. Auditing, to report on the amount of PII/PHI found, and blocking, to prevent the publishing or delivery of payloads that contain PII/PHI data. Once the data protection policy is set, message data protection uses pattern matching and machine learning models to scan your messages in real time for PII/PHI data identifiers and enforce the data protection policy.

For auditing, you can choose to send audit reports to Amazon Simple Storage Service (S3) for archival, Amazon Kinesis Data Firehose for analytics, or Amazon CloudWatch for logging and alarming. Message data protection does not interfere with the topic owner’s ability to use message data encryption at rest, nor with the subscriber’s ability to filter out unwanted messages using message filtering.

Applying message data protection in a use case

Consider an application that processes a variety of transactions for a set of health clinics, an organization that operates in a regulated environment. Compliance frameworks require that the organization take measures to protect both sensitive health records and financial information.

Reference architecture

The application is based on an event-driven serverless architecture. It has a data protection policy attached to the topic to audit for sensitive data and prevent downstream systems from processing certain data types.

The application publishes an event to an SNS topic every time a patient schedules a visit or sees a doctor at a clinic. The SNS topic fans out the event to two subscribed systems, billing and scheduling. Each system stores events in an Amazon SQS queue, which is processed using an AWS Lambda function.

Setting a data protection policy to an SNS topic

You can apply a data protection policy to an SNS topic using the AWS Management Console, the AWS CLI, or the AWS SDKs. You can also use AWS CloudFormation to automate the provisioning of the data protection policy.

This example uses CloudFormation to provision the infrastructure. You have two options for deploying the resources:

  • Deploy the resources by using the message data protection deploy script within the aws-sns-samples repository in GitHub.
  • Alternatively, use the following four CloudFormation templates in order. Allow time for each stack to complete before deploying the next stack, to create the following resources:

1. Prerequisites template

  • Two IAM roles with a managed policy that allows access to receive messages from the SNS topic, one for the billing and another for scheduling system, respectively.

2. Topic owner template

  • SNS topic that delivers events to two distinct systems.
  • A data protection policy that defines both auditing and blocking actions for specific types of PII and PHI.
  • S3 bucket to archive audit findings.
  • CloudWatch log group to monitor audit findings.
  • Kinesis Data Firehose to deliver audit findings to other destinations.

3. Scheduling subscriber template

  • SQS queue for the Scheduling system.
  • Lambda function for the Scheduling system.

4. Billing subscriber template

  • SQS queue for the Billing system.
  • Lambda function for the Billing system.

CloudFormation creates the following data protection policy as part of the topic owner template:

  ClinicSNSTopic:
    Type: 'AWS::SNS::Topic'
    Properties:
      TopicName: SampleClinic
      DataProtectionPolicy:
        Name: data-protection-example-policy
        Description: Policy Description
        Version: 2021-06-01
        Statement:
          - Sid: audit
            DataDirection: Inbound
            Principal:
             - '*'
            DataIdentifier:
              - 'arn:aws:dataprotection::aws:data-identifier/Address'
              - 'arn:aws:dataprotection::aws:data-identifier/AwsSecretKey'
              - 'arn:aws:dataprotection::aws:data-identifier/DriversLicense-US'
              - 'arn:aws:dataprotection::aws:data-identifier/EmailAddress'
              - 'arn:aws:dataprotection::aws:data-identifier/IpAddress'
              - 'arn:aws:dataprotection::aws:data-identifier/NationalDrugCode-US'
              - 'arn:aws:dataprotection::aws:data-identifier/PassportNumber-US'
              - 'arn:aws:dataprotection::aws:data-identifier/Ssn-US'
            Operation:
              Audit:
                SampleRate: 99
                FindingsDestination:
                  CloudWatchLogs:
                    LogGroup: !Ref AuditCWLLogs
                  Firehose:
                    DeliveryStream: !Ref AuditFirehose
                NoFindingsDestination:
                  S3:
                    Bucket: !Ref AuditS3Bucket
          - Sid: deny-inbound
            DataDirection: Inbound
            Principal:
              - '*'
            DataIdentifier:
              - 'arn:aws:dataprotection::aws:data-identifier/PassportNumber-US'
              - 'arn:aws:dataprotection::aws:data-identifier/Ssn-US'
            Operation:
              Deny: {}
          - Sid: deny-outbound-billing
            DataDirection: Outbound
            Principal:
              - !ImportValue "BillingRoleExportDataProtectionDemo"
            DataIdentifier:
              - 'arn:aws:dataprotection::aws:data-identifier/NationalDrugCode-US'
            Operation:
              Deny: {}
          - Sid: deny-outbound-scheduling
            DataDirection: Outbound
            Principal:
              - !ImportValue "SchedulingRoleExportDataProtectionDemo"
            DataIdentifier:
              - 'arn:aws:dataprotection::aws:data-identifier/Address'
              - 'arn:aws:dataprotection::aws:data-identifier/CreditCardNumber'
            Operation:
              Deny: {}

This data protection policy defines:

  • Metadata about the data protection policy, for example name, description, version, and statement IDs (sid).
  • The first statement (sid: audit) scans inbound messages from all principals for addresses, social security numbers, driver’s license, email addresses, IP addresses, national drug codes, passport numbers, and AWS secret keys.
    • The sampling rate is set to 99% so almost all messages are scanned for the defined PII/PHI.
    • Audit results with findings are delivered to CloudWatch Logs and Kinesis Data Firehose for analytics. Audit results without findings are archived to S3.
  • The second statement (sid: deny-inbound) blocks inbound messages to the topic coming from any principal, if the payload includes either a social security number or passport number.
  • The third statement (sid: deny-outbound-billing) blocks the delivery of messages to subscriptions created by the BillingRole, if the messages include any national drug codes.
  • The fourth statement (sid: deny-outbound-scheduling) blocks the delivery of messages to subscriptions created by the SchedulingRole, if the messages include any credit card numbers or addresses.

Testing the capabilities

Test the message data protection capabilities using the following steps:

  1. Publish a message without PII/PHI data to the Clinic Topic. In the CloudWatch console, navigate to the log streams of the respective Lambda functions to confirm that the message is delivered to both subscribers. Both messages are delivered because the payload contains no sensitive data for the data protection policy to deny. The log message looks as follows:
    "This is a demo! received from queue arn:aws:sqs:us-east-1:111222333444:Scheduling-SchedulingQueue"
  2. Publish a message with a social security number (try ‘SSN: 123-12-1234’) to the Clinic Topic. The request is denied, and an audit log is delivered to your CloudWatch Logs log group and Firehose delivery stream.
  3. Navigate to the CloudWatch log console and confirm that the audit log is visible in the /aws/vendedlogs/clinicaudit CloudWatch log group. The following example shows that the data protection policy (sid: deny-inbound) denied the inbound message as the payload contains a US social security number (SSN) between the 5th and the 15th character.
    {
        "messageId": "77ec5f0c-5129-5429-b01d-0457b965c0ac",
        "auditTimestamp": "2022-07-28T01:27:40Z",
        "callerPrincipal": "arn:aws:iam::111222333444:role/Admin",
        "resourceArn": "arn:aws:sns:us-east-1:111222333444:SampleClinic",
        "dataIdentifiers": [
            {
                "name": "Ssn-US",
                "count": 1,
                "detections": [
                    {
                        "start": 5,
                        "end": 15
                    }
                ]
            }
        ]
    }
    
  4. You can use the CloudWatch metrics, MessageWithFindings and MessageWithNoFindings, to track how frequently PII/PHI data is published to an SNS topic. Here’s an example of what the CloudWatch metric graph looks like as the amount of sensitive data published to a topic varies over time:
    CloudWatch metric graph
  5. Publish a message with an address (try ‘410 Terry Ave N, Seattle 98109, WA’). The request is only delivered to the Billing subscription. The data protection policy (sid: deny-outbound-scheduling) denies the outbound message to the Scheduling subscription as the payload contains an address.
  6. Confirm that the message is only delivered to the Billing Lambda function by navigating to the CloudWatch console and inspecting the logs of the two respective Lambda functions. The CloudWatch log of the Billing Lambda function contains the sensitive message that was delivered to it as it was an authorized subscriber. Here’s an example of what the log contains:410 Terry Ave N, Seattle 98109, WA received from queue arn:aws:sqs:us-east-1:111222333444:Billing-BillingQueue
  7. Publish a message with a drug code (try ‘NDC: 0777-3105-02’). The request is only delivered to the Scheduling subscription. The data protection policy (sid: deny-outbound-billing) denies the outbound message to the Billing subscription as the payload contains a drug code.
  8. Confirm that the message is only delivered to the Scheduling Lambda function by navigating to the CloudWatch console and inspecting the logs of the two respective Lambda functions. The CloudWatch log of the Scheduling Lambda function contains the sensitive message that was delivered to it as it was an authorized subscriber. Here’s an example of what the log contains:
    NDC: 0777-3105-02 received from queue arn:aws:sqs:us-east-1:111222333444:Scheduling-SchedulingQueue

Cleaning up

After testing, avoid incurring usage charges by deleting the resources that you created. Navigate to the CloudFormation console and delete the four CloudFormation stacks that you created during the walkthrough. Remember, you must delete all the objects from the S3 bucket before deleting the stack.

Conclusion

This post shows how message data protection enables a topic owner to discover and protect sensitive data that is exchanged through SNS topics. The example shows how to create a data protection policy that generates audit reports for sensitive data and blocks messages from delivery to specific subscribers if the payload contains sensitive data.

Get started with SNS and message data protection by using the AWS Management Console, AWS Command Line Interface (CLI), AWS SDKs, or CloudFormation.

For more details, see message data protection in the SNS Developer Guide. For information on pricing, see SNS pricing.

For more serverless learning resources, visit Serverless Land.

Deploying AWS Lambda functions using AWS Controllers for Kubernetes (ACK)

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/deploying-aws-lambda-functions-using-aws-controllers-for-kubernetes-ack/

This post is written by Rajdeep Saha, Sr. SSA, Containers/Serverless.

AWS Controllers for Kubernetes (ACK) allows you to manage AWS services directly from Kubernetes. With the ACK service controller for AWS Lambda, you can provision and manage Lambda functions with kubectl and custom resources. With ACK, you can have a single consolidated approach to managing container workloads and other AWS services, such as Lambda, directly from Kubernetes without needing additional infrastructure automation tools.

This post walks you through deploying a sample Lambda function from a Kubernetes cluster provided by Amazon EKS.

Use cases

Some of the use cases for provisioning Lambda functions from ACK include:

  • Your organization already has a DevOps process to deploy resources into the Amazon EKS cluster using Kubernetes declarative YAMLs (known as manifest files). With ACK for AWS Lambda, you can now use manifest files to provision Lambda functions without creating separate infrastructure as a code template.
  • Your project has implemented GitOps with Kubernetes. With GitOps, git becomes the single source of truth, and all the changes are done via git repo. In this model, Kubernetes continuously reconciles the git repo (desired state) with the resources running inside the cluster (current state). If any differences are found, the GitOps process automatically implements changes to the cluster from the git repo. Using ACK for AWS Lambda, since you are creating the Lambda function using Kubernetes custom resource, the GitOps model is applied for Lambda.
  • Your organization has established permissions boundaries for different users and groups using role-based access control (RBAC) and IAM roles for service accounts (IRSA). You can reuse this security model for Lambda without having to create new users and policies.

How ACK for AWS Lambda works

  1. The ‘Ops’ team deploys the ACK service controller for Lambda. This controller runs as a pod within the Amazon EKS cluster.
  2. The controller pod needs permission to read the Lambda function code and create the Lambda function. The Lambda function code is stored as a zip file in an S3 bucket for this example. The permissions are granted to the pod using IRSA.
  3. Each AWS service has separate ACK service controllers. This specific controller for AWS Lambda can act on the custom resource type ‘Function’.
  4. The ‘Dev’ team deploys Kubernetes manifest file with custom resource type ‘Function’. This manifest file defines the necessary fields required to create the function, such as S3 bucket name, zip file name, Lambda function IAM role, etc.
  5. The ACK service controller creates the Lambda function using the values from the manifest file.

Prerequisites

You need a few tools before deploying the sample application. Ensure that you have each of the following in your working environment:

This post uses shell variables to make it easier to substitute the actual names for your deployment. When you see placeholders like NAME=<your xyz name>, substitute in the name for your environment.

Setting up the Amazon EKS cluster

  1. Run the following to create an Amazon EKS cluster. The following single command creates a two-node Amazon EKS cluster with a unique name.
    eksctl create cluster
  2. It may take 15–30 minutes to provision the Amazon EKS cluster. When the cluster is ready, run:
    kubectl get nodes
  3. The output shows the following:
    Output
  4. To get the Amazon EKS cluster name to use throughout the walkthrough, run:
    eksctl get cluster
    
    export EKS_CLUSTER_NAME=<provide the name from the previous command>

Setting up the ACK Controller for Lambda

To set up the ACK Controller for Lambda:

  1. Install an ACK Controller with Helm by following these instructions:
    – Change ‘export SERVICE=s3’ to ‘export SERVICE=lambda’.
    – Change ‘export AWS_REGION=us-west-2’ to reflect your Region appropriately.
  2. To configure IAM permissions for the pod running the Lambda ACK Controller to permit it to create Lambda functions, follow these instructions.
    – Replace ‘SERVICE=”s3”’ with ‘SERVICE=”lambda”’.
  3. Validate that the ACK Lambda controller is running:
    kubectl get pods -n ack-system
  4. The output shows the running ACK Lambda controller pod:
    Output

Provisioning a Lambda function from the Kubernetes cluster

In this section, you write a sample “Hello world” Lambda function. You zip up the code and upload the zip file to an S3 bucket. Finally, you deploy that zip file to a Lambda function using the ACK Controller from the EKS cluster you created earlier. For this example, use Python3.9 as your language runtime.

To provision the Lambda function:

  1. Run the following to create the sample “Hello world” Lambda function code, and then zip it up:
    mkdir my-helloworld-function
    cd my-helloworld-function
    cat << EOF > lambda_function.py 
    import json
    
    def lambda_handler(event, context):
        # TODO implement
        return {
            'statusCode': 200,
            'body': json.dumps('Hello from Lambda!')
        }
    EOF
    zip my-deployment-package.zip lambda_function.py
    
  2. Create an S3 bucket following the instructions here. Alternatively, you can use an existing S3 bucket in the same Region of the Amazon EKS cluster.
  3. Run the following to upload the zip file into the S3 bucket from the previous step:
    export BUCKET_NAME=<provide the bucket name from step 2>
    aws s3 cp  my-deployment-package.zip s3://${BUCKET_NAME}
  4. The output shows:
    upload: ./my-deployment-package.zip to s3://<BUCKET_NAME>/my-deployment-package.zip
  5. Create your Lambda function using the ACK Controller. The full spec with all the available fields is listed here. First, provide a name for the function:
    export FUNCTION_NAME=hello-world-s3-ack
  6. Create and deploy the Kubernetes manifest file. The command at the end, kubectl create -f function.yaml submits the manifest file, with kind as ‘Function’. The ACK Controller for Lambda identifies this custom ‘Function’ object and deploys the Lambda function based on the manifest file.
    export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query "Account" --output text)
    export LAMBDA_ROLE="arn:aws:iam::${AWS_ACCOUNT_ID}:role/lambda_basic_execution"
    
    cat << EOF > lambdamanifest.yaml 
    apiVersion: lambda.services.k8s.aws/v1alpha1
    kind: Function
    metadata:
     name: $FUNCTION_NAME
     annotations:
       services.k8s.aws/region: $AWS_REGION
    spec:
     name: $FUNCTION_NAME
     code:
       s3Bucket: $BUCKET_NAME
       s3Key: my-deployment-package.zip
     role: $LAMBDA_ROLE
     runtime: python3.9
     handler: lambda_function.lambda_handler
     description: function created by ACK lambda-controller e2e tests
    EOF
    kubectl create -f lambdamanifest.yaml
    
  7. The output shows:
    function.lambda.services.k8s.aws/< FUNCTION_NAME> created
  8. To retrieve the details of the function using a Kubernetes command, run:
    kubectl describe function/$FUNCTION_NAME
  9. This Lambda function returns a “Hello world” message. To invoke the function, run:
    aws lambda invoke --function-name $FUNCTION_NAME  response.json
    cat response.json
    
  10. The Lambda function returns the following output:
    {"statusCode": 200, "body": "\"Hello from Lambda!\""}

Congratulations! You created a Lambda function from your Kubernetes cluster.

To learn how to provision the Lambda function using the ACK controller from an OCI container image instead of a zip file in an S3 bucket, follow these instructions.

Cleaning up

This section cleans up all the resources that you have created. To clean up:

  1. Delete the Lambda function:
    kubectl delete function $FUNCTION_NAME
  2. If you have created a new S3 bucket, delete it by running:
    aws s3 rm s3://${BUCKET_NAME} --recursive
    aws s3api delete-bucket --bucket ${BUCKET_NAME}
  3. Delete the EKS cluster:
    eksctl delete cluster --name $EKS_CLUSTER_NAME
  4. Delete the IAM role created for the ACK Controller. Get the IAM role name by running the following command, then delete the role from the IAM console:
    echo $ACK_CONTROLLER_IAM_ROLE

Conclusion

This blog post shows how AWS Controllers for Kubernetes enables you to deploy a Lambda function directly from your Amazon EKS environment. AWS Controllers for Kubernetes provides a convenient way to connect your Kubernetes applications to AWS services directly from Kubernetes.

ACK is open source: you can request new features and report issues on the ACK community GitHub repository.

For more serverless learning resources, visit Serverless Land.

Speeding up incremental changes with AWS SAM Accelerate and nested stacks

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/speeding-up-incremental-changes-with-aws-sam-accelerate-and-nested-stacks/

This blog written by Jeff Marcinko, Sr. Technical Account Manager, Health Care & Life Sciencesand Brian Zambrano, Sr. Specialist Solutions Architect, Serverless.

Developers and operators have been using the AWS Serverless Application Model (AWS SAM) to author, build, test, and deploy serverless applications in AWS for over three years. Since its inception, the AWS SAM team has focused on developer productivity, simplicity, and best practices.

As good as AWS SAM is at making your serverless development experience easier and faster, building non-trivial cloud applications remains a challenge. Developers and operators want a development experience that provides high-fidelity and fast feedback on incremental changes. With serverless development, local emulation of an application composed of many AWS resources and managed services can be incomplete and inaccurate. We recommend developing serverless applications in the AWS Cloud against live AWS services to increase developer confidence. However, the latency of deploying an entire AWS CloudFormation stack for every code change is a challenge that developers face with this approach.

In this blog post, I show how to increase development velocity by using AWS SAM Accelerate with AWS CloudFormation nested stacks. Nested stacks are an application lifecycle management best practice at AWS. We recommend nested stacks for deploying complex serverless applications, which aligns to the Serverless Application Lens of the AWS Well-Architected Framework. AWS SAM Accelerate speeds up deployment from your local system by bypassing AWS CloudFormation to deploy code and resource updates when possible.

AWS CloudFormation nested stacks and AWS SAM

A nested stack is a CloudFormation resource that is part of another stack, referred to as the parent, or root stack.

Nested stack architecture

Nested stack architecture

The best practice for modeling complex applications is to author a root stack template and declare related resources in their own nested stack templates. This partitioning improves maintainability and encourages reuse of common template patterns. It is easier to reason about the configuration of the AWS resources in the example application because they are described in nested templates for each application component.

With AWS SAM, developers create nested stacks using the AWS::Serverless::Application resource type. The following example shows a snippet from a template.yaml file, which is the root stack for an AWS SAM application.

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31

Resources:
  DynamoDB:
    Type: AWS::Serverless::Application
    Properties:
      Location: db/template.yaml

  OrderWorkflow:
    Type: AWS::Serverless::Application
    Properties:
      Location: workflow/template.yaml

  ApiIntegrations:
    Type: AWS::Serverless::Application
    Properties:
      Location: api-integrations/template.yaml

  Api:
    Type: AWS::Serverless::Application
    Properties:
      Location: api/template.yaml

Each AWS::Serverless::Application resource type references a child stack, which is an independent AWS SAM template. The Location property tells AWS SAM where to find the stack definition.

Solution overview

The sample application exposes an API via Amazon API Gateway. One API endpoint (#2) forwards POST requests to Amazon SQS, an AWS Lambda function polls (#3) the SQS Queue and starts an Amazon Step Function workflow execution (#4) for each message.

Sample application architecture

Sample application architecture

Prerequisites

  1. AWS SAM CLI, version 1.53.0 or higher
  2. Python 3.9

Deploy the application

To deploy the application:

  1. Clone the repository:
    git clone <a href="https://github.com/aws-samples/sam-accelerate-nested-stacks-demo.git" target="_blank" rel="noopener">https://github.com/aws-samples/sam-accelerate-nested-stacks-demo.git</a>
  2. Change to the root directory of the project and run the following AWS SAM CLI commands:
    cd sam-accelerate-nested-stacks-demo
    sam build
    sam deploy --guided --capabilities CAPABILITY_IAM CAPABILITY_AUTO_EXPAND

    You must include the CAPABILITY_IAM and CAPABILITY_AUTO_EXPAND capabilities to support nested stacks and the creation of permissions.

  3. Use orders-app as the stack name during guided deployment. During the deploy process, enter your email for the SubscriptionEmail value. This requires confirmation later. Accept the defaults for the rest of the values.

    SAM deploy example

    SAM deploy example

  4. After the CloudFormation deployment completes, save the API endpoint URL from the outputs.

Confirming the notifications subscription

After the deployment finishes, you receive an Amazon SNS subscription confirmation email at the email address provided during the deployment. Choose the Confirm Subscription link to receive notifications.

You have chosen to subscribe to the topic: 
arn:aws:sns:us-east-1:123456789012:order-topic-xxxxxxxxxxxxxxxxxx

To confirm this subscription, click or visit the link below (If this was in error no action is necessary): 
Confirm subscription

Testing the orders application

To test the application, use the curl command to create a new Order request with the following JSON payload:

{
    "quantity": 1,
    "name": "Pizza",
    "restaurantId": "House of Pizza"
}
curl -s --header "Content-Type: application/json" \
  --request POST \
  --data '"quantity":1,"name":"Pizza","quantity":1,"restaurantId":"House of Pizza"}' \
  https://xxxxxxxxxx.execute-api.us-east-1.amazonaws.com/Dev/orders  | python -m json.tool

API Gateway responds with the following message, showing it successfully sent the request to the SQS queue:

API Gateway response

API Gateway response

The application sends an order notification once the Step Functions workflow completes processing. The workflow intentionally randomizes the SUCCESS or FAILURE status message.

Accelerating development with AWS SAM sync

AWS SAM Accelerate enhances the development experience. It automatically observes local code changes and synchronizes them to AWS without building and deploying every function in my project.

However, when you synchronize code changes directly into the AWS Cloud, it can introduce drift between your CloudFormation stacks and its deployed resources. For this reason, you should only use AWS SAM Accelerate to publish changes in a development stack.

In your terminal, change to the root directory of the project folder and run the sam sync command. This runs in the foreground while you make code changes:

cd sam-accelerate-nested-stacks-demo
sam sync --watch --stack-name orders-app

The –watch option causes AWS SAM to perform an initial CloudFormation deployment. After the deployment is complete, AWS SAM watches for local changes and synchronizes them to AWS. This feature allows you to make rapid iterative code changes and sync to the Cloud automatically in seconds.

Making a code change

In the editor, update the Subject argument in the send_order_notification function in workflow/src/complete_order/app.py.

def send_order_notification(message):
    topic_arn = TOPIC_ARN
    response = sns.publish(
        TopicArn=topic_arn,
        Message=json.dumps(message),
        Subject=f'Orders-App: Update for order {message["order_id"]}'
        #Subject='Orders-App: SAM Accelerate for the win!'
    )

On save, AWS SAM notices the local code change, and updates the CompleteOrder Lambda function. AWS SAM does not trigger updates to other AWS resources across the different stacks, since they are unchanged. This can result in increased development velocity.

SAM sync output

SAM sync output

Validate the change by sending a new order request and review the notification email subject.

curl -s --header "Content-Type: application/json" \
  --request POST \
  --data '"quantity":1,"name":"Pizza","quantity":1,"restaurantId":"House of Pizza"}' \
  https://xxxxxxxxxx.execute-api.us-east-1.amazonaws.com/Dev/orders  | python -m json.tool

In this example, AWS SAM Accelerate is 10–15 times faster than the CloudFormation deployment workflow (sam deploy) for single function code changes.

Deployment speed comparison between SAM accelerate and CloudFormation

Deployment speed comparison between SAM accelerate and CloudFormation

Deployment times vary based on the size and complexity of your Lambda functions and the number of resources in your project.

Making a configuration change

Next, make an infrastructure change to show how sync –watch handles configuration updates.

Update ReadCapacityUnits and WriteCapacityUnits in the DynamoDB table definition by changing the values from five to six in db/template.yaml.

Resources:
  OrderTable:
    Type: AWS::DynamoDB::Table
    Properties:
      TableName: order-table-test
      AttributeDefinitions:
        - AttributeName: user_id
          AttributeType: S
        - AttributeName: id
          AttributeType: S
      KeySchema:
        - AttributeName: user_id
          KeyType: HASH
        - AttributeName: id
          KeyType: RANGE
      ProvisionedThroughput:
        ReadCapacityUnits: 5
        WriteCapacityUnits: 5

The sam sync –watch command recognizes the configuration change requires a CloudFormation deployment to update the db nested stack. Nested stacks reflect an UPDATE_COMPLETE status because CloudFormation starts an update to every nested stack to determine if changes must be applied.

SAM sync infrastructure update

SAM sync infrastructure update

Cleaning up

Delete the nested stack resources to make sure that you don’t continue to incur charges. After stopping the sam sync –watch command, run the following command to delete your resources:

sam delete orders-app

You can also delete the CloudFormation root stack from the console by following these steps.

Conclusion

Local emulation of complex serverless applications, built with nested stacks, can be challenging. AWS SAM Accelerate helps builders achieve a high-fidelity development experience by rapidly synchronizing code changes into the AWS Cloud.

This post shows AWS SAM Accelerate features that push code changes in near real time to a development environment in the Cloud. I use a non-trivial sample application to show how developers can push code changes to a live environment in seconds while using CloudFormation nested stacks to achieve the isolation and maintenance benefits.

For more serverless learning resources, visit Serverless Land.

Using custom consumer group ID support for the AWS Lambda event sources for MSK and self-managed Kafka

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/using-custom-consumer-group-id-support-for-the-aws-lambda-event-sources-for-msk-and-self-managed-kafka/

This post is written by Adam Wagner, Principal Serverless Specialist SA.

AWS Lambda already supports Amazon Managed Streaming for Apache Kafka (MSK) and self-managed Apache Kafka clusters as event sources. Today, AWS adds support for specifying a custom consumer group ID for the Lambda event source mappings (ESMs) for MSK and self-managed Kafka event sources.

With this feature, you can create a Lambda ESM that uses a consumer group that has already been created. This enables you to use Lambda as a Kafka consumer for topics that are replicated with MirrorMaker v2 or with consumer groups you create to start consuming at a particular offset or timestamp.

Overview

This blog post shows how to use this feature to enable Lambda to consume a Kafka topic starting at a specific timestamp. This can be useful if you must reprocess some data but don’t want to reprocess all of the data in the topic.

In this example application, a client application writes to a topic on the MSK cluster. It creates a consumer group that points to a specific timestamp within that topic as the starting point for consuming messages. A Lambda ESM is created using that existing consumer group that triggers a Lambda function. This processes and writes the messages to an Amazon DynamoDB table.

Reference architecture

  1. A Kafka client writes messages to a topic in the MSK cluster.
  2. A Kafka consumer group is created with a starting point of a specific timestamp
  3. The Lambda ESM polls the MSK topic using the existing consumer group and triggers the Lambda function with batches of messages.
  4. The Lambda function writes the messages to DynamoDB

Step-by-step instructions

To get started, create an MSK cluster and a client Amazon EC2 instance from which to create topics and publish messages. If you don’t already have an MSK cluster, follow this blog on setting up an MSK cluster and using it as an event source for Lambda.

  1. On the client instance, set an environment variable to the MSK cluster bootstrap servers to make it easier to reference them in future commands:
    export MSKBOOTSTRAP='b-1.mskcluster.oy1hqd.c23.kafka.us-east-1.amazonaws.com:9094,b-2.mskcluster.oy1hqd.c23.kafka.us-east-1.amazonaws.com:9094,b-3.mskcluster.oy1hqd.c23.kafka.us-east-1.amazonaws.com:9094'
  2. Create the topic. This example has a three-node MSK cluster so the replication factor is also set to three. The partition count is set to three in this example. In your applications, set this according to throughput and parallelization needs.
    ./bin/kafka-topics.sh --create --bootstrap-server $MSKBOOT --replication-factor 3 --partitions 3 --topic demoTopic01
  3. Write messages to the topic using this Python script:
    #!/usr/bin/env python3
    import json
    import time
    from random import randint
    from uuid import uuid4
    from kafka import KafkaProducer
    
    BROKERS = ['b-1.mskcluster.oy1hqd.c23.kafka.us-east-1.amazonaws.com:9094', 
            'b-2.mskcluster.oy1hqd.c23.kafka.us-east-1.amazonaws.com:9094',
            'b-3.mskcluster.oy1hqd.c23.kafka.us-east-1.amazonaws.com:9094']
    TOPIC = 'demoTopic01'
    
    producer = KafkaProducer(bootstrap_servers=BROKERS, security_protocol='SSL',
            value_serializer=lambda x: json.dumps(x).encode('utf-8'))
    
    def create_record(sequence_num):
        number = randint(1000000,10000000)
        record = {"id": sequence_num, "record_timestamp": int(time.time()), "random_number": number, "producer_id": str(uuid4()) }
        print(record)
        return record
    
    def publish_rec(seq):
        data = create_record(seq)
        producer.send(TOPIC, value=data).add_callback(on_send_success).add_errback(on_send_error)
        producer.flush()
    
    def on_send_success(record_metadata):
        print(record_metadata.topic, record_metadata.partition, record_metadata.offset)
    
    def on_send_error(excp):
        print('error writing to kafka', exc_info=excp)
    
    for num in range(1,10000000):
        publish_rec(num)
        time.sleep(0.5) 
    
  4. Copy the script into a file on the client instance named producer.py. The script uses the kafka-python library, so first create a virtual environment and install the library.
    python3 -m venv venv
    source venv/bin/activate
    pip3 install kafka-python
    
  5. Start the script. Leave it running for a few minutes to accumulate some messages in the topic.
    Output
  6. Previously, a Lambda function would choose between consuming messages starting at the beginning of the topic or starting with the latest messages. In this example, it starts consuming messages from a few hours ago at 14:30 UTC. To do this, first create a new consumer group on the client instance:
    ./bin/kafka-consumer-groups.sh --command-config client.properties --bootstrap-server $MSKBOOTSTRAP --topic demoTopic01 --group specificTimeCG --to-datetime 2022-08-10T16:00:00.000 --reset-offsets --execute
  7. In this case, specificTimeCG is the consumer group ID used when creating the Lambda ESM. Listing the consumer groups on the cluster shows the new group:
    ./bin/kafka-consumer-groups.sh --list --command-config client.properties --bootstrap-server $MSKBOOTSTRAP

    Output

  8. With the consumer group created, create the Lambda function along with the Event Source Mapping that uses this new consumer group. In this case, the Lambda function and DynamoDB table are already created. Create the ESM with the following AWS CLI Command:
    aws lambda create-event-source-mapping --region us-east-1 --event-source-arn arn:aws:kafka:us-east-1:0123456789:cluster/demo-us-east-1/78a8d1c1-fa31-4f59-9de3-aacdd77b79bb-23 --function-name msk-consumer-demo-ProcessMSKfunction-IrUhEoDY6X9N --batch-size 3 --amazon-managed-kafka-event-source-config '{"ConsumerGroupId":"specificTimeCG"}' --topics demoTopic01

    The event source in the Lambda console or CLI shows the starting position set to TRIM_HORIZON. However, if you specify a custom consumer group ID that already has existing offsets, those offsets take precedent.

  9. With the event source created, navigate to the DynamoDB console. Locate the DynamoDB table to see the records written by the Lambda function.
    DynamoDB table

Converting the record timestamp of the earliest record in DynamoDB, 1660147212, to a human-readable date shows that the first record was created on 2022-08-10T16:00:12.

In this example, the consumer group is created before the Lambda ESM so that you can specify the timestamp to start from.

If you create an ESM and specify a custom consumer group ID that does not exist, it is created. This is a convenient way to create a new consumer group for an ESM with an ID of your choosing.

Deleting an ESM does not delete the consumer group, regardless of whether it is created before, or during, the ESM creation.

Using the AWS Serverless Application Model (AWS SAM)

To create the event source mapping with a custom consumer group using an AWS Serverless Application Model (AWS SAM) template, use the following snippet:

Events:
  MyMskEvent:
    Type: MSK
    Properties:
      Stream: !Sub arn:aws:kafka:${AWS::Region}:012345678901:cluster/ demo-us-east-1/78a8d1c1-fa31-4f59-9de3-aacdd77b79bb-23
      Topics:
        - "demoTopic01"
      ConsumerGroupId: specificTimeCG

Other types of Kafka clusters

This example uses the custom consumer group ID feature when consuming a Kafka topic from an MSK cluster. In addition to MSK clusters, this feature also supports self-managed Kafka clusters. These could be clusters running on EC2 instances or managed Kafka clusters from a partner such as Confluent.

Conclusion

This post shows how to use the new custom consumer group ID feature of the Lambda event source mapping for Amazon MSK and self-managed Kafka. This feature can be used to consume messages with Lambda starting at a specific timestamp or offset within a Kafka topic. It can also be used to consume messages from a consumer group that is replicated from another Kafka cluster using MirrorMaker v2.

For more serverless learning resources, visit Serverless Land.