Tag Archives: AWS Lambda

Perform continuous vulnerability scanning of AWS Lambda functions with Amazon Inspector

Post Syndicated from Manjunath Arakere original https://aws.amazon.com/blogs/security/perform-continuous-vulnerability-scanning-of-aws-lambda-functions-with-amazon-inspector/

This blog post demonstrates how you can activate Amazon Inspector within one or more AWS accounts and be notified when a vulnerability is detected in an AWS Lambda function.

Amazon Inspector is an automated vulnerability management service that continually scans workloads for software vulnerabilities and unintended network exposure. Amazon Inspector scans mixed workloads like Amazon Elastic Compute Cloud (Amazon EC2) instances and container images located in Amazon Elastic Container Registry (Amazon ECR). At re:Invent 2022, we announced Amazon Inspector support for Lambda functions and Lambda layers to provide a consolidated solution for compute types.

Only scanning your functions for vulnerabilities before deployment might not be enough since vulnerabilities can appear at any time, like the widespread Apache Log4j vulnerability. So it’s essential that workloads are continuously monitored and rescanned in near real time as new vulnerabilities are published or workloads are changed.

Amazon Inspector scans are intelligently initiated based on the updates to Lambda functions or when new Common Vulnerabilities and Exposures (CVEs) are published that are relevant to your function. No agents are needed for Amazon Inspector to work, which means you don’t need to install a library or agent in your Lambda functions or layers. When Amazon Inspector discovers a software vulnerability or network configuration issue, it creates a finding which describes the vulnerability, identifies the affected resource, rates the severity of the vulnerability, and provides remediation guidance.

In addition, Amazon Inspector integrates with several AWS services, such as Amazon EventBridge and AWS Security Hub. You can use EventBridge to build automation workflows like getting notified for a specific vulnerability finding or performing an automatic remediation with the help of Lambda or AWS Systems Manager.

In this blog post, you will learn how to do the following:

  1. Activate Amazon Inspector in a single AWS account and AWS Region.
  2. See how Amazon Inspector automated discovery and continuous vulnerability scanning works by deploying a new Lambda function with a vulnerable package dependency.
  3. Receive a near real-time notification when a vulnerability with a specific severity is detected in a Lambda function with the help of EventBridge and Amazon Simple Notification Service (Amazon SNS).
  4. Remediate the vulnerability by using the recommendation provided in the Amazon Inspector dashboard.
  5. Activate Amazon Inspector in multiple accounts or Regions through AWS Organizations.

Solution architecture

Figure 1 shows the AWS services used in the solution and how they are integrated.

Figure 1: Solution architecture overview

Figure 1: Solution architecture overview

The workflow for the solution is as follows:

  1. Deploy a new Lambda function by using the AWS Serverless Application Model (AWS SAM).
  2. Amazon Inspector scans when a new vulnerability is published or when an update to an existing Lambda function or a new Lambda function is deployed. Vulnerabilities are identified in the deployed Lambda function.
  3. Amazon EventBridge receives the events from Amazon Inspector and checks against the rules for specific events or filter conditions.
  4. In this case, an EventBridge rule exists for the Amazon Inspector findings, and the target is defined as an SNS topic to send an email to the system operations team.
  5. The EventBridge rule invokes the target SNS topic with the event data, and an email is sent to the confirmed subscribers in the SNS topic.
  6. The system operations team receives an email with detailed information on the vulnerability, the fixed package versions, the Amazon Inspector score to prioritize, and the impacted Lambda functions. By using the remediation information from Amazon Inspector, the team can now prioritize actions and remediate.

Prerequisites

To follow along with this demo, we recommend that you have the following in place:

  • An AWS account.
  • A command line interface: AWS CloudShell or AWS CLI. In this post, we recommend the use of CloudShell because it already has Python and AWS SAM. However, you can also use your CLI with AWS CLI, SAM, and Python.
  • An AWS Region where Amazon Inspector Lambda code scanning is available.
  • An IAM role in that account with administrator privileges.

The solution in this post includes the following AWS services: Amazon Inspector, AWS Lambda, Amazon EventBridge, AWS Identity and Access Management (IAM), Amazon SNS, AWS CloudShell and AWS Organizations for activating Amazon Inspector at scale (multi-accounts).

Step 1: Activate Amazon Inspector in a single account in the Region

The first step is to activate Amazon Inspector in your account in the Region you are using.

To activate Amazon Inspector

  1. Sign in to the AWS Management Console.
  2. Open AWS CloudShell. CloudShell inherits the credentials and permissions of the IAM principal who is signed in to the AWS Management Console. CloudShell comes with the CLIs and runtimes that are needed for this demo (AWS CLI, AWS SAM, and Python).
  3. Use the following command in CloudShell to get the status of the Amazon Inspector activation.
    aws inspector2 batch-get-account-status

  4. Use the following command to activate Inspector in the default Region for resource type LAMBDA. Other allowed values for resource types are EC2, ECR and LAMDA_CODE.
    aws inspector2 enable --resource-types '["LAMBDA"]'

  5. Use the following command to verify the status of the Amazon Inspector activation.
    aws inspector2 batch-get-account-status

You should see a response that shows that Amazon Inspector is enabled for Lambda resources, as shown in Figure 2.

Figure 2: Amazon Inspector status after you enable Lambda scanning

Figure 2: Amazon Inspector status after you enable Lambda scanning

Step 2: Create an SNS topic and subscription for notification

Next, create the SNS topic and the subscription so that you will be notified of each new Amazon Inspector finding.

To create the SNS topic and subscription

  1. Use the following command in CloudShell to create the SNS topic and its subscription and replace <REGION_NAME>, <AWS_ACCOUNTID> and <[email protected]> by the relevant values.
    aws sns create-topic --name amazon-inspector-findings-notifier; 
    
    aws sns subscribe \
    --topic-arn arn:aws:sns:<REGION_NAME>:<AWS_ACCOUNTID>:amazon-inspector-findings-notifier \
    --protocol email --notification-endpoint <[email protected]>

  2. Check the email inbox you entered for <[email protected]>, and in the email from Amazon SNS, choose Confirm subscription.
  3. In the CloudShell console, use the following command to list the subscriptions, to verify the topic and email subscription.
    aws sns list-subscriptions

    You should see a response that shows subscription details like the email address and ARN, as shown in Figure 3.

    Figure 3: Subscribed email address and SNS topic

    Figure 3: Subscribed email address and SNS topic

  4. Use the following command to send a test message to your subscribed email and verify that you receive the message by replacing <REGION_NAME> and <AWS_ACCOUNTID>.
    aws sns publish \
        --topic-arn "arn:aws:sns:<REGION_NAME>:<AWS_ACCOUNTID>:amazon-inspector-findings-notifier" \
        --message "Hello from Amazon Inspector2"

Step 3: Set up Amazon EventBridge with a custom rule and the SNS topic as target

Create an EventBridge rule that will invoke your previously created SNS topic whenever Amazon Inspector finds a new vulnerability with a critical severity.

To set up the EventBridge custom rule

  1. In the CloudShell console, use the following command to create an EventBridge rule named amazon-inspector-findings with filters InspectorScore greater than 8 and severity state set to CRITICAL.
    aws events put-rule \
        --name "amazon-inspector-findings" \
        --event-pattern "{\"source\": [\"aws.inspector2\"],\"detail-type\": [\"Inspector2 Finding\"],\"detail\": {\"inspectorScore\": [ { \"numeric\": [ \">\", 8] } ],\"severity\": [\"CRITICAL\"]}}"

    Refer to the topic Amazon EventBridge event schema for Amazon Inspector events to customize the event pattern for your application needs.

  2. To verify the rule creation, go to the EventBridge console and in the left navigation bar, choose Rules.
  3. Choose the rule with the name amazon-inspector-findings. You should see the event pattern as shown in Figure 4.
    Figure 4: Event pattern for the EventBridge rule to filter on CRITICAL vulnerabilities.

    Figure 4: Event pattern for the EventBridge rule to filter on CRITICAL vulnerabilities.

  4. Add the SNS topic you previously created as the target to the EventBridge rule. Replace <REGION_NAME>, <AWS_ACCOUNTID>, and <RANDOM-UNIQUE-IDENTIFIER-VALUE> with the relevant values. For RANDOM-UNIQUE-IDENTIFIER-VALUE, create a memorable and unique string.
    aws events put-targets \
        --rule amazon-inspector-findings \
        --targets "Id"="<RANDOM-UNIQUE-IDENTIFIER-VALUE>","Arn"="arn:aws:sns:<REGION_NAME>:<AWS_ACCOUNTID>:amazon-inspector-findings-notifier"

    Important: Save the target ID. You will need this in order to delete the target in the last step.

  5. Provide permission to enable Amazon EventBridge to publish to SNS topic amazon-inspector-findings-notifier
    aws sns set-topic-attributes --topic-arn "arn:aws:sns:<REGION_NAME>:<AWS_ACCOUNTID>:amazon-inspector-findings-notifier" \
    --attribute-name Policy \
    --attribute-value "{\"Version\":\"2012-10-17\",\"Id\":\"__default_policy_ID\",\"Statement\":[{\"Sid\":\"PublishEventsToMyTopic\",\"Effect\":\"Allow\",\"Principal\":{\"Service\":\"events.amazonaws.com\"},\"Action\":\"sns:Publish\",\"Resource\":\"arn:aws:sns:<REGION_NAME>:<AWS_ACCOUNTID>:amazon-inspector-findings-notifier\"}]}"

Step 4: Deploy the Lambda function to the AWS account by using AWS SAM

In this step, you will use Serverless Application Manager (SAM) quick state templates to build and deploy a Lambda function with a vulnerable library, in order to generate findings. Learn more about AWS SAM.

To deploy the Lambda function with a vulnerable library

  1. In the CloudShell console, use a prebuilt “hello-world” AWS SAM template to deploy the Lambda function.
    sam init --runtime python3.7 --dependency-manager pip --app-template hello-world --name sam-app

  2. Use the following command to add the vulnerable package python-jwt==3.3.3 to the Lambda function.
    cd sam-app;
    echo -e 'requests\npython-jwt==3.3.3' > hello_world/requirements.txt

  3. Use the following command to build the application.
    sam build

  4. Use the following command to deploy the application with the guided option.
    sam deploy --guided

    This command packages and deploys the application to your AWS account. It provides a series of prompts. You may respond to the prompts with the:

    1. Stack name you want
    2. Set the default options, except for the
      1. HelloWorldFunction may not have authorization defined, Is this okay? [y/N]: prompt. Here, input y and press Enter and
      2. Deploy this changeset? [y/N]: prompt. Here, input y and press Enter.

Step 5: View Amazon Inspector findings

Amazon Inspector will automatically generate findings when scanning the Lambda function previously deployed. To view those findings, follow the steps below.

To view Amazon Inspector findings for the vulnerability

  1. Navigate to the Amazon Inspector console.
  2. In the left navigation menu, choose All findings to see all of the Active findings, as shown in Figure 5.

    Due to the custom event pattern rule in Amazon EventBridge, even though there are multiple findings for the vulnerable package python-jwt==3.3.3, you will be notified only for the finding that has InspectorScore greater than 8 and severity CRITICAL.

  3. Choose the title of each finding to see detailed information about the vulnerability.
    Figure 5: Example of findings from the Amazon Inspector console

    Figure 5: Example of findings from the Amazon Inspector console

Step 6: Remediate the vulnerability by applying the fixed package version

Now you can remediate the vulnerability by updating the package version as suggested by Amazon Inspector.

To remediate the vulnerability

  1. In the Amazon Inspector console, in the left navigation menu, choose All Findings.
  2. Choose the title of the vulnerability to see the finding details and the remediation recommendations.
    Figure 6: Amazon Inspector finding for python-jwt, with the associated remediation

    Figure 6: Amazon Inspector finding for python-jwt, with the associated remediation

  3. To remediate, use the following command to update the package version to the fixed version as suggested by Amazon Inspector.
    cd /home/cloudshell-user/sam-app;
    echo -e "requests\npython-jwt==3.3.4" > hello_world/requirements.txt

  4. Use the following command to build the application.
    sam build

  5. Use the following command to deploy the application with the guided option.
    sam deploy --guided

    This command packages and deploys the application to your AWS account. It provides a series of prompts. You may respond to the prompts with the

    1. Stack name you want
    2. Set the default options, except for the
      1. HelloWorldFunction may not have authorization defined, Is this okay? [y/N]: prompt. Here, input y and press Enter and
      2. Deploy this changeset? [y/N]: prompt. Here, input y and press Enter.
  6. Amazon Inspector automatically rescans the function after its deployment and reevaluates the findings. At this point, you can navigate back to the Amazon Inspector console, and in the left navigation menu, choose All findings. In the Findings area, you can see that the vulnerabilities are moved from Active to Closed status.

    Due to the custom event pattern rule in Amazon EventBridge, you will be notified by email with finding status as CLOSED.

    Figure 7: Inspector rescan results, showing no open findings after remediation

    Figure 7: Inspector rescan results, showing no open findings after remediation

(Optional) Step 7: Activate Amazon Inspector in multiple accounts and Regions

To benefit from Amazon Inspector scanning capabilities across the accounts that you have in AWS Organizations and in your selected Regions, use the following steps:

To activate Amazon Inspector in multiple accounts and Regions

  1. In the CloudShell console, use the following command to clone the code from the aws-samples inspector2-enablement-with-cli GitHub repo.
    cd /home/cloudshell-user;
    git clone https://github.com/aws-samples/inspector2-enablement-with-cli.git;
    cd inspector2-enablement-with-cli

  2. Follow the instructions from the README.md file.
  3. Configure the file param_inspector2.json with the relevant values, as follows:
    • inspector2_da: The delegated administrator account ID for Amazon Inspector to manage member accounts.
    • scanning_type: The resource types (EC2, ECR, LAMBDA) to be enabled by Amazon Inspector.
    • auto_enable: The resource types to be enabled on every account that is newly attached to the delegated administrator.
    • regions: Because Amazon Inspector is a regional service, provide the list of AWS Regions to enable.
  4. Select the AWS account that would be used as the delegated administrator account (<DA_ACCOUNT_ID>).
  5. Delegate an account as the admin for Amazon Inspector by using the following command.
    ./inspector2_enablement_with_awscli.sh -a delegate_admin -da <DA_ACCOUNT_ID>

  6. Activate the delegated admin by using the following command:
    ./inspector2_enablement_with_awscli.sh -a activate -t <DA_ACCOUNT_ID> -s all

  7. Associate the member accounts by using the following command:
    ./inspector2_enablement_with_awscli.sh -a associate -t members

  8. Wait five minutes.
  9. Enable the resource types (EC2, ECR, LAMBDA) on your member accounts by using the following command:
    ./inspector2_enablement_with_awscli.sh -a activate -t members

  10. Enable Amazon Inspector on the new member accounts that are associated with the organization by using the following command:
    ./inspector2_enablement_with_awscli.sh -auto_enable

  11. Check the Amazon Inspector status in your accounts and in multiple selected Regions by using the following command:
    ./inspector2_enablement_with_awscli.sh -a get_status

There are other options you can use to enable Amazon Inspector in multiple accounts, like AWS Control Tower and Terraform. For the reference architecture for Control Tower, see the AWS Security Reference Architecture Examples on GitHub. For more information on the Terraform option, see the Terraform aws_inspector2_enabler resource page.

Step 8: Delete the resources created in the previous steps

AWS offers a 15-day free trial for Amazon Inspector so that you can evaluate the service and estimate its cost.

To avoid potential charges, delete the AWS resources that you created in the previous steps of this solution (Lambda function, EventBridge target, EventBridge rule, and SNS topic), and deactivate Amazon Inspector.

To delete resources

  1. In the CloudShell console, enter the sam-app folder.
    cd /home/cloudshell-user/sam-app

  2. Delete the Lambda function and confirm by typing “y” when prompted for confirmation.
    sam delete

  3. Remove the SNS target from the Amazon EventBridge rule.
    aws events remove-targets --rule "amazon-inspector-findings" --ids <RANDOM-UNIQUE-IDENTIFIER-VALUE>

    Note: If you don’t remember the target ID, navigate to the Amazon EventBridge console, and in the left navigation menu, choose Rules. Select the rule that you want to delete. Choose CloudFormation, and copy the ID.

  4. Delete the EventBridge rule.
    aws events delete-rule --name amazon-inspector-findings

  5. Delete the SNS topic.
    aws sns delete-topic --topic-arn arn:aws:sns:<REGION_NAME>:<AWS_ACCOUNTID>:amazon-inspector-findings-notifier

  6. Disable Amazon Inspector.
    aws inspector2 disable --resource-types '["LAMBDA"]'

    Follow the new few steps to roll back changes only if you have performed the activities listed in Step 7: Activate Amazon Inspector in multiple accounts and Regions.

  7. In the CloudShell console, enter the folder inspector2-enablement-with-cli.
    cd /home/cloudshell-user/inspector2-enablement-with-cli

  8. Deactivate the resource types (EC2, ECR, LAMBDA) on your member accounts.
    ./inspector2_enablement_with_awscli.sh -a deactivate -t members -s all

  9. Disassociate the member accounts.
    ./inspector2_enablement_with_awscli.sh -a disassociate -t members

  10. Deactivate the delegated admin account.
    ./inspector2_enablement_with_awscli.sh -a deactivate -t <DA_ACCOUNT_ID> -s all

  11. Remove the delegated account as the admin for Amazon Inspector.
    ./inspector2_enablement_with_awscli.sh -a remove_admin -da <DA_ACCOUNT_ID>

Conclusion

In this blog post, we discussed how you can use Amazon Inspector to continuously scan your Lambda functions, and how to configure an Amazon EventBridge rule and SNS to send out notification of Lambda function vulnerabilities in near real time. You can then perform remediation activities by using AWS Lambda or AWS Systems Manager. We also showed how to enable Amazon Inspector at scale, activating in both single and multiple accounts, in default and multiple Regions.

As of the writing this post, a new feature to perform code scans for Lambda functions is available. Amazon Inspector can now also scan the custom application code within a Lambda function for code security vulnerabilities such as injection flaws, data leaks, weak cryptography, or missing encryption, based on AWS security best practices. You can use this additional scanning functionality to further protect your workloads.

If you have feedback about this blog post, submit comments in the Comments section below. If you have question about this blog post, start a new thread on the Amazon Inspector forum or contact AWS Support.

 
Want more AWS Security news? Follow us on Twitter.

Manjunath Arakere

Manjunath Arakere

Manjunath is a Senior Solutions Architect in the Worldwide Public Sector team at AWS. He works with Public Sector partners to design and scale well-architected solutions, and he supports their cloud migrations and application modernization initiatives. Manjunath specializes in migration, modernization and serverless technology.

Stéphanie Mbappe

Stéphanie Mbappe

Stéphanie is a Security Consultant with Amazon Web Services. She delights in assisting her customers at every step of their security journey. Stéphanie enjoys learning, designing new solutions, and sharing her knowledge with others.

Python 3.11 runtime now available in AWS Lambda

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/python-3-11-runtime-now-available-in-aws-lambda/

This post is written by Ramesh Mathikumar, Senior DevOps Consultant and Francesco Vergona, Solutions Architect.

AWS Lambda now supports Python 3.11 as both a managed runtime and container base image. Python 3.11 contains significant performance enhancements over Python 3.10. Features like reduced startup time, streamlined stack frames and CPython specialization adaptive interpreter help many workloads using Python 3.11 run faster and cheaper, thanks to Lambda’s per-millisecond billing model. With this release, Python developers can now take advantage of new features and improvements introduced in Python 3.11 when creating serverless applications on Lambda.

You can use Python 3.11 with Lambda Powertools for Python, a developer toolkit to implement Serverless best practices and increase developer velocity. Lambda Powertools includes proven libraries to support common patterns such as observability, parameter store integration, idempotency, batch processing, feature flags, and more. Learn more about PowerTools for AWS Lambda for Python in the documentation.

You can also use Python 3.11 with Lambda@Edge, allowing you to customize low-latency content delivered through Amazon CloudFront.

Python is a popular language for building serverless applications. The Python 3.11 release includes both performance improvements and new language features. For customers who deploy their Lambda functions using container image, the base image for Python 3.11 also includes changes to make managing installed packages easier.

This blog post reviews these changes in turn, followed by an overview of how you can get started with Python 3.11 in Lambda.

Performance improvements

Optimizations to CPython introduced by Python 3.11 brings significant performance enhancements, making it an average of 25% faster than Python 3.10, based on Python community benchmark tests using the Python Performance Benchmark Suite.

This release focuses on two key areas:

  • Faster startup: core modules essential for Python are now “frozen,” with statically allocated code objects, resulting in a 10–15% faster interpreter start up relative to Python 3.10.
  • Faster function execution: improvements include streamlined frame creation, in-lined Python function calls for reduced C stack usage, and the implementation of a Specializing Adaptive Interpreter, which specializes the interpreter for “hot code” (code that’s executed multiple times) and reducing the overhead during execution.

These optimizations can improve performance by 10-60% depending on the workload. In the context of a Lambda function execution, this results in performance improvements for both ”cold start“ and ”warm start“ invocations

In addition to faster CPython performance improvements, Python 3.11 also provides performance improvements across other areas. For example:

  • String formatting with printf-style% codes is now as fast as f-string expressions.
  • Integer division is around 20% faster on x86-64 for certain scenarios.
  • Operations like sum() and list resizing have seen notable speed enhancements.
  • Dictionaries save memory by not storing hash values when keys are Unicode objects.
  • Improvements to asyncio. DatagramProtocol introduce significantly faster large file transfers over UDP.
  • Math functions, statistics functions, and unicodedata.normalize() also benefit from substantial speed improvements.

Language features

Thanks to its simplicity, readability, and extensive community support, Python is a popular language for building serverless applications. The Python 3.11 release includes several new language features, including:

  • Variadic generics (PEP 646): Python 3.11 introduces TypeVarTuple, enabling parameterization with an arbitrary number of types.
  • Marking individual TypedDict items as required or not-required (PEP 655): The introduction of Required and NotRequired in TypedDict allows for explicit marking of individual item requirements, eliminating the need for inheritance.
  • Self type (PEP 673): The Self annotation simplifies the annotation of methods returning an instance of their class, similarly to TypeVar in PEP 484
  • Arbitrary literal string type (PEP 675): The LiteralString annotation allows a function parameter to accept any literal string type, including strings created from literals.
  • Data class transforms (PEP 681): The @dataclass_transform() decorator enables objects to utilize runtime transformations for dataclass-like functionalities.

For the full list of Python 3.11 changes, see the Python 3.11 release notes.

Change in pre-installed modules location and search path

Previously, Lambda base container images for Python included the /var/runtime directory before the /var/lang/lib/python3.x directory in the search path. This meant that packages in /var/runtime are loaded in preference to packages installed via pip into /var/lang/lib/python3.x. Since the AWS SDK for Python (boto3/botocore) was pre-installed into /var/runtime, this made it harder for base container images customers to upgrade the SDK version.

With the Python 3.11 runtime, the AWS SDK and its dependencies are now pre-installed into the /var/lang/lib/python3.11 directory, and the search path has been modified so this directory has precedence over /var/runtime. This change means customers who build and deploy Lambda functions using the Python 3.11 base container image can now override the SDK simply by running pip install on a newer version. This change also enables pip to verify and track that the pre-installed SDK and its dependencies are compatible with any customer-installed packages.

This is the default sys.path before Python 3.11 (where X.Y is the Python major.minor version):

  • /var/task/: User Function
  • /opt/python/lib/pythonX.Y/site-packages/: User Layer
  • /opt/python/: User Layer
  • /var/runtime/: Pre-installed modules
  • /var/lang/lib/pythonX.Y/site-packages/: Default pip install location

Here is the default sys.path starting from Python 3.11:

  • /var/task/: User Function
  • /opt/python/lib/pythonX.Y/site-packages/: User Layer
  • /opt/python/: User Layer
  • /var/lang/lib/pythonX.Y/site-packages/: Pre-installed modules and default pip install location
  • /var/runtime/: No pre-installed modules

Using Python 3.11 in Lambda

AWS Management Console

To use the Python 3.11 runtime to develop your Lambda functions, specify a runtime parameter value Python 3.11 when creating or updating a function. Python 3.11 version is now available in the Runtime dropdown in the Create function page.

Create function

To update an existing Lambda function to Python 3.11, navigate to the function in the Lambda console, then choose Edit in the Runtime settings panel. The new version of Python is available in the Runtime dropdown:

Edit function

AWS Lambda – Container Image

Change the Python base image version by modifying FROM statement in the Dockerfile:

FROM public.ecr.aws/lambda/python:3.11
# Copy function code
COPY lambda_handler.py ${LAMBDA_TASK_ROOT}

To learn more, refer to the usage tab on building functions as container images.

AWS Serverless Application Model (AWS SAM)

In AWS SAM, set the Runtime attribute to python3.11 to use this version.

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: Simple Lambda Function
  MyFunction:
    Type: AWS::Serverless::Function
    Properties:
      Description: My Python Lambda Function
      CodeUri: my_function/
      Handler: lambda_function.lambda_handler
      Runtime: python3.11

AWS SAM supports generating this template with Python 3.11 out of the box for new serverless applications using the sam init command. Refer to the AWS SAM documentation here.

AWS Cloud Development Kit (AWS CDK)

In the AWS CDK, set the runtime attribute to Runtime.PYTHON_3_11 to use this version. In Python:

from constructs import Construct 
from aws_cdk import ( App, Stack, aws_lambda as _lambda )

class SampleLambdaStack(Stack):
    def __init__(self, scope: Construct, id: str, **kwargs) -> None:
        super().__init__(scope, id, **kwargs)
        
        base_lambda = _lambda.Function(self, 'SampleLambda', 
                                       handler='lambda_handler.handler', 
                                    runtime=_lambda.Runtime.PYTHON_3_11, 
                                 code=_lambda.Code.from_asset('lambda'))

In TypeScript:

import * as cdk from 'aws-cdk-lib';
import * as lambda from 'aws-cdk-lib/aws-lambda'
import * as path from 'path';
import { Construct } from 'constructs';

export class CdkStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    // The code that defines your stack goes here

    // The python3.11 enabled Lambda Function
    const lambdaFunction = new lambda.Function(this, 'python311LambdaFunction', {
      runtime: lambda.Runtime.PYTHON_3_11,
      memorySize: 512,
      code: lambda.Code.fromAsset(path.join(__dirname, '/../lambda')),
      handler: 'lambda_handler.handler'
    })
  }
}

Conclusion

You can build and deploy functions using Python 3.11 using the AWS Management Console, AWS CLI, AWS SDK, AWS SAM, AWS CDK, or your choice of Infrastructure as Code (IaC). You can also use the Python 3.11 container base image if you prefer to build and deploy your functions using container images.

We are excited to bring Python 3.11 runtime support to Lambda and empower developers to build more efficient, powerful, and scalable serverless applications. Try Python 3.11 runtime in Lambda today and experience the benefits of this updated language version and take advantage of improved performance and new language features.
For more serverless learning resources, visit Serverless Land

Migrating AWS Lambda functions from the Go1.x runtime to the custom runtime on Amazon Linux 2

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/migrating-aws-lambda-functions-from-the-go1-x-runtime-to-the-custom-runtime-on-amazon-linux-2/

This post is written by Micah Walter, Senior Solutions Architect, Yanko Bolanos, Senior Solutions Architect, and Ramesh Mathikumar, Senior DevOps Consultant.

This blog post describes our plans to improve performance and streamline the user experience for customers writing AWS Lambda functions using Go.

Today, customers using Go with Lambda can either use the go1.x runtime, or use the provided.al2 runtime. Going forward, we plan to deprecate the go1.x runtime in line with the end-of-life of Amazon Linux 1, currently scheduled for December 31, 2023.

Customers using the go1.x runtime should migrate their functions to the provided.al2 runtime to continue to benefit from the latest runtime updates and security patches. Customers who deploy Go functions using container images who are currently using the go1.x base container image should similarly migrate to the provided.al2 base image.

Using the provided.al2 runtime offers several benefits over the go1.x runtime. First, it supports running Lambda functions on AWS Graviton2 processors, offering up to 34% better price-performance compared to functions running on x86_64 processors. Second, it offers a streamlined implementation with a smaller deployment package and faster function invoke path. Finally, this change aligns Go with other languages that also compile to native code such as Rust or C++, which also run on the provided.al2 runtime.

This migration does not require any code changes. The only changes relate to how you build your deployment package and configure your function. This blog post outlines the steps required to update your build scripts and tooling to use the provided.al2 runtime for your Go functions.

There is a difference in Lambda billing between the go1.x runtime and the provided.al2 runtime. With the go1.x runtime, Lambda does not bill for time spent during function initialization (cold start), whereas with the provided.al2 runtime Lambda includes function initialization time in the billed function duration. Since Go functions typically initialize very quickly, and since Lambda reduces the number of initializations by re-using function execution environments for multiple function invokes, in practice the difference in your Lambda bill should be very small.

Compiling for the provided.al2 runtime

In order to run a compiled Go application on Lambda, you must compile your code for Linux. While the go1.x runtime allows you to use any executable name, the provided.al2 runtime requires you to use bootstrap as the executable name. On macOS and Linux, here’s the simplest form of the build command:

GOARCH=amd64 GOOS=linux go build -o bootstrap main.go

This build command creates a Go binary file called bootstrap compatible with the x86_64 instruction set for Lambda. To compile for AWS Graviton processors, set GOARCH=arm64 in the preceding command.

The final step is to compress this binary into a ZIP file deployment package, ready to deploy to Lambda:

zip myFunction.zip bootstrap

For users compiling on Windows, Go supports compiling for Linux without using a Linux virtual machine or build container. However, Lambda uses POSIX file permissions, which must be set correctly. Lambda provides a helper tool which builds a deployment package that is valid for Lambda—see the Lambda documentation for details. Existing users of this tool should update to the latest version to make sure their build scripts are up-to-date.

Removing the RPC dependency

The go1.x runtime uses two processes within the Lambda execution environment to route requests to your handler function. The first process, which is included in the runtime, retrieves function invocation requests from the Lambda runtime API, and uses RPC to pass the invoke to the second process. This second process runs the executable which you deploy, and comprises the aws-lambda-go package and your function code. The aws-lambda-go package receives the RPC request and executes your function.

The following runtime architecture diagram for the go1.x runtime shows the runtime client process calling the runtime API to retrieve a function invocation and using RPC to call a separate process containing the function code.

Execution environment

Go functions deployed to the provided.al2 runtime use a simpler, single-process architecture. When building the Go executable, you include the same aws-lambda-go package as before. However, in this case the aws-lambda-go package acts as the runtime client, retrieving invocation requests from the runtime API directly.

The following runtime architecture diagram shows a Go function running on the provided.al2 runtime. A single process retrieves the function invocation from the runtime API and executes the function code.

Go running on provided.al2

Removing the additional process and RPC hop streamlines the function execution path, resulting in faster invokes. You can also remove the RPC component from the aws-lambda-go package, giving a smaller binary size and faster code loading during cold starts. To remove the RPC dependency, add the lambda.norpc tag to your build command:

GOARCH=amd64 GOOS=linux go build -tags lambda.norpc -o bootstrap main.go

Creating a new Lambda function

Once your deployment package is ready, you can create a new Lambda function using the provided.al2 runtime using the Lambda console:

Creating in the Lambda console

Migrating existing functions

If you have existing Lambda functions that use the go1.x runtime, you can migrate these functions by following these steps:

  1. Recompile your binary using the preceding commands, making sure to name your binary bootstrap.
  2. If you are using the same instruction set architecture, open the runtime settings and switch the runtime to “Provide your own bootstrap on Amazon Linux 2”.
  3. Upload the new version of your binary as a zip file.
    Edit runtime settings

Note: The handler value is not used by the provided.al2 runtime, nor the aws-lambda-go library, and may be set to any value. We recommend the setting the value to bootstrap to help with migrating between go1.x and provided.al2.

To switch instruction set architecture to Graviton (arm64), save your changes, and then re-open the runtimes settings to make the architecture change.

Migrating Go1.x Lambda container images

Lambda allows you to run your Go code as a container image. Customers that are using the go1.x base image for Lambda containers must migrate to the provided.al2 base image. The Lambda documentation includes instructions on how to build and deploy Go functions using the provided.al2 base image.

The following Dockerfile uses a two-stage build to avoid unnecessary layers and files in your final image. The first stage of the process builds the application. This stage installs Go, downloads the dependencies for the code, and compiles the binary. The second stage copies the executable to a new container without the dependencies of the build process.

  1. Create a Dockerfile in your project directory:
    FROM public.ecr.aws/lambda/provided:al2 as build
    # install compiler
    RUN yum install -y golang
    RUN go env -w GOPROXY=direct
    # cache dependencies
    ADD go.mod go.sum ./
    RUN go mod download
    # build
    ADD . .
    RUN go build -tags lambda.norpc -o /main
    # copy artifacts to a clean image
    FROM public.ecr.aws/lambda/provided:al2
    COPY --from=build /main /main
    ENTRYPOINT [ "/main" ]           
    
  2. Build your Docker image with the Docker build command:
    docker build -t hello-world .
  3. Authenticate the Docker CLI to your Amazon ECR registry:
    aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 123456789012.dkr.ecr.us-east-1.amazonaws.com 
  4. Tag and push your image to the Amazon ECR registry:
    docker tag hello-world:latest 123456789012.dkr.ecr.us-east-1.amazonaws.com/hello-world:latest
    
    docker push 123456789012.dkr.ecr.us-east-1.amazonaws.com/hello-world:latest

You can now create or update your Go Lambda function to use the new container image.

Changes to tooling

To migrate your functions from the go1.x runtime to the provided.al2 runtime, you must make configuration changes to your build scripts or CI/CD configurations. Here are some common examples.

Makefiles and build scripts

If you use Makefiles or custom build scripts to build Go functions, you must modify to ensure the executable file is named bootstrap when deploying to the provided.al2 runtime.

Here is an example Makefile which compiles the main.go file into an executable called bootstrap in the bin folder. It also creates a zip file, which you can deploy to Lambda using the console or via the AWS CLI.

GOARCH=arm64 GOOS=linux go build -tags lambda.norpc -o ./bin/bootstrap
(cd bin && zip -FS bootstrap.zip bootstrap)

CloudFormation

If you deploy your Lambda functions using AWS CloudFormation templates, change the Handler and Runtime settings under Properties:

Resources:
  Function:
    Type: AWS::Serverless::Function
    Properties:
      Handler: bootstrap
      Runtime: provided.al2
      ... # Other required properties

AWS Serverless Application Model

If you use the AWS Serverless Application Model (AWS SAM) to build and deploy your Go functions, make the same changes to the Handler and Runtime settings as for CloudFormation. You must also add the BuildMethod:

Resources:
  HelloWorldFunction:
    Type: AWS::Serverless::Function
    Metadata:
      BuildMethod: go1.x 
    Properties:
      CodeUri: hello-world/ # folder where your main program resides
      Handler: bootstrap
      Runtime: provided.al2
      Architectures:
        - x86_64

Cloud Development Kit (CDK)

If you use the AWS Cloud Development Kit (AWS CDK), you can compile your Go executable and place it under a folder in your project. Next, specify the location by using awslambda.Code_FromAsset, and AWS CDK packages the binary into a zip file and uploads it.

// Go CDK
awslambda.NewFunction(stack, jsii.String("HelloHandler"), &awslambda.FunctionProps{
    Code:         awslambda.Code_FromAsset(jsii.String("lambda"), nil), //folder where bootstrap executable is located
    Runtime:      awslambda.Runtime_PROVIDED_AL2(),
    Handler:      jsii.String("bootstrap"), // Handler named bootstrap
    Architecture: awslambda.Architecture_ARM_64(),
})

Taking this further, AWS CDK can perform build commands as part of your AWS CDK build process by using the native AWS CDK bundling functionality. With the bundling parameter, AWS CDK can perform steps before staging the files in the cloud assembly. Instead of placing the binary file, place the Go code in a folder and use the Bundling option to compile the code in a Docker container.

This example uses the golang:1.20.1 Docker image. After compilation, AWS CDK creates a zip file with the binary and creates the Lambda function:

// Go CDK
awslambda.NewFunction(stack, jsii.String("HelloHandler"), &awslambda.FunctionProps{
        Code: awslambda.Code_FromAsset(jsii.String("go-lambda"), &awss3assets.AssetOptions{
                Bundling: &awscdk.BundlingOptions{
                        Image: awscdk.DockerImage_FromRegistry(jsii.String("golang:1.20.1")),
                        Command: &[]*string{
                                jsii.String("bash"),
                                jsii.String("-c"),
                                jsii.String("GOCACHE=/tmp go mod tidy && GOCACHE=/tmp GOARCH=arm64 GOOS=linux go build -tags lambda.norpc -o /asset-output/bootstrap"),
                        },
                },
        }),
        Runtime:      awslambda.Runtime_PROVIDED_AL2(),
        Handler:      jsii.String("bootstrap"),
        Architecture: awslambda.Architecture_ARM_64(),
})

Conclusion

Lambda is deprecating the go1.x runtime in line with Amazon Linux 1 end-of-life, scheduled for December 31, 2023. Customers using Go with Lambda should migrate their functions to the provided.al2 runtime. Benefits include support for AWS Graviton2 processors with better price-performance, and a streamlined invoke path with faster performance.

For more serverless learning resources, visit Serverless Land.

Content Repository for Unstructured Data with Multilingual Semantic Search: Part 2

Post Syndicated from Patrik Nagel original https://aws.amazon.com/blogs/architecture/content-repository-for-unstructured-data-with-multilingual-semantic-search-part-2/

Leveraging vast unstructured data poses challenges, particularly for global businesses needing cross-language data search. In Part 1 of this blog series, we built the architectural foundation for the content repository. The key component of Part 1 was the dynamic access control-based logic with a web UI to upload documents.

In Part 2, we extend the content repository with multilingual semantic search capabilities while maintaining the access control logic from Part 1. This allows users to ingest documents in content repository across multiple languages and then run search queries to get reference to semantically similar documents.

Solution overview

Building on the architectural foundation from Part 1, we introduce four new building blocks to extend the search functionality.

Optical character recognition (OCR) workflow: To automatically identify, understand, and extract text from ingested documents, we use Amazon Textract and a sample review dataset of .png format documents (Figure 1). We use Amazon Textract synchronous application programming interfaces (APIs) to capture key-value pairs for the reviewid and reviewBody attributes. Based on your specific requirements, you can choose to capture either the complete extracted text or parts the text.

Sample document for ingestion

Figure 1. Sample document for ingestion

Embedding generation: To capture the semantic relationship between the text, we use a machine learning (ML) model that maps words and sentences to high-dimensional vector embeddings. You can use Amazon SageMaker, a fully-managed ML service, to build, train, and deploy your ML models to production-ready hosted environments. You can also deploy ready-to-use pre-trained models from multiple avenues such as SageMaker JumpStart. For this blog post, we use the open-source pre-trained universal-sentence-encoder-multilingual model from TensorFlow Hub. The model inference endpoint deployed to a SageMaker endpoint generates embeddings for the document text and the search query. Figure 2 is an example of n-dimensional vector that is generated as the output of the reviewBody attribute text provided to the embeddings model.

Sample embedding representation of the value of reviewBody

Figure 2. Sample embedding representation of the value of reviewBody

Embedding ingestion: To make the embeddings searchable for the content repository users, you can use the k-Nearest Neighbor (k-NN) search feature of Amazon OpenSearch Service. The OpenSearch k-NN plugin provides different methods. For this blog post, we use the Approximate k-NN search approach, based on the Hierarchical Navigable Small World (HNSW) algorithm. HNSW uses a hierarchical set of proximity graphs in multiple layers to improve performance when searching large datasets to find the “nearest neighbors” for the search query text embeddings.

Semantic search: We make the search service accessible as an additional backend logic on Amazon API Gateway. Authenticated content repository users send their search query using the frontend to receive the matching documents. The solution maintains end-to-end access control logic by using the user’s enriched Amazon Cognito provided identity (ID) token claim with the department attribute to compare it with the ingested documents.

Technical architecture

The technical architecture includes two parts:

  1. Implementing multilingual semantic search functionality: Describes the processing workflow for the document that the user uploads; makes the document searchable.
  2. Running input search query: Covers the search workflow for the input query; finds and returns the nearest neighbors of the input text query to the user.

Part 1. Implementing multilingual semantic search functionality

Our previous blog post discussed blocks A through D (Figure 3), including user authentication, ID token enrichment, Amazon Simple Storage Service (Amazon S3) object tags for dynamic access control, and document upload to the source S3 bucket. In the following section, we cover blocks E through H. The overall workflow describes how an unstructured document is ingested in the content repository, run through the backend OCR and embeddings generation process and finally the resulting vector embedding are stored in OpenSearch service.

Technical architecture for implementing multi-lingual semantic search functionality

Figure 3. Technical architecture for implementing multilingual semantic search functionality

  1. The OCR workflow extracts text from your uploaded documents.
    • The source S3 bucket sends an event notification to Amazon Simple Queue Service (Amazon SQS).
    • The document transformation AWS Lambda function subscribed to the Amazon SQS queue invokes an Amazon Textract API call to extract the text.
  2. The document transformation Lambda function makes an inference request to the encoder model hosted on SageMaker. In this example, the Lambda function submits the reviewBody attribute to the encoder model to generate the embedding.
  3. The document transformation Lambda function writes an output file in the transformed S3 bucket. The text file consists of:
    • The reviewid and reviewBody attributes extracted from Step 1
    • An additional reviewBody_embeddings attribute from Step 2
      Note: The workflow tags the output file with the same S3 object tags as the source document for downstream access control.
  4. The transformed S3 bucket sends an event notification to invoke the indexing Lambda function.
  5. The indexing Lambda function reads the text file content. Then indexing Lambda function makes an OpenSearch index API call along with source document tag as one of the indexing attributes for access control.

Part 2. Running user-initiated search query

Next, we describe how the user’s request produces query results (Figure 4).

Search query lifecycle

Figure 4. Search query lifecycle

  1. The user enters a search string in the web UI to retrieve relevant documents.
  2. Based on the active sign-in session, the UI passes the user’s ID token to the search endpoint of the API Gateway.
  3. The API Gateway uses Amazon Cognito integration to authorize the search API request.
  4. Once validated, the search API endpoint request invokes the search document Lambda function.
  5. The search document function sends the search query string as the inference request to the encoder model to receive the embedding as the inference response.
  6. The search document function uses the embedding response to build an OpenSearch k-NN search query. The HNSW algorithm is configured with the Lucene engine and its filter option to maintain the access control logic based on the custom department claim from the user’s ID token. The OpenSearch query returns the following to the query embeddings:
    • Top three Approximate k-NN
    • Other attributes, such as reviewid and reviewBody
  7. The workflow sends the relevant query result attributes back to the UI.

Prerequisites

You must have the following prerequisites for this solution:

Walkthrough

Setup

The following steps deploy two AWS CDK stacks into your AWS account:

  • content-repo-search-stack (blog-content-repo-search-stack.ts) creates the environment detailed in Figure 3, except for the SageMaker endpoint, which you create in a spearate step.
  • demo-data-stack (userpool-demo-data-stack.ts) deploys sample users, groups, and role mappings.

To continue setup, use the following commands:

  1. Clone the project Git repository:
    git clone https://github.com/aws-samples/content-repository-with-multilingual-search content-repository
  2. Install the necessary dependencies:
    cd content-repository/backend-cdk 
    npm install
  3. Configure environment variables:
    export CDK_DEFAULT_ACCOUNT=$(aws sts get-caller-identity --query 'Account' --output text)
    export CDK_DEFAULT_REGION=$(aws configure get region)
  4. Bootstrap your account for AWS CDK usage:
    cdk bootstrap aws://$CDK_DEFAULT_ACCOUNT/$CDK_DEFAULT_REGION
  5. Deploy the code to your AWS account:
    cdk deploy --all

The complete stack set-up may take up to 20 minutes.

Creation of SageMaker endpoint

Follow below steps to create the SageMaker endpoint in the same AWS Region where you deployed the AWS CDK stack.

    1. Sign in to the SageMaker console.
    2. In the navigation menu, select Notebook, then Notebook instances.
    3. Choose Create notebook instance.
    4. Under the Notebook instance settings, enter content-repo-notebook as the notebook instance name, and leave other defaults as-is.
    5. Under the Permissions and encryption section (Figure 5), you need to set the IAM role section to the role with the prefix content-repo-search-stack. In case you don’t see this role automatically populated, select it from the drop-down. Leave the rest of the defaults, and choose Create notebook instance.

      Notebook permissions

      Figure 5. Notebook permissions

    6. The notebook creation status changes to Pending before it’s available for use within 3-4 minutes.
    7. Once the notebook is in the Available status, choose Open Jupyter.
    8. Choose the Upload button and upload the create-sagemaker-endpoint.ipynb file in the backend-cdk folder of the root of the blog repository.
    9. Open the create-sagemaker-endpoint.ipynb notebook. Select the option Run All from the Cell menu (Figure 6). This might take up to 10 minutes.

      Run create-sagemaker-endpoint notebook cells

      Figure 6. Run create-sagemaker-endpoint notebook cells

    10. After all the cells have successfully run, verify that the AWS Systems Manager parameter sagemaker-endpoint is updated with the value of the SageMaker endpoint name. An example of value as the output of the cell is in Figure 7. In case you don’t see the output, check if the preceding steps were run correctly.

      SSM parameter updated with SageMaker endpoint

      Figure 7. SSM parameter updated with SageMaker endpoint

    11. Verify in the SageMaker console that the inference endpoint with the prefix tensorflow-inference has been deployed and is set to status InService.
    12. Upload sample data to the content repository:
      • Update the S3_BUCKET_NAME variable in the upload_documents_to_S3.sh script in the root folder of the blog repository with the s3SourceBucketName from the AWS CDK output of the content-repo-search-stack.
      • Run upload_documents_to_S3.sh script to upload 150 sample documents to the content repository. This takes 5-6 minutes. During this process, the uploaded document triggers the workflow described in the Implementing multilingual semantic search functionality.

Using the search service

At this stage, you have deployed all the building blocks for the content repository in your AWS account. Next, as part of the upload sample data to the content repository, you pushed a limited corpus of 150 sample documents (.png format). Each document is in one of the four different languages – English, German, Spanish and French. With the added multilingual search capability, you can query in one language and receive semantically similar results across different languages while maintaining the access control logic.

  1. Access the frontend application:
    • Copy the amplifyHostedAppUrl value of the AWS CDK output from the content-repo-search-stack shown in the terminal.
    • Enter the URL in your web browser to access the frontend application.
    • A temporary page displays until the automated build and deployment of the React application completes after 4-5 minutes.
  2. Sign into the application:
    • The content repository provides two demo users with credentials as part of the demo-data-stack in the AWS CDK output. Copy the password from the terminal associated with the sales-user, which belongs to the sales department.
    • Follow the prompts from the React webpage to sign in with the sales-user and change the temporary password.
  3. Enter search queries and verify results. The search action invokes the workflow described in Running input search query. For example:
    • Enter works well as the search query. Note the multilingual output and the semantically similar results (Figure 8).

      Positive sentiment multi-lingual search result for the sales-user

        Figure 8. Positive sentiment multilingual search result for the sales-user

    • Enter bad quality as the search query. Note the multilingual output and the semantically similar results (Figure 9).

      Negative sentiment multi-lingual search result for the sales-user

      Figure 9. Negative sentiment multi-lingual search result for the sales-user

  4. Sign out as the sales-user with the Log Out button on the webpage.
  5. Sign in using the marketing-user credentials to verify access control:
    • Follow the sign in procedure in step 2 but with the marketing-user.
    • This time with works well as search query, you find different output. This is because the access control only allows marketing-user to search for the documents that belong to the marketing department (Figure 10).

      Positive sentiment multi-lingual search result for the marketing-user

      Figure 10. Positive sentiment multilingual search result for the marketing-user

Cleanup

In the backend-cdk subdirectory of the cloned repository, delete the deployed resources: cdk destroy --all.

Additionally, you need to access the Amazon SageMaker console to delete the SageMaker endpoint and notebook instance created as part of the Walkthrough setup section.

Conclusion

In this blog, we enriched the content repository with multi-lingual semantic search features while maintaining the access control fundamentals that we implemented in Part 1. The building blocks of the semantic search for unstructured documents—Amazon Textract, Amazon SageMaker, and Amazon OpenSearch Service—set a foundation for you to customize and enhance the search capabilities for your specific use case. For example, you can leverage the fast developments in Large Language Models (LLM) to enhance the semantic search experience. You can replace the encoder model with an LLM capable of generating multilingual embeddings while still maintaining the OpenSearch service to store and index data and perform vector search.

Temporal data lake architecture for benchmark and indices analytics

Post Syndicated from Krishna Gogineni original https://aws.amazon.com/blogs/architecture/temporal-data-lake-architecture-for-benchmark-and-indices-analytics/

Financial trading houses and stock exchanges generate enormous volumes of data in near real-time, making it difficult to perform bi-temporal calculations that yield accurate results. Achieving this requires a processing architecture that can handle large volumes of data during peak bursts, meet strict latency requirements, and scale according to incoming volumes.

In this post, we’ll describe a scenario for an industry leader in the financial services sector and explain how AWS services are used for bi-temporal processing with state management and scale based on variable workloads during the day, all while meeting strict service-level agreement (SLA) requirements.

Problem statement

To design and implement a fully temporal transactional data lake with the repeatable read isolation level for queries is a challenge, particularly with burst events that need the overall architecture to scale accordingly. The data store in the overall architecture needs to record the value history of data at different times, which is especially important for financial data. Financial data can include corporate actions, annual or quarterly reports, or fixed-income securities, like bonds that have variable rates. It’s crucial to be able to correct data inaccuracies during the reporting period.

The example customer seeks a data processing platform architecture to dynamically scale based on the workloads with a capacity of processing 150 million records under 5 minutes. Their platform should be capable of meeting the end-to-end SLA of 15 minutes, from ingestion to reporting, with lowest total cost of ownership. Additionally, managing bi-temporal data requires a database that has critical features, such as ACID (atomicity, consistency, isolation, durability) compliance, time-travel capability, full-schema evolution, partition layout and evolution, rollback to prior versions, and SQL-like query experience.

Solution overview

The solution architecture key building blocks are Amazon Kinesis Data Streams for streaming data, Amazon Kinesis Data Analytics with Apache Flink as processing engine, Flink’s RocksDB for state management, and Apache Iceberg on Amazon Simple Storage Service (Amazon S3) as the storage engine (Figure 1).

End-to-end data-processing architecture

Figure 1. End-to-end data-processing architecture

Data processing

Here’s how it works:

  • A publisher application receives the data from the source systems and publishes data into Kinesis Data Streams using a well-defined JSON format structure.
  • Kinesis Data Streams holds the data for a duration that is configurable so data is not lost and can auto scale based on the data volume ingested.
  • Kinesis Data Analytics runs an Apache Flink application, with state management (RocksDB), to handle bi-temporal calculations. The Apache Flink application consumes data from Kinesis Data Streams and performs the following computations:
    • Transforms the JSON stream into a row-type record, compatible with a SQL table-like structure, resolving nesting and parent–child relationships present within the stream
    • Checks whether the record has already an existing state in in-memory RocksDB or disk attached to Kinesis Data Analytics computational node to avoid read latency from the database, which is critical for meeting the performance requirements
    • Performs bi-temporal calculations and creates the resultant records in an in-memory data structure before invoking the Apache Iceberg sink operator
    • The Apache Flink application sink operator appends the temporal states, expressed as records into existing Apache Iceberg data store. This will comply with key principles of time series data, which is immutable, and the ability to time-travel along with ACID compliance, schema evolution, and partition evolution
  • Kinesis Data Analytics is resilient and provides a no-data-loss capability, with features like periodic checkpoints and savepoints. They are used to store the state management in a secure Amazon S3 location that can be accessed outside of Kinesis Data Analytics. This savepoints mechanism can be used to programmatically to scale the cluster size based on the workloads using time-driven scheduling and AWS Lambda functions.
  • If the time-to-live feature of RocksDB is implemented, old records are stored in Apache Iceberg on Amazon S3. When performing temporal calculations, if the state is not found in memory, data is read from Apache Iceberg into RocksDB and the processing is completed. However, this step is optional and can be circumvented if the Kinesis Data Analytics cluster is initialized with right number of Kinesis processing units to hold the historical information, as per requirements.
  • Because the data is stored in an Apache Iceberg table format in Amazon S3, data is queried using Trino, which supports Apache Iceberg table format.
  • The end user queries data using any SQL tool that supports the Trino query engine.

Apache Iceberg maintenance jobs, such as data compaction, expire snapshot, delete orphan files, can be launched using Amazon Athena to optimize performance out of Apache Iceberg data store. Details of each processing step performed in Apache Flink application are captured using Amazon CloudWatch, which logs all the events.

Scalability

Amazon EventBridge scheduler invokes a Lambda function to scale the Kinesis Data Analytics. Kinesis Data Analytics has a short outage during rescaling that is proportional to the amount of data stored in RocksDB, which is why a state management strategy is necessary for the proper operation of the system.

Figure 2 shows the scaling process, which depicts:

  • Before peak load: The Kinesis Data Analytics cluster is processing off-peak records with minimum configuration before the peak load. A scheduled event is launched from EventBridge that invokes a Lambda function, which shuts down the cluster using the savepoint mechanism and scales up the Kinesis Data Analytics cluster to required Kinesis processing units.
  • During peak load: When the peak data burst happens, the Kinesis Data Analytics cluster is ready to handle the volume of data from Kinesis Data Stream, and processes it within the SLA of 5 minutes.
  • After peak load: A scheduled event from EventBridge invokes a Lambda function to scale down the Kinesis Data Analytics cluster to the minimum configuration that holds the required state for the entire volume of records.
Cluster scaling before, during, and after peak data volume processing

Figure 2. Cluster scaling before, during, and after peak data volume processing

Performance insights

With the discussed architecture, we want to demonstrate that the we are able to meet the SLAs, in terms of performance and processing times. We have taken a subset of benchmarks and indices data and processed the same with the end-to-end architecture. During the process, we observed some very interesting findings, which we would like to share.

Processing time for Apache Iceberg Upsert vs Append operations: During our tests, we expected Upsert operation to be faster than append. But on the contrary, we noticed that Append operations were faster compared to Upsert even though more computations are performed in the Apache Flink application. In our test with 3,500,000 records, Append operation took 1556 seconds while Upsert took 1675 seconds to process the data (Figure 3).

Processing times for Upsert vs. Append

Figure 3. Processing times for Upsert vs. Append

Compute consumption for Apache Iceberg Upsert vs. Append operations: Comparing the compute consumption for 10,000,000 records, we noticed that Append operation was able to process the data in the same amount of time as Upsert operation but with less compute resources. In our tests, we have noted that Append operation only consumed 64 Kinesis processing units, whereas Upsert consumed 78 Kinesis processing units (Figure 4).

Comparing consumption for Upsert vs. Append

Figure 4. Comparing consumption for Upsert vs. Append

Scalability vs performance: To achieve the desired data processing performance, we need a specific configuration of Kinesis processing units, Kinesis Data Streams, and Iceberg parallelism. In our test with the data that we chose, we started with four Kinesis processing units and four Kinesis data streams for data processing. We observed an 80% performance improvement in data processing with 16 Kinesis data processing units. An additional 6% performance improvement was demonstrated when we scaled to 32 Kinesis processing units. When we increased the Kinesis data streams to 16, we observed an additional 2% performance improvement (Figure 5).

Scalability vs. performance

Figure 5. Scalability vs. performance

Data volume processing times for Upsert vs. Append: For this test, we started with 350,000 records of data. When we increased data volume to 3.5M records, we observed that Append performing better than Upsert, demonstrating a five-fold increase in processing time (Figure 6).

Data volume processing times for Upsert vs. Append

Figure 6. Data volume processing times for Upsert vs. Append

Conclusion

The architecture we explored today scales based on the data-volume requirements of the customer and is capable of meeting the end-to-end SLA of 15 minutes, with a potential lowered total cost of ownership. Additionally, the solution is capable of handling high-volume, bi-temporal computations with ACID compliance, time travel, full-schema evolution, partition layout evolution, rollback to prior versions and SQL-like query experience.

Further reading

Automate secure access to Amazon MWAA environments using existing OpenID Connect single-sign-on authentication and authorization

Post Syndicated from Ajay Vohra original https://aws.amazon.com/blogs/big-data/automate-secure-access-to-amazon-mwaa-environments-using-existing-openid-connect-single-sign-on-authentication-and-authorization/

Customers use Amazon Managed Workflows for Apache Airflow (Amazon MWAA) to run Apache Airflow at scale in the cloud. They want to use their existing login solutions developed using OpenID Connect (OIDC) providers with Amazon MWAA; this allows them to provide a uniform authentication and single sign-on (SSO) experience using their adopted identity providers (IdP) across AWS services. For ease of use for end-users of Amazon MWAA, organizations configure a custom domain endpoint to their Apache Airflow UI endpoint. For teams operating and managing multiple Amazon MWAA environments, securing and customizing each environment is a repetitive but necessary task. Automation through infrastructure as code (IaC) can alleviate this heavy lifting to achieve consistency at scale.

This post describes how you can integrate your organization’s existing OIDC-based IdPs with Amazon MWAA to grant secure access to your existing Amazon MWAA environments. Furthermore, you can use the solution to provision new Amazon MWAA environments with the built-in OIDC-based IdP integrations. This approach allows you to securely provide access to your new or existing Amazon MWAA environments without requiring AWS credentials for end-users.

Overview of Amazon MWAA environments

Managing multiple user names and passwords can be difficult—this is where SSO authentication and authorization comes in. OIDC is a widely used standard for SSO, and it’s possible to use OIDC SSO authentication and authorization to access Apache Airflow UI across multiple Amazon MWAA environments.

When you provision an Amazon MWAA environment, you can choose public or private Apache Airflow UI access mode. Private access mode is typically used by customers that require restricting access from only within their virtual private cloud (VPC). When you use public access mode, the access to the Apache Airflow UI is available from the internet, in the same way as an AWS Management Console page. Internet access is needed when access is required outside of a corporate network.

Regardless of the access mode, authorization to the Apache Airflow UI in Amazon MWAA is integrated with AWS Identity and Access Management (IAM). All requests made to the Apache Airflow UI need to have valid AWS session credentials with an assumed IAM role that has permissions to access the corresponding Apache Airflow environment. For more details on the permissions policies needed to access the Apache Airflow UI, refer to Apache Airflow UI access policy: AmazonMWAAWebServerAccess.

Different user personas such as developers, data scientists, system operators, or architects in your organization may need access to the Apache Airflow UI. In some organizations, not all employees have access to the AWS console. It’s fairly common that employees who don’t have AWS credentials may also need access to the Apache Airflow UI that Amazon MWAA exposes.

In addition, many organizations have multiple Amazon MWAA environments. It’s common to have an Amazon MWAA environment setup per application or team. Each of these Amazon MWAA environments can be run in different deployment environments like development, staging, and production. For large organizations, you can easily envision a scenario where there is a need to manage multiple Amazon MWAA environments. Organizations need to provide secure access to all of their Amazon MWAA environments using their existing OIDC provider.

Solution Overview

The solution architecture integrates an existing OIDC provider to provide authentication for accessing the Amazon MWAA Apache Airflow UI. This allows users to log in to the Apache Airflow UI using their OIDC credentials. From a system perspective, this means that Amazon MWAA can integrate with an existing OIDC provider rather than having to create and manage an isolated user authentication and authorization through IAM internally.

The solution architecture relies on an Application Load Balancer (ALB) setup with a fully qualified domain name (FQDN) with public (internet) or private access. This ALB provides SSO access to multiple Amazon MWAA environments. The user-agent (web browser) call flow for accessing an Apache Airflow UI console to the target Amazon MWAA environment includes the following steps:

  1. The user-agent resolves the ALB domain name from the Domain Name System (DNS) resolver.
  2. The user-agent sends a login request to the ALB path /aws_mwaa/aws-console-sso with a set of query parameters populated. The request uses the required parameters mwaa_env and rbac_role as placeholders for the target Amazon MWAA environment and the Apache Airflow role-based access control (RBAC) role, respectively.
  3. Once it receives the request, the ALB redirects the user-agent to the OIDC IdP authentication endpoint. The user-agent authenticates with the OIDC IdP with the existing user name and password.
  4. If user authentication is successful, the OIDC IdP redirects the user-agent back to the configured ALB with a redirect_url with the authorization code included in the URL.
  5. The ALB uses the authorization code received to obtain the access_token and OpenID JWT token with openid email scope from the OIDC IdP. It then forwards the login request to the Amazon MWAA authenticator AWS Lambda function with the JWT token included in the request header in the x-amzn-oidc-data parameter.
  6. The Lambda function verifies the JWT token found in the request header using ALB public keys. The function subsequently authorizes the authenticated user for the requested mwaa_env and rbac_role stored in an Amazon DynamoDB table. The use of DynamoDB for authorization here is optional; the Lambda code function is_allowed can be customized to use other authorization mechanisms.
  7. The Amazon MWAA authenticator Lambda function redirects the user-agent to the Apache Airflow UI console in the requested Amazon MWAA environment with the login token in the redirect URL. Additionally, the function provides the logout functionality.

Amazon MWAA public network access mode

For the Amazon MWAA environments configured with public access mode, the user agent uses public routing over the internet to connect to the ALB hosted in a public subnet.

The following diagram illustrates the solution architecture with a numbered call flow sequence for internet network reachability.

Amazon MWAA public network access mode architecture diagram

Amazon MWAA private network access mode

For Amazon MWAA environments configured with private access mode, the user agent uses private routing over a dedicated AWS Direct Connect or AWS Client VPN to connect to the ALB hosted in a private subnet.

The following diagram shows the solution architecture for Client VPN network reachability.

Amazon MWAA private network access mode architecture diagram

Automation through infrastructure as code

To make setting up this solution easier, we have released a pre-built solution that automates the tasks involved. The solution has been built using the AWS Cloud Development Kit (AWS CDK) using the Python programming language. The solution is available in our GitHub repository and helps you achieve the following:

  • Set up a secure ALB to provide OIDC-based SSO to your existing Amazon MWAA environment with default Apache Airflow Admin role-based access.
  • Create new Amazon MWAA environments along with an ALB and an authenticator Lambda function that provides OIDC-based SSO support. With the customization provided, you can define the number of Amazon MWAA environments to create. Additionally, you can customize the type of Amazon MWAA environments created, including defining the hosting VPC configuration, environment name, Apache Airflow UI access mode, environment class, auto scaling, and logging configurations.

The solution offers a number of customization options, which can be specified in the cdk.context.json file. Follow the setup instructions to complete the integration to your existing Amazon MWAA environments or create new Amazon MWAA environments with SSO enabled. The setup process creates an ALB with an HTTPS listener that provides the user access endpoint. You have the option to define the type of ALB that you need. You can define whether your ALB will be public facing (internet accessible) or private facing (only accessible within the VPC). It is recommended to use a private ALB with your new or existing Amazon MWAA environments configured using private UI access mode.

The following sections describe the specific implementation steps and customization options for each use case.

Prerequisites

Before you continue with the installation steps, make sure you have completed all prerequisites and run the setup-venv script as outlined within the README.md file of the GitHub repository.

Integrate to a single existing Amazon MWAA environment

If you’re integrating with a single existing Amazon MWAA environment, follow the guides in the Quick start section. You must specify the same ALB VPC as that of your existing Amazon MWAA VPC. You can specify the default Apache Airflow RBAC role that all users will assume. The ALB with an HTTPS listener is configured within your existing Amazon MWAA VPC.

Integrate to multiple existing Amazon MWAA environments

To connect to multiple existing Amazon MWAA environments, specify only the Amazon MWAA environment name in the JSON file. The setup process will create a new VPC with subnets hosting the ALB and the listener. You must define the CIDR range for this ALB VPC such that it doesn’t overlap with the VPC CIDR range of your existing Amazon MWAA VPCs.

When the setup steps are complete, implement the post-deployment configuration steps. This includes adding the ALB CNAME record to the Amazon Route 53 DNS domain.

For integrating with Amazon MWAA environments configured using private access mode, there are additional steps that need to be configured. These include configuring VPC peering and subnet routes between the new ALB VPC and the existing Amazon MWAA VPC. Additionally, you need to configure network connectivity from your user-agent to the private ALB endpoint resolved by your DNS domain.

Create new Amazon MWAA environments

You can configure the new Amazon MWAA environments you want to provision through this solution. The cdk.context.json file defines a dictionary entry in the MwaaEnvironments array. Configure the details that you need for each of the Amazon MWAA environments. The setup process creates an ALB VPC, ALB with an HTTPS listener, Lambda authorizer function, DynamoDB table, and respective Amazon MWAA VPCs and Amazon MWAA environments in them. Furthermore, it creates the VPC peering connection between the ALB VPC and the Amazon MWAA VPC.

If you want to create Amazon MWAA environments with private access mode, the ALB VPC CIDR range specified must not overlap with the Amazon MWAA VPC CIDR range. This is required for the automatic peering connection to succeed. It can take between 20–30 minutes for each Amazon MWAA environment to finish creating.

When the environment creation processes are complete, run the post-deployment configuration steps. One of the steps here is to add authorization records to the created DynamoDB table for your users. You need to define the Apache Airflow rbac_role for each of your end-users, which the Lambda authorizer function matches to provide the requisite access.

Verify access

Once you’ve completed with the post-deployment steps, you can log in to the URL using your ALB FQDN. For example, If your ALB FQDN is alb-sso-mwaa.example.com, you can log in to your target Amazon MWAA environment, named Env1, assuming a specific Apache Airflow RBAC role (such as Admin), using the following URL: https://alb-sso-mwaa.example.com/aws_mwaa/aws-console-sso?mwaa_env=Env1&rbac_role=Admin. For the Amazon MWAA environments that this solution created, you need to have appropriate Apache Airflow rbac_role entries in your DynamoDB table.

The solution also provides a logout feature. To log out from an Apache Airflow console, use the normal Apache Airflow console logout. To log out from the ALB, you can, for example, use the URL https://alb-sso-mwaa.example.com/logout.

Clean up

Follow the readme documented steps in the section Destroy CDK stacks in the GitHub repo, which shows how to clean up the artifacts created via the AWS CDK deployments. Remember to revert any manual configurations, like VPC peering connections, that you might have made after the deployments.

Conclusion

This post provided a solution to integrate your organization’s OIDC-based IdPs with Amazon MWAA to grant secure access to multiple Amazon MWAA environments. We walked through the solution that solves this problem using infrastructure as code. This solution allows different end-user personas in your organization to access the Amazon MWAA Apache Airflow UI using OIDC SSO.

To use the solution for your own environments, refer to Application load balancer single-sign-on for Amazon MWAA. For additional code examples on Amazon MWAA, refer to Amazon MWAA code examples.


About the Authors

Ajay Vohra is a Principal Prototyping Architect specializing in perception machine learning for autonomous vehicle development. Prior to Amazon, Ajay worked in the area of massively parallel grid-computing for financial risk modeling.

Jaswanth Kumar is a customer-obsessed Cloud Application Architect at AWS in NY. Jaswanth excels in application refactoring and migration, with expertise in containers and serverless solutions, coupled with a Masters Degree in Applied Computer Science.

Aneel Murari is a Sr. Serverless Specialist Solution Architect at AWS based in the Washington, D.C. area. He has over 18 years of software development and architecture experience and holds a graduate degree in Computer Science. Aneel helps AWS customers orchestrate their workflows on Amazon Managed Apache Airflow (MWAA) in a secure, cost effective and performance optimized manner.

Parnab Basak is a Solutions Architect and a Serverless Specialist at AWS. He specializes in creating new solutions that are cloud native using modern software development practices like serverless, DevOps, and analytics. Parnab works closely in the analytics and integration services space helping customers adopt AWS services for their workflow orchestration needs.

AWS Week in Review – Updates on Amazon FSx for NetApp ONTAP, AWS Lambda, eksctl, Karpetner, and More – July 17, 2023

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/aws-week-in-review-updates-on-amazon-fsx-for-netapp-ontap-aws-lambda-eksctl-karpetner-and-more-july-17-2023/

The Data Centered: Eastern Oregon, a five-part mini-documentary series looking at the real-life impact of the more than $15 billion investment AWS has made in the local community, and how the company supports jobs, generates economic growth, provides skills training and education, and unlocks opportunities for local businesses suppliers.

Last week, I watched a new episode introducing the Data Center Technician training program offered by AWS to train people with little or no previous technical experience in the skills they need to work in data centers and other information technology (IT) roles. This video reminded me of my first days of cabling and transporting servers in data centers. Remember, there are still people behind cloud computing.

Last Week’s Launches
Here are some launches that got my attention:

Amazon FSx for NetApp ONTAP Updates – Jeff Barr introduced Amazon FSx for NetApp ONTAP support for SnapLock, an ONTAP feature that gives you the power to create volumes that provide write once read many (WORM) functionality for regulatory compliance and ransomware protection. In addition, FSx for NetApp ONTAP now supports IPSec encryption of data in transit and two additional monitoring and troubleshooting capabilities that you can use to monitor file system events and diagnose network connectivity.

AWS Lambda detects and stops recursive loops in Lambda functions – In certain scenarios, due to resource misconfiguration or code defects, a processed event might be sent back to the same service or resource that invoked the Lambda function. This can cause an unintended recursive loop and result in unintended usage and costs for customers. With this launch, Lambda will stop recursive invocations between Amazon SQS, Lambda, and Amazon SNS after 16 recursive calls. For more information, refer to our documentation or the launch blog post.

Email notification

Amazon CloudFront supports for 3072-bit RSA certificates – You can now associate their 3072-bit RSA certificates with CloudFront distributions to enhance communication security between clients and CloudFront edge locations. To get started, associate a 3072-bit RSA certificate with your CloudFront distribution using console or APIs. There are no additional fees associated with this feature. For more information, please refer to the CloudFront Developer Guide.

Running GitHub Actions with AWS CodeBuild – Two weeks ago, AWS CodeBuild started to support GitHub Actions. You can now define GitHub Actions steps directly in the BuildSpec and run them alongside CodeBuild commands. Last week, the AWS DevOps Blog published the blog post about using the Liquibase GitHub Action for deploying changes to an Amazon Aurora database in a private subnet. You can learn how to integrate AWS CodeBuild and nearly 20,000 GitHub Actions developed by the open source community.

CodeBuild configuration showing the GitHub repository URL

Amazon DynamoDB local version 2.0 – You can develop and test applications by running Amazon DynamoDB local in your local development environment without incurring any additional costs. The new 2.0 version allows Java developers to use DynamoDB local to work with Spring Boot 3 and frameworks such as Spring Framework 6 and Micronaut Framework 4 to build modernized, simplified, and lightweight cloud-native applications.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Open Source Updates
Last week, we introduced new open source projects and significant roadmap contributions to the Jupyter community.

New joint maintainership between Weaveworks and AWS for eksctl – Now the eksctl open source project has been moved from the Weaveworks GitHub organization to a new top level GitHub organization—eksctl-io—that will be jointly maintained by Weaveworks and AWS moving forward. The eksctl project can now be found on GitHub.

Karpenter now supports Windows containers – Karpenter is an open source flexible, high-performance Kubernetes node provisioning and management solution that you can use to quickly scale Amazon EKS clusters. With the launch of version 0.29.0, Karpenter extends the automated node provisioning support to Windows containers running on EKS. Read this blog post for a step-by-step guide on how to get started with Karpenter for Windows node groups.

Updates in Amazon Aurora and Amazon OpenSearch Service – Following the announcement of updates to the PostgreSQL database in May by the open source community, we’ve updated Amazon Aurora PostgreSQL-Compatible Edition to support PostgreSQL 15.3, 14.8, 13.11, 12.15, and 11.20. These releases contain product improvements and bug fixes made by the PostgreSQL community, along with Aurora-specific improvements. You can also run OpenSearch version 2.7 in Amazon OpenSearch Service. With OpenSearch 2.7 (also released in May), we’ve made several improvements to observability, security analytics, index management, and geospatial capabilities in OpenSearch Service.

To learn about weekly updates for open source at AWS, check out the latest AWS open source newsletter by Ricardo.

Upcoming AWS Events
Check your calendars and sign up for these AWS events:

AWS Storage Day on August 9 – Join a one-day virtual event that will help you to better understand AWS storage services and make the most of your data. Register today.

AWS Global Summits – Sign up for the AWS Summit closest to your city: Hong Kong (July 20), New York City (July 26), Taiwan (August 2-3), São Paulo (August 3), and Mexico City (August 30).

AWS Community Days – Join a community-led conference run by AWS user group leaders in your region: Malaysia (July 22), Philippines (July 29-30), Colombia (August 12), and West Africa (August 19).

AWS re:Invent 2023 – Join us to hear the latest from AWS, learn from experts, and connect with the global cloud community. Registration is now open.

You can browse all upcoming AWS-led in-person and virtual events, and developer-focused events such as AWS DevDay.

Take the AWS Blog Customer Survey
We’re focused on improving our content to provide a better customer experience, and we need your feedback to do so. Take our survey to share insights regarding your experience on the AWS Blog.

This survey is hosted by an external company. AWS handles your information as described in the AWS Privacy Notice. AWS will own the data gathered via this survey and will not share the information collected with survey respondents.

That’s all for this week. Check back next Monday for another Week in Review!

Channy

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

IBM Consulting creates innovative AWS solutions in French Hackathon

Post Syndicated from Diego Colombatto original https://aws.amazon.com/blogs/architecture/ibm-consulting-creates-innovative-aws-solutions-in-french-hackathon/

In March 2023, IBM Consulting delivered an Innovation Hackathon in France, aimed at designing and building new innovative solutions for real customer use cases using the AWS Cloud.

In this post, we briefly explore six of the solutions considered and demonstrate the AWS architectures created and implemented during the Hackathon.

Hackathon solutions

Solution 1: Optimize digital channels monitoring and management for Marketing

Monitoring Marketing campaign impact can require a lot of effort, such as customers and competitors’ reactions on digital media channels. Digital campaign managers need this data to evaluate customer segment penetration and overall campaign effectiveness. Information can be collected via digital-channel API integrations or on the digital channel user interface (UI): digital-channel API integrations require frequent maintenance, while UI data collection can be labor-intensive.

On the AWS Cloud, IBM designed an augmented digital campaign manager solution, to assist digital campaign managers with digital-channel monitoring and management. This solution monitors social media APIs and, when APIs change, automatically updates the API integration, ensuring accurate information collection (Figure 1).

Optimize digital channels monitoring and management for Marketing

Figure 1. Optimize digital channels monitoring and management for Marketing

  1. Amazon Simple Storage Service (Amazon S3) and AWS Lambda are used to garner new digital estates, such as new social media APIs, and assess data quality.
  2. Amazon Kinesis Data Streams is used to decouple data ingestion from data query and storage.
  3. Lambda retrieves the required information from Amazon DynamoDB, like the most relevant brands; natural language processing (NLP) is applied to retrieved data, like URL, bio, about, verification status.
  4. Amazon S3 and Amazon CloudFront are used to present a dashboard where end-users can check, enrich, and validate collected data.
  5. When graph API calls detect an error/change, Lambda checks API documentation to update/correct the API call.
  6. A new Lambda function is generated, with updated API call.

Solution 2: 4th party logistics consulting service for a greener supply chain

Logistics companies have a wealth of trip data, both first- and third-party, and can leverage these data to provide new customer services, such as options for trips booking with optimized carbon footprint, duration, or costs.

IBM designed an AWS solution (Figure 2) enabling the customer to book goods transport by selecting from different route options, combining transport modes, selecting departure-location, arrival, cargo weight and carbon emissions. Proposed options include the greenest, fastest, and cheapest routes. Additionally, the user can provide financial and time constraints.

Optimized transport booking architecture

Figure 2. Optimized transport booking architecture

  1. User connects to web-app UI, hosted on Amazon S3.
  2. Amazon API Gateway receives user requests from web app; requests are forwarded to Lambda.
  3. Lambda calculates the best trip options based on the user’s prerequisites, such as carbon emissions.
  4. Lambda estimates carbon emissions; estimates are combined with trip options at Step 3.
  5. Amazon Neptune graph database is used to efficiently store and query trip data.
  6. Different Lambda instances are used to ingest data from on-premises data sources and send customer bookings through the customer ordering system.

Solution 3: Purchase order as a service

In the context of vendor-managed inventory and vendor-managed replenishment, inventory and logistics companies want to check on warehouse stock levels to identify the best available options for goods transport. Their objective is to optimize the availability of warehouse stock for order fulfillment; therefore, when a purchase order (PO) is received, required goods are identified as available in the correct warehouse, enabling swift delivery with minimal lead time and costs.

IBM designed an AWS PO as a service solution (Figure 3), using warehouse data to forecast future customer’s POs. Based on this forecast, the solution plans and optimizes warehouse goods availability and, hence, logistics required for the PO fulfillment.

Purchase order as a service AWS solution

Figure 3. Purchase order as a service AWS solution

  1. AWS Amplify provides web-mobile UI where users can set constraints (such as warehouse capacity, minimum/maximum capacity) and check: warehouses’ states, POs in progress. Additionally, UI proposes possible optimized POs, which are automatically generated by the solution. If the user accepts one of these solution-generated POs, the user will benefit from optimized delivery time, costs and carbon-footprint.
  2. Lambda receives Amazon Forecast inferences and reads/writes PO information on Amazon DynamoDB.
  3. Forecast provides inferences regarding the most probable future POs. Forecast uses POs, warehouse data, and goods delivery data to automatically train a machine learning (ML) model that is used to generate forecast inferences.
  4. Amazon DynamoDB stores PO and warehouse information.
  5. Lambda pushes PO, warehouse, and goods delivery data from Amazon DynamoDB into Amazon S3. These data are used in the Forecast ML-model re-train, to ensure high quality forecasting inferences.

Solution 4: Optimize environmental impact associated with engineers’ interventions for customer fiber connections

Telco companies that provide end-users’ internet connections need engineers executing field tasks, like deploying, activating, and repairing subscribers’ lines. In this scenario, it’s important to identify the most efficient engineers’ itinerary.

IBM designed an AWS solution that automatically generates engineers’ itineraries that consider criteria such as mileage, carbon-emission generation, and electric-/thermal-vehicle availability.

The solution (Figure 4) provides:

  • Customer management teams with a mobile dashboard showing carbon-emissions estimates for all engineers’ journeys, both in-progress and planned
  • Engineers with a mobile application including an optimized itinerary, trip updates based on real time traffic, and unexpected events
AWS telco solution for greener customer service

Figure 4. AWS telco solution for greener customer service

  1. Management team and engineers connect to web/mobile application, respectively. Amazon Cognito provides authentication and authorization, Amazon S3 stores application static content, and API Gateway receives and forwards API requests.
  2. AWS Step Functions implements different workflows. Application logic is implemented in Lambda, which connects to DynamoDB to get trip data (current route and driver location); Amazon Location Service provides itineraries, and Amazon SageMaker ML model implements itinerary optimization engine.
  3. Independently from online users, trip data are periodically sent to API Gateway and stored in Amazon S3.
  4. SageMaker notebook periodically uses Amazon S3 data to re-train the trip optimization ML model with updated data.

Solution 5: Improve the effectiveness of customer SAP level 1 support by reducing response times for common information requests

Companies using SAP usually provide first-level support to their internal SAP users. SAP users engage the support (usually via ticketing system) to ask for help when facing SAP issues or to request additional information. A high number of information requests requires significant effort to retrieve and provide the available information on resources like SAP notes/documentation or similar support requests.

IBM designed an AWS solution (Figure 5), based on support request information, that can automatically provide a short list of most probable solutions with a confidence score.

SAP customer support solution

Figure 5. SAP customer support solution

  1. Lambda receives ticket information, such as ticket number, business service, and description.
  2. Lambda processes ticket data and Amazon Translate translates text into country native-language and English.
  3. SageMaker ML model receives the question and provides the inference.
  4. If the inference has a high confidence score, Lambda provides it immediately as output.
  5. If the inference has a low confidence score, Amazon Kendra receives the question, searches automatically through indexed company information and provides the best answer available. Lambda then provides the answer as output.

Solution 6: Improve contact center customer experience providing faster and more accurate customer support

Insured customers often interact with insurer companies using contact centers, requesting information and services regarding their insurance policies.

IBM designed an AWS solution improving end-customer experience and contact center agent efficiency by providing automated customer-agent call/chat summarization. This enables:

  • The agent to quickly recall the customer need in following interactions
  • Contact center supervisor to quickly understand the objective of each case (intervening if necessary)
  • Insured customers to quickly have the information required, without repeating information already provided
Improving contact center customer experience

Figure 6. Improving contact center customer experience

Summarization capability is provided by generative AI, leveraging large language models (LLM) on SageMaker.

  1. Pretrained LLM model from Hugging Face is stored on Amazon S3.
  2. LLM model is fine-tuned and trained using Amazon SageMaker.
  3. LLM model is made available as SageMaker API endpoint, ready to provide inferences.
  4. Insured user contact customer support; the user request goes through voice/chatbot, then reaches Amazon Connect.
  5. Lambda queries the LLM model. The inference is provided by LLM and it’s sent to an Amazon Connect instance, where inference is enriched with knowledge-based search, using Amazon Connect Wisdom.
  6. If the user–agent conversation was a voice interaction (like a phone call), then the call recording is transcribed using Amazon Transcribe. Then, Lambda is called for summarization.

Conclusion

In this blog post, we have explored how IBM Consulting delivered an Innovation Hackathon in France. During the Hackathon, IBM worked backward from real customer use cases, designing and building innovative solutions using AWS services.

Level up your React app with Amazon QuickSight: How to embed your dashboard for anonymous access

Post Syndicated from Adrianna Kurzela original https://aws.amazon.com/blogs/big-data/level-up-your-react-app-with-amazon-quicksight-how-to-embed-your-dashboard-for-anonymous-access/

Using embedded analytics from Amazon QuickSight can simplify the process of equipping your application with functional visualizations without any complex development. There are multiple ways to embed QuickSight dashboards into application. In this post, we look at how it can be done using React and the Amazon QuickSight Embedding SDK.

Dashboard consumers often don’t have a user assigned to their AWS account and therefore lack access to the dashboard. To enable them to consume data, the dashboard needs to be accessible for anonymous users. Let’s look at the steps required to enable an unauthenticated user to view your QuickSight dashboard in your React application.

Solution overview

Our solution uses the following key services:

After loading the web page on the browser, the browser makes a call to API Gateway, which invokes a Lambda function that calls the QuickSight API to generate a dashboard URL for an anonymous user. The Lambda function needs to assume an IAM role with the required permissions. The following diagram shows an overview of the architecture.

Architecture

Prerequisites

You must have the following prerequisites:

Set up permissions for unauthenticated viewers

In your account, create an IAM policy that your application will assume on behalf of the viewer:

  1. On the IAM console, choose Policies in the navigation pane.
  2. Choose Create policy.
  3. On the JSON tab, enter the following policy code:
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "quicksight:GenerateEmbedUrlForAnonymousUser"
            ],
            "Resource": [
                "arn:aws:quicksight:*:*:namespace/default",
				"arn:aws:quicksight:*:*:dashboard/<YOUR_DASHBOARD_ID1>",
				"arn:aws:quicksight:*:*:dashboard/<YOUR_DASHBOARD_ID2>"
            ],
            "Effect": "Allow"
        },
        {
            "Action": [
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": "*",
            "Effect": "Allow"
        }
    ]
}

Make sure to change the value of <YOUR_DASHBOARD_ID> to the value of the dashboard ID. Note this ID to use in a later step as well.

For the second statement object with logs, permissions are optional. It allows you to create a log group with the specified name, create a log stream for the specified log group, and upload a batch of log events to the specified log stream.

In this policy, we allow the user to perform the GenerateEmbedUrlForAnonymousUser action on the dashboard ID within the list of dashboard IDs inserted in the placeholder.

  1. Enter a name for your policy (for example, AnonymousEmbedPolicy) and choose Create policy.

Next, we create a role and attach this policy to the role.

  1. Choose Roles in the navigation pane, then choose Create role.

Identity and access console

  1. Choose Lambda for the trusted entity.
  2. Search for and select AnonymousEmbedPolicy, then choose Next.
  3. Enter a name for your role, such as AnonymousEmbedRole.
  4. Make sure the policy name is included in the Add permissions section.
  5. Finish creating your role.

You have just created the AnonymousEmbedRole execution role. You can now move to the next step.

Generate an anonymous embed URL Lambda function

In this step, we create a Lambda function that interacts with QuickSight to generate an embed URL for an anonymous user. Our domain needs to be allowed. There are two ways to achieve the integration of Amazon QuickSight:

  1. By adding the URL to the list of allowed domains in the Amazon QuickSight admin console (explained later in [Optional] Add your domain in QuickSight section).
  2. [Recommended] By adding the embed URL request during runtime in the API call. Option 1 is recommended when you need to persist the allowed domains. Otherwise, the domains will be removed after 30 minutes, which is equivalent to the session duration. For other use cases, it is recommended to use the second option (described and implemented below).

On the Lambda console, create a new function.

  1. Select Author from scratch.
  2. For Function name, enter a name, such as AnonymousEmbedFunction.
  3. For Runtime¸ choose Python 3.9.
  4. For Execution role¸ choose Use an existing role.
  5. Choose the role AnonymousEmbedRole.
  6. Choose Create function.
  7. On the function details page, navigate to the Code tab and enter the following code:

import json, boto3, os, re, base64

def lambda_handler(event, context):
print(event)
try:
def getQuickSightDashboardUrl(awsAccountId, dashboardIdList, dashboardRegion, event):
#Create QuickSight client
quickSight = boto3.client('quicksight', region_name=dashboardRegion);

#Construct dashboardArnList from dashboardIdList
dashboardArnList=[ 'arn:aws:quicksight:'+dashboardRegion+':'+awsAccountId+':dashboard/'+dashboardId for dashboardId in dashboardIdList]
#Generate Anonymous Embed url
response = quickSight.generate_embed_url_for_anonymous_user(
AwsAccountId = awsAccountId,
Namespace = 'default',
ExperienceConfiguration = {'Dashboard':{'InitialDashboardId': dashboardIdList[0]}},
AuthorizedResourceArns = dashboardArnList,
SessionLifetimeInMinutes = 60,
AllowedDomains = ['http://localhost:3000']
)
return response

#Get AWS Account Id
awsAccountId = context.invoked_function_arn.split(':')[4]

#Read in the environment variables
dashboardIdList = re.sub(' ','',os.environ['DashboardIdList']).split(',')
dashboardNameList = os.environ['DashboardNameList'].split(',')
dashboardRegion = os.environ['DashboardRegion']

response={}

response = getQuickSightDashboardUrl(awsAccountId, dashboardIdList, dashboardRegion, event)

return {'statusCode':200,
'headers': {"Access-Control-Allow-Origin": "http://localhost:3000",
"Content-Type":"text/plain"},
'body':json.dumps(response)
}

except Exception as e: #catch all
return {'statusCode':400,
'headers': {"Access-Control-Allow-Origin": "http://localhost:3000",
"Content-Type":"text/plain"},
'body':json.dumps('Error: ' + str(e))
}

If you don’t use localhost, replace http://localhost:3000 in the returns with the hostname of your application. To move to production, don’t forget to replace http://localhost:3000 with your domain.

  1. On the Configuration tab, under General configuration, choose Edit.
  2. Increase the timeout from 3 seconds to 30 seconds, then choose Save.
  3. Under Environment variables, choose Edit.
  4. Add the following variables:
    1. Add DashboardIdList and list your dashboard IDs.
    2. Add DashboardRegion and enter the Region of your dashboard.
  5. Choose Save.

Your configuration should look similar to the following screenshot.

  1. On the Code tab, choose Deploy to deploy the function.

Environment variables console

Set up API Gateway to invoke the Lambda function

To set up API Gateway to invoke the function you created, complete the following steps:

  1. On the API Gateway console, navigate to the REST API section and choose Build.
  2. Under Create new API, select New API.
  3. For API name, enter a name (for example, QuicksightAnonymousEmbed).
  4. Choose Create API.

API gateway console

  1. On the Actions menu, choose Create resource.
  2. For Resource name, enter a name (for example, anonymous-embed).

Now, let’s create a method.

  1. Choose the anonymous-embed resource and on the Actions menu, choose Create method.
  2. Choose GET under the resource name.
  3. For Integration type, select Lambda.
  4. Select Use Lambda Proxy Integration.
  5. For Lambda function, enter the name of the function you created.
  6. Choose Save, then choose OK.

API gateway console

Now we’re ready to deploy the API.

  1. On the Actions menu, choose Deploy API.
  2. For Deployment stage, select New stage.
  3. Enter a name for your stage, such as embed.
  4. Choose Deploy.

[Optional] Add your domain in QuickSight

If you added Allowed domains in Generate an anonymous embed URL Lambda function part, feel free to move to Turn on capacity pricing section.

To add your domain to the allowed domains in QuickSight, complete the following steps:

  1. On the QuickSight console, choose the user menu, then choose Manage QuickSight.

Quicksight dropdown menu

  1. Choose Domains and Embedding in the navigation pane.
  2. For Domain, enter your domain (http://localhost:<PortNumber>).

Make sure to replace <PortNumber> to match your local setup.

  1. Choose Add.

Quicksight admin console

Make sure to replace the localhost domain with the one you will use after testing.

Turn on capacity pricing

If you don’t have session capacity pricing enabled, follow the steps in this section. It’s mandatory to have this function enabled to proceed further.

Capacity pricing allows QuickSight customers to purchase reader sessions in bulk without having to provision individual readers in QuickSight. Capacity pricing is ideal for embedded applications or large-scale business intelligence (BI) deployments. For more information, visit Amazon QuickSight Pricing.

To turn on capacity pricing, complete the following steps:

  1. On the Manage QuickSight page, choose Your Subscriptions in the navigation pane.
  2. In the Capacity pricing section, select Get monthly subscription.
  3. Choose Confirm subscription.

To learn more about capacity pricing, see New in Amazon QuickSight – session capacity pricing for large scale deployments, embedding in public websites, and developer portal for embedded analytics.

Set up your React application

To set up your React application, complete the following steps:

  1. In your React project folder, go to your root directory and run npm i amazon-quicksight-embedding-sdk to install the amazon-quicksight-embedding-sdk package.
  2. In your App.js file, replace the following:
    1. Replace YOUR_API_GATEWAY_INVOKE_URL/RESOURCE_NAME with your API Gateway invoke URL and your resource name (for example, https://xxxxxxxx.execute-api.xx-xxx-x.amazonaws.com/embed/anonymous-embed).
    2. Replace YOUR_DASHBOARD1_ID with the first dashboardId from your DashboardIdList. This is the dashboard that will be shown on the initial render.
    3. Replace YOUR_DASHBOARD2_ID with the second dashboardId from your DashboardIdList.

The following code snippet shows an example of the App.js file in your React project. The code is a React component that embeds a QuickSight dashboard based on the selected dashboard ID. The code contains the following key components:

  • State hooks – Two state hooks are defined using the useState() hook from React:
    • dashboard – Holds the currently selected dashboard ID.
    • quickSightEmbedding – Holds the QuickSight embedding object returned by the embedDashboard() function.
  • Ref hook – A ref hook is defined using the useRef() hook from React. It’s used to hold a reference to the DOM element where the QuickSight dashboard will be embedded.
  • useEffect() hook – The useEffect() hook is used to trigger the embedding of the QuickSight dashboard whenever the selected dashboard ID changes. It first fetches the dashboard URL for the selected ID from the QuickSight API using the fetch() method. After it retrieves the URL, it calls the embed() function with the URL as the argument.
  • Change handler – The changeDashboard() function is a simple event handler that updates the dashboard state whenever the user selects a different dashboard from the drop-down menu. As soon as new dashboard ID is set, the useEffect hook is triggered.
  • 10-millisecond timeout – The purpose of using the timeout is to introduce a small delay of 10 milliseconds before making the API call. This delay can be useful in scenarios where you want to avoid immediate API calls or prevent excessive requests when the component renders frequently. The timeout gives the component some time to settle before initiating the API request. Because we’re building the application in development mode, the timeout helps avoid errors caused by the double run of useEffect within StrictMode. For more information, refer to Updates to Strict Mode.

See the following code:

import './App.css';
import * as React from 'react';
import { useEffect, useRef, useState } from 'react';
import { createEmbeddingContext } from 'amazon-quicksight-embedding-sdk';

 function App() {
  const dashboardRef = useRef([]);
  const [dashboardId, setDashboardId] = useState('YOUR_DASHBOARD1_ID');
  const [embeddedDashboard, setEmbeddedDashboard] = useState(null);
  const [dashboardUrl, setDashboardUrl] = useState(null);
  const [embeddingContext, setEmbeddingContext] = useState(null);

  useEffect(() => {
    const timeout = setTimeout(() => {
      fetch("YOUR_API_GATEWAY_INVOKE_URL/RESOURCE_NAME"
      ).then((response) => response.json()
      ).then((response) => {
        setDashboardUrl(response.EmbedUrl)
      })
    }, 10);
    return () => clearTimeout(timeout);
  }, []);

  const createContext = async () => {
    const context = await createEmbeddingContext();
    setEmbeddingContext(context);
  }

  useEffect(() => {
    if (dashboardUrl) { createContext() }
  }, [dashboardUrl])

  useEffect(() => {
    if (embeddingContext) { embed(); }
  }, [embeddingContext])

  const embed = async () => {

    const options = {
      url: dashboardUrl,
      container: dashboardRef.current,
      height: "500px",
      width: "600px",
    };

    const newEmbeddedDashboard = await embeddingContext.embedDashboard(options);
    setEmbeddedDashboard(newEmbeddedDashboard);
  };

  useEffect(() => {
    if (embeddedDashboard) {
      embeddedDashboard.navigateToDashboard(dashboardId, {})
    }
  }, [dashboardId])

  const changeDashboard = async (e) => {
    const dashboardId = e.target.value
    setDashboardId(dashboardId)
  }

  return (
    <>
      <header>
        <h1>Embedded <i>QuickSight</i>: Build Powerful Dashboards in React</h1>
      </header>
      <main>
        <p>Welcome to the QuickSight dashboard embedding sample page</p>
        <p>Please pick a dashboard you want to render</p>
        <select id='dashboard' value={dashboardId} onChange={changeDashboard}>
          <option value="YOUR_DASHBOARD1_ID">YOUR_DASHBOARD1_NAME</option>
          <option value="YOUR_DASHBOARD2_ID">YOUR_DASHBOARD2_NAME</option>
        </select>
        <div ref={dashboardRef} />
      </main>
    </>
  );
};

export default App

Next, replace the contents of your App.css file, which is used to style and layout your web page, with the content from the following code snippet:

body {
  background-color: #ffffff;
  font-family: Arial, sans-serif;
  margin: 0;
  padding: 0;
}

header {
  background-color: #f1f1f1;
  padding: 20px;
  text-align: center;
}

h1 {
  margin: 0;
}

main {
  margin: 20px;
  text-align: center;
}

p {
  margin-bottom: 20px;
}

a {
  color: #000000;
  text-decoration: none;
}

a:hover {
  text-decoration: underline;
}

i {
  color: orange;
  font-style: normal;
}

Now it’s time to test your app. Start your application by running npm start in your terminal. The following screenshots show examples of your app as well as the dashboards it can display.

Example application page with the visualisation

Example application page with the visualisation

Conclusion

In this post, we showed you how to embed a QuickSight dashboard into a React application using the AWS SDK. Sharing your dashboard with anonymous users allows them to access your dashboard without granting them access to your AWS account. There are also other ways to share your dashboard anonymously, such as using 1-click public embedding.

Join the Quicksight Community to ask, answer and learn with others and explore additional resources.


About the Author

author headshot

Adrianna is a Solutions Architect at AWS Global Financial Services. Having been a part of Amazon since August 2018, she has had the chance to be involved both in the operations as well as the cloud business of the company. Currently, she builds software assets which demonstrate innovative use of AWS services, tailored to a specific customer use cases. On a daily basis, she actively engages with various aspects of technology, but her true passion lies in combination of web development and analytics.

Detecting and stopping recursive loops in AWS Lambda functions

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/detecting-and-stopping-recursive-loops-in-aws-lambda-functions/

This post is written by Pawan Puthran, Principal Serverless Specialist TAM, Aneel Murari, Senior Serverless Specialist Solution Architect, and Shree Shrikhande, Senior AWS Lambda Product Manager.

AWS Lambda is announcing a recursion control to detect and stop Lambda functions running in a recursive or infinite loop.

At launch, this feature is available for Lambda integrations with Amazon Simple Queue Service (Amazon SQS), Amazon SNS, or when invoking functions directly using the Lambda invoke API. Lambda now detects functions that appear to be running in a recursive loop and drops requests after exceeding 16 invocations.

This can help reduce costs from unexpected Lambda function invocations because of recursion. You receive notifications about this action through the AWS Health Dashboard, email, or by configuring Amazon CloudWatch Alarms.

Overview

You can invoke Lambda functions in multiple ways. AWS services generate events that invoke Lambda functions, and Lambda functions can send messages to other AWS services. In most architectures, the service or resource that invokes a Lambda function should be different from the service or resource that the function outputs to. Because of misconfiguration or coding bugs, a function can send a processed event to the same service or resource that invokes the Lambda function, causing a recursive loop.

Lambda now detects the function running in a recursive loop between supported services, after exceeding 16 invocations. It returns a RecursiveInvocationException to the caller. There is no additional charge for this feature. For asynchronous invokes, Lambda sends the event to a dead-letter queue or on-failure destination, if one is configured.

The following is an example of an order processing system.

Image processing system

Order processing system

  1. A new order information message is sent to the source SQS queue.
  2. Lambda consumes the message from the source queue using an ESM.
  3. The Lambda function processes the message and sends the updated orders message to a destination SQS queue using the SQS SendMessage API.
  4. The source queue has a dead-letter queue(DLQ) configured for handling any failed or unprocessed messages.
  5. Because of a misconfiguration, the Lambda function sends the message to the source SQS queue instead of the destination queue. This causes a recursive loop of Lambda function invocations.

To explore sample code for this example, see the GitHub repo.

In the preceding example, after 16 invocations, Lambda throws a RecursiveInvocationException to the ESM. The ESM stops invoking the Lambda function and, once the maxReceiveCount is exceeded, SQS moves the message to the source queues configured DLQ.

You receive an AWS Health Dashboard notification with steps to troubleshoot the function.

AWS Health Dashboard notification

AWS Health Dashboard notification

You also receive an email notification to the registered email address on the account.

Email notification

Email notification

Lambda emits a RecursiveInvocationsDropped CloudWatch metric, which you can view in the CloudWatch console.

RecursiveInvocationsDropped CloudWatch metric

RecursiveInvocationsDropped CloudWatch metric

How does Lambda detect recursion?

For Lambda to detect recursive loops, your function must use one of the supported AWS SDK versions or higher.

Lambda uses an AWS X-Ray trace header primitive called “Lineage” to track the number of times a function has been invoked with an event. When your function code sends an event using a supported AWS SDK version, Lambda increments the counter in the lineage header. If your function is then invoked with the same triggering event more than 16 times, Lambda stops the next invocation for that event. You do not need to configure active X-Ray tracing for this feature to work.

An example lineage header looks like:

X-Amzn-Trace-Id:Root=1-645f7998-4b1e232810b0bb733dba2eab;Parent=5be88d12eefc1fc0;Sampled=1;Lineage=43e12f0f:5

43e12f0f is the hash of a resource, in this case a Lambda function. 5 is the number of times this function has been invoked with the same event. The logic of hash generation, encoding, and size of the lineage header may change in the future. You should not design any application functionality based on this.

When using an ESM to consume messages from SQS, after the maxReceiveCount value is exceeded, the message is sent to the source queue’s configured DLQ. When Lambda detects a recursive loop and drops subsequent invocations, it returns a RecursiveInvocationException to the ESM. This increments the maxReceiveCount value. When the ESM auto retries to process events, based on the error handling configuration, these retries are not considered recursive invocations.

When using SQS, you can also batch multiple messages into one Lambda event. Where the message batch size is greater than 1, Lambda uses the maximum lineage value within the batch of messages. It drops the entire batch if the value exceeds 16.

Recursion detection in action

You can deploy a sample application example in the GitHub repo to test Lambda recursive loop detection. The application includes a Lambda function that reads from an SQS queue and writes messages back to the same SQS queue.

As prerequisites, you must install:

To deploy the application:

    1. Set your AWS Region:
export REGION=<your AWS region>
    1. Clone the GitHub repository
git clone https://github.com/aws-samples/aws-lambda-recursion-detection-sample.git
cd aws-lambda-recursion-detection-sample
    1. Use AWS SAM to build and deploy the resources to your AWS account. Enter a stack name, such as lambda-recursion, when prompted. Accept the remaining default values.
sam build –-use-container
sam deploy --guided --region $REGION

To test the application:

    1. Save the name of the SQS queue in a local environment variable:
SOURCE_SQS_URL=$(aws cloudformation describe-stacks \ --region $REGION \ --stack-name lambda-recursion \ --query 'Stacks[0].Outputs[?OutputKey==`SourceSQSqueueURL`].OutputValue' --output text)
  1. Publish a message to the source SQS queue:
aws sqs send-message --queue-url $SOURCE_SQS_URL --message-body '{"orderId":"111","productName":"Bolt","orderStatus":"Submitted"}' --region $REGION

This invokes the Lambda function, which writes the message back to the queue.

To verify that Lambda has detected the recursion:

  1. Navigate to the CloudWatch console. Choose All Metrics under Metrics in the left-hand panel and search for RecursiveInvocationsDropped.

    Find RecursiveInvocationsDropped.

    Find RecursiveInvocationsDropped.

  2. Choose Lambda > By Function Name and choose RecursiveInvocationsDropped for the function you created. Under Graphed metrics, change the statistic to sum and Period to 1 minute. You see one record. Refresh if you don’t see the metric after a few seconds.
Metrics sum view

Metrics sum view

Actions to take when Lambda stops a recursive loop

When you receive a notification regarding recursion in your account, the following steps can help address the issue.

  • To stop further invoke attempts while you fix the underlying configuration issue, set the function concurrency to 0. This acts as an off switch for the Lambda function. You can choose the “Throttle” button in the Lambda console or use the PutFunctionConcurrency API to set the function concurrency to 0.
  • You can also disable or delete the event source mapping or trigger for the Lambda function.
  • Check your Lambda function code and configuration for any code defects that create loops. For example, check your environment variables to ensure you are not using the same SQS queue or SNS topic as source and target.
  • If an SQS Queue is the event source for your Lambda function, configure a DLQ on the source queue.
  • If an SNS topic is the event source, configure an On-Failure Destination for the Lambda function.

Disabling recursion detection

You may have valid use-cases where Lambda recursion is intentional as part of your design. In this case, use caution and implement suitable guardrails to prevent unexpected charges to your account. To learn more about best practices for using recursive invocation patterns, see Recursive patterns that cause run-away Lambda functions in the AWS Lambda Operator Guide.

This feature is turned on by default to stop recursive loops. To request turning it off for your account, reach out to AWS Support.

Conclusion

Lambda recursion control for SQS and SNS automatically detects and stops functions running in a recursive or infinite loop. This can be due to misconfiguration or coding errors. Recursion control helps reduce unexpected usage with Lambda and downstream services. The post also explains how Lambda detects and stops recursive loops and notifies you through AWS Health Dashboard to troubleshoot the function.

To learn more about the feature, visit the AWS Lambda Developer Guide.

For more serverless learning resources, visit Serverless Land

Understanding AWS Lambda’s invoke throttling limits

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/understanding-aws-lambdas-invoke-throttle-limits/

This post is written by Archana Srikanta, Principal Engineer, AWS Lambda.

When you call AWS Lambda’s Invoke API, a series of throttle limits are evaluated to decide if your call is let through or throttled with a 429 “Too Many Requests” exception. This blog post explains the most common invoke throttle limits and the relationship between them, so you can better understand scaling workloads on Lambda.

Overview

Invoke call flow

The throttle limits exist to protect the following components of Lambda’s internal service architecture, and your workload, from noisy neighbors:

  • Execution environment: An execution environment is a Firecracker microVM where your function code runs. A given execution environment only hosts one invocation at a time, but it can be reused for subsequent invocations of the same function version.
  • Invoke data plane: These are a series of internal web services that, on an invoke, select (or create) a sandbox and route your request to it. This is also responsible for enforcing the throttle limits.

When you make an Invoke API call, it transits through some or all of the Invoke Data Plane services, before reaching an execution environment where your function code is downloaded and executed.

There are three distinct but related throttle limits which together decide if your invoke request is accepted by the data plane or throttled.

Concurrency

Concurrent means “existing, happening, or done at the same time”. Accordingly, the Lambda concurrency limit is a limit on the simultaneous in-flight invocations allowed at any given time. It is not a rate or transactions per second (TPS) limit in and of itself, but instead a limit on how many invocations can be inflight at the same time. This documentation visually explains the concept of concurrency.

Under the hood, the concurrency limit roughly translates to a limit on the maximum number of execution environments (and thus Firecracker microVMs) that your account can claim at any given point in time. Lambda runs a fleet of multi-tenant bare metal instances, on which Firecracker microVMs are carved out to serve as execution environments for your functions. AWS constantly monitors and scales this fleet based on incoming demand and shares the available capacity fairly among customers.

The concurrency limit helps protect Lambda from a single customer exhausting all the available capacity and causing a denial of service to other customers.

Transactions per second (TPS)

Customers often ask how their concurrency limit translates to TPS. The answer depends on how long your function invocations last.

Relation between concurrency and TPS

The diagram above considers three cases, each with a different function invocation duration, but a fixed concurrency limit of 1000. In the first case, invocations have a constant duration of 1 second. This means you can initiate 1000 invokes and claim all 1000 execution environments permitted by your concurrency limit. These execution environments remain busy for the entire second, and you cannot start any more invokes in that second because your concurrency limit prevents you from claiming any more execution environments. So, the TPS you can achieve with a concurrency limit of 1000 and a function duration of 1 second is 1000 TPS.

In case 2, the invocation duration is halved to 500ms, with the same concurrency limit of 1000. You can initiate 1000 concurrent invokes at the start of the second as before. These invokes keep the execution environments busy for the first half of the second. Once finished, you can start an additional 1000 invokes against the same execution environments while still being within your concurrency limit. So, by halving the function duration, you doubled your TPS to 2000.

Similarly, in case 3, if your function duration is 100ms, you can initiate 10 rounds of 1000 invokes each in a second, achieving a TPS of 10K.

Codifying this as an equation, the TPS you can achieve given a concurrency limit is:

TPS = concurrency / function duration in seconds

Taken to an extreme, for a function duration of only 1ms and at a concurrency limit of 1000 (the default limit), an account can drive an invoke TPS of one million. For every additional unit of concurrency granted via a limit increase, it implicitly grants an additional 1000 TPS per unit of concurrency increased. The high TPS doesn’t require any additional execution environments (Firecracker microVMs), so it’s not problematic from a fleet capacity perspective. However, driving over a million TPS from a single account puts stress on the Invoke Data Plane services. They must be protected from noisy neighbor impact as well so all customers have a fair share of the services’ bandwidth. A concurrency limit alone isn’t sufficient to protect against this – the TPS limit provides this protection.

As of this writing, the invoke TPS is capped at 10 times your concurrency. Added to the previous equation:

TPS = min( 10 x concurrency, concurrency / function duration in seconds)

The concurrency factor is common across both terms in the min function, so the key comparison is:

min(10, 1 / function duration in seconds)

If the function duration is exactly 100ms (or 1/10th of a second), both terms in the min function are equal. If the function duration is over 100ms, the second term is lower and TPS is limited as per concurrency/function duration. If the function duration is under 100ms, the first term is lower and TPS is limited as per 10 x concurrency.

Limits for functions less than 100ms

To summarize, the TPS limit exists to protect the Invoke Data Plane from the high churn of short-lived invocations, for which the concurrency limit alone affords too high of a TPS. If you drive short invocations of under 100ms, your throughput is capped as though the function duration is 100ms (at 10 x concurrency) as shown in the diagram above. This implies that short lived invocations may be TPS limited, rather than concurrency limited. However, if your function duration is over 100ms you can effectively ignore the 10 x concurrency TPS limit and calculate your available TPS as concurrency/function duration.

Burst

The third throttle limit is the burst limit. Lambda does not keep execution environments provisioned for your entire concurrency limit at all times. That would be wasteful, especially if usage peaks are transient, as is the case with many workloads. Instead, the service spins up execution environments just-in-time as the invoke arrives, if one doesn’t already exist. Once an execution environment is spun up, it remains “warm” for some period of time and is available to host subsequent invocations of the same function version.

However, if an invoke doesn’t find a warm execution environment, it experiences a “cold start” while we provision a new execution environment. Cold starts involve certain additional operations over and above the warm invoke path, such as downloading your code or container and initializing your application within the execution environment. These initialization operations are typically computationally heavy and so have a lower throughput compared to the warm invoke path. If there are sudden and steep spikes in the number of cold starts, it can put pressure on the invoke services that handle these cold start operations, and also cause undesirable side effects for your application such as increased latencies, reduced cache efficiency and increased fan out on downstream dependencies. The burst limit exists to protect against such surges of cold starts, especially for accounts that have a high concurrency limit. It ensures that the climb up to a high concurrency limit is gradual so as to smooth out the number of cold starts in a burst.

The algorithm used to enforce the burst limit is the Token Bucket rate-limiting algorithm. Consider a bucket that holds tokens. The bucket has a maximum capacity of B tokens (burst). The bucket starts full. Each time you send an invoke request that requires an additional unit of concurrency, it costs a token from the bucket. If the token exists, you are granted the additional concurrency and the token is removed from the bucket. The bucket is refilled at a constant rate of r tokens per minute (rate) until it reaches its maximum capacity.

Token bucket

What this means is that the rate of climb of concurrency is limited to r tokens per minute. Even though the algorithm allows you to collect up to B tokens and burst, you must wait for the bucket to refill before you can burst again, effectively limiting your average rate to r per minute.

Concurrency burst limit chart

The chart above shows the burst limit in action with a maximum concurrency limit of 3000, a maximum burst(B) of 1000 and a refill rate(r) of 500/minute. The token bucket starts full with 1000 tokens, as is the available burst headroom.

There is a burst activity between minute one and two, which consumes all tokens in the bucket and claims all 1000 concurrent execution environments allowed by the burst limit. At this point the bucket is empty and any attempt to claim additional concurrent execution environments is burst throttled, in spite of max concurrency not being reached yet.

The token bucket and the burst headroom are replenished at minutes two and three with 500 tokens each minute to bring it back up to its maximum capacity of 1000. At minute four, there is no additional refill because the bucket is at maximum capacity. Between minutes four and five, there is a second burst activity which empties the bucket again and claims an additional 1000 execution environments, bringing the total number of active execution environments to 2000.

The bucket continues to replenish at a rate of 500/minute at minutes five and six. At this point, sufficient tokens have been accumulated to cover the entire concurrency limit of 3000, and so the bucket isn’t refilled anymore even when you have the third burst activity at minute seven. At minute ten, when all the usage ramps down, the available burst headroom slowly stair steps back down to the maximum initial burst of 1K.

The actual numbers for maximum burst and refill rate vary by Region and are subject to change, please visit the Lambda burst limits page for specific values.

It is important to distinguish that the burst limit isn’t a rate limit on the invoke itself, but a rate limit on how quickly concurrency can rise. However, since invoke TPS is a function of concurrency, it also clamps how quickly TPS can rise (a rate limit for a rate limit). The following chart shows how the TPS burst headroom follows a similar stair step pattern as the concurrency burst headroom, only with a multiplier.

Burst limits chart

Conclusion

This blog explains three key throttle limits applied on Lambda invokes: the concurrency limit, TPS limit and burst limit. It outlines the relationship between these limits and how each one protects the system and your workload from noisy neighbors. Equipped with this knowledge you can better interpret any 429 throttling exceptions you may receive while scaling your applications on Lambda. For more information on getting started with Lambda visit the Developer Guide.

For more serverless learning resources, visit Serverless Land.

Reduce archive cost with serverless data archiving

Post Syndicated from Rostislav Markov original https://aws.amazon.com/blogs/architecture/reduce-archive-cost-with-serverless-data-archiving/

For regulatory reasons, decommissioning core business systems in financial services and insurance (FSI) markets requires data to remain accessible years after the application is retired. Traditionally, FSI companies either outsourced data archiving to third-party service providers, which maintained application replicas, or purchased vendor software to query and visualize archival data.

In this blog post, we present a more cost-efficient option with serverless data archiving on Amazon Web Services (AWS). In our experience, you can build your own cloud-native solution on Amazon Simple Storage Service (Amazon S3) at one-fifth of the price of third-party alternatives. If you are retiring legacy core business systems, consider serverless data archiving for cost-savings while keeping regulatory compliance.

Serverless data archiving and retrieval

Modern archiving solutions follow the principles of modern applications:

  • Serverless-first development, to reduce management overhead.
  • Cloud-native, to leverage native capabilities of AWS services, such as backup or disaster recovery, to avoid custom build.
  • Consumption-based pricing, since data archival is consumed irregularly.
  • Speed of delivery, as both implementation and archive operations need to be performed quickly to fulfill regulatory compliance.
  • Flexible data retention policies can be enforced in an automated manner.

AWS Storage and Analytics services offer the necessary building blocks for a modern serverless archiving and retrieval solution.

Data archiving can be implemented on top of Amazon S3) and AWS Glue.

  1. Amazon S3 storage tiers enable different data retention policies and retrieval service level agreements (SLAs). You can migrate data to Amazon S3 using AWS Database Migration Service; otherwise, consider another data transfer service, such as AWS DataSync or AWS Snowball.
  2. AWS Glue crawlers automatically infer both database and table schemas from your data in Amazon S3 and store the associated metadata in the AWS Glue Data Catalog.
  3. Amazon CloudWatch monitors the execution of AWS Glue crawlers and notifies of failures.

Figure 1 provides an overview of the solution.

Serverless data archiving and retrieval

Figure 1. Serverless data archiving and retrieval

Once the archival data is catalogued, Amazon Athena can be used for serverless data query operations using standard SQL.

  1. Amazon API Gateway receives the data retrieval requests and eases integration with other systems via REST, HTTPS, or WebSocket.
  2. AWS Lambda reads parametrization data/templates from Amazon S3 in order to construct the SQL queries. Alternatively, query templates can be stored as key-value entries in a NoSQL store, such as Amazon DynamoDB.
  3. Lambda functions trigger Athena with the constructed SQL query.
  4. Athena uses the AWS Glue Data Catalog to retrieve table metadata for the Amazon S3 (archival) data and to return the SQL query results.

How we built serverless data archiving

An early build-or-buy assessment compared vendor products with a custom-built solution using Amazon S3, AWS Glue, and a user frontend for data retrieval and visualization.

The total cost of ownership over a 10-year period for one insurance core system (Policy Admin System) was $0.25M to build and run the custom solution on AWS compared with >$1.1M for third-party alternatives. The implementation cost advantage of the custom-built solution was due to development efficiencies using AWS services. The lower run cost resulted from a decreased frequency of archival usage and paying only for what you use.

The data archiving solution was implemented with AWS services (Figure 2):

  1. Amazon S3 is used to persist archival data in Parquet format (optimized for analytics and compressed to reduce storage space) that is loaded from the legacy insurance core system. The archival data source was AS400/DB2 and moved with Informatica Cloud to Amazon S3.
  2. AWS Glue crawlers infer the database schema from objects in Amazon S3 and create tables in AWS Glue for the decommissioned application data.
  3. Lambda functions (Python) remove data records based on retention policies configured for each domain, such as customers, policies, claims, and receipts. A daily job (Control-M) initiates the retention process.
Exemplary implementation of serverless data archiving and retrieval for insurance core system

Figure 2. Exemplary implementation of serverless data archiving and retrieval for insurance core system

Retrieval operations are formulated and executed via Python functions in Lambda. The following AWS resources implement the retrieval logic:

  1. Athena is used to run SQL queries over the AWS Glue tables for the decommissioned application.
  2. Lambda functions (Python) build and execute queries for data retrieval. The functions render HMTL snippets using Jinja templating engine and Athena query results, returning the selected template filled with the requested archive data. Using Jinja as templating engine improved the speed of delivery and reduced the heavy lifting of frontend and backend changes when modeling retrieval operations by ~30% due to the decoupling between application layers. As a result, engineers only need to build an Athena query with the linked Jinja template.
  3. Amazon S3 stores templating configuration and queries (JSON files) used for query parametrization.
  4. Amazon API Gateway serves as single point of entry for API calls.

The user frontend for data retrieval and visualization is implemented as web application using React JavaScript library (with static content on Amazon S3) and Amazon CloudFront used for web content delivery.

The archiving solution enabled 80 use cases with 60 queries and reduced storage from three terabytes on source to only 35 gigabytes on Amazon S3. The success of the implementation depended on the following key factors:

  • Appropriate sponsorship from business across all areas (claims, actuarial, compliance, etc.)
  • Definition of SLAs for responding to courts, regulators, etc.
  • Minimum viable and mandatory approach
  • Prototype visualizations early on (fail fast)

Conclusion

Traditionally, FSI companies relied on vendor products for data archiving. In this post, we explored how to build a scalable solution on Amazon S3 and discussed key implementation considerations. We have demonstrated that AWS services enable FSI companies to build a serverless archiving solution while reaching and keeping regulatory compliance at a lower cost.

Learn more about some of the AWS services covered in this blog:

Extract time series from satellite weather data with AWS Lambda

Post Syndicated from Lior Perez original https://aws.amazon.com/blogs/big-data/extract-time-series-from-satellite-weather-data-with-aws-lambda/

Extracting time series on given geographical coordinates from satellite or Numerical Weather Prediction data can be challenging because of the volume of data and of its multidimensional nature (time, latitude, longitude, height, multiple parameters). This type of processing can be found in weather and climate research, but also in applications like photovoltaic and wind power. For instance, time series describing the quantity of solar energy reaching specific geographical points can help in designing photovoltaic power plants, monitoring their operation, and detecting yield loss.

A generalization of the problem could be stated as follows: how can we extract data along a dimension that is not the partition key from a large volume of multidimensional data? For tabular data, this problem can be easily solved with AWS Glue, which you can use to create a job to filter and repartition the data, as shown at the end of this post. But what if the data is multidimensional and provided in a domain-specific format, like in the use case that we want to tackle?

AWS Lambda is a serverless compute service that lets you run code without provisioning or managing servers. With AWS Step Functions, you can launch parallel runs of Lambda functions. This post shows how you can use these services to run parallel tasks, with the example of time series extraction from a large volume of satellite weather data stored on Amazon Simple Storage Service (Amazon S3). You also use AWS Glue to consolidate the files produced by the parallel tasks.

Note that Lambda is a general purpose serverless engine. It has not been specifically designed for heavy data transformation tasks. We are using it here after having confirmed the following:

  • Task duration is predictable and is less than 15 minutes, which is the maximum timeout for Lambda functions
  • The use case is simple, with low compute requirements and no external dependencies that could slow down the process

We work on a dataset provided by EUMESAT: the MSG Total and Diffuse Downward Surface Shortwave Flux (MDSSFTD). This dataset contains satellite data at 15-minute intervals, in netcdf format, which represents approximately 100 GB for 1 year.

We process the year 2018 to extract time series on 100 geographical points.

Solution overview

To achieve our goal, we use parallel Lambda functions. Each Lambda function processes 1 day of data: 96 files representing a volume of approximately 240 MB. We then have 365 files containing the extracted data for each day, and we use AWS Glue to concatenate them for the full year and split them across the 100 geographical points. This workflow is shown in the following architecture diagram.

Deployment of this solution: In this post, we provide step-by-step instructions to deploy each part of the architecture manually. If you prefer an automatic deployment, we have prepared for you a Github repository containing the required infrastructure as code template.

The dataset is partitioned by day, with YYYY/MM/DD/ prefixes. Each partition contains 96 files that will be processed by one Lambda function.

We use Step Functions to launch the parallel processing of the 365 days of the year 2018. Step Functions helps developers use AWS services to build distributed applications, automate processes, orchestrate microservices, and create data and machine learning (ML) pipelines.

But before starting, we need to download the dataset and upload it to an S3 bucket.

Prerequisites

Create an S3 bucket to store the input dataset, the intermediate outputs, and the final outputs of the data extraction.

Download the dataset and upload it to Amazon S3

A free registration on the data provider website is required to download the dataset. To download the dataset, you can use the following command from a Linux terminal. Provide the credentials that you obtained at registration. Your Linux terminal could be on your local machine, but you can also use an AWS Cloud9 instance. Make sure that you have at least 100 GB of free storage to handle the entire dataset.

wget -c --no-check-certificate -r -np -nH --user=[YOUR_USERNAME] --password=[YOUR_PASSWORD] \
     -R "*.html, *.tmp" \
     https://datalsasaf.lsasvcs.ipma.pt/PRODUCTS/MSG/MDSSFTD/NETCDF/2018/

Because the dataset is quite large, this download could take a long time. In the meantime, you can prepare the next steps.

When the download is complete, you can upload the dataset to an S3 bucket with the following command:

aws s3 cp ./PRODUCTS/ s3://[YOUR_BUCKET_NAME]/ --recursive

If you use temporary credentials, they might expire before the copy is complete. In this case, you can resume by using the aws s3 sync command.

Now that the data is on Amazon S3, you can delete the directory that has been downloaded from your Linux machine.

Create the Lambda functions

For step-by-step instructions on how to create a Lambda function, refer to Getting started with Lambda.

The first Lambda function in the workflow generates the list of days that we want to process:

from datetime import datetime
from datetime import timedelta

def lambda_handler(event, context):
    '''
    Generate a list of dates (string format)
    '''
    
    begin_date_str = "20180101"
    end_date_str = "20181231"
    
    # carry out conversion between string 
    # to datetime object
    current_date = datetime.strptime(begin_date_str, "%Y%m%d")
    end_date = datetime.strptime(end_date_str, "%Y%m%d")

    result = []

    while current_date <= end_date:
        current_date_str = current_date.strftime("%Y%m%d")

        result.append(current_date_str)
            
        # adding 1 day
        current_date += timedelta(days=1)
      
    return result

We then use the Map state of Step Functions to process each day. The Map state will launch one Lambda function for each element returned by the previous function, and will pass this element as an input. These Lambda functions will be launched simultaneously for all the elements in the list. The processing time for the full year will therefore be identical to the time needed to process 1 single day, allowing scalability for long time series and large volumes of input data.

The following is an example of code for the Lambda function that processes each day:

import boto3
import netCDF4 as nc
import numpy as np
import pandas as pd
from datetime import datetime
import time
import os
import random

# Bucket containing input data
INPUT_BUCKET_NAME = "[INPUT_BUCKET_NAME]" # example: "my-bucket-name"
LOCATION = "[PREFIX_OF_INPUT_DATA_WITH_TRAILING_SLASH]" # example: "MSG/MDSSFTD/NETCDF/"

# Local output files
TMP_FILE_NAME = "/tmp/tmp.nc"
LOCAL_OUTPUT_FILE = "/tmp/dataframe.parquet"

# Bucket for output data
OUTPUT_BUCKET = "[OUTPUT_BUCKET_NAME]"
OUTPUT_PREFIX = "[PREFIX_OF_OUTPUT_DATA_WITH_TRAILING_SLASH]" # example: "output/intermediate/"

# Create 100 random coordinates
random.seed(10)
coords = [(random.randint(1000,2500), random.randint(1000,2500)) for _ in range(100)]

client = boto3.resource('s3')
bucket = client.Bucket(INPUT_BUCKET_NAME)

def date_to_partition_name(date):
    '''
    Transform a date like "20180302" to partition like "2018/03/02/"
    '''
    d = datetime.strptime(date, "%Y%m%d")
    return d.strftime("%Y/%m/%d/")

def lambda_handler(event, context):
    # Get date from input    
    date = str(event)
    print("Processing date: ", date)
    
    # Initialize output dataframe
    COLUMNS_NAME = ['time', 'point_id', 'DSSF_TOT', 'FRACTION_DIFFUSE']
    df = pd.DataFrame(columns = COLUMNS_NAME)
    
    prefix = LOCATION + date_to_partition_name(date)
    print("Loading files from prefix: ", prefix)
    
    # List input files (weather files)
    objects = bucket.objects.filter(Prefix=prefix)    
    keys = [obj.key for obj in objects]
           
    # For each file
    for key in keys:
        # Download input file from S3
        bucket.download_file(key, TMP_FILE_NAME)
        
        print("Processing: ", key)    
    
        try:
            # Load the dataset with netcdf library
            dataset = nc.Dataset(TMP_FILE_NAME)
            
            # Get values from the dataset for our list of geographical coordinates
            lats, lons = zip(*coords)
            data_1 = dataset['DSSF_TOT'][0][lats, lons]
            data_2 = dataset['FRACTION_DIFFUSE'][0][lats, lons]
    
            # Prepare data to add it into the output dataframe
            nb_points = len(lats)
            data_time = dataset.__dict__['time_coverage_start']
            time_list = [data_time for _ in range(nb_points)]
            point_id_list = [i for i in range(nb_points)]
            tuple_list = list(zip(time_list, point_id_list, data_1, data_2))
            
            # Add data to the output dataframe
            new_data = pd.DataFrame(tuple_list, columns = COLUMNS_NAME)
            df = pd.concat ([df, new_data])
        except OSError:
            print("Error processing file: ", key)
        
    # Replace masked by NaN (otherwise we cannot save to parquet)
    df = df.applymap(lambda x: np.NaN if type(x) == np.ma.core.MaskedConstant else x)    
        
    
    # Save to parquet
    print("Writing result to tmp parquet file: ", LOCAL_OUTPUT_FILE)
    df.to_parquet(LOCAL_OUTPUT_FILE)
    
    # Copy result to S3
    s3_output_name = OUTPUT_PREFIX + date + '.parquet'
    s3_client = boto3.client('s3')
    s3_client.upload_file(LOCAL_OUTPUT_FILE, OUTPUT_BUCKET, s3_output_name)

You need to associate a role to the Lambda function to authorize it to access the S3 buckets. Because the runtime is about a minute, you also have to configure the timeout of the Lambda function accordingly. Let’s set it to 5 minutes. We also increase the memory allocated to the Lambda function to 2048 MB, which is needed by the netcdf4 library for extracting several points at a time from satellite data.

This Lambda function depends on the pandas and netcdf4 libraries. They can be installed as Lambda layers. The pandas library is provided as an AWS managed layer. The netcdf4 library will have to be packaged in a custom layer.

Configure the Step Functions workflow

After you create the two Lambda functions, you can design the Step Functions workflow in the visual editor by using the Lambda Invoke and Map blocks, as shown in the following diagram.

In the Map state block, choose Distributed processing mode and increase concurrency limit to 365 in Runtime settings. This will enable parallel processing of all the days.

The number of Lambda functions that can run concurrently is limited for each account. Your account may have insufficient quota. You can request a quota increase.

Launch the state machine

You can now launch the state machine. On the Step Functions console, navigate to your state machine and choose Start execution to run your workflow.

This will trigger a popup in which you can enter optional input for your state machine. For this post, you can leave the defaults and choose Start execution.

The state machine should take 1–2 minutes to run, during which time you will be able to monitor the progress of your workflow. You can select one of the blocks in the diagram and inspect its input, output, and other information in real time, as shown in the following screenshot. This can be very useful for debugging purposes.

When all the blocks turn green, the state machine is complete. At this step, we have extracted the data for 100 geographical points for a whole year of satellite data.

In the S3 bucket configured as output for the processing Lambda function, we can check that we have one file per day, containing the data for all the 100 points.

Transform data per day to data per geographical point with AWS Glue

For now, we have one file per day. However, our goal is to get time series for every geographical point. This transformation involves changing the way the data is partitioned. From a day partition, we have to go to a geographical point partition.

Fortunately, this operation can be done very simply with AWS Glue.

  1. On the AWS Glue Studio console, create a new job and choose Visual with a blank canvas.

For this example, we create a simple job with a source and target block.

  1. Add a data source block.
  2. On the Data source properties tab, select S3 location for S3 source type.
  3. For S3 URL, enter the location where you created your files in the previous step.
  4. For Data format, keep the default as Parquet.
  5. Choose Infer schema and view the Output schema tab to confirm the schema has been correctly detected.

  1. Add a data target block.
  2. On the Data target properties tab, for Format, choose Parquet.
  3. For Compression type, choose Snappy.
  4. For S3 Target Location, enter the S3 target location for your output files.

We now have to configure the magic!

  1. Add a partition key, and choose point_id.

This tells AWS Glue how you want your output data to be partitioned. AWS Glue will automatically partition the output data according to the point_id column, and therefore we’ll get one folder for each geographical point, containing the whole time series for this point as requested.

To finish the configuration, we need to assign an AWS Identity and Access Management (IAM) role to the AWS Glue job.

  1. Choose Job details, and for IAM role¸ choose a role that has permissions to read from the input S3 bucket and to write to the output S3 bucket.

You may have to create the role on the IAM console if you don’t already have an appropriate one.

  1. Enter a name for our AWS Glue job, save it, and run it.

We can monitor the run by choosing Run details. It should take 1–2 minutes to complete.

Final results

After the AWS Glue job succeeds, we can check in the output S3 bucket that we have one folder for each geographical point, containing some Parquet files with the whole year of data, as expected.

To load the time series for a specific point into a pandas data frame, you can use the awswrangler library from your Python code:

import awswrangler as wr
import pandas as pd

# Retrieving the data directly from Amazon S3
df = wr.s3.read_parquet("s3://[BUCKET]/[PREFIX]/", dataset=True)

If you want to test this code now, you can create a notebook instance in Amazon SageMaker, and then open a Jupyter notebook. The following screenshot illustrates running the preceding code in a Jupyter notebook.

As we can see, we have successfully extracted the time series for specific geographical points!

Clean up

To avoid incurring future charges, delete the resources that you have created:

  • The S3 bucket
  • The AWS Glue job
  • The Step Functions state machine
  • The two Lambda functions
  • The SageMaker notebook instance

Conclusion

In this post, we showed how to use Lambda, Step Functions, and AWS Glue for serverless ETL (extract, transform, and load) on a large volume of weather data. The proposed architecture enables extraction and repartitioning of the data in just a few minutes. It’s scalable and cost-effective, and can be adapted to other ETL and data processing use cases.

Interested in learning more about the services presented in this post? You can find hands-on labs to improve your knowledge with AWS Workshops. Additionally, check out the official documentation of AWS Glue, Lambda, and Step Functions. You can also discover more architectural patterns and best practices at AWS Whitepapers & Guides.


About the Author

Lior Perez is a Principal Solutions Architect on the Enterprise team based in Toulouse, France. He enjoys supporting customers in their digital transformation journey, using big data and machine learning to help solve their business challenges. He is also personally passionate about robotics and IoT, and constantly looks for new ways to leverage technologies for innovation.

Implementing AWS Lambda error handling patterns

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/implementing-aws-lambda-error-handling-patterns/

This post is written by Jeff Chen, Principal Cloud Application Architect, and Jeff Li, Senior Cloud Application Architect

Event-driven architectures are an architecture style that can help you boost agility and build reliable, scalable applications. Splitting an application into loosely coupled services can help each service scale independently. A distributed, loosely coupled application depends on events to communicate application change states. Each service consumes events from other services and emits events to notify other services of state changes.

Handling errors becomes even more important when designing distributed applications. A service may fail if it cannot handle an invalid payload, dependent resources may be unavailable, or the service may time out. There may be permission errors that can cause failures. AWS services provide many features to handle error conditions, which you can use to improve the resiliency of your applications.

This post explores three use-cases and design patterns for handling failures.

Overview

AWS Lambda, Amazon Simple Queue Service (Amazon SQS), Amazon Simple Notification Service (Amazon SNS), and Amazon EventBridge are core building blocks for building serverless event-driven applications.

The post Understanding the Different Ways to Invoke Lambda Functions lists the three different ways of invoking a Lambda function: synchronous, asynchronous, and poll-based invocation. For a list of services and which invocation method they use, see the documentation.

Lambda’s integration with Amazon API Gateway is an example of a synchronous invocation. A client makes a request to API Gateway, which sends the request to Lambda. API Gateway waits for the function response and returns the response to the client. There are no built-in retries or error handling. If the request fails, the client attempts the request again.

Lambda’s integration with SNS and EventBridge are examples of asynchronous invocations. SNS, for example, sends an event to Lambda for processing. When Lambda receives the event, it places it on an internal event queue and returns an acknowledgment to SNS that it has received the message. Another Lambda process reads events from the internal queue and invokes your Lambda function. If SNS cannot deliver an event to your Lambda function, the service automatically retries the same operation based on a retry policy.

Lambda’s integration with SQS uses poll-based invocations. Lambda runs a fleet of pollers that poll your SQS queue for messages. The pollers read the messages in batches and invoke your Lambda function once per batch.

You can apply this pattern in many scenarios. For example, your operational application can add sales orders to an operational data store. You may then want to load the sales orders to your data warehouse periodically so that the information is available for forecasting and analysis. The operational application can batch completed sales as events and place them on an SQS queue. A Lambda function can then process the events and load the completed sale records into your data warehouse.

If your function processes the batch successfully, the pollers delete the messages from the SQS queue. If the batch is not successfully processed, the pollers do not delete the messages from the queue. Once the visibility timeout expires, the messages are available again to be reprocessed. If the message retention period expires, SQS deletes the message from the queue.

The following table shows the invocation types and retry behavior of the AWS services mentioned.

AWS service example Invocation type Retry behavior
Amazon API Gateway Synchronous No built-in retry, client attempts retries.

Amazon SNS

Amazon EventBridge

Asynchronous Built-in retries with exponential backoff.
Amazon SQS Poll-based Retries after visibility timeout expires until message retention period expires.

There are a number of design patterns to use for poll-based and asynchronous invocation types to retain failed messages for additional processing. These patterns can help you recover from delivery or processing failures.

You can explore the patterns and test the scenarios by deploying the code from this repository which uses the AWS Cloud Development Kit (AWS CDK) using Python.

Lambda poll-based invocation pattern

When using Lambda with SQS, if Lambda isn’t able to process the message and the message retention period expires, SQS drops the message. Failure to process the message can be due to function processing failures, including time-outs or invalid payloads. Processing failures can also occur when the destination function does not exist, or has incorrect permissions.

You can configure a separate dead-letter queue (DLQ) on the source queue for SQS to retain the dropped message. A DLQ preserves the original message and is useful for analyzing root causes, handling error conditions properly, or sending notifications that require manual interventions. In the poll-based invocation scenario, the Lambda function itself does not maintain a DLQ. It relies on the external DLQ configured in SQS. For more information, see Using Lambda with Amazon SQS.

The following shows the design pattern when you configure Lambda to poll events from an SQS queue and invoke a Lambda function.

Lambda synchronously polling catches of messages from SQS

Lambda synchronously polling batches of messages from SQS

To explore this pattern, deploy the code in this repository. Once deployed, you can use this instruction to test the pattern with the happy and unhappy paths.

Lambda asynchronous invocation pattern

With asynchronous invokes, there are two failure aspects to consider when using Lambda. The event source cannot deliver the message to Lambda and the Lambda function errors when processing the event.

Event sources vary in how they handle failures delivering messages to Lambda. If SNS or EventBridge cannot send the event to Lambda after exhausting all their retry attempts, the service drops the event. You can configure a DLQ on an SNS topic or EventBridge event bus to hold the dropped event. This works in the same way as the poll-based invocation pattern with SQS.

Lambda functions may then error due to input payload syntax errors, duration time-outs, or the function throws an exception such as a data resource not available.

For asynchronous invokes, you can configure how long Lambda retains an event in its internal queue, up to 6 hours. You can also configure how many times Lambda retries when the function errors, between 0 and 2. Lambda discards the event when the maximum age passes or all retry attempts fail. To retain a copy of discarded events, you can configure either a DLQ or, preferably, a failed-event destination as part of your Lambda function configuration.

A Lambda destination enables you to specify what to do next if an asynchronous invocation succeeds or fails. You can configure a destination to send invocation records to SQS, SNS, EventBridge, or another Lambda function. Destinations are preferred for failure processing as they support additional targets and include additional information. A DLQ holds the original failed event. With a destination, Lambda also passes details of the function’s response in the invocation record. This includes stack traces, which can be useful for analyzing the root cause.

Using both a DLQ and Lambda destinations

You can apply this pattern in many scenarios. For example, many of your applications may contain customer records. To comply with the California Consumer Privacy Act (CCPA), different organizations may need to delete records for a particular customer. You can set up a consumer delete SNS topic. Each organization creates a Lambda function, which processes the events published by the SNS topic and deletes customer records in its managed applications.

The following shows the design pattern when you configure an SNS topic as the event source for a Lambda function, which uses destination queues for success and failure process.

SNS topic as event source for Lambda

SNS topic as event source for Lambda

You configure a DLQ on the SNS topic to capture messages that SNS cannot deliver to Lambda. When Lambda invokes the function, it sends details of the successfully processed messages to an on-success SQS destination. You can use this pattern to route an event to multiple services for simpler use cases. For orchestrating multiple services, AWS Step Functions is a better design choice.

Lambda can also send details of unsuccessfully processed messages to an on-failure SQS destination.

A variant of this pattern is to replace an SQS destination with an EventBridge destination so that multiple consumers can process an event based on the destination.

To explore how to use an SQS DLQ and Lambda destinations, deploy the code in this repository. Once deployed, you can use this instruction to test the pattern with the happy and unhappy paths.

Using a DLQ

Although destinations is the preferred method to handle function failures, you can explore using DLQs.

The following shows the design pattern when you configure an SNS topic as the event source for a Lambda function, which uses SQS queues for failure process.

Lambda invoked asynchonously

Lambda invoked asynchonously

You configure a DLQ on the SNS topic to capture the messages that SNS cannot deliver to the Lambda function. You also configure a separate DLQ for the Lambda function. Lambda saves an unsuccessful event to this DLQ after Lambda cannot process the event after maximum retry attempts.

To explore how to use a Lambda DLQ, deploy the code in this repository. Once deployed, you can use this instruction to test the pattern with happy and unhappy paths.

Conclusion

This post explains three patterns that you can use to design resilient event-driven serverless applications. Error handling during event processing is an important part of designing serverless cloud applications.

You can deploy the code from the repository to explore how to use poll-based and asynchronous invocations. See how poll-based invocations can send failed messages to a DLQ. See how to use DLQs and Lambda destinations to route and handle unsuccessful events.

Learn more about event-driven architecture on Serverless Land.

Serverless ICYMI Q2 2023

Post Syndicated from Benjamin Smith original https://aws.amazon.com/blogs/compute/serverless-icymi-q2-2023/

Welcome to the 22nd edition of the AWS Serverless ICYMI (in case you missed it) quarterly recap. Every quarter, we share all the most recent product launches, feature enhancements, blog posts, webinars, live streams, and other interesting things that you might have missed!

In case you missed our last ICYMI, check out what happened last quarter here.

Serverless Innovation Day

AWS recently hosted the Serverless Innovation Day, a day of live streams that showcased AWS serverless technologies such as AWS Lambda, Amazon ECS with AWS Fargate, Amazon EventBridge, and AWS Step Functions. The event included insights from AWS leaders such as Holly Mesrobian, Ajay Nair, and Usman Khalid, as well as prominent customers and our serverless Developer Advocate team. It provided insights into serverless modernization success stories, use cases, and best practices. If you missed the event, you can catch up on the recorded sessions here.

Serverless Land, your go-to resource for all things serverless, expanded to include a new Serverless Testing section. This provides valuable insights, patterns, and best practices for testing integrations using AWS SAM and CDK templates.

Serverless Land also launched a new learning page featuring a collection of resources, including blog posts, videos, workshops, and training materials, allowing users to choose a learning path from a variety of topics. “EventBridge Visuals“, small, easily digestible visuals focused on EventBridge have also been added.

AWS Lambda

Lambda introduced support for response payload streaming allowing functions to progressively stream response data to clients. This feature significantly improves performance by reducing the time to first byte (TTFB) latency, benefiting web and mobile applications.

Response streaming is particularly useful for applications with large payloads such as images, videos, documents, or database results. It eliminates the need to buffer the entire payload in memory and enables the transfer of responses larger than Lambda’s 6 MB limit, up to a soft limit of 20 MB.

By configuring the Function URL to use the InvokeWithResponseStream API, streaming responses can be accessed through an HTTP client that supports incremental response data. This enhancement expands Lambda’s capabilities, allowing developers to handle larger payloads more efficiently and enhance the overall performance and user experience of their web and mobile applications.

Lambda now supports Java 17 with Amazon Corretto distribution, providing long-term support and improved performance. Java 17 introduces new language features like records, sealed classes, and multi-line strings. The runtime uses ZGC and Shenandoah garbage collectors to reduce latency. Default JVM configuration changes optimize tiered compilation for reduced startup latency. Developers can use Java 17 in Lambda through AWS Management Console, AWS SAM, and AWS CDK. Popular frameworks like Spring Boot 3 and Micronaut 4 require Java 17 as a minimum. Micronaut provides a web service to generate example projects using Java 17 and AWS CDK infrastructure.

Lambda now supports the Ruby 3.2 runtime, enabling you to write serverless functions using the latest version of the Ruby programming language. This update enhances developer productivity and brings new features and improvements to your Ruby-based Lambda functions.

Lambda introduced support for Kafka and Amazon MQ event sources in four additional Regions. This expanded availability allows developers to build event-driven architectures using these messaging systems in more regions around the world, providing greater flexibility and scalability. It also supports Kafka and Amazon MQ event sources in AWS GovCloud (US) Regions, allowing government organizations to leverage the benefits of event-driven architectures in their cloud environments.

Lambda also added support for starting from a specific timestamp for Kafka event sources, allowing for precise message processing and useful scenarios like Disaster Recovery, without any additional charges.

Serverless Land has launched new learning paths for Lambda to help you level up your serverless skills:

  • The Java Replatforming learning path guides Java developers through the process of migrating existing Java applications to a serverless architecture.
  • The Lift and Shift to Serverless learning path provides guidance on migrating traditional applications to a serverless environment.
  • Lambda Fundamentals is a 23-part video series providing practical examples and tips to help you get started with serverless development using Lambda.

The new AWS Tech Talk, Best practices for building interactive applications with AWS Lambda, helps you learn best practices and architectural patterns for building web and mobile backends as well as API-driven microservices on Lambda. Explore how to take advantage of features in Lambda, Amazon API Gateway, Amazon DynamoDB, and more to easily build highly scalable serverless web applications.

AWS Step Functions

The latest update to AWS Step Functions introduces versions and aliases, allows users to run specific state machine revisions, ensuring reliable deployments, reducing risks, and providing version visibility. Appending version numbers to the state machine ARN enables selection of desired versions, even after updates. Aliases distribute execution requests based on weights, supporting incremental deployment patterns.

This enhances confidence in state machine updates, improves observability, auditing, and can be managed through the Step Functions console or AWS CloudFormation. Versions and aliases are available in all supported AWS Regions at no extra cost.

AWS SAM

AWS SAM CLI has introduced a new feature called remote invoke that allows developers to test Lambda functions in the AWS Cloud. This feature enables developers to invoke Lambda functions from their local development environment and provides options for event payloads, output formats, and logging.

It can be used with or without AWS SAM and can be combined with AWS SAM Accelerate for streamlined development and testing. Overall, the remote invoke feature simplifies serverless application testing in the AWS Cloud.

Amazon EventBridge

EventBridge announced an open-source connector for Kafka Connect, providing seamless integration between EventBridge and Kafka Connect. This connector simplifies the process of streaming events from Kafka topics to EventBridge, enabling you to build event-driven architectures with ease.

EventBridge has improved end-to-end latencies for event buses, delivering events up to 80% faster. This enables broader use in latency-sensitive applications such as industrial and medical applications, with the lower latencies applied by default across all AWS Regions at no extra cost.

Amazon Aurora Serverless v2

Amazon Aurora Serverless v2 is now available in four additional Regions, expanding the reach of this scalable and cost-effective serverless database option. With Aurora Serverless v2, you can benefit from automatic scaling, pause-and-resume capability, and pay-per-use pricing, enabling you to optimize costs and manage your databases more efficiently.

Amazon SNS

Amazon SNS now supports message data protection in five additional Regions, ensuring the security and integrity of your message payloads. With this feature, you can encrypt sensitive message data at rest and in transit, meeting compliance requirements and safeguarding your data.

Serverless Blog Posts

April 2023

Apr 27 – AWS Lambda now supports Java 17

Apr 27 – Optimizing Amazon EC2 Spot Instances with Spot Placement Scores

Apr 26 – Building private serverless APIs with AWS Lambda and Amazon VPC Lattice

Apr 25 – Implementing error handling for AWS Lambda asynchronous invocations

Apr 20 – Understanding techniques to reduce AWS Lambda costs in serverless applications

Apr 18 – Python 3.10 runtime now available in AWS Lambda

Apr 13 – Optimizing AWS Lambda extensions in C# and Rust

Apr 7 – Introducing AWS Lambda response streaming

May 2023

May 24 – Developing a serverless Slack app using AWS Step Functions and AWS Lambda

May 11 – Automating stopping and starting Amazon MWAA environments to reduce cost

May 10 – Monitor Amazon SNS-based applications end-to-end with AWS X-Ray active tracing

May 10 – Debugging SnapStart-enabled Lambda functions made easy with AWS X-Ray

May 10 – Implementing cross-account CI/CD with AWS SAM for container-based Lambda functions

May 3 – Extending a serverless, event-driven architecture to existing container workloads

May 3 – Patterns for building an API to upload files to Amazon S3

June 2023

Jun 7 – Ruby 3.2 runtime now available in AWS Lambda

Jun 5 – Implementing custom domain names for Amazon API Gateway private endpoints using a reverse proxy

June 22 – Deploying state machines incrementally with versions and aliases in AWS Step Functions

June 22 – Testing AWS Lambda functions with AWS SAM remote invoke

Videos

Serverless Office Hours – Tues 10AM PT

Weekly live virtual office hours. In each session we talk about a specific topic or technology related to serverless and open it up to helping you with your real serverless challenges and issues.

YouTube: youtube.com/serverlessland
Twitch: twitch.tv/aws

LinkedIn:  linkedin.com/company/serverlessland

April 2023

Apr 4 – Serverless AI with ChatGPT and DALL-E

Apr 11 – Building Java apps with AWS SAM

Apr 18 – Managing EventBridge with Kubernetes

Apr 25 – Lambda response streaming

May 2023

May 2 – Automating your life with serverless 

May 9 – Building real-life asynchronous architectures

May 16 – Testing Serverless Applications

May 23 – Build faster with Amazon CodeCatalyst 

May 30 – Serverless networking with VPC Lattice

June 2023

June 6 – AWS AppSync: Private APIs and Merged APIs 

June 13 – Integrating EventBridge and Kafka

June 20 – AWS Copilot for serverless containers

June 27 – Serverless high performance modeling

FooBar Serverless YouTube channel

April 2023

Apr 6 – Designing a DynamoDB Table in 4 Steps: From Entities to Access Patterns

Apr 14 – Amazon CodeWhisperer – Improve developer productivity using machine learning (ML)

Apr 20 – Beginner’s Guide to DynamoDB with AWS CDK: Step-by-Step Tutorial for provisioning NoSQL Databases

Apr 27 – Build a WebApp that uses DynamoDB in 6 steps | DynamoDB Expressions

May 2023

May 4 – How to Migrate Data to DynamoDB?

May 11 – Load Testing DynamoDB: Observability and Performance tuning

May 18 – DynamoDB Streams – THE most powerful feature from DynamoDB for event-driven applications

May 25 – Track Application Events with DynamoDB streams and Email Notifications using EventBridge Pipes

June 2023

Jun 1 – How to filter messages based on the payload using Amazon SNS

June 8 – Getting started with Amazon Kinesis

Still looking for more?

The Serverless landing page has more information. The Lambda resources page contains case studies, webinars, whitepapers, customer stories, reference architectures, and even more Getting Started tutorials.

You can also follow the Serverless Developer Advocacy team on Twitter to see the latest news, follow conversations, and interact with the team.

Centralize near-real-time governance through alerts on Amazon Redshift data warehouses for sensitive queries

Post Syndicated from Rajdip Chaudhuri original https://aws.amazon.com/blogs/big-data/centralize-near-real-time-governance-through-alerts-on-amazon-redshift-data-warehouses-for-sensitive-queries/

Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud that delivers powerful and secure insights on all your data with the best price-performance. With Amazon Redshift, you can analyze your data to derive holistic insights about your business and your customers. In many organizations, one or multiple Amazon Redshift data warehouses run daily for data and analytics purposes. Therefore, over time, multiple Data Definition Language (DDL) or Data Control Language (DCL) queries, such as CREATE, ALTER, DROP, GRANT, or REVOKE SQL queries, are run on the Amazon Redshift data warehouse, which are sensitive in nature because they could lead to dropping tables or deleting data, causing disruptions or outages. Tracking such user queries as part of the centralized governance of the data warehouse helps stakeholders understand potential risks and take prompt action to mitigate them following the operational excellence pillar of the AWS Data Analytics Lens. Therefore, for a robust governance mechanism, it’s crucial to alert or notify the database and security administrators on the kind of sensitive queries that are run on the data warehouse, so that prompt remediation actions can be taken if needed.

To address this, in this post we show you how you can automate near-real-time notifications over a Slack channel when certain queries are run on the data warehouse. We also create a simple governance dashboard using a combination of Amazon DynamoDB, Amazon Athena, and Amazon QuickSight.

Solution overview

An Amazon Redshift data warehouse logs information about connections and user activities taking place in databases, which helps monitor the database for security and troubleshooting purposes. These logs can be stored in Amazon Simple Storage Service (Amazon S3) buckets or Amazon CloudWatch. Amazon Redshift logs information in the following log files, and this solution is based on using an Amazon Redshift audit log to CloudWatch as a destination:

  • Connection log – Logs authentication attempts, connections, and disconnections
  • User log – Logs information about changes to database user definitions
  • User activity log – Logs each query before it’s run on the database

The following diagram illustrates the solution architecture.

Solution Architecture

The solution workflow consists of the following steps:

  1. Audit logging is enabled in each Amazon Redshift data warehouse to capture the user activity log in CloudWatch.
  2. Subscription filters on CloudWatch capture the required DDL and DCL commands by providing filter criteria.
  3. The subscription filter triggers an AWS Lambda function for pattern matching.
  4. The Lambda function processes the event data and sends the notification over a Slack channel using a webhook.
  5. The Lambda function stores the data in a DynamoDB table over which a simple dashboard is built using Athena and QuickSight.

Prerequisites

Before starting the implementation, make sure the following requirements are met:

  • You have an AWS account.
  • The AWS Region used for this post is us-east-1. However, this solution is relevant in any other Region where the necessary AWS services are available.
  • Permissions to create Slack a workspace.

Create and configure an Amazon Redshift cluster

To set up your cluster, complete the following steps:

  1. Create a provisioned Amazon Redshift data warehouse.

For this post, we use three Amazon Redshift data warehouses: demo-cluster-ou1, demo-cluster-ou2, and demo-cluster-ou3. In this post, all the Amazon Redshift data warehouses are provisioned clusters. However, the same solution applies for Amazon Redshift Serverless.

  1. To enable audit logging with CloudWatch as the log delivery destination, open an Amazon Redshift cluster and go to the Properties tab.
  2. On the Edit menu, choose Edit audit logging.

Redshift edit audit logging

  1. Select Turn on under Configure audit logging.
  2. Select CloudWatch for Log export type.
  3. Select all three options for User log, Connection log, and User activity log.
  4. Choose Save changes.

  1. Create a parameter group for the clusters with enable_user_activity_logging set as true for each of the clusters.
  2. Modify the cluster to attach the new parameter group to the Amazon Redshift cluster.

For this post, we create three custom parameter groups: custom-param-grp-1, custom-param-grp-2, and custom-param-grp-3 for three clusters.

Note, if you enable only the audit logging feature, but not the associated parameter, the database audit logs log information for only the connection log and user log, but not for the user activity log.

  1. On the CloudWatch console, choose Log groups under Logs in the navigation pane.
  2. Search for /aws/redshift/cluster/demo.

This will show all the log groups created for the Amazon Redshift clusters.

Create a DynamoDB audit table

To create your audit table, complete the following steps:

  1. On the DynamoDB console, choose Tables in the navigation pane.
  2. Choose Create table.
  3. For Table name, enter demo_redshift_audit_logs.
  4. For Partition key, enter partKey with the data type as String.

  1. Keep the table settings as default.
  2. Choose Create table.

Create Slack resources

Slack Incoming Webhooks expect a JSON request with a message string corresponding to a "text" key. They also support message customization, such as adding a user name and icon, or overriding the webhook’s default channel. For more information, see Sending messages using Incoming Webhooks on the Slack website.

The following resources are created for this post:

  • A Slack workspace named demo_rc
  • A channel named #blog-demo in the newly created Slack workspace
  • A new Slack app in the Slack workspace named demo_redshift_ntfn (using the From Scratch option)
  • Note down the Incoming Webhook URL, which will be used in this post for sending the notifications

Create an IAM role and policy

In this section, we create an AWS Identity and Access Management (IAM) policy that will be attached to an IAM role. The role is then used to grant a Lambda function access to a DynamoDB table. The policy also includes permissions to allow the Lambda function to write log files to Amazon CloudWatch Logs.

  1. On the IAM console, choose Policies in navigation pane.
  2. Choose Create policy.
  3. In the Create policy section, choose the JSON tab and enter the following IAM policy. Make sure you replace your AWS account ID in the policy (replace XXXXXXXX with your AWS account ID).
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "ReadWriteTable",
            "Effect": "Allow",
            "Action": [
                "dynamodb:BatchGetItem",
                "dynamodb:BatchWriteItem",
                "dynamodb:PutItem",
                "dynamodb:GetItem",
                "dynamodb:Scan",
                "dynamodb:Query",
                "dynamodb:UpdateItem",
                "logs:PutLogEvents"
            ],
            "Resource": [
                "arn:aws:dynamodb:us-east-1:XXXXXXXX:table/demo_redshift_audit_logs",
                "arn:aws:logs:*:XXXXXXXX:log-group:*:log-stream:*"
            ]
        },
        {
            "Sid": "WriteLogStreamsAndGroups",
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogStream",
                "logs:CreateLogGroup"
            ],
            "Resource": "arn:aws:logs:*:XXXXXXXX:log-group:*"
        }
    ]
}
  1. Choose Next: Tags, then choose Next: Review.
  2. Provide the policy name demo_post_policy and choose Create policy.

To apply demo_post_policy to a Lambda function, you first have to attach the policy to an IAM role.

  1. On the IAM console, choose Roles in the navigation pane.
  2. Choose Create role.
  3. Select AWS service and then select Lambda.
  4. Choose Next.

  1. On the Add permissions page, search for demo_post_policy.
  2. Select demo_post_policy from the list of returned search results, then choose Next.

  1. On the Review page, enter demo_post_role for the role and an appropriate description, then choose Create role.

Create a Lambda function

We create a Lambda function with Python 3.9. In the following code, replace the slack_hook parameter with the Slack webhook you copied earlier:

import gzip
import base64
import json
import boto3
import uuid
import re
import urllib3

http = urllib3.PoolManager()
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table("demo_redshift_audit_logs")
slack_hook = "https://hooks.slack.com/services/xxxxxxx"

def exe_wrapper(data):
    cluster_name = (data['logStream'])
    for event in data['logEvents']:
        message = event['message']
        reg = re.match(r"'(?P<ts>\d{4}-\d\d-\d\dT\d\d:\d\d:\d\dZ).*?\bdb=(?P<db>\S*).*?\buser=(?P<user>\S*).*?LOG:\s+(?P<query>.*?);?$", message)
        if reg is not None:
            filter = reg.groupdict()
            ts = filter['ts']
            db = filter['db']
            user = filter['user']
            query = filter['query']
            query_type = ' '.join((query.split(" "))[0 : 2]).upper()
            object = query.split(" ")[2]
            put_dynamodb(ts,cluster_name,db,user,query,query_type,object,message)
            slack_api(cluster_name,db,user,query,query_type,object)
            
def put_dynamodb(timestamp,cluster,database,user,sql,query_type,object,event):
    table.put_item(Item = {
        'partKey': str(uuid.uuid4()),
        'redshiftCluster': cluster,
        'sqlTimestamp' : timestamp,
        'databaseName' : database,
        'userName': user,
        'sqlQuery': sql,
        'queryType' : query_type,
        'objectName': object,
        'rawData': event
        })
        
def slack_api(cluster,database,user,sql,query_type,object):
    payload = {
	'channel': '#blog-demo',
	'username': 'demo_redshift_ntfn',
	'blocks': [{
			'type': 'section',
			'text': {
				'type': 'mrkdwn',
				'text': 'Detected *{}* command\n *Affected Object*: `{}`'.format(query_type, object)
			}
		},
		{
			'type': 'divider'
		},
		{
			'type': 'section',
			'fields': [{
					'type': 'mrkdwn',
					'text': ':desktop_computer: *Cluster Name:*\n{}'.format(cluster)
				},
				{
					'type': 'mrkdwn',
					'text': ':label: *Query Type:*\n{}'.format(query_type)
				},
				{
					'type': 'mrkdwn',
					'text': ':card_index_dividers: *Database Name:*\n{}'.format(database)
				},
				{
					'type': 'mrkdwn',
					'text': ':technologist: *User Name:*\n{}'.format(user)
				}
			]
		},
		{
			'type': 'section',
			'text': {
				'type': 'mrkdwn',
				'text': ':page_facing_up: *SQL Query*\n ```{}```'.format(sql)
			}
		}
	]
	}
    encoded_msg = json.dumps(payload).encode('utf-8')
    resp = http.request('POST',slack_hook, body=encoded_msg)
    print(encoded_msg) 

def lambda_handler(event, context):
    print(f'Logging Event: {event}')
    print(f"Awslog: {event['awslogs']}")
    encoded_zipped_data = event['awslogs']['data']
    print(f'data: {encoded_zipped_data}')
    print(f'type: {type(encoded_zipped_data)}')
    zipped_data = base64.b64decode(encoded_zipped_data)
    data = json.loads(gzip.decompress(zipped_data))
    exe_wrapper(data)

Create your function with the following steps:

  1. On the Lambda console, choose Create function.
  2. Select Author from scratch and for Function name, enter demo_function.
  3. For Runtime, choose Python 3.9.
  4. For Execution role, select Use an existing role and choose demo_post_role as the IAM role.
  5. Choose Create function.

  1. On the Code tab, enter the preceding Lambda function and replace the Slack webhook URL.
  2. Choose Deploy.

Create a CloudWatch subscription filter

We need to create the CloudWatch subscription filter on the useractivitylog log group created by the Amazon Redshift clusters.

  1. On the CloudWatch console, navigate to the log group /aws/redshift/cluster/demo-cluster-ou1/useractivitylog.
  2. On the Subscription filters tab, on the Create menu, choose Create Lambda subscription filter.

  1. Choose demo_function as the Lambda function.
  2. For Log format, choose Other.
  3. Provide the subscription filter pattern as ?create ?alter ?drop ?grant ?revoke.
  4. Provide the filter name as Sensitive Queries demo-cluster-ou1.
  5. Test the filter by selecting the actual log stream. If it has any queries with a match pattern, then you can see some results. For testing, use the following pattern and choose Test pattern.
'2023-04-02T04:18:43Z UTC [ db=dev user=awsuser pid=100 userid=100 xid=100 ]' LOG: alter table my_table alter column string type varchar(16);
'2023-04-02T04:06:08Z UTC [ db=dev user=awsuser pid=100 userid=100 xid=200 ]' LOG: create user rs_user with password '***';

  1. Choose Start streaming.

  1. Repeat the same steps for /aws/redshift/cluster/demo-cluster-ou2/useractivitylog and /aws/redshift/cluster/demo-cluster-ou3/useractivitylog by giving unique subscription filter names.
  2. Complete the preceding steps to create a second subscription filter for each of the Amazon Redshift data warehouses with the filter pattern ?CREATE ?ALTER ?DROP ?GRANT ?REVOKE, ensuring uppercase SQL commands are also captured through this solution.

Test the solution

In this section, we test the solution in the three Amazon Redshift clusters that we created in the previous steps and check for the notifications of the commands on the Slack channel as per the CloudWatch subscription filters as well as data getting ingested in the DynamoDB table. We use the following commands to test the solution; however, this is not restricted to these commands only. You can check with other DDL commands as per the filter criteria in your Amazon Redshift cluster.

create schema sales;
create schema marketing;
create table dev.public.demo_test_table_1  (id int, string varchar(10));
create table dev.public.demo_test_table_2  (empid int, empname varchar(100));
alter table dev.public.category alter column catdesc type varchar(65);
drop table dev.public.demo_test_table_1;
drop table dev.public.demo_test_table_2;

In the Slack channel, details of the notifications look like the following screenshot.

To get the results in DynamoDB, complete the following steps:

  1. On the DynamoDB console, choose Explore items under Tables in the navigation pane.
  2. In the Tables pane, select demo_redshift_audit_logs.
  3. Select Scan and Run to get the results in the table.

Athena federation over the DynamoDB table

The Athena DynamoDB connector enables Athena to communicate with DynamoDB so that you can query your tables with SQL. As part of the prerequisites for this, deploy the connector to your AWS account using the Athena console or the AWS Serverless Application Repository. For more details, refer to Deploying a data source connector or Using the AWS Serverless Application Repository to deploy a data source connector. For this post, we use the Athena console.

  1. On the Athena console, under Administration in the navigation pane, choose Data sources.
  2. Choose Create data source.

  1. Select the data source as Amazon DynamoDB, then choose Next.

  1. For Data source name, enter dynamo_db.
  2. For Lambda function, choose Create Lambda function to open a new window with the Lambda console.

  1. Under Application settings, enter the following information:
    • For Application name, enter AthenaDynamoDBConnector.
    • For SpillBucket, enter the name of an S3 bucket.
    • For AthenaCatalogName, enter dynamo.
    • For DisableSpillEncryption, enter false.
    • For LambdaMemory, enter 3008.
    • For LambdaTimeout, enter 900.
    • For SpillPrefix, enter athena-spill-dynamo.

  1. Select I acknowledge that this app creates custom IAM roles and choose Deploy.
  2. Wait for the function to deploy, then return to the Athena window and choose the refresh icon next to Lambda function.
  3. Select the newly deployed Lambda function and choose Next.

  1. Review the information and choose Create data source.
  2. Navigate back to the query editor, then choose dynamo_db for Data source and default for Database.
  3. Run the following query in the editor to check the sample data:
SELECT partkey,
       redshiftcluster,
       databasename,
       objectname,
       username,
       querytype,
       sqltimestamp,
       sqlquery,
       rawdata
FROM dynamo_db.default.demo_redshift_audit_logs limit 10;

Visualize the data in QuickSight

In this section, we create a simple governance dashboard in QuickSight using Athena in direct query mode to query the record set, which is persistently stored in a DynamoDB table.

  1. Sign up for QuickSight on the QuickSight console.
  2. Select Amazon Athena as a resource.
  3. Choose Lambda and select the Lambda function created for DynamoDB federation.

  1. Create a new dataset in QuickSight with Athena as the source.
  2. Provide the name of the data source name as demo_blog.
  3. Choose dynamo_db for Catalog, default for Database, and demo_redshift_audit_logs for Table.
  4. Choose Edit/Preview data.

  1. Choose String in the sqlTimestamp column and choose Date.

  1. In the dialog box that appears, enter the data format yyyy-MM-dd'T'HH:mm:ssZZ.
  2. Choose Validate and Update.

  1. Choose PUBLISH & VISUALIZE.
  2. Choose Interactive sheet and choose CREATE.

This will take you to the visualization page to create the analysis on QuickSight.

  1. Create a governance dashboard with the appropriate visualization type.

Refer to the Amazon QuickSight learning videos in QuickSight community for basic to advanced level of authoring. The following screenshot is a sample visualization created on this data.

Clean up

Clean up your resources with the following steps:

  1. Delete all the Amazon Redshift clusters.
  2. Delete the Lambda function.
  3. Delete the CloudWatch log groups for Amazon Redshift and Lambda.
  4. Delete the Athena data source for DynamoDB.
  5. Delete the DynamoDB table.

Conclusion

Amazon Redshift is a powerful, fully managed data warehouse that can offer significantly increased performance and lower cost in the cloud. In this post, we discussed a pattern to implement a governance mechanism to identify and notify sensitive DDL/DCL queries on an Amazon Redshift data warehouse, and created a quick dashboard to enable the DBA and security team to take timely and prompt action as required. Additionally, you can extend this solution to include DDL commands used for Amazon Redshift data sharing across clusters.

Operational excellence is a critical part of the overall data governance on creating a modern data architecture, as it’s a great enabler to drive our customers’ business. Ideally, any data governance implementation is a combination of people, process, and technology that organizations use to ensure the quality and security of their data throughout its lifecycle. Use these instructions to set up your automated notification mechanism as sensitive queries are detected as well as create a quick dashboard on QuickSight to track the activities over time.


About the Authors

Rajdip Chaudhuri is a Senior Solutions Architect with Amazon Web Services specializing in data and analytics. He enjoys working with AWS customers and partners on data and analytics requirements. In his spare time, he enjoys soccer and movies.

Dhiraj Thakur is a Solutions Architect with Amazon Web Services. He works with AWS customers and partners to provide guidance on enterprise cloud adoption, migration, and strategy. He is passionate about technology and enjoys building and experimenting in the analytics and AI/ML space.

Retrieving parameters and secrets with Powertools for AWS Lambda (TypeScript)

Post Syndicated from Pascal Vogel original https://aws.amazon.com/blogs/compute/retrieving-parameters-and-secrets-with-powertools-for-aws-lambda-typescript/

This post is written by Andrea Amorosi, Senior Solutions Architect and Pascal Vogel, Solutions Architect.

When building serverless applications using AWS Lambda, you often need to retrieve parameters, such as database connection details, API secrets, or global configuration values at runtime. You can make these parameters available to your Lambda functions via secure, scalable, and highly available parameter stores, such as AWS Systems Manager Parameter Store or AWS Secrets Manager.

The Parameters utility for Powertools for AWS Lambda (TypeScript) simplifies the integration of these parameter stores inside your Lambda functions. The utility provides high-level functions for retrieving secrets and parameters, integrates caching and transformations, and reduces the amount of boilerplate code you must write.

The Parameters utility supports the following parameter stores:

The Parameters utility is part of the Powertools for AWS Lambda (TypeScript), which you can use in both JavaScript and TypeScript code bases. Implementing guidance from the Serverless Applications Lens of the AWS Well-Architected Framework, Powertools provides utilities to ease the adoption of best practices such as distributed tracing, structured logging, and asynchronous business and application metrics.

For more details, see the Powertools for AWS Lambda (TypeScript) documentation on GitHub and the introduction blog post.

This blog post shows how to use the new Parameters utility to retrieve parameters and secrets in your JavaScript and TypeScript Lambda functions securely.

Getting started with the Parameters utility

Initial setup

The Powertools toolkit is modular, meaning that you can install the Parameters utility independently from the Logger, Tracing, or Metrics packages. Install the Parameters utility library in your project via npm:

npm install @aws-lambda-powertools/parameters

In addition, you must add the AWS SDK client for the parameter store you are planning to use. The Parameters utility supports AWS SDK v3 for JavaScript only, which allows the utility to be modular. You install only the needed SDK packages to keep your bundle size small.

Next, assign appropriate AWS Identity and Access Management (IAM) permissions to the Lambda function execution role of your Lambda function that allow retrieving parameters from the parameter store.

The following sections illustrate how to perform the previously mentioned steps for some typical parameter retrieval scenarios.

Retrieving a single parameter from SSM Parameter Store

To retrieve parameters from SSM Parameter Store, install the AWS SDK client for SSM in addition to the Parameters utility:

npm install @aws-sdk/client-ssm

To retrieve an individual parameter, the Parameters utility provides the getParameter function:

import { getParameter } from '@aws-lambda-powertools/parameters/ssm';

export const handler = async (): Promise<void> => {
  // Retrieve a single parameter
  const parameter = await getParameter('/my/parameter');
  console.log(parameter);
};

Finally, you need to assign an IAM policy with the ssm:GetParameter permission to your Lambda function execution role. Apply the principle of least privilege by scoping the permission to the specific parameter resource as shown in the following policy example:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ssm:GetParameter"
      ],
      "Resource": [
        "arn:aws:ssm:AWS_REGION:AWS_ACCOUNT_ID:my/parameter"
      ]
    }
  ]
}

Adjusting cache TTL

By default, the retrieved parameters are cached in-memory for 5 seconds. This cached value is used for further invocations of the Lambda function until it expires. If your application requires a different behavior, the Parameters utility allows you to adjust the time-to-live (TTL) via the maxAge argument.

Building on the previous example, if you want to cache your retrieved parameter for 30 instead of 5 seconds, you can adapt your function code as follows:

import { getParameter } from '@aws-lambda-powertools/parameters/ssm';

export const handler = async (): Promise<void> => {
  // Retrieve a single parameter with a 30 seconds cache TTL
  const parameter = await getParameter('/my/parameter', { maxAge: 30 });
  console.log(parameter);
};

In other cases, you may want to always retrieve the latest value from the parameter store and ignore any cached value. To achieve this, set the forceFetch parameter to true:

import { getParameter } from '@aws-lambda-powertools/parameters/ssm';

export const handler = async (): Promise<void> => {
  // Always retrieve the latest value of a single parameter
  const parameter = await getParameter('/my/parameter', { forceFetch: true });
  console.log(parameter);
};

For details, see Always fetching the latest in the Powertools for AWS Lambda (TypeScript) documentation.

Decoding parameters stored in JSON or base64 format

If some of your parameters are stored in base64 or JSON, you can deserialize them via the Parameters utility’s transform argument.

Considering a parameter stored in SSM as JSON, it can be retrieved and deserialized as follows:

import { Transform } from '@aws-lambda-powertools/parameters';
import { getParameter } from '@aws-lambda-powertools/parameters/ssm';

export const handler = async (): Promise => {
  // Retrieve and deserialize a single JSON parameter
  const valueFromJson = await getParameter('/my/json/parameter', { transform: Transform.JSON });
  console.log(valueFromJson);
};

The Parameters utility supports the transform argument for all parameter store providers and high-level functions. For details, see Deserializing values with transform parameters.

Working with encrypted parameters in SSM Parameter Store

SSM Parameter Store supports encrypted secure string parameters via the AWS Key Management Service (AWS KMS). The Parameters utility allows you to retrieve these encrypted parameters by adding the decrypt argument to your request.

For example, you could retrieve an encrypted parameter as follows:

import { getParameter } from '@aws-lambda-powertools/parameters/ssm';

export const handler = async (): Promise<void> => {
  // Decrypt the parameter
  const decryptedParameter = await getParameter('/my/encrypted/parameter', { decrypt: true });
  console.log(decryptedParameter);
};

In this case, the Lambda function execution role needs to have the kms:Decrypt IAM permission in addition to ssm:GetParameter.

Retrieving multiple parameters from SSM Parameter Store

Besides retrieving a single parameter using getParameter, you can also use getParameters to recursively retrieve multiple parameters under a SSM Parameter Store path, or getParametersByName to retrieve multiple distinct parameters by their full name.

You can also apply custom caching, transform, or decrypt configurations per parameter when using getParametersByName. The following example retrieves three distinct parameters from SSM Parameter Store with different caching and transform configurations:

import { getParametersByName } from '@aws-lambda-powertools/parameters/ssm';
import type {
  SSMGetParametersByNameOptionsInterface
} from '@aws-lambda-powertools/parameters/ssm/types';

const props: Record<string, SSMGetParametersByNameOptionsInterface> = {
  '/develop/service/commons/telemetry/config': { maxAge: 300, transform: 'json' },
  '/no_cache_param': { maxAge: 0 },
  '/develop/service/payment/api/capture/url': {}, // When empty or undefined, it uses default values
};

export const handler = async (): Promise<void> => {
  // This returns an object with the parameter name as key
  const parameters = await getParametersByName(props);
  for (const [ key, value ] of Object.entries(parameters)) {
    console.log(`${key}: ${value}`);
  }
};

Retrieving multiple parameters requires the GetParameter and GetParameters permissions to be present in the Lambda function execution role.

Retrieving secrets from Secrets Manager

To securely store sensitive parameters such as passwords or API keys for external services, Secrets Manager is a suitable option. To retrieve secrets from Secrets Manager using the Parameters utility, install the AWS SDK client for Secrets Manager in addition to the Parameters utility:

npm install @aws-sdk/client-secrets-manager

Now you can access a secret using its key as follows:

import { getSecret } from '@aws-lambda-powertools/parameters/secrets';

export const handler = async (): Promise<void> => {
  // Retrieve a single secret
  const secret = await getSecret('my-secret');
  console.log(secret);
};

Getting a secret from Secrets Manager requires you to add the secretsmanager:GetSecretValue IAM permission to your Lambda function execution role.

Retrieving an application configuration from AppConfig

If you plan to leverage feature flags or dynamic application configurations in your applications built on Lambda, AppConfig is a suitable option. The Parameters utility makes it easy to fetch configurations from AppConfig while benefitting from utility features such as caching and transformations.

For example, considering an AppConfig application called my-app with an environment called my-env, you can retrieve its configuration profile my-configuration as follows:

import { getAppConfig } from '@aws-lambda-powertools/parameters/appconfig';

export const handler = async (): Promise<void> => {
  // Retrieve a configuration, latest version
  const config = await getAppConfig('my-configuration', {
    environment: 'my-env',
    application: 'my-app'
  });
  console.log(config);
};

Retrieving a configuration requires both the appconfig:GetLatestConfiguration and appconfig:StartConfigurationSession IAM permissions to be attached to the Lambda function execution role.

Retrieving a parameter from a DynamoDB table

DynamoDB’s low latency and high flexibility make it a great option for storing parameters. To use DynamoDB as a parameter store via the Parameters utility, install the DynamoDB AWS SDK client and utility package in addition to the Parameters utility.

npm install @aws-sdk/client-dynamodb @aws-sdk/util-dynamodb

By default, the Parameters utility expects the DynamoDB table containing the parameters to have a partition key of id and an attribute called value. For example, assuming an item with an id of my-parameter and a value of my-value stored in an DynamoDB table called my-table, you can retrieve it as follows:

import { DynamoDBProvider } from '@aws-lambda-powertools/parameters/dynamodb';

const dynamoDBProvider = new DynamoDBProvider({ tableName: 'my-table' });

export const handler = async (): Promise<void> => {
  // Retrieve a value from DynamoDB
  const value = await dynamoDBProvider.get('my-parameter');
  console.log(value);
};

In case of retrieving a single parameter from DynamoDB, the Lambda function execution role needs to have the dynamodb:GetItem IAM permission.

The Parameters utility DynamoDB provider can also retrieve multiple parameters from a table with a single request via a DynamoDB query. See DynamoDB provider in the Powertools for AWS Lambda (TypeScript) documentation for details.

Conclusion

This blog post introduces the Powertools for AWS Lambda (TypeScript) Parameters utility and demonstrates how it is used with different parameter stores. The Parameters utility allows you to retrieve secrets and parameters in your Lambda function from SSM Parameter Store, Secrets Manager, AppConfig, DynamoDB, and custom parameter stores. By using the utility, you get access to functionality such as caching and transformation, and reduce the amount of boilerplate code you need to write for your Lambda functions.

To learn more about the Parameters utility and its full set of functionality, take a look at the Powertools for AWS Lambda (TypeScript) documentation.

Share your feedback for Powertools for AWS Lambda (TypeScript) by opening a GitHub issue.

For more serverless learning resources, visit Serverless Land.

Testing AWS Lambda functions with AWS SAM remote invoke

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/testing-aws-lambda-functions-with-aws-sam-remote-invoke/

Developers are taking advantage of event driven architecture (EDA) to build large distributed applications. To build these applications, developers are using managed service like AWS Lambda, AWS Step Functions, and Amazon EventBridge to handle compute, orchestration, and choreography. Since these services run in the cloud, developers are also looking for ways to test in the cloud. With this in mind, AWS SAM is adding a new feature to the AWS SAM CLI called remote invoke.

AWS SAM remote invoke enables developers to invoke a Lambda function in the AWS Cloud from their development environment. The feature has several options for identifying the Lambda function to invoke, the payload event, and the output type.

Using remote invoke

To test the remote invoke feature, there is a small AWS SAM application that comprises two AWS Lambda functions. The TranslateFunction takes a text string and translates it to the target language using the AI/ML service Amazon Translate. The StreamFunction generates data in a streaming format. To run these demonstrations, be sure to install the latest AWS SAM CLI.

To deploy the application, follow these steps:

  1. Clone the repository:
    git clone https://github.com/aws-samples/aws-sam-remote-invoke-example
  2. Change to the root directory of the repository:
    cd aws-sam-remote-invoke-example
  3. Build the AWS Lambda artifacts (use the –use-container option to ensure Python 3.10 and Node 18 are present. If these are both set up on your machine, you can ignore this flag):
    sam build --use-container
  4. Deploy the application to your AWS account:
    sam deploy --guided
  5. Name the application “remote-test” and choose all defaults.

AWS SAM can now remotely invoke the Lambda functions deployed with this application. Use the following command to test the TranslateFunction:

sam remote invoke --stack-name remote-test --event '{"message":"I am testing the power of remote invocation", "target-language":"es"}' TranslateFunction

This is a quick way to test a small event. However, developers often deal with large complex payloads. The AWS SAM remote invoke function also allows an event to be passed as a file. Use the following command to test:

sam remote invoke --stack-name remote-test --event-file './events/translate-event.json' TranslateFunction

With either of these methods, AWS SAM returns the response from the Lambda function as if it were called from a service like Amazon API Gateway. However, AWS SAM also offers the ability to get the raw response as returned from the Python software development kit (SDK), boto3. This format provides additional information such as the version that you invoked, if any retries were attempted, and more. To retrieve this output, run the invocation with the additional –output parameter with the value of json.

sam remote invoke --stack-name remote-test --event '{"message": "I am testing the power of remote invocation", "target-language": "es"}' --output json TranslateFunction
Full output from SDK

Full output from SDK

It is also possible to invoke Lambda functions that are not created in AWS SAM. Using the name of a Lambda function, AWS SAM can remotely invoke any Lambda function that you have permission to invoke. When you deployed the sample application, AWS SAM prints the name of the Lambda function in the console. Use the following command to print the output again:

sam list stack-outputs --stack-name remote-test

Using the output for the TranslateFunctionName, run:

sam remote invoke --event '{"message": "Testing direct access of the function", "target-language": "fr"}' <TranslateFunctionName>

Lambda recently added support from streaming responses from Lambda functions. Streaming functions do not wait until the entire response is available before they respond to the client. To show this, the StreamFunction generates multiple chunks of text and sends them over a period of time.

To invoke the function, run:

sam remote invoke --stack-name remote-test StreamFunction

Extending remote invoke

The AWS SDKs offer different options when invoking Lambda functions via the Lambda service. Behind the scenes, AWS SAM is using boto3 to power the remote invoke functionality. To make full use of the SDK options for Lambda function invocation, the AWS SAM offers a —parameter flag that can be used multiple times.

For example, you may want to run an invocation as a dry run only. This type of invocation tests Lambda’s ability to invoke the function based on factors like variable values and proper permissions. The command looks like the following:

sam remote invoke --stack-name remote-test --event '{"message": "I am testing the power of remote invocation", "target-language": "es"}' --parameter InvocationType=DryRun --output json TranslateFunction

In a second example, I want to invoke a specific version of the Lambda function:

sam remote invoke --stack-name remote-test --event '{"message": "I am testing the power of remote invocation", "target-language": "es"}' --parameter Qualifier='$LATEST' TranslateFunction

If you need both options:

sam remote invoke --stack-name remote-test --event '{"message": "I am testing the power of remote invocation", "target-language": "es"}' --parameter InvocationType=DryRun --parameter Qualifier='$LATEST' --output json TranslateFunction

Logging

When developing distributed applications, logging is a critical tool to trace the state of a request across decoupled microservices. AWS SAM offers the sam logs functionality to help view aggregated logs and traces from Amazon CloudWatch and AWS X-Ray, respectively. However, when testing individual functions, developers want contextual logs pinpointed to a specific invocation. The new remote invoke function provides these logs by default. Returning to the TranslateFunction, run the following command again:

sam remote invoke --stack-name remote-test --event '{"message": "I am testing the power of remote invocation", "target-language": "es"}' TranslateFunction
Logging response from remote invoke

Logging response from remote invoke

The remote invocation returns the response from the Lambda function, any logging from within the Lambda function, followed by the final report from the Lambda service about the invocation itself.

Combining remote invoke with AWS SAM Accelerate

Developers are constantly striving to remove complexity and friction and improve speed and agility in the development pipeline. To help serverless developers towards this goal, the AWS SAM team released a feature called AWS SAM Accelerate. AWS SAM Accelerate is a series of features that move debugging and testing from the local machine to the cloud.

To show how AWS SAM Accelerate and remote invoke can work together, follow these steps:

  1. In a separate terminal, start the AWS SAM sync process with the watch option:
    sam sync --stack-name remote-test --use-container --watch
  2. In a second window or tab, run the remote invoke function:
    sam remote invoke --stack-name remote-test --event-file './events/translate-event.json' TranslateFunction

The combination of these two options provides a robust auto-deployment and testing environment. During iterations of code in the Lambda function, each time you save the file, AWS SAM syncs the code and any dependencies to the cloud. As needed, the remote invoke is then run to verify the code works as expected, with logging provided for each execution.

Conclusion

Serverless developers are looking for the most efficient way to test their applications in the AWS Cloud. They want to invoke an AWS Lambda function quickly without having to mock security, external services, or other environment variables. This blog shows how to use the new AWS SAM remote invoke feature to do just that.

This post shows how to invoke the Lambda function, change the payload type and location, and change the output format. It explains using this feature in conjunction with the AWS SAM Accelerate features to streamline the serverless development and testing process.

For more serverless learning resources, visit Serverless Land.

Let’s Architect! Open-source technologies on AWS

Post Syndicated from Vittorio Denti original https://aws.amazon.com/blogs/architecture/lets-architect-open-source-technologies-on-aws/

We brought you a Let’s Architect! blog post about open-source on AWS that covered some technologies with development led by AWS/Amazon, as well as well-known solutions available on managed AWS services. Today, we’re following the same approach to share more insights about the process itself for developing open-source. That’s why the first topic we discuss in this post is a re:Invent talk from Heitor Lessa, Principal Solutions Architect at AWS, explaining some interesting approaches for developing and scaling successful open-source projects.

This edition of Let’s Architect! also touches on observability with Open Telemetry, Apache Kafka on AWS, and Infrastructure as Code with an hands-on workshop on AWS Cloud Development Kit (AWS CDK).

Powertools for AWS Lambda: Lessons from the road to 10 million downloads

Powertools for AWS Lambda is an open-source library to help engineering teams implement serverless best practices. In two years, Powertools went from an initial prototype to a fast-growing project in the open-source world. Rapid growth along with support from a wide community led to challenges from balancing new features with operational excellence to triaging bug reports and RFCs and scaling and redesigning documentation.

In this session, you can learn about Powertools for AWS Lambda to understand what it is and the problems it solves. Moreover, there are many valuable lessons to learn how to create and scale a successful open-source project. From managing the trade-off between releasing new features and achieving operational stability to measuring the impact of the project, there are many challenges in open-source projects that require careful thought.

Take me to this video!

Heitor Lessa describing one the key lessons: development and releasing new features should be as important as the other activities (governance, operational excellence, and more)

Heitor Lessa describing one of the key lessons: development and releasing new features should be as important as the other activities (governance, operational excellence, and more).

Observability the open-source way

The recent blog post Let’s Architect! Monitoring production systems at scale talks about the importance of monitoring. Setting up observability is critical to maintain application and infrastructure health, but instrumenting applications to collect monitoring signals such as metrics and logs can be challenging when using vendor-specific SDKs.

This video introduces you to OpenTelemetry, an open-source observability framework. OpenTelemetry provides a flexible, single vendor-agnostic SDK based on open-source specifications that developers can use to instrument and collect signals from applications. This resource explains how it works in practice and how to monitor microservice-based applications with the OpenTelemetry SDK.

Take me to this video!

With AWS Distro for OpenTelemetry, you can collect data from your AWS resources.

With AWS Distro for OpenTelemetry, you can collect data from your AWS resources.

Best practices for right-sizing your Apache Kafka clusters to optimize performance and cost

Apache Kafka is an open-source streaming data store that decouples applications producing streaming data (producers) into its data store from applications consuming streaming data (consumers) from its data store. Amazon Managed Streaming for Apache Kafka (Amazon MSK) allows you to use the open-source version of Apache Kafka with the service managing infrastructure and operations for you.

This blog post explains how the underlying infrastructure configuration can affect Apache Kafka performance. You can learn strategies on how to size the clusters to meet the desired throughput, availability, and latency requirements. This resource helps you discover strategies to find the optimal sizing for your resources, and learn the mental models adopted to conduct the investigation and derive the conclusions.

Take me to this blog!

Comparisons of put latencies for three clusters with different broker sizes

Comparisons of put latencies for three clusters with different broker sizes

AWS Cloud Development Kit workshop

AWS Cloud Development Kit (AWS CDK) is an open-source software development framework that allows you to provision cloud resources programmatically (Infrastructure as Code or IaC) by using familiar programming languages such as Python, Typescript, Javascript, Java, Go, and C#/.Net.

CDK allows you to create reusable template and assets, test your infrastructure, make deployments repeatable, and make your cloud environment stable by removing manual (and error-prone) operations. This workshop introduces you to CDK, where you can learn how to provision an initial simple application as well as become familiar with more advanced concepts like CDK constructs.

Take me to this workshop!

This construct can be attached to any Lambda function that is used as an API Gateway backend. It counts how many requests were issued to each URL.

This construct can be attached to any Lambda function that is used as an API Gateway backend. It counts how many requests were issued to each URL.

See you next time!

Thanks for joining our conversation! To find all the blogs from this series, you can check out the Let’s Architect! list of content on the AWS Architecture Blog.

AWS Week in Review – Amazon EC2 Instance Connect Endpoint, Detective, Amazon S3 Dual Layer Encryption, Amazon Verified Permission – June 19, 2023

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/aws-week-in-review-amazon-ec2-instance-connect-endpoint-detective-amazon-s3-dual-layer-encryption-amazon-verified-permission-june-19-2023/

This week, I’ll meet you at AWS partner’s Jamf Nation Live in Amsterdam where we’re showing how to use Amazon EC2 Mac to deploy your remote developer workstations or configure your iOS CI/CD pipelines in the cloud.Mac in an instant

Last Week’s Launches
While I was traveling last week, I kept an eye on the AWS News. Here are some launches that got my attention.

Amazon EC2 Instance Connect Endpoint. Endpoint for EC2 Instance Connect allows you to securely access Amazon EC2 instances using their private IP addresses, making the use of bastion hosts obsolete. Endpoint for EC2 Instance Connect is by far my favorite launch from last week. With EC2 Instance Connect, you use AWS Identity and Access Management (IAM) policies and principals to control SSH access to your instances. This removes the need to share and manage SSH keys. We also updated the AWS Command Line Interface (AWS CLI) to allow you to easily connect or open a secured tunnel to an instance using only its instance ID. I read and contributed to a couple of threads on social media where you pointed out that AWS Systems Manager Session Manager already offered similar capabilities. You’re right. But the extra advantage of EC2 Instance Connect Endpoint is that it allows you to use your existing SSH-based tools and libraries, such as the scp command.

Amazon Inspector now supports code scanning of AWS Lambda functions. This expands the existing capability to scan Lambda functions and associated layers for software vulnerabilities in application package dependencies. Amazon Detective also extends finding groups to Amazon Inspector. Detective automatically collects findings from Amazon Inspector, GuardDuty, and other AWS security services, such as AWS Security Hub, to help increase situational awareness of related security events.

Amazon Verified Permissions is generally available. If you’re designing or developing business applications that need to enforce user-based permissions, you have a new option to centrally manage application permissions. Verified Permissions is a fine-grained permissions management and authorization service for your applications that can be used at any scale. Verified Permissions centralizes permissions in a policy store and helps developers use those permissions to authorize user actions within their applications. Similarly to the way an identity provider simplifies authentication, a policy store lets you manage authorization in a consistent and scalable way. Read Danilo’s post to discover the details.

Amazon S3 Dual-Layer Server-Side Encryption with keys stored in AWS Key Management Service (DSSE-KMS). Some heavily regulated industries require double encryption to store some type of data at rest. Amazon Simple Storage Service (Amazon S3) offers DSSE-KMS, a new free encryption option that provides two layers of data encryption, using different keys and different implementation of the 256-bit Advanced Encryption Standard with Galois Counter Mode (AES-GCM) algorithm. My colleague Irshad’s post has all the details.

AWS CloudTrail Lake Dashboards provide out-of-the-box visibility and top insights from your audit and security data directly within the CloudTrail Lake console. CloudTrail Lake features a number of AWS curated dashboards so you can get started right away – with no required detailed dashboard setup or SQL experience.

AWS IAM Identity Center now supports automated user provisioning from Google Workspace. You can now connect your Google Workspace to AWS IAM Identity Center (successor to AWS Single Sign-On) once and manage access to AWS accounts and applications centrally in IAM Identity Center.

AWS CloudShell is now available in 12 additional regions. AWS CloudShell is a browser-based shell that makes it easier to securely manage, explore, and interact with your AWS resources. The list of the 12 new Regions is detailed in the launch announcement.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
Here are some other updates and news that you might have missed:

  • AWS Extension for Stable Diffusion WebUI. WebUI is a popular open-source web interface that allows you to easily interact with Stable Diffusion generative AI. We built this extension to help you to migrate existing workloads (such as inference, train, and ckpt merge) from your local or standalone servers to the AWS Cloud.
  • GoDaddy developed a multi-Region, event-driven system. Their system handles 400 millions events per day. They plan to scale it to process 2 billion messages per day in a near future. My colleague Marcia explains the detail of their architecture in her post.
  • The Official AWS Podcast – Listen each week for updates on the latest AWS news and deep dives into exciting use cases. There are also official AWS podcasts in several languages. Check out the podcasts in FrenchGermanItalian, and Spanish.
  • AWS Open Source News and Updates – This is a newsletter curated by my colleague Ricardo to bring you the latest open source projects, posts, events, and more.

Upcoming AWS Events
Check your calendars and sign up for these AWS events:

  • AWS Silicon Innovation Day (June 21) – A one-day virtual event that will allow you to better understand AWS Silicon and how you can use the Amazon EC2 chip offerings to your benefit. My colleague Irshad shared the details in this post. Register today.
  • AWS Global Summits – There are many AWS Summits going on right now around the world: Milano (June 22), Hong Kong (July 20), New York (July 26), Taiwan (Aug 2 & 3), and Sao Paulo (Aug 3).
  • AWS Community Day – Join a community-led conference run by AWS user group leaders in your region: Manila (June 29–30), Chile (July 1), and Munich (September 14).
  • AWS User Group Perú Conf 2023 (September 2023). Some of the AWS News blog writer team will be present: Marcia, Jeff, myself, and our colleague Startup Developer Advocate Mark. Save the date and register today.
  • CDK Day CDK Day is happening again this year on September 29. The call for papers for this event is open, and this year we’re also accepting talks in Spanish. Submit your talk here.

That’s all for this week. Check back next Monday for another Week in Review!

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!
— seb