Tag Archives: serverless

Debugging SnapStart-enabled Lambda functions made easy with AWS X-Ray

Post Syndicated from Marcia Villalba original https://aws.amazon.com/blogs/compute/debugging-snapstart-enabled-lambda-functions-made-easy-with-aws-x-ray/

This post is written by Rahul Popat (Senior Solutions Architect) and Aneel Murari (Senior Solutions Architect) 

Today, AWS X-Ray is announcing support for SnapStart-enabled AWS Lambda functions. Lambda SnapStart is a performance optimization that significantly improves the cold startup times for your functions. Announced at AWS re:Invent 2022, this feature delivers up to 10 times faster function startup times for latency-sensitive Java applications at no extra cost, and with minimal or no code changes.

X-Ray is a distributed tracing system that provides an end-to-end view of how an application is performing. X-Ray collects data about requests that your application serves and provides tools you can use to gain insight into opportunities for optimizations. Now you can use X-Ray to gain insights into the performance improvements of your SnapStart-enabled Lambda function.

With today’s feature launch, by turning on X-Ray tracing for SnapStart-enabled Lambda functions, you see separate subsegments corresponding to the Restore and Invoke phases for your Lambda function’s execution.

How does Lambda SnapStart work?

With SnapStart, the function’s initialization is done ahead of time when you publish a function version. Lambda takes an encrypted snapshot of the initialized execution environment and persists the snapshot in a tiered cache for low latency access.

When the function is first invoked or scaled, Lambda restores the cached execution environment from the persisted snapshot instead of initializing anew. This results in reduced startup times.

X-Ray tracing before this feature launch

Using an example of a Hello World application written in Java, a Lambda function is configured with SnapStart and fronted by Amazon API Gateway:

Before today’s launch, X-Ray was not supported for SnapStart-enabled Lambda functions. So if you had enabled X-Ray tracing for API Gateway, the X-Ray trace for the sample application would look like:

The trace only shows the overall duration of the Lambda service call. You do not have insight into your function’s execution or the breakdown of different phases of Lambda function lifecycle.

Next, enable X-Ray for your Lambda function and see how you can view a breakdown of your function’s total execution duration.

Prerequisites for enabling X-Ray for SnapStart-enabled Lambda function

SnapStart is only supported for Lambda functions with Java 11 and newly launched Java 17 managed runtimes. You can only enable SnapStart for the published versions of your Lambda function. Once you’ve enabled SnapStart, Lambda publishes all subsequent versions with snapshots. You may also create a Lambda function alias, which points to the published version of your Lambda function.

Make sure that the Lambda function’s execution role has appropriate permissions to write to X-Ray.

Enabling AWS X-Ray for your Lambda function with SnapStart

You can enable X-Ray tracing for your Lambda function using AWS Management Console, AWS Command Line Interface (AWS CLI), AWS Serverless Application Model (AWS SAM), AWS CloudFormation template, or via AWS Cloud Deployment Kit (CDK).

This blog shows how you can achieve this via AWS Management Console and AWS SAM. For more information on enabling SnapStart and X-Ray using other methods, refer to AWS Lambda Developer Guide.

Enabling SnapStart and X-Ray via AWS Management Console

To enable SnapStart and X-Ray for Lambda function via the AWS Management Console:

  1. Navigate to your Lambda Function.
  2. On the Configuration tab, choose Edit and change the SnapStart attribute value from None to PublishedVersions.
  3. Choose Save.

To enable X-Ray via the AWS Management Console:

  1. Navigate to your Lambda Function.
  2. ­On the Configuration tab, scroll down to the Monitoring and operations tools card and choose Edit.
  3. Under AWS X-Ray, enable Active tracing.
  4. Choose Save

To publish a new version of Lambda function via the AWS Management Console:

  1. Navigate to your Lambda Function.
  2. On the Version tab, choose Publish new version.
  3. Verify that PublishedVersions is shown below SnapStart.
  4. Choose Publish.

To create an alias for a published version of your Lambda function via the AWS Management Console:

  1. Navigate to your Lambda Function.
  2. On the Aliases tab, choose Create alias.
  3. Provide a Name for an alias and select a Version of your Lambda function to point the alias to.
  4. Choose Save.

Enabling SnapStart and X-Ray via AWS SAM

To enable SnapStart and X-Ray for Lambda function via AWS SAM:

    1. Enable Lambda function versions and create an alias by adding a AutoPublishAlias property in template.yaml file. AWS SAM automatically publishes a new version for each new deployment and automatically assigns the alias to the newly published version.
      Resources:
        my-function:
          type: AWS::Serverless::Function
          Properties:
            […]
            AutoPublishAlias: live
    2. Enable SnapStart on Lambda function by adding the SnapStart property in template.yaml file.
      Resources: 
        my-function: 
          type: AWS::Serverless::Function 
          Properties: 
            […] 
            SnapStart:
             ApplyOn: PublishedVersions
    3. Enable X-Ray for Lambda function by adding the Tracing property in template.yaml file.
      Resources:
        my-function:
          type: AWS::Serverless::Function
          Properties:
            […]
            Tracing: Active 

You can find the complete AWS SAM template for the preceding example in this GitHub repository.

Using X-Ray to gain insights into SnapStart-enabled Lambda function’s performance

To demonstrate X-Ray integration for your Lambda function with SnapStart, you can build, deploy, and test the sample Hello World application using AWS SAM CLI. To do this, follow the instructions in the README file of the GitHub project.

The build and deployment output with AWS SAM looks like this:

Once your application is deployed to your AWS account, note that SnapStart and X-Ray tracing is enabled for your Lambda function. You should also see an alias `live` created against the published version of your Lambda function.

You should also have an API deployed via API Gateway, which is pointing to the `live` alias of your Lambda function as the backend integration.

Now, invoke your API via `curl` command or any other HTTP client. Make sure to replace the url with your own API’s url.

$ curl --location --request GET https://{rest-api-id}.execute-api.{region}.amazonaws.com/{stage}/hello

Navigate to Amazon CloudWatch and under the X-Ray service map, you see a visual representation of the trace data generated by your application.

Under Traces, you can see the individual traces, Response code, Response time, Duration, and other useful metrics.

Select a trace ID to see the breakdown of total Duration on your API call.

You can now see the complete trace for the Lambda function’s invocation with breakdown of time taken during each phase. You can see the Restore duration and actual Invocation duration separately.

Restore duration shown in the trace includes the time it takes for Lambda to restore a snapshot on the microVM, load the runtime (JVM), and run any afterRestore hooks if specified in your code. Note that, the process of restoring snapshots can include time spent on activities outside the microVM. This time is not reported in the Restore sub-segment, but is part of the AWS::Lambda segment in X-Ray traces.

This helps you better understand the latency of your Lambda function’s execution, and enables you to identify and troubleshoot the performance issues and errors.

Conclusion

This blog post shows how you can enable AWS X-Ray for your Lambda function enabled with SnapStart, and measure the end-to-end performance of such functions using X-Ray console. You can now see a complete breakdown of your Lambda function’s execution time. This includes Restore duration along with the Invocation duration, which can help you to understand your application’s startup times (cold starts), diagnose slowdowns, or troubleshoot any errors and timeouts.

To learn more about the Lambda SnapStart feature, visit the AWS Lambda Developer Guide.

For more serverless learning resources, visit Serverless Land.

Implementing cross-account CI/CD with AWS SAM for container-based Lambda functions

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/implementing-cross-account-cicd-with-aws-sam-for-container-based-lambda/

This post is written by Chetan Makvana, Sr. Solutions Architect.

Customers use modular architectural patterns and serverless operational models to build more sustainable, scalable, and resilient applications in modern application development. AWS Lambda is a popular choice for building these applications.

If customers have invested in container tooling for their development workflows, they deploy to Lambda using the container image packaging format for workloads like machine learning inference or data intensive workloads. Using functions deployed as container images, customers benefit from the same operation simplicity, automation scaling, high availability, and native integration with many services.

Containerized applications often have several distinct environments and accounts, such as dev, test, and prod. An application has to go through a process of deployment and testing in these environments. One common pattern for deploying containerized applications is to have a central AWS create a single container image, and carry out deployment across other AWS accounts. To achieve automated deployment of the application across different environments, customers use CI/CD pipelines with familiar container tooling.

This blog post explores how to use AWS Serverless Application Model (AWS SAM) Pipelines to create a CI/CD deployment pipeline and deploy a container-based Lambda function across multiple accounts.

Solution overview

This example comprises three accounts: tooling, test, and prod. The tooling account is a central account where you provision the pipeline, and build the container. The pipeline deploys the container into Lambda in the test and prod accounts using AWS CodeBuild. It also requires the necessary resources in the test and prod account. This consists of an Identity and Access Management (IAM) role that trusts the tooling account and provides the required deployment-specific permissions. AWS CodeBuild assumes this IAM role in the tooling account to carry out deployment.

The solution uses AWS SAM Pipelines to create CI/CD deployment pipeline resources. It provides commands to generate the required AWS infrastructure resources and a pipeline configuration file that CI/CD system can use to deploy using AWS SAM. Find the example code for this solution in the GitHub repository.

Full solution architecture

Full solution architecture

AWS CodePipeline goes through these steps to deploy the container-based Lambda function in the test and prod accounts:

  1. The developer commits the code of Lambda function into AWS CodeCommit or other source control repositories, which triggers the CI/CD workflow.
  2. AWS CodeBuild builds the code, creates a container image, and pushes the image to the Amazon Elastic Container Registry (ECR) repository using AWS SAM.
  3. AWS CodeBuild assumes a cross-account role for the test account.
  4. AWS CodeBuild uses AWS SAM to deploy the Lambda function by pulling image from Amazon ECR.
  5. If deployment is successful, AWS CodeBuild deploys the same image in prod account using AWS SAM.

Deploying the example

Prerequisites

  • An AWS account. The IAM user that you use must have sufficient permissions to make necessary AWS service calls and manage AWS resources.
  • AWS CLI installed and configured.
  • Git installed.
  • AWS SAM installed.
  • Setup .aws/credentials named profiles for tooling, test and prod accounts so you can run CLI and AWS SAM commands against them.
  • Set the TOOLS_ACCOUNT_ID, TEST_ACCOUNT_ID, and PROD_ACCOUNT_ID env variables:
    export TOOLS_ACCOUNT_ID=<Tooling Account Id>
    export TEST_ACCOUNT_ID=<Test Account Id>
    export PROD_ACCOUNT_ID=<Prod Account Id>

Creating a Git Repository and pushing the code

Run the following command in the tooling account from your terminal to create a new CodeCommit repository:

aws codecommit create-repository --repository-name lambda-container-repo --profile tooling

Initialize the Git repository and push the code.

cd ~/environment/cicd-lambda-container
git init -b main
git add .
git commit -m "Initial commit"
git remote add origin codecommit://lambda-container-repo
git push -u origin main

Creating cross-account roles in test and prod accounts

For the pipeline to gain access to the test and production environment, it must assume an IAM role. In cross-account scenario, the IAM role for the pipeline must be created on the test and production accounts.

Change directory to the directory templates and run the following command to deploy roles to test and prod using respective named profiles.

Test Profile

cd ~/environment/cicd-lambda-container/templates
aws cloudformation deploy --template-file crossaccount_pipeline_roles.yml --stack-name codepipeline-crossaccount-roles --capabilities CAPABILITY_NAMED_IAM --profile test --parameter-overrides ToolAccountID=${TOOLS_ACCOUNT_ID}
aws cloudformation describe-stacks --stack-name codepipeline-crossaccount-roles --query "Stacks[0].Outputs" --output json --profile test

Open the codepipeline_parameters.json file from the root directory. Replace the value of TestCodePipelineCrossAccountRoleArn and TestCloudFormationCrossAccountRoleArn with the CloudFormation output value of CodePipelineCrossAccountRole and CloudFormationCrossAccountRole respectively.

Prod Profile

aws cloudformation deploy --template-file crossaccount_pipeline_roles.yml --stack-name codepipeline-crossaccount-roles --capabilities CAPABILITY_NAMED_IAM --profile prod --parameter-overrides ToolAccountID=${TOOLS_ACCOUNT_ID}
aws cloudformation describe-stacks --stack-name codepipeline-crossaccount-roles --query "Stacks[0].Outputs" --output json --profile prod

Open the codepipeline_parameters.json file from the root directory. Replace the value of ProdCodePipelineCrossAccountRoleArn and ProdCloudFormationCrossAccountRoleArn with the CloudFormation output value of CodePipelineCrossAccountRole and CloudFormationCrossAccountRole respectively.

Creating the required IAM roles and infrastructure in the tooling account

Change to the templates directory and run the following command using tooling named profile:

aws cloudformation deploy --template-file tooling_resources.yml --stack-name tooling-resources --capabilities CAPABILITY_NAMED_IAM --parameter-overrides TestAccountID=${TEST_ACCOUNT_ID} ProdAccountID=${PROD_ACCOUNT_ID} --profile tooling
aws cloudformation describe-stacks --stack-name tooling-resources --query "Stacks[0].Outputs" --output json --profile tooling

Open the codepipeline_parameters.json file from the root directory. Replace value of ImageRepositoryURI, ArtifactsBucket, ToolingCodePipelineExecutionRoleArn, and ToolingCloudFormationExecutionRoleArn with the corresponding CloudFormation output value.

Updating cross-account IAM roles

The cross-account IAM roles on the test and production account require permission to access artifacts that contain application code (S3 bucket and ECR repository). Note that the cross-account roles are deployed twice. This is because there is a circular dependency on the roles in the test and prod accounts and the pipeline artifact resources provisioned in the tooling account.

The pipeline must reference and resolve the ARNs of the roles it needs to assume to deploy the application to the test and prod accounts, so the roles must be deployed before the pipeline is provisioned. However, the policies attached to the roles need to include the S3 bucket and ECR repository. But the S3 bucket and ECR repository don’t exist until the resources deploy in the preceding step. By deploying the roles twice, once without a policy so their ARNs resolve, and a second time to attach policies to the existing roles that reference the resources in the tooling account.

Replace ImageRepositoryArn and ArtifactBucketArn with output value from the above step in the below command and run it from the templates directory using Test and Prod named profiles.

Test Profile

aws cloudformation deploy --template-file crossaccount_pipeline_roles.yml --stack-name codepipeline-crossaccount-roles --capabilities CAPABILITY_NAMED_IAM --profile test --parameter-overrides ToolAccountID=${TOOLS_ACCOUNT_ID} ImageRepositoryArn=&lt;ImageRepositoryArn value&gt; ArtifactsBucketArn=&lt;ArtifactsBucketArn value&gt;

Prod Profile

aws cloudformation deploy --template-file crossaccount_pipeline_roles.yml --stack-name codepipeline-crossaccount-roles --capabilities CAPABILITY_NAMED_IAM --profile prod --parameter-overrides ToolAccountID=${TOOLS_ACCOUNT_ID} ImageRepositoryArn=&lt;ImageRepositoryArn value&gt; ArtifactsBucketArn=&lt;ArtifactsBucketArn value&gt;

Deploying the pipeline

Replace DeploymentRegion value with the current Region and CodeCommitRepositoryName value with the CodeCommit repository name in codepipeline_parameters.json file.

Push the changes to CodeCommit repository using Git commands.

Replace CodeCommitRepositoryName value with the CodeCommit repository name created in the first step and run the following command from the root directory of the project using tooling named profile.

sam deploy -t codepipeline.yaml --stack-name cicd-lambda-container-pipeline --capabilities=CAPABILITY_IAM --parameter-overrides CodeCommitRepositoryName=&lt;CodeCommit Repository Name&gt; --profile tooling

Cleaning Up

  1. Run the following command in root directory of the project to delete the pipeline:
    sam delete --stack-name cicd-lambda-container-pipeline --profile tooling
  2. Empty the artifacts bucket. Replace the artifacts bucket name with the output value from the preceding step:
    aws s3 rm s3://&lt;Arifacts bucket name&gt; --recursive --profile tooling
  3. Delete the Lambda functions from the test and prod accounts:
    aws cloudformation delete-stack --stack-name lambda-container-app-test --profile test
    aws cloudformation delete-stack --stack-name lambda-container-app-prod --profile prod
  4. Delete cross-account roles from the test and prod accounts:
    aws cloudformation delete-stack --stack-name codepipeline-crossaccount-roles --profile test
    aws cloudformation delete-stack --stack-name codepipeline-crossaccount-roles --profile prod
  5. Delete the ECR repository:
    aws ecr delete-repository --repository-name image-repository --profile tooling --force
  6. Delete resources from the tooling account:
    aws cloudformation delete-stack --stack-name tooling-resources --profile tooling

Conclusion

This blog post discusses how to automate deployment of container-based Lambda across multiple accounts using AWS SAM Pipelines.

Navigate to the GitHub repository and review the implementation to see how CodePipeline pushes container image to Amazon ECR, and deploys image to Lambda using cross-account role. Examine the codepipeline.yml file to see how the AWS SAM Pipelines creates CI/CD resources using this template.

For more serverless learning resources, visit  Serverless Land.

Let’s Architect! Designing serverless solutions

Post Syndicated from Luca Mezzalira original https://aws.amazon.com/blogs/architecture/lets-architect-designing-serverless-solutions/

During his re:Invent 2022 keynote, Werner Vogels, AWS Vice President and Chief Technology Officer, emphasized the asynchronous nature of our world and the challenges associated with incorporating asynchronicity into our architectures. AWS serverless services can help users concentrate on the asynchronous aspects of their workloads, easing the execution of event-driven architectures and enabling the adoption of effective integration patterns for communication both within and beyond a bounded context.

In this edition of Let’s Architect!, we offer an in-depth exploration of the architecture of serverless AWS services, such as AWS Lambda. We also present a new workshop centered on design patterns employing serverless AWS services, which ultimately delivers valuable insights on implementing event-driven architectures within systems.

A closer look at AWS Lambda

This video is the perfect companion for those seeking to learn and master a Lambda architecture, empowering you to effectively leverage its capabilities in your workloads.

With the knowledge gained from this video, you will be well-equipped to design your functions’ code in a highly optimized manner, ensuring efficient performance and resource utilization. Furthermore, a comprehensive understanding of Lambda functions can help identify and apply the most suitable approach to cloud workloads, resulting in an agile and robust cloud infrastructure that meets a project’s unique requirements.

Take me to this video!

Discover how AWS Lambda functions work under the hood

Discover how AWS Lambda functions work under the hood

Implementing an event-driven serverless story generation application with ChatGPT and DALL-E

This example of an event-driven serverless architecture showcases the power of leveraging AWS services and AI technologies to develop innovative solutions. Built upon a foundation of serverless services, including Amazon EventBridge, Amazon DynamoDB, Lambda, Amazon Simple Storage Service, and managed artificial intelligence (AI ) services like Amazon Polly, this architecture demonstrates the seamless capacity to create daily stories with a scheduled launch. By utilizing EventBridge scheduler, an Lambda function is initiated every night to generate new content. The integration of AI services, like ChatGPT and DALL-E, further elevates the solution, as their compatibility with the serverless model enables efficient and dynamic content creation. This case serves as a testament to the potential of combining event-driven serverless architectures, with cutting-edge AI technologies for inventive and impactful applications.

Take me to this Compute Blog post!

How to build an event-driven architecture with serverless AWS services integrating ChatGPT and DALL-E

How to build an event-driven architecture with serverless AWS services integrating ChatGPT and DALL-E

AWS Workshop Studio: Serverless Patterns

The AWS Serverless Patterns workshop offers a comprehensive learning experience to enhance your understanding of architectural patterns applicable to serverless projects. Throughout the workshop, participants will delve into various patterns, such as synchronous and asynchronous implementations, tailored to meet the demands of modern serverless applications. This hands-on approach ensures a production-ready understanding, encompassing crucial topics like testing serverless workloads, establishing automation pipelines, and more. Take this workshop to elevate your serverless architecture knowledge!

Take me to the serverless workshop!

The high-level architecture of the workshops modules

The high-level architecture of the workshops modules

Building Serverlesspresso: Creating event-driven architectures

Serverlesspresso is an event-driven, serverless workload that uses EventBridge and AWS Step Functions to coordinate events across microservices and support thousands of orders per day. This comprehensive session delves into design considerations, development processes, and valuable lessons learned from creating a production-ready solution. Discover practical patterns and extensibility options that contribute to a robust, scalable, and cost-effective application. Gain insights into combining EventBridge and Step Functions to address complex architectural challenges in larger applications.

Take me to this video!

How to leverage AWS Step Functions for orchestrating your workflows

How to leverage AWS Step Functions for orchestrating your workflows

See you next time!

Thanks for joining our conversation on serverless solutions! We’ll see you next time when we talk about AWS microservices.

Can’t get enough of the Let’s Architect! series? Visit the Let’s Architect! page of the AWS Architecture Blog!

AWS Week in Review – AWS Notifications, Serverless event, and More – May 8, 2023

Post Syndicated from Marcia Villalba original https://aws.amazon.com/blogs/aws/aws-week-in-review-aws-notifications-serverless-event-and-more-may-8-2023/

At the end of this week, I’m flying to Seattle to take part in the AWS Serverless Innovation Day. Along with many customers and colleagues from AWS, we are going to be live on May 17 at a virtual free event. During the AWS Serverless Innovation Day we will share best practices related to building event-driven applications and using serverless functions and containers. Get a calendar reminder and check the full agenda at the event site.

Serverless innovation day

Last Week’s Launches
Here are some launches that got my attention during the previous week.

New Local Zones in Auckland – AWS Local Zones allow you to deliver applications that require single-digit millisecond latency or local data processing. Starting last week, AWS Local Zones is available in Auckland, New Zealand.

All AWS Local Zones

AWS Notifications Channy wrote an article explaining how you can view and configure notifications for your AWS account. In addition to the AWS Management Console notifications, the AWS Console Mobile Application now allows you to create and receive actionable push notifications when a resource requires your attention.

AWS SimSpace Weaver Last reInvent, we launched AWS SimSpace Weaver, a fully managed compute service that helps you deploy large spatial simulations in the cloud. Starting last week, AWS SimSpace Weaver allows you to save the state of the simulations at a specific point in time.

AWS Security Hub Added four new integration partners to help customers with their cloud security posture monitoring, and now it provides detailed tracking of finding changes with the finding history feature. This last feature provides an immutable trail of changes to get more visibility into the changes made to your findings.

AWS Compute Optimizer – AWS Compute Optimizer supports inferred workload type filtering on Amazon EC2 instance recommendations and automatically detects the applications that might run on your AWS resources. Now AWS Compute Optimizer supports filtering your rightsizing recommendation by tags and identifies and filters Microsoft SQL Server workloads as an inferred workload type.

AWS AppSyncNow AWS AppSync GraphQL APIs support Private API. With Private APIs, you can now create GraphQL APIs that can only be accessed from your Amazon Virtual Private Cloud (Amazon VPC).

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
Some other updates and news that you may have missed:

  • Responsible AI in the Generative EraAmazon Science published a very interesting blog post this week about the special challenges raised by building a responsible generative AI and the different things builders of applications can do in order to solve these challenges.
  • Patterns for Building an API to Upload Files to Amazon S3 – Amazon S3 is one of the most used services by our customers, and applications often require a way for users to upload files. In this article, Thomas Moore shows different ways to do this in a secure way.
  • The Official AWS Podcast – Listen each week for updates on the latest AWS news and deep dives into exciting use cases. There are also official AWS podcasts in your local languages. Check out the ones in FrenchGermanItalian, and Spanish.
  • AWS Open-Source News and Updates – This is a newsletter curated by my colleague Ricardo to bring you the latest open-source projects, posts, events, and more.

Upcoming AWS Events
Check your calendars and sign up for these AWS events:

  • AWS Serverless Innovation DayJoin us on May 17 for a virtual and free event about AWS Serverless. We will have talks and fireside chats with customers related to AWS Lambda, Amazon ECS with Fargate, AWS Step Functions, and Amazon EventBridge.
  • AWS re:Inforce 2023You can now register for AWS re:Inforce, happening in Anaheim, California, on June 13–14.
  • AWS Global Summits – There are many summits going on right now around the world: Stockholm (May 11), Hong Kong (May 23), India (May 25), Amsterdam (June 1), London (June 7), Washington, DC (June 7–8), Toronto (June 14), Madrid (June 15), and Milano (June 22).
  • AWS Community Day – Join a community-led conference run by AWS user group leaders in your region: Warsaw (June 1), Chicago (June 15), Manila (June 29–30), and Munich (September 14).
  • AWS User Group Peru Conference – The local AWS User Group announced a one-day cloud event in Spanish and English in Lima on September 23. Seb, Jeff, and I will be attending the event from the AWS News blog team. Register today!

That’s all for this week. Check back next Monday for another Week in Review!

— Marcia

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Single sign-on with Amazon Redshift Serverless with Okta using Amazon Redshift Query Editor v2 and third-party SQL clients

Post Syndicated from Maneesh Sharma original https://aws.amazon.com/blogs/big-data/single-sign-on-with-amazon-redshift-serverless-with-okta-using-amazon-redshift-query-editor-v2-and-third-party-sql-clients/

Amazon Redshift Serverless makes it easy to run and scale analytics in seconds without the need to set up and manage data warehouse clusters. With Redshift Serverless, users such as data analysts, developers, business professionals, and data scientists can get insights from data by simply loading and querying data in the data warehouse.

Customers use their preferred SQL clients to analyze their data in Redshift Serverless. They want to use an identity provider (IdP) or single sign-on (SSO) credentials to connect to Redshift Serverless to reuse existing using credentials and avoid additional user setup and configuration. When you use AWS Identity and Access Management (IAM) or IdP-based credentials to connect to a serverless data warehouse, Amazon Redshift automatically creates a database user for the end-user. You can simplify managing user privileges by using role-based access control. Admins can use a database-role mapping for SSO with the IAM roles that users are assigned to get their database privileges automatically. With this integration, organizations can simplify user management because they no longer need to create users and map them to database roles manually. You can define the mapped database roles as a principal tag for the IdP groups or IAM role, so Amazon Redshift database roles and users who are members of those IdP groups are granted to the database roles automatically.

In this post, we focus on Okta as the IdP and provide step-by-step guidance to integrate Redshift Serverless with Okta using the Amazon Redshift Query Editor V2 and with SQL clients like SQL Workbench/J. You can use this mechanism with other IdP providers such as Azure Active Directory or Ping with any applications or tools using Amazon’s JDBC/ODBC/Python driver.

Solution overview

The following diagram illustrates the authentication flow of Okta with Redshift Serverless using federated IAM roles and automatic database-role mapping.

The workflow contains the following steps:

  1. Either the user chooses an IdP app in their browser, or the SQL client initiates a user authentication request to the IdP (Okta).
  2. Upon a successful authentication, Okta submits a request to the AWS federation endpoint with a SAML assertion containing the PrincipalTags.
  3. The AWS federation endpoint validates the SAML assertion and invokes the AWS Security Token Service (AWS STS) API AssumeRoleWithSAML. The SAML assertion contains the IdP user and group information that is stored in the RedshiftDbUser and RedshiftDbRoles principal tags, respectively. Temporary IAM credentials are returned to the SQL client or, if using the Query Editor v2, the user’s browser is redirected to the Query Editor v2 console using the temporary IAM credentials.
  4. The temporary IAM credentials are used by the SQL client or Query Editor v2 to call the Redshift Serverless GetCredentials API. The API uses the principal tags to determine the user and database roles that the user belongs to. An associated database user is created if the user is signing in for the first time and is granted the matching database roles automatically. A temporary password is returned to the SQL client.
  5. Using the database user and temporary password, the SQL client or Query Editor v2 connects to Redshift Serverless. Upon login, the user is authorized based on the Amazon Redshift database roles that were assigned in Step 4.

To set up the solution, we complete the following steps:

  1. Set up your Okta application:
    • Create Okta users.
    • Create groups and assign groups to users.
    • Create the Okta SAML application.
    • Collect Okta information.
  2. Set up AWS configuration:
    • Create the IAM IdP.
    • Create the IAM role and policy.
  3. Configure Redshift Serverless role-based access.
  4. Federate to Redshift Serverless using the Query Editor V2.
  5. Configure the SQL client (for this post, we use SQL Workbench/J).
  6. Optionally, implement MFA with SQL Client and Query Editor V2.

Prerequisites

You need the following prerequisites to set up this solution:

Set up Okta application

In this section, we provide the steps to configure your Okta application.

Create Okta users

To create your Okta users, complete the following steps:

  1. Sign in to your Okta organization as a user with administrative privileges.
  2. On the admin console, under Directory in the navigation pane, choose People.
  3. Choose Add person.
  4. For First Name, enter the user’s first name.
  5. For Last Name, enter the user’s last name.
  6. For Username, enter the user’s user name in email format.
  7. Select I will set password and enter a password.
  8. Optionally, deselect User must change password on first login if you don’t want the user to change their password when they first sign in. Choose Save.

Create groups and assign groups to users

To create your groups and assign them to users, complete the following steps:

  1. Sign in to your Okta organization as a user with administrative privileges.
  2. On the admin console, under Directory in the navigation pane, choose Groups.
  3. Choose Add group.
  4. Enter a group name and choose Save.
  5. Choose the recently created group and then choose Assign people.
  6. Choose the plus sign and then choose Done.
  7. Repeat Steps 1–6 to add more groups.

In this post, we create two groups: sales and finance.

Create an Okta SAML application

To create your Okta SAML application, complete the following steps:

  1. Sign in to your Okta organization as a user with administrative privileges.
  2. On the admin console, under Applications in the navigation pane, choose Applications.
  3. Choose Create App Integration.
  4. Select SAML 2.0 as the sign-in method and choose Next.
  5. Enter a name for your app integration (for example, redshift_app) and choose Next.
  6. Enter following values in the app and leave the rest as is:
    • For Single Sign On URL, enter https://signin.aws.amazon.com/saml.
    • For Audience URI (SP Entity ID), enter urn:amazon:webservices.
    • For Name ID format, enter EmailAddress.
  7. Choose Next.
  8. Choose I’m an Okta customer adding an internal app followed by This is an internal app that we have created.
  9. Choose Finish.
  10. Choose Assignments and then choose Assign.
  11. Choose Assign to groups and then select Assign next to the groups that you want to add.
  12. Choose Done.

Set up Okta advanced configuration

After you create the custom SAML app, complete the following steps:

  1. On the admin console, navigate to General and choose Edit under SAML settings.
  2. Choose Next.
  3. Set Default Relay State to the Query Editor V2 URL, using the format https://<region>.console.aws.amazon.com/sqlworkbench/home. For this post, we use https://us-west-2.console.aws.amazon.com/sqlworkbench/home.
  4. Under Attribute Statements (optional), add the following properties:
    • Provide the IAM role and IdP in comma-separated format using the Role attribute. You’ll create this same IAM role and IdP in a later step when setting up AWS configuration.
    • Set user.login for RoleSessionName. This is used as an identifier for the temporary credentials that are issued when the role is assumed.
    • Set the DB roles using PrincipalTag:RedshiftDbRoles. This uses the Okta groups to fill the principal tags and map them automatically with the Amazon Redshift database roles. Its value must be a colon-separated list in the format role1:role2.
    • Set user.login for PrincipalTag:RedshiftDbUser. This uses the user name in the directory. This is a required tag and defines the database user that is used by Query Editor V2.
    • Set the transitive keys using TransitiveTagKeys. This prevents users from changing the session tags in case of role chaining.

The preceding tags are forwarded to the GetCredentials API to get temporary credentials for your Redshift Serverless instance and map automatically with Amazon Redshift database roles. The following table summarizes their attribute statements configuration.

Name Name Format Format Example
https://aws.amazon.com/SAML/Attributes/Role Unspecified arn:aws:iam::<yourAWSAccountID>:role/role-name,arn:aws:iam:: <yourAWSAccountID>:saml-provider/provider-name arn:aws:iam::112034567890:role/oktarole,arn:aws:iam::112034567890:saml-provider/oktaidp
https://aws.amazon.com/SAML/Attributes/RoleSessionName Unspecified user.login user.login
https://aws.amazon.com/SAML/Attributes/PrincipalTag:RedshiftDbRoles Unspecified String.join(":", isMemberOfGroupName("group1") ? 'group1' : '', isMemberOfGroupName("group2") ? 'group2' : '') String.join(":", isMemberOfGroupName("sales") ? 'sales' : '', isMemberOfGroupName("finance") ? 'finance' : '')
https://aws.amazon.com/SAML/Attributes/PrincipalTag:RedshiftDbUser Unspecified user.login user.login
https://aws.amazon.com/SAML/Attributes/TransitiveTagKeys Unspecified Arrays.flatten("RedshiftDbUser", "RedshiftDbRoles") Arrays.flatten("RedshiftDbUser", "RedshiftDbRoles")
  1. After you add the attribute claims, choose Next followed by Finish.

Your attributes should be in similar format as shown in the following screenshot.

Collect Okta information

To gather your Okta information, complete the following steps:

  1. On the Sign On tab, choose View SAML setup instructions.
  2. For Identity Provider Single Sign-on URL, Use this URL when connecting with any third-party SQL client such as SQL Workbench/J.
  3. Use the IdP metadata in block 4 and save the metadata file in .xml format (for example, metadata.xml).

Set up AWS configuration

In this section, we provide the steps to configure your IAM resources.

Create the IAM IdP

To create your IAM IdP, complete the following steps:

  1. On the IAM console, under Access management in the navigation pane, choose Identity providers.
  2. Choose Add provider.
  3. For Provider type¸ select SAML.
  4. For Provider name¸ enter a name.
  5. Choose Choose file and upload the metadata file (.xml) you downloaded earlier.
  6. Choose Add provider.

Create the IAM Amazon Redshift access policy

To create your IAM policy, complete the following steps:

  1. On the IAM console, choose Policies.
  2. Choose Create policy.
  3. On the Create policy page, choose the JSON tab.
  4. For the policy, enter the JSON in following format:
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "VisualEditor0",
                "Effect": "Allow",
                "Action": "redshift-serverless:GetCredentials",
                "Resource": "<Workgroup ARN>"
            },
            {
                "Sid": "VisualEditor1",
                "Effect": "Allow",
                "Action": "redshift-serverless:ListWorkgroups",
                "Resource": "*"
            }
        ]
    }

The workgroup ARN is available on the Redshift Serverless workgroup configuration page.

The following example policy includes only a single Redshift Serverless workgroup; you can modify the policy to include multiple workgroups in the Resource section:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": "redshift-serverless:GetCredentials",
            "Resource": "arn:aws:redshift-serverless:us-west-2:123456789012:workgroup/4a4f12vc-123b-2d99-fd34-a12345a1e87f"
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": "redshift-serverless:ListWorkgroups",
            "Resource": "*"
        }
    ]
}

  1. Choose Next: Tags.
  2. Choose Next: Review.
  3. In the Review policy section, for Name, enter the name of your policy; for example, OktaRedshiftPolicy.
  4. For Description, you can optionally enter a brief description of what the policy does.
  5. Choose Create policy.

Create the IAM role

To create your IAM role, complete the following steps:

  1. On the IAM console, choose Roles in the navigation pane.
  2. Choose Create role.
  3. For Trusted entity type, select SAML 2.0 federation.
  4. For SAML 2.0-based provider, choose the IdP you created earlier.
  5. Select Allow programmatic and AWS Management Console access.
  6. Choose Next.
  7. Choose the policy you created earlier.
  8. Also, add the policy AmazonRedshiftQueryEditorV2ReadSharing.
  9. Choose Next.
  10. In the Review section, for Role Name, enter the name of your role; for example, oktarole.
  11. For Description, you can optionally enter a brief description of what the role does.
  12. Choose Create role.
  13. Navigate to the role that you just created and choose Trust Relationships.
  14. Choose Edit trust policy and choose TagSession under Add actions for STS.

When using session tags, trust policies for all roles connected to the IdP passing tags must have the sts:TagSession permission. For roles without this permission in the trust policy, the AssumeRole operation fails.

  1. Choose Update policy.

The following screenshot shows the role permissions.

The following screenshot shows the trust relationships.

Update the advanced Okta Role Attribute

Complete the following steps:

  1. Switch back to Okta.com.
  2. Navigate to the application which you created earlier.
  3. Navigate to General and click Edit under SAML settings.
  4. Under Attribute Statements (optional), update the value for the attribute – https://aws.amazon.com/SAML/Attributes/Role, using the actual role and identity provider arn values from the above step. For example, arn:aws:iam::123456789012:role/oktarole,arn:aws:iam::123456789012:saml-provider/oktaidp.

Configure Redshift Serverless role-based access

In this step, we create database roles in Amazon Redshift based on the groups that you created in Okta. Make sure the role name matches with the Okta Group name.

Amazon Redshift roles simplify managing privileges required for your end-users. In this post, we create two database roles, sales and finance, and grant them access to query tables with sales and finance data, respectively. You can download this sample SQL Notebook and import into Redshift Query Editor v2 to run all cells in the notebook used in this example. Alternatively, you can copy and enter the SQL into your SQL client.

The following is the syntax to create a role in Redshift Serverless:

create role <IdP groupname>;

For example:

create role sales;
create role finance;

Create the sales and finance database schema:

create schema sales_schema;
create schema finance_schema;

Create the tables:

CREATE TABLE IF NOT EXISTS finance_schema.revenue
(
account INTEGER   ENCODE az64
,customer VARCHAR(20)   ENCODE lzo
,salesamt NUMERIC(18,0)   ENCODE az64
)
DISTSTYLE AUTO
;

insert into finance_schema.revenue values (10001, 'ABC Company', 12000);
insert into finance_schema.revenue values (10002, 'Tech Logistics', 175400);
insert into finance_schema.revenue values (10003, 'XYZ Industry', 24355);
insert into finance_schema.revenue values (10004, 'The tax experts', 186577);

CREATE TABLE IF NOT EXISTS sales_schema.store_sales
(
ID INTEGER   ENCODE az64,
Product varchar(20),
Sales_Amount INTEGER   ENCODE az64
)
DISTSTYLE AUTO
;

Insert into sales_schema.store_sales values (1,'product1',1000);
Insert into sales_schema.store_sales values (2,'product2',2000);
Insert into sales_schema.store_sales values (3,'product3',3000);
Insert into sales_schema.store_sales values (4,'product4',4000);

The following is the syntax to grant permission to the Redshift Serverless role:

GRANT { { SELECT | INSERT | UPDATE | DELETE | DROP | REFERENCES } [,...]| ALL [ PRIVILEGES ] } ON { [ TABLE ] table_name [, ...] | ALL TABLES IN SCHEMA schema_name [, ...] } TO role <IdP groupname>;

Grant relevant permission to the role as per your requirements. In the following example, we grant full permission to the role sales on sales_schema and only select permission on finance_schema to the role finance:

grant usage on schema sales_schema to role sales;
grant select on all tables in schema sales_schema to role sales;

grant usage on schema finance_schema to role finance;
grant select on all tables in schema finance_schema to role finance;

Federate to Redshift Serverless using Query Editor V2

The RedshiftDbRoles principal tag and DBGroups are both mechanisms that can be used to integrate with an IdP. However, federating with the RedshiftDbRoles principal has some clear advantages when it comes to connecting with an IdP because it provides automatic mapping between IdP groups and Amazon Redshift database roles. Overall, RedshiftDbRoles is more flexible, easier to manage, and more secure, making it the better option for integrating Amazon Redshift with your IdP.

Now you’re ready to connect to Redshift Serverless using the Query Editor V2 and federated login:

  1. Use the SSO URL you collected earlier and log in to your Okta account with your user credentials. For this demo, we log in with user Ethan.
  2. In the Query Editor v2, choose your Redshift Serverless instance (right-click) and choose Create connection.
  3. For Authentication, select Federated user.
  4. For Database, enter the database name you want to connect to.
  5. Choose Create Connection.

User Ethan will be able to access sales_schema tables. If Ethan tries to access the tables in finance_schema, he will get a permission denied error.

Configure the SQL client (SQL Workbench/J)

To set up SQL Workbench/J, complete the following steps:

  1. Create a new connection in SQL Workbench/J and choose Redshift Serverless as the driver.
  2. Choose Manage drivers and add all the files from the downloaded AWS JDBC driver pack .zip file (remember to unzip the .zip file).
  3. For Username and Password, enter the values that you set in Okta.
  4. Capture the values for app_id, app_name, and idp_host from the Okta app embed link, which can be found on the General tab of your application.
  5. Set the following extended properties:
    • For app_id, enter the value from app embed link (for example, 0oa8p1o1RptSabT9abd0/avc8k7abc32lL4izh3b8).
    • For app_name, enter the value from app embed link (for example, dev-123456_redshift_app_2).
    • For idp_host, enter the value from app embed link (for example, dev-123456.okta.com).
    • For plugin_name, enter com.amazon.redshift.plugin.OktaCredentialsProvider. The following screenshot shows the SQL Workbench/J extended properties.

      1. Choose OK.
      2. Choose Test from SQL Workbench/J to test the connection.
      3. When the connection is successful, choose OK.
      4. Choose OK to sign in with the users created.

User Ethan will be able to access the sales_schema tables. If Ethan tries to access the tables in the finance_schema, he will get a permission denied error.

Congratulations! You have federated with Redshift Serverless and Okta with SQL Workbench/J using RedshiftDbRoles.

[Optional] Implement MFA with SQL Client and Query Editor V2

Implementing MFA poses an additional challenge because the nature of multi-factor authentication is an asynchronous process between initiating the login (the first factor) and completing the login (the second factor). The SAML response will be returned to the appropriate listener in each scenario; the SQL Client or the AWS console in the case of QEV2. Depending on which login options you will be giving your users, you may need an additional Okta application. See below for the different scenarios:

  1. If you are ONLY using QEV2 and not using any other SQL client, then you can use MFA with Query Editor V2 with the above application. There are no changes required in the custom SAML application which we have created above.
  2. If you are NOT using QEV2 and only using third party SQL client (SQL Workbench/J etc), then you need to modify the above custom SAML app as mentioned below.
  3. If you want to use QEV2 and third-party SQL Client with MFA, then you need create an additional custom SAML app as mentioned below.

Prerequisites for MFA

Each identity provider (IdP) has step for enabling and managing MFA for your users. In the case of Okta, see the following guides on how to enable MFA using the Okta Verify application and by defining an authentication policy.

Steps to create/update SAML application which supports MFA for a SQL Client

  1. If creating a second app, follow all the steps which are described under section 1 (Create Okta SAML application).
  2. Open the custom SAML app and select General.
  3. Select Edit under SAML settings
  4. Click Next in General Settings
  5. Under General, update the Single sign-on URL to http://localhost:7890/redshift/
  6. Select Next followed by Finish.

Below is the screenshot from the MFA App after making above changes:

Configure SQL Client for MFA

To set up SQL Workbench/J, complete the following steps:

  1. Follow all the steps which are described under (Configure the SQL client (SQL Workbench/J))
  2. Modify your connection updating the extended properties:
    • login_url – Get the Single Sign-on URL as shown in section -Collect Okta information. (For example, https://dev-123456.okta.com/app/dev-123456_redshiftapp_2/abc8p6o5psS6xUhBJ517/sso/saml)
    • plugin_name – com.amazon.redshift.plugin.BrowserSamlCredentialsProvider
  3. Choose OK
  4. Choose OK from SQL Workbench/J. You’re redirected to the browser to sign in with your Okta credentials.
  5. After that, you will get prompt for MFA. Choose either Enter a code or Get a push notification.
  6. Once authentication is successful, log in to be redirected to a page showing the connection as successful.
  7. With this connection profile, run the following query to return federated user name.

Troubleshooting

If your connection didn’t work, consider the following:

  • Enable logging in the driver. For instructions, see Configure logging.
  • Make sure to use the latest Amazon Redshift JDBC driver version.
  • If you’re getting errors while setting up the application on Okta, make sure you have admin access.
  • If you can authenticate via the SQL client but get a permission issue or can’t see objects, grant the relevant permission to the role, as detailed earlier in this post.

Clean up

When you’re done testing the solution, clean up the resources to avoid incurring future charges:

  1. Delete the Redshift Serverless instance by deleting both the workgroup and the namespace.
  2. Delete the IAM roles, IAM IdPs, and IAM policies.

Conclusion

In this post, we provided step-by-step instructions to integrate Redshift Serverless with Okta using the Amazon Redshift Query Editor V2 and SQL Workbench/J with the help of federated IAM roles and automatic database-role mapping. You can use a similar setup with any other SQL client (such as DBeaver or DataGrip) or business intelligence tool (such as Tableau Desktop). We also showed how Okta group membership is mapped automatically with Redshift Serverless roles to use role-based authentication seamlessly.

For more information about Redshift Serverless single sign-on using database roles, see Defining database roles to grant to federated users in Amazon Redshift Serverless.


About the Authors

Maneesh Sharma is a Senior Database Engineer at AWS with more than a decade of experience designing and implementing large-scale data warehouse and analytics solutions. He collaborates with various Amazon Redshift Partners and customers to drive better integration.

Debu-PandaDebu Panda is a Senior Manager, Product Management at AWS. He is an industry leader in analytics, application platform, and database technologies, and has more than 25 years of experience in the IT world.

Mohamed ShaabanMohamed Shaaban is a Senior Software Engineer in Amazon Redshift and is based in Berlin, Germany. He has over 12 years of experience in the software engineering. He is passionate about cloud services and building solutions that delight customers. Outside of work, he is an amateur photographer who loves to explore and capture unique moments.

Rajiv Gupta is Sr. Manager of Analytics Specialist Solutions Architects based out of Irvine, CA. He has 20+ years of experience building and managing teams who build data warehouse and business intelligence solutions.

Amol Mhatre is a Database Engineer in Amazon Redshift and works on Customer & Partner engagements. Prior to Amazon, he has worked on multiple projects involving Database & ERP implementations.

Ning Di is a Software Development Engineer at Amazon Redshift, driven by a genuine passion for exploring all aspects of technology.

Harsha Kesapragada is a Software Development Engineer for Amazon Redshift with a passion to build scalable and secure systems. In the past few years, he has been working on Redshift Datasharing, Security and Redshift Serverless.

Extending a serverless, event-driven architecture to existing container workloads

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/extending-a-serverless-event-driven-architecture-to-existing-container-workloads/

This post is written by Dhiraj Mahapatro, Principal Specialist SA, and Sascha Moellering, Principal Specialist SA, and Emily Shea, WW Lead, Integration Services.

Many serverless services are a natural fit for event-driven architectures (EDA), as events invoke them and only run when there is an event to process. When building in the cloud, many services emit events by default and have built-in features for managing events. This combination allows customers to build event-driven architectures easier and faster than ever before.

The insurance claims processing sample application in this blog series uses event-driven architecture principles and serverless services like AWS LambdaAWS Step FunctionsAmazon API GatewayAmazon EventBridge, and Amazon SQS.

When building an event-driven architecture, it’s likely that you have existing services to integrate with the new architecture, ideally without needing to make significant refactoring changes to those services. As services communicate via events, extending applications to new and existing microservices is a key benefit of building with EDA. You can write those microservices in different programming languages or running on different compute options.

This blog post walks through a scenario of integrating an existing, containerized service (a settlement service) to the serverless, event-driven insurance claims processing application described in this blog post.

Overview of sample event-driven architecture

The sample application uses a front-end to sign up a new user and allow the user to upload images of their car and driver’s license. Once signed up, they can file a claim and upload images of their damaged car. Previously, it did not yet integrate with a settlement service for completing the claims and settlement process.

In this scenario, the settlement service is a brownfield application that runs Spring Boot 3 on Amazon ECS with AWS Fargate. AWS Fargate is a serverless, pay-as-you-go compute engine that lets you focus on building container applications without managing servers.

The Spring Boot application exposes a REST endpoint, which accepts a POST request. It applies settlement business logic and creates a settlement record in the database for a car insurance claim. Your goal is to make settlement work with the new EDA application that is designed for claims processing without re-architecting or rewriting. Customer, claims, fraud, document, and notification are the other domains that are shown as blue-colored boxes in the following diagram:

Reference architecture

Project structure

The application uses AWS Cloud Development Kit (CDK) to build the stack. With CDK, you get the flexibility to create modular and reusable constructs imperatively using your language of choice. The sample application uses TypeScript for CDK.

The following project structure enables you to build different bounded contexts. Event-driven architecture relies on the choreography of events between domains. The object oriented programming (OOP) concept of CDK helps provision the infrastructure to separate the domain concerns while loosely coupling them via events.

You break the higher level CDK constructs down to these corresponding domains:

Comparing domains

Application and infrastructure code are present in each domain. This project structure creates a seamless way to add new domains like settlement with its application and infrastructure code without affecting other areas of the business.

With the preceding structure, you can use the settlement-service.ts CDK construct inside claims-processing-stack.ts:

const settlementService = new SettlementService(this, "SettlementService", {
  bus,
});

The only information the SettlementService construct needs to work is the EventBridge custom event bus resource that is created in the claims-processing-stack.ts.

To run the sample application, follow the setup steps in the sample application’s README file.

Existing container workload

The settlement domain provides a REST service to the rest of the organization. A Docker containerized Spring Boot application runs on Amazon ECS with AWS Fargate. The following sequence diagram shows the synchronous request-response flow from an external REST client to the service:

Settlement service

  1. External REST client makes POST /settlement call via an HTTP API present in front of an internal Application Load Balancer (ALB).
  2. SettlementController.java delegates to SettlementService.java.
  3. SettlementService applies business logic and calls SettlementRepository for data persistence.
  4. SettlementRepository persists the item in the Settlement DynamoDB table.

A request to the HTTP API endpoint looks like:

curl --location <settlement-api-endpoint-from-cloudformation-output> \
--header 'Content-Type: application/json' \
--data '{
  "customerId": "06987bc1-1234-1234-1234-2637edab1e57",
  "claimId": "60ccfe05-1234-1234-1234-a4c1ee6fcc29",
  "color": "green",
  "damage": "bumper_dent"
}'

The response from the API call is:

API response

You can learn more here about optimizing Spring Boot applications on AWS Fargate.

Extending container workload for events

To integrate the settlement service, you must update the service to receive and emit events asynchronously. The core logic of the settlement service remains the same. When you file a claim, upload damaged car images, and the application detects no document fraud, the settlement domain subscribes to Fraud.Not.Detected event and applies its business logic. The settlement service emits an event back upon applying the business logic.

The following sequence diagram shows a new interface in settlement to work with EDA. The settlement service subscribes to events that a producer emits. Here, the event producer is the fraud service that puts an event in an EventBridge custom event bus.

Sequence diagram

  1. Producer emits Fraud.Not.Detected event to EventBridge custom event bus.
  2. EventBridge evaluates the rules provided by the settlement domain and sends the event payload to the target SQS queue.
  3. SubscriberService.java polls for new messages in the SQS queue.
  4. On message, it transforms the message body to an input object that is accepted by SettlementService.
  5. It then delegates the call to SettlementService, similar to how SettlementController works in the REST implementation.
  6. SettlementService applies business logic. The flow is like the REST use case from 7 to 10.
  7. On receiving the response from the SettlementService, the SubscriberService transforms the response to publish an event back to the event bus with the event type as Settlement.Finalized.

The rest of the architecture consumes this Settlement.Finalized event.

Using EventBridge schema registry and discovery

Schema enforces a contract between a producer and a consumer. A consumer expects the exact structure of the event payload every time an event arrives. EventBridge provides schema registry and discovery to maintain this contract. The consumer (the settlement service) can download the code bindings and use them in the source code.

Enable schema discovery in EventBridge before downloading the code bindings and using them in your repository. The code bindings provide a marshaller that unmarshals the incoming event from SQS queue to a plain old Java object (POJO) FraudNotDetected.java. You download the code bindings using the choice of your IDE. AWS Toolkit for IntelliJ makes it convenient to download and use them.

Download code bindings

The final architecture for the settlement service with REST and event-driven architecture looks like:

Final architecture

Transition to become fully event-driven

With the new capability to handle events, the Spring Boot application now supports both the REST endpoint and the event-driven architecture by running the same business logic through different interfaces. In this example scenario, as the event-driven architecture matures and the rest of the organization adopts it, the need for the POST endpoint to save a settlement may diminish. In the future, you can deprecate the endpoint and fully rely on polling messages from the SQS queue.

You start with using an ALB and Fargate service CDK ECS pattern:

const loadBalancedFargateService = new ecs_patterns.ApplicationLoadBalancedFargateService(
  this,
  "settlement-service",
  {
    cluster: cluster,
    taskImageOptions: {
      image: ecs.ContainerImage.fromDockerImageAsset(asset),
      environment: {
        "DYNAMODB_TABLE_NAME": this.table.tableName
      },
      containerPort: 8080,
      logDriver: new ecs.AwsLogDriver({
        streamPrefix: "settlement-service",
        mode: ecs.AwsLogDriverMode.NON_BLOCKING,
        logRetention: RetentionDays.FIVE_DAYS,
      })
    },
    memoryLimitMiB: 2048,
    cpu: 1024,
    publicLoadBalancer: true,
    desiredCount: 2,
    listenerPort: 8080
  });

To adapt to EDA, you update the resources to retrofit the SQS queue to receive messages and EventBridge to put events. Add new environment variables to the ApplicationLoadBalancerFargateService resource:

environment: {
  "SQS_ENDPOINT_URL": queue.queueUrl,
  "EVENTBUS_NAME": props.bus.eventBusName,
  "DYNAMODB_TABLE_NAME": this.table.tableName
}

Grant the Fargate task permission to put events in the custom event bus and consume messages from the SQS queue:

props.bus.grantPutEventsTo(loadBalancedFargateService.taskDefinition.taskRole);
queue.grantConsumeMessages(loadBalancedFargateService.taskDefinition.taskRole);

When you transition the settlement service to become fully event-driven, you do not need the HTTP API endpoint and ALB anymore, as SQS is the source of events.

A better alternative is to use QueueProcessingFargateService ECS pattern for the Fargate service. The pattern provides auto scaling based on the number of visible messages in the SQS queue, besides CPU utilization. In the following example, you can also add two capacity provider strategies while setting up the Fargate service: FARGATE_SPOT and FARGATE. This means, for every one task that is run using FARGATE, there are two tasks that use FARGATE_SPOT. This can help optimize cost.

const queueProcessingFargateService = new ecs_patterns.QueueProcessingFargateService(this, 'Service', {
  cluster,
  memoryLimitMiB: 1024,
  cpu: 512,
  queue: queue,
  image: ecs.ContainerImage.fromDockerImageAsset(asset),
  desiredTaskCount: 2,
  minScalingCapacity: 1,
  maxScalingCapacity: 5,
  maxHealthyPercent: 200,
  minHealthyPercent: 66,
  environment: {
    "SQS_ENDPOINT_URL": queueUrl,
    "EVENTBUS_NAME": props?.bus.eventBusName,
    "DYNAMODB_TABLE_NAME": tableName
  },
  capacityProviderStrategies: [
    {
      capacityProvider: 'FARGATE_SPOT',
      weight: 2,
    },
    {
      capacityProvider: 'FARGATE',
      weight: 1,
    },
  ],
});

This pattern abstracts the automatic scaling behavior of the Fargate service based on the queue depth.

Running the application

To test the application, follow How to use the Application after the initial setup. Once complete, you see that the browser receives a Settlement.Finalized event:

{
  "version": "0",
  "id": "e2a9c866-cb5b-728c-ce18-3b17477fa5ff",
  "detail-type": "Settlement.Finalized",
  "source": "settlement.service",
  "account": "123456789",
  "time": "2023-04-09T23:20:44Z",
  "region": "us-east-2",
  "resources": [],
  "detail": {
    "settlementId": "377d788b-9922-402a-a56c-c8460e34e36d",
    "customerId": "67cac76c-40b1-4d63-a8b5-ad20f6e2e6b9",
    "claimId": "b1192ba0-de7e-450f-ac13-991613c48041",
    "settlementMessage": "Based on our analysis on the damage of your car per claim id b1192ba0-de7e-450f-ac13-991613c48041, your out-of-pocket expense will be $100.00."
  }
}

Cleaning up

The stack creates a custom VPC and other related resources. Be sure to clean up resources after usage to avoid the ongoing cost of running these services. To clean up the infrastructure, follow the clean-up steps shown in the sample application.

Conclusion

The blog explains a way to integrate existing container workload running on AWS Fargate with a new event-driven architecture. You use EventBridge to decouple different services from each other that are built using different compute technologies, languages, and frameworks. Using AWS CDK, you gain the modularity of building services decoupled from each other.

This blog shows an evolutionary architecture that allows you to modernize existing container workloads with minimal changes that still give you the additional benefits of building with serverless and EDA on AWS.

The major difference between the event-driven approach and the REST approach is that you unblock the producer once it emits an event. The event producer from the settlement domain that subscribes to that event is loosely coupled. The business functionality remains intact, and no significant refactoring or re-architecting effort is required. With these agility gains, you may get to the market faster

The sample application shows the implementation details and steps to set up, run, and clean up the application. The app uses ECS Fargate for a domain service, but you do not limit it to just Fargate. You can also bring container-based applications running on Amazon EKS similarly to event-driven architecture.

Learn more about event-driven architecture on Serverless Land.

Patterns for building an API to upload files to Amazon S3

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/patterns-for-building-an-api-to-upload-files-to-amazon-s3/

This blog is written by Thomas Moore, Senior Solutions Architect and Josh Hart, Senior Solutions Architect.

Applications often require a way for users to upload files. The traditional approach is to use an SFTP service (such as the AWS Transfer Family), but this requires specific clients and management of SSH credentials. Modern applications instead need a way to upload to Amazon S3 via HTTPS. Typical file upload use cases include:

  • Sharing datasets between businesses as a direct replacement for traditional FTP workflows.
  • Uploading telemetry and logs from IoT devices and mobile applications.
  • Uploading media such as videos and images.
  • Submitting scanned documents and PDFs.

If you have control over the application that sends the uploads, then you can integrate with the AWS SDK from within the browser with a framework such as AWS Amplify. To learn more, read Allowing external users to securely and directly upload files to Amazon S3.

Often you must provide end users direct access to upload files via an endpoint. You could build a bespoke service for this purpose, but this results in more code to build, maintain, and secure.

This post explores three different approaches to securely upload content to an Amazon S3 bucket via HTTPS without the need to build a dedicated API or client application.

Using Amazon API Gateway as a direct proxy

The simplest option is to use API Gateway to proxy an S3 bucket. This allows you to expose S3 objects as REST APIs without additional infrastructure. By configuring an S3 integration in API Gateway, this allows you to manage authentication, authorization, caching, and rate limiting more easily.

This pattern allows you to implement an authorizer at the API Gateway level and requires no changes to the client application or caller. The limitation with this approach is that API Gateway has a maximum request payload size of 10 MB. For step-by-step instructions to implement this pattern, see this knowledge center article.

This is an example implementation (you can deploy this from Serverless Land):

Using Amazon API Gateway as a direct proxy

Using API Gateway with presigned URLs

The second pattern uses S3 presigned URLs, which allow you to grant access to S3 objects for a specific period, after which the URL expires. This time-bound access helps prevent unauthorized access to S3 objects and provides an additional layer of security.

They can be used to control access to specific versions or ranges of bytes within an object. This granularity allows you to fine-tune access permissions for different users or applications, and ensures that only authorized parties have access to the required data.

This avoids the 10 MB limit of API Gateway as the API is only used to generate the presigned URL, which is then used by the caller to upload directly to S3. Presigned URLs are straightforward to generate and use programmatically, but it does require the client to make two separate requests: one to generate the URL and one to upload the object. To learn more, read Uploading to Amazon S3 directly from a web or mobile application.

Using API Gateway with presigned URLs

This pattern is limited by the 5GB maximum request size of the S3 Put Object API call. One way to work around this limit with this pattern is to leverage S3 multipart uploads. This requires that the client split the payload into multiple segments and send a separate request for each part.

This adds some complexity to the client and is used by libraries such as AWS Amplify that abstract away the multipart upload implementation. This allows you to upload objects up to 5TB in size. For more details, see uploading large objects to Amazon S3 using multipart upload and transfer acceleration.

An example of this pattern is available on Serverless Land.

Using Amazon CloudFront with Lambda@Edge

The final pattern leverages Amazon CloudFront instead of API Gateway. CloudFront is primarily a content delivery network (CDN) that caches and delivers content from an S3 bucket or other origin. However, CloudFront can also be used to upload data to an S3 bucket. Without any additional configuration, this would essentially make the S3 bucket publicly writable. To secure the solution so that only authenticated users can upload objects, you can use a Lambda@Edge function to verify the users’ permissions.

The maximum size of the object that you can upload with this pattern is 5GB. If you need to upload files larger than 5GB, then you must use multipart uploads. To implement this, deploy the example Serverless Land pattern:

Using Amazon CloudFront with Lambda@Edge

This pattern uses an origin access identity (OAI) to limit access to the S3 bucket to only come from CloudFront. The default OAI has s3:GetObject permission, which is changed to s3:PutObject to allow uploads explicitly and prevent and read operations:

{
    "Version": "2008-10-17",
    "Id": "PolicyForCloudFrontPrivateContent",
    "Statement": [
        {
            "Sid": "1",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::cloudfront:user/CloudFront Origin Access Identity <origin access identity ID>"
            },
            "Action": [
                "s3:PutObject"
            ],
            "Resource": "arn:aws:s3::: DOC-EXAMPLE-BUCKET/*"
        }
    ]
}

As CloudFront is not used to cache content, the managed cache policy is set to CachingDisabled.

There are multiple options for implementing the authorization in the Lambda@Edge function. The sample repository uses an Amazon Cognito authorizer that validates a JSON Web Token (JWT) sent as an HTTP authorization header.

Using a JWT is secure as it implies this token is dynamically vended by an Identity Provider, such as Amazon Cognito. This does mean that the caller needs a mechanism to obtain this JWT token. You are in control of this authorizer function, and the exact implementation depends on your use-case. You could instead use an API Key or integrate with an alternate identity provider such as Auth0 or Okta.

Lambda@Edge functions do not currently support environment variables. This means that the configuration parameters are dynamically resolved at runtime. In the example code, AWS Systems Manager Parameter Store is used to store the Amazon Cognito user pool ID and app client ID that is required for the token verification. For more details on how to choose where to store your configuration parameters, see Choosing the right solution for AWS Lambda external parameters.

To verify the JWT token, the example code uses the aws-jwt-verify package. This supports JWTs issued by Amazon Cognito and third-party identity providers.

The Serverless Land pattern uses an Amazon Cognito identity provider to do authentication in the Lambda@Edge function. This code snippet shows an example using a pre-shared key for basic authorization:

import json

def lambda_handler(event, context):
       
    print(event)
       
    response = event["Records"][0]["cf"]["request"]
    headers = response["headers"]
       
    if 'authorization' not in headers or headers['authorization'] == None:
        return unauthorized()
           
    if headers['authorization'] == 'my-secret-key':
        return request

    return response
       
def unauthorized():
    response = {
            'status': "401",
            'statusDescription': 'Unauthorized',
            'body': 'Unauthorized'
        }
    return response

The Lambda function is associated with the CloudFront distribution by creating a Lambda trigger. The CloudFront event is set Viewer request to meaning the function is invoked in reaction to PUT events from the client.

Add trigger

The solution can be tested with an API testing client, such as Postman. In Postman, issue a PUT request to https://<your-cloudfront-domain>/<object-name> with a binary payload as the body. You receive a 401 Unauthorized response.

Postman response

Next, add the Authorization header with a valid token and submit the request again. For more details on how to obtain a JWT from Amazon Cognito, see the README in the repository. Now the request works and you receive a 200 OK message.

To troubleshoot, the Lambda function logs to Amazon CloudWatch Logs. For Lambda@Edge functions, look for the logs in the Region closest to the request, and not the same Region as the function.

The Lambda@Edge function in this example performs basic authorization. It validates the user has access to the requested resource. You can perform any custom authorization action here. For example, in a multi-tenant environment, you could restrict the prefix so that specific tenants only have permission to write to their own prefix, and validate the requested object name in the function.

Additionally, you could implement controls traditionally performed by the API Gateway such as throttling by tenant or user. Another use for the function is to validate the file type. If users can only upload images, you could validate the content-length to ensure the images are a certain size and the file extension is correct.

Conclusion

Which option you choose depends on your use case. This table summarizes the patterns discussed in this blog post:

 

API Gateway as a proxy Presigned URLs with API Gateway CloudFront with Lambda@Edge
Max Object Size 10 MB 5 GB (5 TB with multipart upload) 5 GB
Client Complexity Single HTTP Request Multiple HTTP Requests Single HTTP Request
Authorization Options Amazon Cognito, IAM, Lambda Authorizer Amazon Cognito, IAM, Lambda Authorizer Lambda@Edge
Throttling API Key throttling API Key throttling Custom throttling

Each of the available methods has its strengths and weaknesses and the choice of which one to use depends on your specific needs. The maximum object size supported by S3 is 5 TB, regardless of which method you use to upload objects. Additionally, some methods have more complex configuration that requires more technical expertise. Considering these factors with your specific use-case can help you make an informed decision on the best API option for uploading to S3.

For more serverless learning resources, visit Serverless Land.

Use the Amazon Redshift Data API to interact with Amazon Redshift Serverless

Post Syndicated from Debu Panda original https://aws.amazon.com/blogs/big-data/use-the-amazon-redshift-data-api-to-interact-with-amazon-redshift-serverless/

Amazon Redshift is a fast, scalable, secure, and fully managed cloud data warehouse that makes it simple and cost-effective to analyze all your data using standard SQL and your existing ETL (extract, transform, and load), business intelligence (BI), and reporting tools. Tens of thousands of customers use Amazon Redshift to process exabytes of data per day and power analytics workloads such as BI, predictive analytics, and real-time streaming analytics. Amazon Redshift Serverless makes it convenient for you to run and scale analytics without having to provision and manage data warehouses. With Redshift Serverless, data analysts, developers, and data scientists can now use Amazon Redshift to get insights from data in seconds by loading data into and querying records from the data warehouse.

As a data engineer or application developer, for some use cases, you want to interact with the Redshift Serverless data warehouse to load or query data with a simple API endpoint without having to manage persistent connections. With the Amazon Redshift Data API, you can interact with Redshift Serverless without having to configure JDBC or ODBC. This makes it easier and more secure to work with Redshift Serverless and opens up new use cases.

This post explains how to use the Data API with Redshift Serverless from the AWS Command Line Interface (AWS CLI) and Python. If you want to use the Data API with Amazon Redshift clusters, refer to Using the Amazon Redshift Data API to interact with Amazon Redshift clusters.

Introducing the Data API

The Data API enables you to seamlessly access data from Redshift Serverless with all types of traditional, cloud-native, and containerized serverless web service-based applications and event-driven applications.

The following diagram illustrates this architecture.

The Data API simplifies data access, ingest, and egress from programming languages and platforms supported by the AWS SDK such as Python, Go, Java, Node.js, PHP, Ruby, and C++.

The Data API simplifies access to Amazon Redshift by eliminating the need for configuring drivers and managing database connections. Instead, you can run SQL commands to Redshift Serverless by simply calling a secured API endpoint provided by the Data API. The Data API takes care of managing database connections and buffering data. The Data API is asynchronous, so you can retrieve your results later. Your query results are stored for 24 hours. The Data API federates AWS Identity and Access Management (IAM) credentials so you can use identity providers like Okta or Azure Active Directory or database credentials stored in Secrets Manager without passing database credentials in API calls.

For customers using AWS Lambda, the Data API provides a secure way to access your database without the additional overhead for Lambda functions to be launched in an Amazon VPC. Integration with the AWS SDK provides a programmatic interface to run SQL statements and retrieve results asynchronously.

Relevant use cases

The Data API is not a replacement for JDBC and ODBC drivers, and is suitable for use cases where you don’t need a persistent connection to a serverless data warehouse. It’s applicable in the following use cases:

  • Accessing Amazon Redshift from custom applications with any programming language supported by the AWS SDK. This enables you to integrate web service-based applications to access data from Amazon Redshift using an API to run SQL statements. For example, you can run SQL from JavaScript.
  • Building a serverless data processing workflow.
  • Designing asynchronous web dashboards because the Data API lets you run long-running queries without having to wait for them to complete.
  • Running your query one time and retrieving the results multiple times without having to run the query again within 24 hours.
  • Building your ETL pipelines with AWS Step Functions, Lambda, and stored procedures.
  • Having simplified access to Amazon Redshift from Amazon SageMaker and Jupyter notebooks.
  • Building event-driven applications with Amazon EventBridge and Lambda.
  • Scheduling SQL scripts to simplify data load, unload, and refresh of materialized views.

The Data API GitHub repository provides examples for different use cases for both Redshift Serverless and provisioned clusters.

Create a Redshift Serverless workgroup

If you haven’t already created a Redshift Serverless data warehouse, or want to create a new one, refer to the Getting Started Guide. This guide walks you through the steps of creating a namespace and workgroup with their names as default. Also, ensure that you have created an IAM role and make sure that the IAM role you attach to your Redshift Serverless namespace has AmazonS3ReadOnlyAccess permission. You can use the AWS Management Console to create an IAM role and assign Amazon Simple Storage Service (Amazon S3) privileges (refer to Loading in data from Amazon S3). In this post, we create a table and load data using the COPY command.

Prerequisites for using the Data API

You must be authorized to access the Data API. Amazon Redshift provides the RedshiftDataFullAccess managed policy, which offers full access to Data API. This policy also allows access to Redshift Serverless workgroups, Secrets Manager, and API operations needed to authenticate and access a Redshift Serverless workgroup by using IAM credentials.

You can also create your own IAM policy that allows access to specific resources by starting with RedshiftDataFullAccess as a template.

The Data API allows you to access your database either using your IAM credentials or secrets stored in Secrets Manager. In this post, we use IAM credentials.

When you federate your IAM credentials to connect with Amazon Redshift, it automatically creates a database user for the IAM user that is being used. It uses the GetCredentials API to get temporary database credentials. If you want to provide specific database privileges to your users with this API, you can use an IAM role with the tag name RedshiftDBRoles with a list of roles separated by colons. For example, if you want to assign database roles such as sales and analyst, you can have a value sales:analyst assigned to RedshiftDBRoles.

Use the Data API from the AWS CLI

You can use the Data API from the AWS CLI to interact with the Redshift Serverless workgroup and namespace. For instructions on configuring the AWS CLI, see Setting up the AWS CLI. The Amazon Redshift Serverless CLI (aws redshift-serverless) is a part of AWS CLI that lets you manage Amazon Redshift workgroups and namespaces, such as creating, deleting, setting usage limits, tagging resource, and more. The Data API provides a command line interface to the AWS CLI (aws redshift-data) that allows you to interact with the databases in Redshift Serverless.

You can invoke help using the following command:

aws redshift-data help

The following table shows you the different commands available with the Data API CLI.

Command Description
list-databases Lists the databases in a workgroup.
list-schemas Lists the schemas in a database. You can filter this by a matching schema pattern.
list-tables Lists the tables in a database. You can filter the tables list by a schema name pattern, a matching table name pattern, or a combination of both.
describe-table Describes the detailed information about a table including column metadata.
execute-statement Runs a SQL statement, which can be SELECT, DML, DDL, COPY, or UNLOAD.
batch-execute-statement Runs multiple SQL statements in a batch as a part of single transaction. The statements can be SELECT, DML, DDL, COPY, or UNLOAD.
cancel-statement Cancels a running query. To be canceled, a query must not be in the FINISHED or FAILED state.
describe-statement Describes the details of a specific SQL statement run. The information includes when the query started, when it finished, the number of rows processed, and the SQL statement.
list-statements Lists the SQL statements in the last 24 hours. By default, only finished statements are shown.
get-statement-result Fetches the temporarily cached result of the query. The result set contains the complete result set and the column metadata. You can paginate through a set of records to retrieve the entire result as needed.

If you want to get help on a specific command, run the following command:

aws redshift-data list-tables help

Now we look at how you can use these commands.

List databases

Most organizations use a single database in their Amazon Redshift workgroup. You can use the following command to list the databases in your Serverless endpoint. This operation requires you to connect to a database and therefore requires database credentials.

aws redshift-data list-databases --database dev --workgroup-name default

List schemas

Similar to listing databases, you can list your schemas by using the list-schemas command:

aws redshift-data list-schemas --database dev --workgroup-name default

If you have several schemas that match demo (demo, demo2, demo3, and so on), you can optionally provide a pattern to filter your results matching to that pattern:

aws redshift-data list-schemas --database dev --workgroup-name default --schema-pattern "demo%"

List tables

The Data API provides a simple command, list-tables, to list tables in your database. You might have thousands of tables in a schema; the Data API lets you paginate your result set or filter the table list by providing filter conditions.

You can search across your schema with table-pattern; for example, you can filter the table list by a table name prefix across all your schemas in the database or filter your tables list in a specific schema pattern by using schema-pattern.

The following is a code example that uses both:

aws redshift-data list-tables --database dev --workgroup-name default --schema-pattern "demo%" --table-pattern “orders%”

Run SQL commands

You can run SELECT, DML, DDL, COPY, or UNLOAD commands for Amazon Redshift with the Data API. You can optionally specify the –with-event option if you want to send an event to EventBridge after the query run, then the Data API will send the event with queryId and final run status.

Create a schema

Let’s use the Data API to see how you can create a schema. The following command lets you create a schema in your database. You don’t have to run this SQL if you have pre-created the schema. You have to specify –-sql to specify your SQL commands.

aws redshift-data execute-statement --database dev --workgroup-name default \
--sql "CREATE SCHEMA demo;"

The following shows an example output of execute-statement:

{
    "CreatedAt": "2023-04-07T17:14:43.038000+00:00",
    "Database": "dev",
    "DbUser": "IAMR:Admin",
    "Id": "8e4e5af3-9af9-4567-8e70-7849515b3a79",
    "WorkgroupName": "default"
}

We discuss later in this post how you can check the status of a SQL that you ran with execute-statement.

Create a table

You can use the following command to create a table with the CLI:

aws redshift-data execute-statement --database dev --workgroup-name default  \
   --sql "CREATE TABLE demo.green_201601( \
  vendorid                VARCHAR(4), \
  pickup_datetime         TIMESTAMP, \
  dropoff_datetime        TIMESTAMP, \
  store_and_fwd_flag      VARCHAR(1), \
  ratecode                INT, \
  pickup_longitude        FLOAT4, \
  pickup_latitude         FLOAT4, \
  dropoff_longitude       FLOAT4, \
  dropoff_latitude        FLOAT4, \
  passenger_count         INT, \
  trip_distance           FLOAT4, \
  fare_amount             FLOAT4, \
  extra                   FLOAT4, \
  mta_tax                 FLOAT4, \
  tip_amount              FLOAT4, \
  tolls_amount            FLOAT4, \
  ehail_fee               FLOAT4, \
  improvement_surcharge   FLOAT4, \
  total_amount            FLOAT4, \
  payment_type            VARCHAR(4),\
  trip_type               VARCHAR(4));" 

Load sample data

The COPY command lets you load bulk data into your table in Amazon Redshift. You can use the following command to load data into the table we created earlier:

aws redshift-data execute-statement --database dev --workgroup-name default --sql "COPY demo.green_201601 \
FROM 's3://us-west-2.serverless-analytics/NYC-Pub/green/green_tripdata_2016-01' \
IAM_ROLE default \
DATEFORMAT 'auto' \
IGNOREHEADER 1 \
DELIMITER ',' \
IGNOREBLANKLINES \
REGION 'us-west-2';" 

Retrieve data

The following query uses the table we created earlier:

aws redshift-data execute-statement --database dev --workgroup-name default --sql "SELECT ratecode,  \
COUNT(*) FROM demo.green_201601 WHERE \
trip_distance > 5 GROUP BY 1 ORDER BY 1;"

The following shows an example output:

{
    "CreatedAt": "2023-04-07T17:25:16.030000+00:00",
    "Database": "dev",
    "DbUser": "IAMR:Admin",
    "Id": "cae88c08-0bb4-4279-8845-d5a8fefafade",
    "WorkgroupName": "default"
}

You can fetch results using the statement ID that you receive as an output of execute-statement.

Check the status of a statement

You can check the status of your statement by using describe-statement. The output for describe-statement provides additional details such as PID, query duration, number of rows in and size of the result set, and the query ID given by Amazon Redshift. You have to specify the statement ID that you get when you run the execute-statement command. See the following command:

aws redshift-data describe-statement --id cae88c08-0bb4-4279-8845-d5a8fefafade \

The following is an example output:

{
     "CreatedAt": "2023-04-07T17:27:15.937000+00:00",
     "Duration": 2602410468,
     "HasResultSet": true,
     "Id": "cae88c08-0bb4-4279-8845-d5a8fefafade",
     "QueryString": " SELECT ratecode, COUNT(*) FROM 
     demo.green_201601 WHERE
     trip_distance > 5 GROUP BY 1 ORDER BY 1;",
     "RedshiftPid": 1073815670,
     "WorkgroupName": "default",
     "UpdatedAt": "2023-04-07T17:27:18.539000+00:00"
}

The status of a statement can be STARTED, FINISHED, ABORTED, or FAILED.

Run SQL statements with parameters

You can run SQL statements with parameters. The following example uses two named parameters in the SQL that is specified using a name-value pair:

aws redshift-data execute-statement --database dev --workgroup-name default --sql "select sellerid,sum(pricepaid) totalsales from sales where eventid >= :eventid and sellerid > :selrid group by sellerid"  --parameters "[{\"name\": \"selrid\", \"value\": \"100\"},{\"name\": \"eventid\", \"value\": \"100\"}]"

The describe-statement returns QueryParameters along with QueryString.

You can map the name-value pair in the parameters list to one or more parameters in the SQL text, and the name-value parameter can be in random order. You can’t specify a NULL value or zero-length value as a parameter.

Cancel a running statement

If your query is still running, you can use cancel-statement to cancel a SQL query. See the following command:

aws redshift-data cancel-statement --id 39a0de2f-e85e-45ff-a0d7-cd074c348120

Fetch results from your query

You can fetch the query results by using get-statement-result. The query result is stored for 24 hours. See the following command:

aws redshift-data get-statement-result --id 7b61da88-1b11-4ade-956a-21085a29118d

The output of the result contains metadata such as the number of records fetched, column metadata, and a token for pagination.

Run multiple SQL statements

You can run multiple SELECT, DML, DDL, COPY, or UNLOAD commands for Amazon Redshift in a single transaction with the Data API. The batch-execute-statement enables you to create tables and run multiple COPY commands or create temporary tables as part of your reporting system and run queries on that temporary table. See the following code:

aws redshift-data batch-execute-statement --database dev --workgroup-name default \
--sqls "create temporary table mysales \
(firstname, lastname, total_quantity ) as \
SELECT firstname, lastname, total_quantity \
FROM   (SELECT buyerid, sum(qtysold) total_quantity \
        FROM  sales  \
        GROUP BY buyerid \
        ORDER BY total_quantity desc limit 10) Q, users \
WHERE Q.buyerid = userid \ 
ORDER BY Q.total_quantity desc;" "select * from mysales limit 100;"

The describe-statement for a multi-statement query shows the status of all sub-statements:

{

{
"CreatedAt": "2023-04-10T14:01:11.257000-07:00",
"Duration": 30564173,
"HasResultSet": true,
"Id": "23d99d7f-fd13-4686-92c8-e2c279715c21",
"RedshiftPid": 1073922185,
"RedshiftQueryId": 0,
"ResultRows": -1,
"ResultSize": -1,
"Status": "FINISHED",
"SubStatements": [
{
"CreatedAt": "2023-04-10T14:01:11.357000-07:00",
"Duration": 12779028,
"HasResultSet": false,
"Id": "23d99d7f-fd13-4686-92c8-e2c279715c21:1",
"QueryString": "create temporary table mysales (firstname, lastname,
total_quantity ) as \nSELECT firstname, lastname, total_quantity \nFROM (SELECT
buyerid, sum(qtysold) total_quantity\nFROM sales\nGROUP BY
buyerid\nORDER BY total_quantity desc limit 10) Q, users\nWHERE Q.buyerid =
userid\nORDER BY Q.total_quantity desc;",
"RedshiftQueryId": 0,
"ResultRows": 0,
"ResultSize": 0,
"Status": "FINISHED",
"UpdatedAt": "2023-04-10T14:01:11.807000-07:00"
},
{
"CreatedAt": "2023-04-10T14:01:11.357000-07:00",
"Duration": 17785145,
"HasResultSet": true,
"Id": "23d99d7f-fd13-4686-92c8-e2c279715c21:2",
"QueryString": ""select *\nfrom mysales limit 100;",
"RedshiftQueryId": 0,
"ResultRows": 40,
"ResultSize": 1276,
"Status": "FINISHED",
"UpdatedAt": "2023-04-10T14:01:11.911000-07:00"
}
],
"UpdatedAt": "2023-04-10T14:01:11.970000-07:00",
"WorkgroupName": "default"
}

In the preceding example, we had two SQL statements and therefore the output includes the ID for the SQL statements as 23d99d7f-fd13-4686-92c8-e2c279715c21:1 and 23d99d7f-fd13-4686-92c8-e2c279715c21:2. Each sub-statement of a batch SQL statement has a status, and the status of the batch statement is updated with the status of the last sub-statement. For example, if the last statement has status FAILED, then the status of the batch statement shows as FAILED.

You can fetch query results for each statement separately. In our example, the first statement is a SQL statement to create a temporary table, so there are no results to retrieve for the first statement. You can retrieve the result set for the second statement by providing the statement ID for the sub-statement:

aws redshift-data get-statement-result --id 23d99d7f-fd13-4686-92c8-e2c279715c21:2

Use the Data API with Secrets Manager

The Data API allows you to use database credentials stored in Secrets Manager. You can create a secret type as Other type of secret and then specify username and password. Note you can’t choose an Amazon Redshift cluster because Redshift Serverless is different than a cluster.

Let’s assume that you created a secret key for your credentials as defaultWG. You can use the secret-arn parameter to pass your secret key as follows:

aws redshift-data list-tables --database dev --workgroup-name default --secret-arn defaultWG --region us-west-1

Export the data

Amazon Redshift allows you to export from database tables to a set of files in an S3 bucket by using the UNLOAD command with a SELECT statement. You can unload data in either text or Parquet format. The following command shows you an example of how to use the data lake export with the Data API:

aws redshift-data execute-statement --database dev --workgroup-name default --sql "unload ('select * from demo.green_201601') to '<your-S3-bucket>' iam_role '<your-iam-role>'; " 

You can use batch-execute-statement if you want to use multiple statements with UNLOAD or combine UNLOAD with other SQL statements.

Use the Data API from the AWS SDK

You can use the Data API in any of the programming languages supported by the AWS SDK. For this post, we use the AWS SDK for Python (Boto3) as an example to illustrate the capabilities of the Data API.

We first import the Boto3 package and establish a session:

import botocore.session as bc
import boto3

def get_client(service, endpoint=None, region="us-west-2"):
    session = bc.get_session()
    s = boto3.Session(botocore_session=session, region_name=region)
    if endpoint:
        return s.client(service, endpoint_url=endpoint)
    return s.client(service)

Get a client object

You can create a client object from the boto3.Session object and using RedshiftData:

rsd = get_client('redshift-data')

If you don’t want to create a session, your client is as simple as the following code:

import boto3
client = boto3.client('redshift-data')

Run a statement

The following example code uses the Secrets Manager key to run a statement. For this post, we use the table we created earlier. You can use DDL, DML, COPY, and UNLOAD in the SQL parameter:

resp = rsd.execute_statement(
    WorkgroupName ="default",
Database = "dev",
Sql = "SELECT ratecode, COUNT(*) totalrides FROM demo.green_201601 WHERE trip_distance > 5 GROUP BY 1 ORDER BY 1;" 
)

As we discussed earlier, running a query is asynchronous; running a statement returns an ExecuteStatementOutput, which includes the statement ID.

If you want to publish an event to EventBridge when the statement is complete, you can use the additional parameter WithEvent set to true:

resp = rsd.execute_statement(
    Database="dev",
    WorkgroupName="default",
    Sql="SELECT ratecode, COUNT(*) totalrides FROM demo.green_201601 WHERE trip_distance > 5 GROUP BY 1 ORDER BY 1;",
WithEvent=True
)

Describe a statement

You can use describe_statement to find the status of the query and number of records retrieved:

id=resp['Id']
desc = rsd.describe_statement(Id=id)
if desc["Status"] == "FINISHED":
    print(desc["ResultRows"])

Fetch results from your query

You can use get_statement_result to retrieve results for your query if your query is complete:

if desc and desc["ResultRows"]  > 0:
    result = rsd.get_statement_result(Id=qid)

The get_statement_result command returns a JSON object that includes metadata for the result and the actual result set. You might need to process the data to format the result if you want to display it in a user-friendly format.

Fetch and format results

For this post, we demonstrate how to format the results with the Pandas framework. The post_process function processes the metadata and results to populate a DataFrame. The query function retrieves the result from a database in an Amazon Redshift cluster. See the following code:

import pandas as pd

def post_process(meta, records):
    columns = [k["name"] for k in meta]
    rows = []
    for r in records:
        tmp = []
        for c in r:
            tmp.append(c[list(c.keys())[0]])
        rows.append(tmp)
    return pd.DataFrame(rows, columns=columns)

def query(sql, workgroup="default ", database="dev"):
    resp = rsd.execute_statement(
        Database=database,
        WorkgroupName=workgroup,
        Sql=sql
    )
    qid = resp["Id"]
    print(qid)
    desc = None
    while True:
        desc = rsd.describe_statement(Id=qid)
        if desc["Status"] == "FINISHED" or desc["Status"] == "FAILED":
            break
    	print(desc["ResultRows"])
    if desc and desc["ResultRows"]  > 0:
        result = rsd.get_statement_result(Id=qid)
        rows, meta = result["Records"], result["ColumnMetadata"]
        return post_process(meta, rows)

pf=query("select * from demo.customer_activity limit 100;")
print(pf)

In this post, we demonstrated using the Data API with Python with Redshift Serverless. However, you can use the Data API with other programming languages supported by the AWS SDK. You can read how Roche democratized access to Amazon Redshift data using the Data API with Google Sheets. You can also address this type of use case with Redshift Serverless.

Best practices

We recommend the following best practices when using the Data API:

  • Federate your IAM credentials to the database to connect with Amazon Redshift. Redshift Serverless allows users to get temporary database credentials with GetCredentials. Redshift Serverless scopes the access to the specific IAM user and the database user is automatically created.
  • Use a custom policy to provide fine-grained access to the Data API in the production environment if you don’t want your users to use temporary credentials. You have to use Secrets Manager to manage your credentials in such use cases.
  • Don’t retrieve a large amount of data from your client and use the UNLOAD command to export the query results to Amazon S3. You’re limited to retrieving only 100 MB of data with the Data API.
  • Don’t forget to retrieve your results within 24 hours; results are stored only for 24 hours.

Conclusion

In this post, we introduced how to use the Data API with Redshift Serverless. We also demonstrated how to use the Data API from the Amazon Redshift CLI and Python using the AWS SDK. Additionally, we discussed best practices for using the Data API.

To learn more, refer to Using the Amazon Redshift Data API or visit the Data API GitHub repository for code examples.


About the authors

Debu Panda is a Senior Manager, Product Management at AWS, is an industry leader in analytics, application platform, and database technologies, and has more than 25 years of experience in the IT world. Debu has published numerous articles on analytics, enterprise Java, and databases and has presented at multiple conferences such as re:Invent, Oracle Open World, and Java One. He is lead author of the EJB 3 in Action (Manning Publications 2007, 2014) and Middleware Management (Packt).

Fei Peng is a Software Dev Engineer working in the Amazon Redshift team.

How CyberCRX cut ML processing time from 8 days to 56 minutes with AWS Step Functions Distributed Map

Post Syndicated from Marcia Villalba original https://aws.amazon.com/blogs/aws/how-cybercrx-cut-ml-processing-time-from-8-days-to-56-minutes-with-aws-step-functions-distributed-map/

Last December, Sébastien Stormacq wrote about the availability of a distributed map state for AWS Step Functions, a new feature that allows you to orchestrate large-scale parallel workloads in the cloud. That’s when Charles Burton, a data systems engineer for a company called CyberGRX, found out about it and refactored his workflow, reducing the processing time for his machine learning (ML) processing job from 8 days to 56 minutes. Before, running the job required an engineer to constantly monitor it; now, it runs in less than an hour with no support needed. In addition, the new implementation with AWS Step Functions Distributed Map costs less than what it did originally.

What CyberGRX achieved with this solution is a perfect example of what serverless technologies embrace: letting the cloud do as much of the undifferentiated heavy lifting as possible so the engineers and data scientists have more time to focus on what’s important for the business. In this case, that means continuing to improve the model and the processes for one of the key offerings from CyberGRX, a cyber risk assessment of third parties using ML insights from its large and growing database.

What’s the business challenge?
CyberGRX shares third-party cyber risk (TPCRM) data with their customers. They predict, with high confidence, how a third-party company will respond to a risk assessment questionnaire. To do this, they have to run their predictive model on every company in their platform; they currently have predictive data on more than 225,000 companies. Whenever there’s a new company or the data changes for a company, they regenerate their predictive model by processing their entire dataset. Over time, CyberGRX data scientists improve the model or add new features to it, which also requires the model to be regenerated.

The challenge is running this job for 225,000 companies in a timely manner, with as few hands-on resources as possible. The job runs a set of operations for each company, and every company calculation is independent of other companies. This means that in the ideal case, every company can be processed at the same time. However, implementing such a massive parallelization is a challenging problem to solve.

First iteration
With that in mind, the company built their first iteration of the pipeline using Kubernetes and Argo Workflows, an open-source container-native workflow engine for orchestrating parallel jobs on Kubernetes. These were tools they were familiar with, as they were already using them in their infrastructure.

But as soon as they tried to run the job for all the companies on the platform, they ran up against the limits of what their system could handle efficiently. Because the solution depended on a centralized controller, Argo Workflows, it was not robust, and the controller was scaled to its maximum capacity during this time. At that time, they only had 150,000 companies. And running the job with all of the companies took around 8 days, during which the system would crash and need to be restarted. It was very labor intensive, and it always required an engineer on call to monitor and troubleshoot the job.

The tipping point came when Charles joined the Analytics team at the beginning of 2022. One of his first tasks was to do a full model run on approximately 170,000 companies at that time. The model run lasted the whole week and ended at 2:00 AM on a Sunday. That’s when he decided their system needed to evolve.

Second iteration
With the pain of the last time he ran the model fresh in his mind, Charles thought through how he could rewrite the workflow. His first thought was to use AWS Lambda and SQS, but he realized that he needed an orchestrator in that solution. That’s why he chose Step Functions, a serverless service that helps you automate processes, orchestrate microservices, and create data and ML pipelines; plus, it scales as needed.

Charles got the new version of the workflow with Step Functions working in about 2 weeks. The first step he took was adapting his existing Docker image to run in Lambda using Lambda’s container image packaging format. Because the container already worked for his data processing tasks, this update was simple. He scheduled Lambda provisioned concurrency to make sure that all functions he needed were ready when he started the job. He also configured reserved concurrency to make sure that Lambda would be able to handle this maximum number of concurrent executions at a time. In order to support so many functions executing at the same time, he raised the concurrent execution quota for Lambda per account.

And to make sure that the steps were run in parallel, he used Step Functions and the map state. The map state allowed Charles to run a set of workflow steps for each item in a dataset. The iterations run in parallel. Because Step Functions map state offers 40 concurrent executions and CyberGRX needed more parallelization, they created a solution that launched multiple state machines in parallel; in this way, they were able to iterate fast across all the companies. Creating this complex solution, required a preprocessor that handled the heuristics of the concurrency of the system and split the input data across multiple state machines.

This second iteration was already better than the first one, as now it was able to finish the execution with no problems, and it could iterate over 200,000 companies in 90 minutes. However, the preprocessor was a very complex part of the system, and it was hitting the limits of the Lambda and Step Functions APIs due to the amount of parallelization.

Second iteration with AWS Step Functions

Third and final iteration
Then, during AWS re:Invent 2022, AWS announced a distributed map for Step Functions, a new type of map state that allows you to write Step Functions to coordinate large-scale parallel workloads. Using this new feature, you can easily iterate over millions of objects stored in Amazon Simple Storage Service (Amazon S3), and then the distributed map can launch up to 10,000 parallel sub-workflows to process the data.

When Charles read in the News Blog article about the 10,000 parallel workflow executions, he immediately thought about trying this new state. In a couple of weeks, Charles built the new iteration of the workflow.

Because the distributed map state split the input into different processors and handled the concurrency of the different executions, Charles was able to drop the complex preprocessor code.

The new process was the simplest that it’s ever been; now whenever they want to run the job, they just upload a file to Amazon S3 with the input data. This action triggers an Amazon EventBridge rule that targets the state machine with the distributed map. The state machine then executes with that file as an input and publishes the results to an Amazon Simple Notification Service (Amazon SNS) topic.

Final iteration with AWS Step Functions

What was the impact?
A few weeks after completing the third iteration, they had to run the job on all 227,000 companies in their platform. When the job finished, Charles’ team was blown away; the whole process took only 56 minutes to complete. They estimated that during those 56 minutes, the job ran more than 57 billion calculations.

Processing of the Distributed Map State

The following image shows an Amazon CloudWatch graph of the concurrent executions for one Lambda function during the time that the workflow was running. There are almost 10,000 functions running in parallel during this time.

Lambda concurrency CloudWatch graph

Simplifying and shortening the time to run the job opens a lot of possibilities for CyberGRX and the data science team. The benefits started right away the moment one of the data scientists wanted to run the job to test some improvements they had made for the model. They were able to run it independently without requiring an engineer to help them.

And, because the predictive model itself is one of the key offerings from CyberGRX, the company now has a more competitive product since the predictive analysis can be refined on a daily basis.

Learn more about using AWS Step Functions:

You can also check the Serverless Workflows Collection that we have available in Serverless Land for you to test and learn more about this new capability.

Marcia

AWS Lambda now supports Java 17

Post Syndicated from Benjamin Smith original https://aws.amazon.com/blogs/compute/java-17-runtime-now-available-on-aws-lambda/

This post was written by Mark Sailes, Senior Specialist Solutions Architect, Serverless.

You can now develop AWS Lambda functions with the Amazon Corretto distribution of Java 17. This version of Corretto comes with long-term support (LTS), which means it will receive updates and bug fixes for an extended period, providing stability and reliability to developers who build applications on it. This runtime also supports AWS Lambda SnapStart, so you can upgrade to the latest managed runtime without losing your performance improvements.

Java 17 comes with new language features for developers, including Java records, sealed classes, and multi-line strings. It also comes with improvements to further optimize running Java on ARM CPU architectures, such as Graviton.

This blog explains how to get started using Java 17 with Lambda, how to use the new language features, and what else has changed with the runtime.

New language features

In Java, it is common to pass data using an immutable object. Before Java 17, this resulted in boiler plate code or the use of an external library like Lombok. For example, a generic Person object may look like this:

public class Person {
    
    private final String name;
    private final int age;

    public Person(String name, int age) {
        this.name = name;
        this.age = age;
    }

    public String getName() {
        return name;
    }

    public int getAge() {
        return age;
    }
    
    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;

        Person person = (Person) o;

        if (age != person.age) return false;
        return Objects.equals(name, person.name);
    }

    @Override
    public int hashCode() {
        return Objects.hash(name, age);
    }

    @Override
    public String toString() {
        return "Person{" +
                "name='" + name + '\'' +
                ", age=" + age +
                '}';
    }
}

In Java 17, you can replace this entire class with a record, expressed as:

public record Person(String name, int age) {

}

The equals, hashCode, and toString methods, as well as the private, final fields and public constructor, are generated by the Java compiler. This simplifies the code that you have to maintain.

The Java 17 managed runtime introduces a new feature allowing developers to use records as the object to represent event data in the handler method. Records were introduced in Java 14 and provide a simpler syntax to declare classes primarily used to store data. Records allow developers to define an immutable class with a set of named properties and methods to access those properties, making them perfect for event data. This feature simplifies code, making it easier to read and maintain. Additionally, it can provide better performance since records are immutable by default, and Java’s runtime can optimize the memory allocation and garbage collection process. To use records as the parameter for the event handler method, define the record with the required properties, and pass the record to the method. The ability to use records as the object to represent event data in the handler method is a useful addition to the Java language, providing a concise and efficient way to define event data structures.

For example, the following Lambda function uses a Person record to represent the event data:

public class App implements RequestHandler<Person, APIGatewayProxyResponseEvent> {

    public APIGatewayProxyResponseEvent handleRequest(Person person, Context context) {
        
        String id = UUID.randomUUID().toString();
        Optional<Person> savedPerson = createPerson(id, person.name(), person.age());
        if (savedPerson.isPresent()) {
            return new APIGatewayProxyResponseEvent().withStatusCode(200);
        } else {
            return new APIGatewayProxyResponseEvent().withStatusCode(500);
        }
    }

Garbage collection

Java 17 makes available two new Java garbage collectors (GCs): Z Garbage Collector (ZGC) introduced in Java 15 and Shenandoah introduced in Java 12.

You can evaluate GCs against three axes:

  • Throughput: the amount of work that can be done.
  • Latency: how long work takes to complete.
  • Memory footprint: how much additional memory is required.

Both the ZGC and Shenandoah GCs trade throughput and footprint to focus on reducing latency where possible. They perform all expensive work concurrently, without stopping the execution of application threads for more than a few milliseconds.

In the Java 17 managed runtime, Lambda continues to use the Serial GC as it does in Java 11. This is a low footprint GC well-suited for single processor machines, which is often the case when using Lambda functions.

You can change the default GC using the JAVA_TOOL_OPTIONS environment variable to an alternative if required. For example, if you were running with more memory and therefore multiple CPUs consider the Parallel GC. To use this, set JAVA_TOOL_OPTIONS to -XX:+UseParallelGC.

Runtime JVM configuration changes

In the Java 17 runtime, the JVM flag for tiered compilation is now set to stop at level 1 by default. In previous versions, you would have to do this by setting the JAVA_TOOL_OPTIONS to -XX:+TieredCompilation -XX:TieredStopAtLevel=1.

This is helpful in the majority of synchronous workloads because it can reduce startup latency by up to 60%. For more information on configuring tiered compilation, see “Optimizing AWS Lambda function performance for Java“.

If you are running a workload that processes large numbers of batches, simulates events, or any other highly repetitive action, you might find that this slows the duration of your function. An example of this would be Monte Carlo simulations. To change back to the previous settings, set JAVA_TOOL_OPTIONS to -XX:-TieredCompilation.

Using Java 17 in Lambda

AWS Management Console

To use the Java 17 runtime to develop your Lambda functions, set the runtime value to Java 17 when creating or updating a function.

To update an existing Lambda function to Java 17, navigate to the function in the Lambda console, then choose Edit in the Runtime settings panel. The new version is available in the Runtime dropdown:

AWS Serverless Application Model (AWS SAM)

In AWS SAM, set the Runtime attribute to java17 to use this version:

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: Simple Lambda Function

Resources:
  HelloWorldFunction:
    Type: AWS::Serverless::Function 
    Properties:
      CodeUri: HelloWorldFunction
      Handler: helloworld.App::handleRequest
      Runtime: java17
      MemorySize: 1024

AWS SAM supports the generation of this template with Java 17 out of the box for new serverless applications using the sam init command. Refer to the AWS SAM documentation here.

AWS Cloud Development Kit (AWS CDK)

In the AWS CDK, set the runtime attribute to Runtime.JAVA_17 to use this version. In Java:

import software.amazon.awscdk.core.Construct;
import software.amazon.awscdk.core.Stack;
import software.amazon.awscdk.core.StackProps;
import software.amazon.awscdk.services.lambda.Code;
import software.amazon.awscdk.services.lambda.Function;
import software.amazon.awscdk.services.lambda.Runtime;

public class InfrastructureStack extends Stack {

    public InfrastructureStack(final Construct parent, final String id, final StackProps props) {
        super(parent, id, props);

        Function.Builder.create(this, "HelloWorldFunction")
                .runtime(Runtime.JAVA_17)
                .code(Code.fromAsset("target/hello-world.jar"))
                .handler("helloworld.App::handleRequest")
                .memorySize(1024)
                .build();
    }
}

Application frameworks

Java application frameworks Spring and Micronaut have announced that their latest versions Spring Boot 3 and Micronaut 4 require Java 17 as a minimum. Quarkus 3 continues to support Java 11. Java 17 is faster than 8 or 11, and framework developers want to pass on the performance improvements to customers. They also want to use the improvements to the Java language in their own code and show code examples with the most modern ways of working.

To try Micronaut 4 and Java 17, you can use the Micronaut launch web service to generate an example project that includes all the application code and AWS Cloud Development Kit (CDK) infrastructure as code you need to deploy it to Lambda.

The following command creates a Micronaut application, which uses the common controller pattern to handle REST requests. The infrastructure code will create an Amazon API Gateway and proxy all its requests to the Lambda function.

curl --location --request GET 'https://launch.micronaut.io/create/default/blog.example.lambda-java-17?lang=JAVA&build=MAVEN&test=JUNIT&javaVersion=JDK_17&features=amazon-api-gateway&features=aws-cdk&features=crac' --output lambda-java-17.zip

Unzip the downloaded file then run the following Maven command to generate the deployable artifact.

./mvnw package

Finally, deploy the resources to AWS with CDK:

cd infra
cdk deploy

Conclusion

This blog post describes how to create a new Lambda function running the Amazon Corretto Java 17 managed runtime. It introduces the new records language feature to model the event being sent to your Lambda function and explains how changes to the default JVM configuration might affect the performance of your functions.

If you’re interested in learning more, visit serverlessland.com. If this has inspired you to try migrating an existing application to Lambda, read our re-platforming guide.

How the BMW Group analyses semiconductor demand with AWS Glue

Post Syndicated from Göksel SARIKAYA original https://aws.amazon.com/blogs/big-data/how-the-bmw-group-analyses-semiconductor-demand-with-aws-glue/

This is a guest post co-written by Maik Leuthold and Nick Harmening from BMW Group.

The BMW Group is headquartered in Munich, Germany, where the company oversees 149,000 employees and manufactures cars and motorcycles in over 30 production sites across 15 countries. This multinational production strategy follows an even more international and extensive supplier network.

Like many automobile companies across the world, the BMW Group has been facing challenges in its supply chain due to the worldwide semiconductor shortage. Creating transparency about BMW Group’s current and future demand of semiconductors is one key strategic aspect to resolve shortages together with suppliers and semiconductor manufacturers. The manufacturers need to know BMW Group’s exact current and future semiconductor volume information, which will effectively help steer the available worldwide supply.

The main requirement is to have an automated, transparent, and long-term semiconductor demand forecast. Additionally, this forecasting system needs to provide data enrichment steps including byproducts, serve as the master data around the semiconductor management, and enable further use cases at the BMW Group.

To enable this use case, we used the BMW Group’s cloud-native data platform called the Cloud Data Hub. In 2019, the BMW Group decided to re-architect and move its on-premises data lake to the AWS Cloud to enable data-driven innovation while scaling with the dynamic needs of the organization. The Cloud Data Hub processes and combines anonymized data from vehicle sensors and other sources across the enterprise to make it easily accessible for internal teams creating customer-facing and internal applications. To learn more about the Cloud Data Hub, refer to BMW Group Uses AWS-Based Data Lake to Unlock the Power of Data.

In this post, we share how the BMW Group analyzes semiconductor demand using AWS Glue.

Logic and systems behind the demand forecast

The first step towards the demand forecast is the identification of semiconductor-relevant components of a vehicle type. Each component is described by a unique part number, which serves as a key in all systems to identify this component. A component can be a headlight or a steering wheel, for example.

For historic reasons, the required data for this aggregation step is siloed and represented differently in diverse systems. Because each source system and data type have its own schema and format, it’s particularly difficult to perform analytics based on this data. Some source systems are already available in the Cloud Data Hub (for example, part master data), therefore it’s straightforward to consume from our AWS account. To access the remaining data sources, we need to build specific ingest jobs that read data from the respective system.

The following diagram illustrates the approach.

The data enrichment starts with an Oracle Database (Software Parts) that contains part numbers that are related to software. This can be the control unit of a headlight or a camera system for automated driving. Because semiconductors are the basis for running software, this database builds the foundation of our data processing.

In the next step, we use REST APIs (Part Relations) to enrich the data with further attributes. This includes how parts are related (for example, a specific control unit that will be installed into a headlight) and over which timespan a part number will be built into a vehicle. The knowledge about the part relations is essential to understand how a specific semiconductor, in this case the control unit, is relevant for a more general part, the headlight. The temporal information about the use of part numbers allows us to filter out outdated part numbers, which will not be used in the future and therefore have no relevance in the forecast.

The data (Part Master Data) can directly be consumed from the Cloud Data Hub. This database includes attributes about the status and material types of a part number. This information is required to filter out part numbers that we gathered in the previous steps but have no relevance for semiconductors. With the information that was gathered from the APIs, this data is also queried to extract further part numbers that weren’t ingested in the previous steps.

After data enrichment and filtering, a third-party system reads the filtered part data and enriches the semiconductor information. Subsequently, it adds the volume information of the components. Finally, it provides the overall semiconductor demand forecast centrally to the Cloud Data Hub.

Applied services

Our solution uses the serverless services AWS Glue and Amazon Simple Storage Service (Amazon S3) to run ETL (extract, transform, and load) workflows without managing an infrastructure. It also reduces the costs by paying only for the time jobs are running. The serverless approach fits our workflow’s schedule very well because we run the workload only once a week.

Because we’re using diverse data source systems as well as complex processing and aggregation, it’s important to decouple ETL jobs. This allows us to process each data source independently. We also split the data transformation into several modules (Data Aggregation, Data Filtering, and Data Preparation) to make the system more transparent and easier to maintain. This approach also helps in case of extending or modifying existing jobs.

Although each module is specific to a data source or a particular data transformation, we utilize reusable blocks inside of every job. This allows us to unify each type of operation and simplifies the procedure of adding new data sources and transformation steps in the future.

In our setup, we follow the security best practice of the least privilege principle, to ensure the information is protected from accidental or unnecessary access. Therefore, each module has AWS Identity and Access Management (IAM) roles with only the necessary permissions, namely access to only data sources and buckets the job deals with. For more information regarding security best practices, refer to Security best practices in IAM.

Solution overview

The following diagram shows the overall workflow where several AWS Glue jobs are interacting with each other sequentially.

As we mentioned earlier, we used the Cloud Data Hub, Oracle DB, and other data sources that we communicate with via the REST API. The first step of the solution is the Data Source Ingest module, which ingests the data from different data sources. For that purpose, AWS Glue jobs read information from different data sources and writes into the S3 source buckets. Ingested data is stored in the encrypted buckets, and keys are managed by AWS Key Management Service (AWS KMS).

After the Data Source Ingest step, intermediate jobs aggregate and enrich the tables with other data sources like components version and categories, model manufacture dates, and so on. Then they write them into the intermediate buckets in the Data Aggregation module, creating comprehensive and abundant data representation. Additionally, according to the business logic workflow, the Data Filtering and Data Preparation modules create the final master data table with only actual and production-relevant information.

The AWS Glue workflow manages all these ingestion jobs and filtering jobs end to end. An AWS Glue workflow schedule is configured weekly to run the workflow on Wednesdays. While the workflow is running, each job writes execution logs (info or error) into Amazon Simple Notification Service (Amazon SNS) and Amazon CloudWatch for monitoring purposes. Amazon SNS forwards the execution results to the monitoring tools, such as Mail, Teams, or Slack channels. In case of any error in the jobs, Amazon SNS also alerts the listeners about the job execution result to take action.

As the last step of the solution, the third-party system reads the master table from the prepared data bucket via Amazon Athena. After further data engineering steps like semiconductor information enrichment and volume information integration, the final master data asset is written into the Cloud Data Hub. With the data now provided in the Cloud Data Hub, other use cases can use this semiconductor master data without building several interfaces to different source systems.

Business outcome

The project results provide the BMW Group a substantial transparency about their semiconductor demand for their entire vehicle portfolio in the present and in the future. The creation of a database with that magnitude enables the BMW Group to establish even further use cases to the benefit of more supply chain transparency and clearer and deeper exchange with first-tier suppliers and semiconductor manufacturers. It helps not only to resolve the current demanding market situation, but also to be more resilient in the future. Therefore, it’s one major step to a digital, transparent supply chain.

Conclusion

This post describes how to analyze semiconductor demand from many data sources with big data jobs in an AWS Glue workflow. A serverless architecture with minimal diversity of services makes the code base and architecture simple to understand and maintain. To learn more about how to use AWS Glue workflows and jobs for serverless orchestration, visit the AWS Glue service page.


About the authors

Maik Leuthold is a Project Lead at the BMW Group for advanced analytics in the business field of supply chain and procurement, and leads the digitalization strategy for the semiconductor management.

Nick Harmening is an IT Project Lead at the BMW Group and an AWS certified Solutions Architect. He builds and operates cloud-native applications with a focus on data engineering and machine learning.

Göksel Sarikaya is a Senior Cloud Application Architect at AWS Professional Services. He enables customers to design scalable, cost-effective, and competitive applications through the innovative production of the AWS platform. He helps them to accelerate customer and partner business outcomes during their digital transformation journey.

Alexander Tselikov is a Data Architect at AWS Professional Services who is passionate about helping customers to build scalable data, analytics and ML solutions to enable timely insights and make critical business decisions.

Rahul Shaurya is a Senior Big Data Architect at Amazon Web Services. He helps and works closely with customers building data platforms and analytical applications on AWS. Outside of work, Rahul loves taking long walks with his dog Barney.

Building private serverless APIs with AWS Lambda and Amazon VPC Lattice

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-private-serverless-apis-with-aws-lambda-and-amazon-vpc-lattice/

This post was written by Josh Kahn, Tech Leader, Serverless.

Amazon VPC Lattice is a new, generally available application networking service that simplifies connectivity between services. Builders can connect, secure, and monitor services on instances, containers, or serverless compute in a simplified and consistent manner.

VPC Lattice supports AWS Lambda functions as both a target and a consumer of services. This blog post explores how to incorporate VPC Lattice into your serverless workloads to simplify private access to HTTP-based APIs built with Lambda.

Overview

VPC Lattice is an application networking service that enables discovery and connectivity of services across VPCs and AWS accounts. VPC Lattice includes features that allow builders to define policies for network access, traffic management, and monitoring. It also supports custom domain names for private endpoints.

VPC Lattice is composed of several key components:

  • Service network – a logical grouping mechanism for a collection of services on which you can apply common policies. Associate one or more VPCs to allow access from services in the VPC to the service network.
  • Service – a unit of software that fulfills a specific task or function. Services using VPC Lattice can run on instances, containers, or serverless compute. This post focuses on services built with Lambda functions.
  • Target group – in a serverless application, a Lambda function that performs business logic in response to a request. Routing rules within the service route requests to the appropriate target group.
  • Auth policy – an AWS Identity and Access Management (IAM) resource policy that can be associated with a service network and a service that defines access to those services.

VPC Lattice enables connectivity across VPC and account boundaries, while alleviating the complexity of the underlying networking. It supports HTTP/HTTPS and gRPC protocols, though gRPC is not currently applicable for Lambda target groups.

VPC Lattice and Lambda

Lambda is one of the options to build VPC Lattice services. The AWS Lambda console supports VPC Lattice as a trigger, similar to previously existing triggers such as Amazon API Gateway and Amazon EventBridge. You can also connect VPC Lattice as an event source using infrastructure as code, such as AWS CloudFormation and Terraform.

To configure VPC Lattice as a trigger for a Lambda function in the Console, navigate to the desired function and select the Configuration tab. Select the Triggers menu on the left and then choose Add trigger.

The trigger configuration wizard allows you to define a new VPC Lattice service provided by the Lambda function or to add to an existing service. When adding to an existing service, the wizard allows configuration of path-based routing that sends requests to the target group that includes the function. Path-based and other routing mechanisms available from VPC Lattice are useful in migration scenarios.

This example shows creating a new service. Provide a unique name for the service and select the desired VPC Lattice service network. If you have not created a service network, follow the link to create a new service network in the VPC Console (to create a new service network, read the VPC Lattice documentation).

The listener configuration allows you to configure the protocol and port on which the service is accessible. HTTPS (port 443) is the default configuration, though you can also configure the listener for HTTP (port 80). Note that configuring the listener for HTTP does not change the behavior of Lambda: it is still invoked by VPC Lattice over an HTTPS endpoint, but the service endpoint is available as HTTP. Choose Add to complete setup.

In addition to configuring the VPC Lattice service and target group, the Lambda wizard also adds a resource policy to the function that allows the VPC Lattice target group to invoke the function.

Add trigger

VPC Lattice integration

When a client sends a request to a VPC Lattice service backed by a Lambda target group, VPC Lattice synchronously invokes the target Lambda function. During a synchronous invocation, the client waits for the result of the function and all retry handling is performed by the client. VPC Lattice has an idle timeout of one minute and connection timeout of ten minutes to both the client and target.

The event payload received by the Lambda function when invoked by VPC Lattice is similar to the following example. Note that base64 encoding is dependent on the content type.

{
    "body": "{ "\userId\": 1234, \"orderId\": \"5C71D3EB-3B8A-457B-961D\" }",
    "headers": {
        "accept": "application/json, text/plain, */*",
        "content-length": "156",
        "user-agent": "axios/1.3.4",
        "host": "myvpclattice-service-xxxx.xxxx.vpc-lattice-svcs.us-east-2.on.aws",
        "x-forwarded-for": "10.0.129.151"
    },
    "is_base64_encoded": false,
    "method": "PUT",
    "query_string_parameters": {
        "action": "add"
    },
    "raw_path": "/points?action=add"
}

The response payload returned by the Lambda function includes a status code, headers, base64 encoding, and an optional body as shown in the following example. A response payload that does not meet the required specification results in an error. To return binary content, you must set isBase64Encoded to true.

{
    "isBase64Encoded": false,
    "statusCode": 200,
    "statusDescription": "200 OK",
    "headers": {
        "Set-Cookie": "cookies",
        "Content-Type": "application/json"
    },
    "body": "Hello from Lambda (optional)"
}

For more details on the integration between VPC Lattice and Lambda, visit the Lambda documentation.

Calling VPC Lattice services from Lambda

VPC Lattice services support connectivity over HTTP/HTTPS and gRPC protocols as well as open access or authorization using IAM. To call a VPC Lattice service, the Lambda function must be attached to a VPC that is associated to a VPC Lattice service network:

While a function that calls a VPC Lattice service must be associated with an appropriate VPC, a Lambda function that is part of a Lattice service target group does not need to be attached to a VPC. Remember that Lambda functions are always invoked via an AWS endpoint with access controlled by AWS IAM.

Calls to a VPC Lattice service are similar to sending a request to other HTTP/HTTPS services. VPC Lattice allows builders to define an optional auth policy to enforce authentication and perform context-specific authorization and implement network-level controls with security groups. Callers of the service must meet networking and authorization requirements to access the service. VPC Lattice blocks traffic if it does not explicitly meet all conditions before your function is invoked.

A Lambda function that calls a VPC Lattice service must have explicit permission to invoke that service, unless the auth type for the service is NONE. You provide that permission through a policy attached to the Lambda function’s execution role, for example:

{
    "Action": "vpc-lattice-svcs:Invoke",
    "Resource": "arn:aws:vpc-lattice:us-east-2:123456789012:service/svc-123abc/*",
    "Effect": "Allow"
}

If the auth policy associated with your service network or service requires authenticated requests, any requests made to that service must contain a valid request signature computed using Signature Version 4 (SigV4). An example of computing a SigV4 signature can be found in the VPC Lattice documentation. VPC Lattice does not support payload signing at this time. In TypeScript, you can sign a request using the AWS SDK and Axios library as follows:

import { SignatureV4 } from "@aws-sdk/signature-v4";
import { Sha256 } from "@aws-crypto/sha256-js";
import axios from "axios";

const endpointUrl = new URL(VPC_LATTICE_SERVICE_ENDPOINT);
const sigv4 = new SignatureV4({
    service: "vpc-lattice-svcs",
    region: process.env.AWS_REGION!,
    credentials: {
        accessKeyId: process.env.AWS_ACCESS_KEY_ID!,
        secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!,
        sessionToken: process.env.AWS_SESSION_TOKEN
    },
    sha256: Sha256
});

const signedRequest = await sigv4.sign({
    method: "PUT",
    hostname: endpointUrl.host,
    path: endpointUrl.pathname,
    protocol: endpointUrl.protocol,
    headers: {
        'Content-Type': 'application/json',
        host: endpointUrl.hostname,
        // Include following header as VPC Lattice does not support signed payloads
        "x-amz-content-sha256": "UNSIGNED-PAYLOAD"
    }
  });    
  const { data } = await axios({
    ...signedRequest,
    data: {
        // some data
    },
    url: VPC_LATTICE_SERVICE_ENDPOINT
  });

VPC Lattice provides several layers of security controls, including network-level and auth policies, that allow (or deny) access from a client to your service. These controls can be implemented at the service network, applying those controls across all services in the network.

Connecting to any VPC Lattice service

VPC Lattice supports services built using Amazon EKS and Amazon EC2 in addition to Lambda. Calling services built using these other compute options looks exactly the same to the caller as the preceding sample. VPC Lattice provides an endpoint that abstracts how the service itself is actually implemented.

A Lambda function configured to access resources in a VPC can potentially access VPC Lattice services that are part of the service network associated with that VPC. IAM permissions, the auth policy associated with the service, and security groups may also impact whether the function can invoke the service (see VPC Lattice documentation for details on securing your services).

Services deployed to an Amazon EKS cluster can also invoke Lambda functions exposed as VPC Lattice services using native Kubernetes semantics. They can use either the VPC Lattice-generated domain name or a configured custom domain name to invoke the Lambda function instead of API Gateway or an Application Load Balancer (ALB). Refer to this blog post on the AWS Container Blog for details on how an Amazon EKS service invokes a VPC Lattice service with access control enabled.

Building private serverless APIs

With the launch of VPC Lattice, AWS now offers several options to build serverless APIs accessible only within your customer VPC. These options include API Gateway, ALB, and VPC Lattice. Each of these services offers a unique set of features and trade-offs that may make one a better fit for your workload than others.

Private APIs with API Gateway provide a rich set of features, including throttling, caching, and API keys. API Gateway also offers a rich set of authorization and routing options. Detailed networking and DNS knowledge may be required in complex environments. Both network-level and resource policy controls are available to control access and the OpenAPI specification allows schema sharing.

Application Load Balancer provides flexibility and a rich set of routing options, including to a variety of targets. ALB also can offer a static IP address via AWS Global Accelerator. Detailed networking knowledge is required to configure cross-VPC/account connectivity. ALB relies on network-level controls.

Service networks in VPC Lattice simplify access to services on EC2, EKS, and Lambda across VPCs and accounts without requiring detailed knowledge of networking and DNS. VPC Lattice provides a centralized means of managing access control and guardrails for service-to-service communication. VPC Lattice also readily supports custom domain names and routing features (path, method, header) that enable customers to build complex private APIs without the complexity of managing networking. VPC Lattice can be used to provide east-west interservice communication in combination with API Gateway and AWS AppSync to provide public endpoints for your services.

Conclusion

We’re excited about the simplified connectivity now available with VPC Lattice. Builders can focus on creating customer value and differentiated features instead of complex networking in much the same way that Lambda allows you to focus on writing code. If you are interested in learning more about VPC Lattice, we recommend the VPC Lattice User Guide.

To learn more about serverless, visit Serverless Land for a wide array of reusable patterns, tutorials, and learning materials.

Let’s Architect! Getting started with containers

Post Syndicated from Luca Mezzalira original https://aws.amazon.com/blogs/architecture/lets-architect-getting-started-with-containers/

Most of AWS customers building cloud-native applications or modernizing applications choose containers to run their microservices applications to accelerate innovation and time to market while lowering their total cost of ownership (TCO). Using containers in AWS comes with other benefits, such as increased portability, scalability, and flexibility.

The combination of containers technologies and AWS services also provides features such as load balancing, auto scaling, and service discovery, making it easier to deploy and manage applications at scale.

In this edition of Let’s Architect! we share useful resources to help you to get started with containers on AWS.

Container Build Lens

This whitepaper describes the Container Build Lens for the AWS Well-Architected Framework. It helps customers review and improve their cloud-based architectures and better understand the business impact of their design decisions. The document describes general design principles for containers, as well as specific best practices and implementation guidance using the Six Pillars of the Well-Architected Framework.

Take me to explore the Containers Build Lens!

Follow Containers Build Lens Best practices to architect your containers-based workloads

Follow Containers Build Lens Best practices to architect your containers-based workloads.

EKS Workshop

The EKS Workshop is a useful resource to familiarize yourself with Amazon Elastic Kubernetes Service (Amazon EKS) by practicing on real use-cases. It is built to help users learn about Amazon EKS features and integrations with popular open-source projects. The workshop is abstracted into high-level learning modules, including Networking, Security, DevOps Automation, and more. These are further broken down into standalone labs focusing on a particular feature, tool, or use case.

Once you’re done experimenting with EKS Workshop, start building your environments with Amazon EKS Blueprints, a collection of Infrastructure as Code (IaC) modules that helps you configure and deploy consistent, batteries-included Amazon EKS clusters across accounts and regions following AWS best practices. Amazon EKS Blueprints are available in both Terraform and CDK.

Take me to this workshop!

The workshop is abstracted into high-level learning modules, including Networking, Security, DevOps Automation, and more.

The workshop is abstracted into high-level learning modules, including Networking, Security, DevOps Automation, and more.

Architecting for resiliency on AWS App Runner

Learn how to architect an highly available and resilient application using AWS App Runner. With App Runner, you can start with just the source code of your application or a container image. The complexity of running containerized applications is abstracted away, including the cloud resources needed for running your web application or API. App Runner manages load balancers, TLS certificates, auto scaling, logs, metrics, teachability and more, so you can focus on implementing your business logic in a highly scalable and elastic environment.

Take me to this blog post!

A high-level architecture for an available and resilient application with AWS App Runner.

A high-level architecture for an available and resilient application with AWS App Runner

Securing Kubernetes: How to address Kubernetes attack vectors

As part of designing any modern system on AWS, it is necessary to think about the security implications and what can affect your security posture. This session introduces the fundamentals of the Kubernetes architecture and common attack vectors. It also includes security controls provided by Amazon EKS and suggestions on how to address them. With these strategies, you can learn how to reduce risk for your Kubernetes-based workloads.

Take me to this video!

Some common attack vectors that need addressing with Kubernetes

Some common attack vectors that need addressing with Kubernetes

See you next time!

Thanks for exploring architecture tools and resources with us!

Next time we’ll talk about serverless.

To find all the posts from this series, check out the Let’s Architect! page of the AWS Architecture Blog.

Implementing error handling for AWS Lambda asynchronous invocations

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/implementing-error-handling-for-aws-lambda-asynchronous-invocations/

This blog is written by Poornima Chand, Senior Solutions Architect, Strategic Accounts and Giedrius Praspaliauskas, Senior Solutions Architect, Serverless.

AWS Lambda functions allow both synchronous and asynchronous invocations, which both have different function behaviors and error handling:

When you invoke a function synchronously, Lambda returns any unhandled errors in the function code back to the caller. The caller can then decide how to handle the errors. With asynchronous invocations, the caller does not wait for a response from the function code. It hands off the event to the Lambda service to handle the process.

As the caller does not have visibility of any downstream errors, error handling for asynchronous invocations can be more challenging and must be implemented at the Lambda service layer.

This post explains the error behaviors and approaches for handling errors in Lambda asynchronous invocations to build reliable serverless applications.

Overview

AWS services such as Amazon S3, Amazon SNS, and Amazon EventBridge invoke Lambda functions asynchronously. When you invoke a function asynchronously, the Lambda service places the event in an internal queue and returns a success response without additional information. A separate process reads the events from the queue and sends those to the function.

You can configure how a Lambda function handles the errors either by implementing error handling within the code and using the error handling features provided by the Lambda service. The following diagram depicts the solution options for observing and handling errors in asynchronous invocations.

Architectural overview

Architectural overview

Understanding the error behavior

When you invoke a function, two types of errors can occur. Invocation errors occur if the Lambda service rejects the request before the function receives it (throttling and system errors (400-series and 500-series)). Function errors occur when the function’s code or runtime returns an error (exceptions and timeouts). The Lambda service retries the function invocation if it encounters unhandled errors in an asynchronous invocation.

The retry behavior is different for invocation errors and function errors. For function errors, the Lambda service retries twice by default, and these additional invocations incur cost. For throttling and system errors, the service returns the event to the event queue and attempts to run the function again for up to 6 hours, using exponential backoff. You can control the default retry behavior by setting the maximum age of an event (up to 6 hours) and the retry attempts (0, 1 or 2). This allows you to limit the number of retries and avoids retrying obsolete events.

Handling the errors

Depending on the error type and behaviors, you can use the following options to implement error handling in Lambda asynchronous invocations.

Lambda function code

The most typical approach to handling errors is to address failures directly in the function code. While implementing this approach varies across programming languages, it commonly involves the use of a try/catch block in your code.

Error handling within the code may not cover all potential errors that could occur during the invocation. It may also affect Lambda error metrics in CloudWatch if you suppress the error. You can address these scenarios by using the error handling features provided by Lambda.

Failure destinations

You can configure Lambda to send an invocation record to another service, such as Amazon SQS, SNS, Lambda, or EventBridge, using AWS Lambda Destination. The invocation record contains details about the request and response in JSON format. You can configure separate destinations for events that are processed successfully, and events that fail all processing attempts.

With failure destinations, after exhausting all retries, Lambda sends a JSON document with details about the invocation and error to the destination. You can use this information to determine re-processing strategy (for example, extended logging, separate error flow, manual processing).

For example, to use Lambda destinations in an AWS Serverless Application Model (AWS SAM) template:

ProcessOrderForShipping:
    Type: AWS::Serverless::Function
    Properties:
      Description: Function that processes order before shipping
      Handler: src/process_order_for_shipping.lambda_handler
      EventInvokeConfig:
        DestinationConfig:
          OnSuccess:
            Type: SQS
            Destination: !GetAtt ShipmentsJobsQueue.Arn 
          OnFailure:
            Type: Lambda
            Destination: !GetAtt ErrorHandlingFunction.Arn

Dead-letter queues

You can use dead-letter queues (DLQ) to capture failed events for re-processing. With DLQs, message attributes capture error details. You can configure a standard SQS queue or standard SNS topic as a dead-letter queue for discarded events. For dead-letter queues, Lambda only sends the content of the event, without details about the response.

This is an example of using dead-letter queues in an AWS SAM template:

SendOrderToShipping:
    Type: AWS::Serverless::Function
    Properties:
      Description: Function that sends order to shipping
      Handler: src/send_order_to_shipping.lambda_handler
      DeadLetterQueue:
        Type: SQS
        TargetArn: !GetAtt OrderShippingFunctionDLQ.Arn 

Design considerations

There are a number of design considerations when using DLQs:

  • Error handling within the function code works well for issues that you can easily address in the code. For example, retrying database transactions in the case of failures because of disruptions in network connectivity.
  • Scenarios that require complex error handling logic (for example, sending failed messages for manual re-processing) are better handled using Lambda service features. This approach would keep the function code simpler and easy to maintain.
  • Even though the dead-letter queue’s behavior is the same as an on-failure destination, a dead-letter queue is part of a function’s version-specific configuration.
  • Invocation records sent to on-failure destinations contain more information about the failure than DLQ message attributes. This includes the failure condition, error message, stack trace, request, and response payloads.
  • Lambda destinations also support additional targets, such as other Lambda functions and EventBridge. This allows destinations to give you more visibility and control of function execution results, and reduce code.

Gaining visibility into errors

Understanding of the behavior and errors cannot rely on error handling alone.

You also want to know why errors address the underlying issues. You must also know when there is elevated error rate, the expected baseline for the errors, other activities in the system when errors happen. Monitoring and observability, including metrics, logs and tracing, brings visibility to the errors and underlying issues.

Metrics

When a function finishes processing an event, Lambda sends metrics about the invocation to Amazon CloudWatch. This includes metrics for the errors that happen during the invocation that you should monitor and react to:

  • Errors – the number of invocations that result in a function error (include exceptions that both your code and the Lambda runtime throw).
  • Throttles – the number of invocation requests that are throttled (note that throttled requests and other invocation errors don’t count as errors in the previous metric).

There are also metrics specific to the errors in asynchronous invocations:

  • AsyncEventsDropped – the number of events that are dropped without successfully running the function.
  • DeadLetterErrors – the number of times that Lambda attempts to send an event to a dead-letter queue (DLQ) but fails (typically because of mis-configured resources or size limits).
  • DestinationDeliveryFailures – the number of times that Lambda attempts to send an event to a destination but fails (typically because of permissions, mis-configured resources, or size limits).

CloudWatch Logs

Lambda automatically sends logs to Amazon CloudWatch Logs. You can write to these logs using the standard logging functionality for your programming language. The resulting logs are in the CloudWatch Logs group that is specific to your function, named /aws/lambda/<function name>. You can use CloudWatch Logs Insights to query logs across multiple functions.

AWS X-Ray

AWS X-Ray can visualize the components of your application, identify performance bottlenecks, and troubleshoot requests that resulted in an error. Keep in mind that AWS X-Ray does not trace all requests. The sampling rate is one request per second and 5 percent of additional requests (this is non-configurable). Do not rely on AWS X-Ray as an only tool while troubleshooting a particular failed invocation as it may be missing in the sampled traces.

Conclusion

This blog post walks through error handling in the asynchronous Lambda function invocations using various approaches and discusses how to gain observability into those errors.

For more detail on the topics covered, visit:

For more serverless learning resources, visit Serverless Land.

Understanding techniques to reduce AWS Lambda costs in serverless applications

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/understanding-techniques-to-reduce-aws-lambda-costs-in-serverless-applications/

This post is written by Josh Kahn, Tech Leader, and Chloe Jeon, Senior PMTes Lambda.

Serverless applications can lower the total cost of ownership (TCO) when compared to a server-based cloud execution model because it effectively shifts operational responsibilities such as managing servers to a cloud provider. Deloitte research on serverless TCO with Fortune 100 clients across industries show that serverless applications can offer up to 57% cost savings compared with server-based solutions.

Serverless applications can offer lower costs in:

  • Initial development: Serverless enables builders to be more agile, deliver features more rapidly, and focus on differentiated business logic.
  • Ongoing maintenance and infrastructure: Serverless shifts operational burden to AWS. Ongoing maintenance tasks, including patching and operating system updates.

This post focuses on options available to reduce direct AWS costs when building serverless applications. AWS Lambda is often the compute layer in these workloads and may comprise a meaningful portion of the overall cost.

To help optimize your Lambda-related costs, this post discusses some of the most commonly used cost optimization techniques with an emphasis on configuration changes over code updates. This post is intended for architects and developers new to building with serverless.

Building with serverless makes both experimentation and iterative improvement easier. All of the techniques described here can be applied before application development, or after you have deployed your application to production. The techniques are roughly by applicability: The first can apply to any workload; the last applies to a smaller number of workloads.

Right-sizing your Lambda functions

Lambda uses a pay-per-use cost model that is driven by three metrics:

  • Memory configuration: allocated memory from 128MB to 10,240MB. CPU and other resources available to the function are allocated proportionally to memory.
  • Function duration: time function runs, measures in milliseconds.
  • Number of invocations: the number of times your function runs.

Over-provisioning memory is one of the primary drivers of increased Lambda cost. This is particularly acute among builders new to Lambda who are used to provisioning hosts running multiple processes. Lambda scales such that each execution environment of a function only handles one request at a time. Each execution environment has access to fixed resources (memory, CPU, storage) to complete work on the request.

By right-sizing the memory configuration, the function has the resources to complete its work and you are paying for only the needed resources. While you also have direct control of function duration, this is a less effective cost optimization to implement. The engineering costs to create a few milliseconds of savings may outweigh the cost savings. Depending on the workload, the number of times your function is invoked may be outside your control. The next section discusses a technique to reduce the number of invocations for some types of workloads.

Memory configuration is accessible via the AWS Management Console or your favorite infrastructure as code (IaC) option. The memory configuration setting defines allocated memory, not memory used by your function. Right-sizing memory is an adjustment that can reduce the cost (or increase performance) of your function. However, lowering the function-memory may not always result in cost savings. Lowering function memory means lowering available CPU for the Lambda function, which could increase the function duration, resulting in either no cost savings or higher cost. It is important to identify the optimal memory configuration for cost savings while preserving performance.

AWS offers two approaches to right-sizing memory allocation: AWS Lambda Power Tuning and AWS Compute Optimizer.

AWS Lambda Power Tuning is an open-source tool that can be used to empirically find the optimal memory configuration for your function by trading off cost against execution time. The tool runs multiple concurrent versions of your function against mock input data at different memory allocations. The result is a chart that can help you find the “sweet spot” between cost and duration/performance. Depending on the workload, you may prioritize one over the other. AWS Lambda Power Tuning is a good choice for new functions and can also help select between the two instruction set architectures offered by Lambda.

AWS Power Tuning Tool

AWS Compute Optimizer uses machine learning to recommend an optimal memory configuration based on historical data. Compute Optimizer requires that your function be invoked at least 50 times over the trailing 14 days to provide a recommendation based on past utilization, so is most effective once your function is in production.

Both Lambda Power Tuning and Compute Optimizer help derive the right-sized memory allocation for your function. Use this value to update the configuration of your function using the AWS Management Console or IaC.

This post includes AWS Serverless Application Model (AWS SAM) sample code throughout to demonstrate how to implement optimizations. You can also use AWS Cloud Development Kit (AWS CDK), Terraform, Serverless Framework, and other IaC tools to implement the same changes.

MyFunction:
  Type: AWS::Serverless::Function
  Properties:
    Runtime: nodejs18.x
    Handler: app.handler
    MemorySize: 1024   # Set memory configuration to optimize for cost or performance

Setting a realistic function timeout

Lambda functions are configured with a maximum time that each invocation can run, up to 15 minutes. Setting an appropriate timeout can be beneficial in containing costs of your Lambda-based application. Unhandled exceptions, blocking actions (for example, opening a network connection), slow dependencies, and other conditions can lead to longer-running functions or functions that run until the configured timeout. Proper timeouts are the best protection against both slow and erroneous code. At some point, the work the function is performing and the per-millisecond cost of that work is wasted.

Our recommendation is to set a timeout of less than 29-seconds for all synchronous invocations, or those in which the caller is waiting for a response. Longer timeouts are appropriate for asynchronous invocations, but consider timeouts longer than 1-minute to be an exception that requires review and testing.

Using Graviton

Lambda offers two instruction set architectures in most AWS Regions: x86 and arm64.

Choosing Graviton can save money in two ways. First, your functions may run more efficiently due to the Graviton2 architecture. Second, you may pay less for the time that they run. Lambda functions powered by Graviton2 are designed to deliver up to 19 percent better performance at 20 percent lower cost. Consider starting with Graviton when developing new Lambda functions, particularly those that do not require natively compiled binaries.

If your function relies on native compiled binaries or is packaged as a container image, you must rebuild to move between arm64 and x86. Lambda layers may also include dependencies targeted for one architecture or the other. We encourage you to review dependencies and test your function before changing the architecture. The AWS Lambda Power Tuning tool also allows you to compare the price and performance of arm64 and x86 at different memory settings.

You can modify the architecture configuration of your function in the console or your IaC of choice. For example, in AWS SAM:

MyFunction:
  Type: AWS::Serverless::Function
  Properties:
    Architectures:
      - arm64. # Set architecture to use Graviton2
    Runtime: nodejs18.x
    Handler: app.handler

Filtering incoming events

Lambda is integrated with over 200 event sources, including Amazon SQS, Amazon Kinesis Data Streams, Amazon DynamoDB Streams, Amazon Managed Streaming for Apache Kafka, and Amazon MQ. The Lambda service integrates with these event sources to retrieve messages and invokes your function as needed to process those messages.

When working with one of these event sources, builders can configure filters to limit the events sent to your function. This technique can greatly reduce the number of times your function is invoked depending on the number of events and specificity of your filters. When not using event filtering, the function must be invoked to first determine if an event should be processed before performing the actual work. Event filtering alleviates the need to perform this upfront check while reducing the number of invocations.

For example, you may only want a function to run when orders of over $200 are found in a message on a Kinesis data stream. You can configure an event filtering pattern using the console or IaC in a manner similar to memory configuration.

To implement the Kinesis stream filter using AWS SAM:

MyFunction:
  Type: AWS::Serverless::Function
  Properties:
    Runtime: nodejs18.x
    Handler: app.handler
    Events:
      Type: Kinesis
      Properties:
        StartingPosition: LATEST
        Stream: "arn:aws:kinesis:us-east-1:0123456789012:stream/orders"
        FilterCriteria:
          Filters:
            - Pattern: '{ "data" : { "order" : { "value" : [{ "numeric": [">", 200] }] } } }'

If an event satisfies one of the event filters associated with the event source, Lambda sends the event to your function for processing. Otherwise, the event is discarded as processed successfully without invoking the function.

If you are building or running a Lambda function that is invoked by one of the previously mentioned event sources, it’s recommended that you review the filtering options available. This technique requires no code changes to your Lambda function – even if the function performs some preprocessing check, that check still completes successfully with filtering implemented.

To learn more, read Filtering event sources for AWS Lambda functions.

Avoiding recursion

You may be familiar with the programming concept of a recursive function or a function/routine that calls itself. Though rare, customers sometimes unintentionally build recursion in their architecture, so a Lambda function continuously calls itself.

The most common recursive pattern is between Lambda and Amazon S3. Actions in an S3 bucket can trigger a Lambda function, and recursion can occur when that Lambda function writes back to the same bucket.

Consider a use case in which a Lambda function is used to generate a thumbnail of user-submitted images. You configure the bucket to trigger the thumbnail generation function when a new object is put in the bucket. What happens if the Lambda function writes the thumbnail to the same bucket? The process starts anew and the Lambda function then runs on the thumbnail image itself. This is recursion and can lead to an infinite loop condition.

While there are multiple ways to prevent against this condition, it’s best practice to use a second S3 bucket to store thumbnails. This approach minimizes changes to the architecture as you do not need to change the notification settings nor the primary S3 bucket. To learn about other approaches, read Avoiding recursive invocation with Amazon S3 and AWS Lambda.

If you do encounter recursion in your architecture, set Lambda reserved concurrency to zero to stop the function from running. Allow minutes to hours before lifting the reserved concurrency cap. Since S3 events are asynchronous invocations that have automatic retries, you may continue to see recursion until you resolve the issue or all events have expired.

Conclusion

Lambda offers a number of techniques that you can use to minimize infrastructure costs whether you are just getting started with Lambda or have numerous functions already deployed in production. When combined with the lower costs of initial development and ongoing maintenance, serverless can offer a low total cost of ownership. Get hands-on with these techniques and more with the Serverless Optimization Workshop.

To learn more about serverless architectures, find reusable patterns, and keep up-to-date, visit Serverless Land.

Configure SAML federation for Amazon OpenSearch Serverless with AWS IAM Identity Center

Post Syndicated from Utkarsh Agarwal original https://aws.amazon.com/blogs/big-data/configure-saml-federation-for-amazon-opensearch-serverless-with-aws-iam-identity-center/

Amazon OpenSearch Serverless is a serverless option of Amazon OpenSearch Service that makes it easy for you to run large-scale search and analytics workloads without having to configure, manage, or scale OpenSearch clusters. It automatically provisions and scales the underlying resources to deliver fast data ingestion and query responses for even the most demanding and unpredictable workloads. With OpenSearch Serverless, you can configure SAML to enable users to access data through OpenSearch Dashboards using an external SAML identity provider (IdP).

AWS IAM Identity Center (Successor to AWS Single Sign-On) helps you securely create or connect your workforce identities and manage their access centrally across AWS accounts and applications, OpenSearch Dashboards being one of them.

In this post, we show you how to configure SAML authentication for OpenSearch Dashboards using IAM Identity Center as its IdP.

Solution overview

The following diagram illustrates how the solution allows users or groups to authenticate into OpenSearch Dashboards using single sign-on (SSO) with IAM Identity Center using its built-in directory as the identity source.

The workflow steps are as follows:

  1. A user accesses the OpenSearch Dashboard URL in their browser and chooses the SAML provider.
  2. OpenSearch Serverless redirects the login to the specified IdP.
  3. The IdP provides a login form for the user to specify the credentials for authentication.
  4. After the user is authenticated successfully, a SAML assertion is sent back to OpenSearch Serverless.

OpenSearch Serverless validates the SAML assertion, and the user logs in to OpenSearch Dashboards.

Prerequisites

To get started, you must have an active OpenSearch Serverless collection. Refer to Creating and managing Amazon OpenSearch Serverless collections to learn more about creating a collection. Furthermore, you must have the correct AWS Identity and Access Management (IAM) permissions for configuring SAML authentication along with relevant IAM permissions for configuring the data access policy.

IAM Identity Center should be enabled, and you should have the relevant IAM permissions to create an application in IAM Identity Center and create and manage users and groups.

Create and configure the application in IAM Identity Center

To set up your application in IAM Identity Center, complete the following steps:

  1. On the IAM Identity Center dashboard, choose Applications in the navigation pane.
  2. Choose Add application
  3. For Custom application, select Add custom SAML 2.0 application.
  4. Choose Next.
  5. Under Configure application, enter a name and description for the application.
  6. Under IAM Identity Center metadata, choose Download under IAM Identity Center SAML metadata file.

We use this metadata file to create a SAML provider under OpenSearch Serverless. It contains the public certificate used to verify the signature of the IAM Identity Center SAML assertions.

  1. Under Application properties, leave Application start URL and Relay state blank.
  2. For Session duration, choose 1 hour (the default value).

Note that the session duration you configure in this step takes precedence over the OpenSearch Dashboards timeout setting specified in the configuration of the SAML provider details on the OpenSearch Serverless end.

  1. Under Application metadata, select Manually type your metadata values.
  2. For Application ACS URL, enter your URL using the format https://collection.<REGION>.aoss.amazonaws.com/_saml/acs. For example, we enter https://collection.us-east-1.aoss.amazonaws.com/_saml/acs for this post.
  3. For Application SAML audience, enter your service provider in the format aws:opensearch:<aws account id>.
  4. Choose Submit.

Now you modify the attribute settings. The attribute mappings you configure here become part of the SAML assertion that is sent to the application.

  1. On the Actions menu, choose Edit attribute mappings.
  2. Configure Subject to map to ${user:email}, with the format unspecified.

Using ${user:email} here ensures that the email address for the user in IAM Identity Center is passed in the <NameId> tag of the SAML response.

  1. Choose Save changes.

Now we assign a user to the application.

  1. Create a user in IAM Identity Center to use to log in to OpenSearch Dashboards.

Alternatively, you can use an existing user.

  1. On the IAM Identity Center console, navigate to your application and choose Assign Users and select the user(s) you would like to assign.

You have now created a custom SAML application. Next, you will configure the SAML provider in OpenSearch Serverless.

Create a SAML provider

The SAML provider you create in this step can be assigned to any collection in the same Region. Complete the following steps:

  1. On the OpenSearch Service console, under Serverless in the navigation pane, choose SAML authentication under Security.
  2. Choose Create SAML provider.
  3. Enter a name and description for your SAML provider.
  4. Enter the metadata from your IdP that you downloaded earlier.
  5. Under Additional settings, you can optionally add custom user ID and group attributes. We leave these settings blank for now.
  6. Choose Create a SAML provider.

You have now configured a SAML provider for OpenSearch Serverless. Next, we walk you through configuring the data access policy for accessing collections.

Create the data access policy

In this section, you set up data access policies for OpenSearch Serverless and allow access to the users. Complete the following steps:

  1. On the OpenSearch Service console, under Serverless in the navigation pane, choose Data access policies under Security.
  2. Choose Create access policy.
  3. Enter a name and description for your access policy.
  4. For Policy definition method, select Visual Editor.
  5. In the Rules section, enter a rule name.
  6. Under Select principals, for Add principals, choose SAML users and groups.
  7. For SAML provider name, choose the SAML provider you created earlier.
  8. Specify the user in the format user/<email> (for example, user/[email protected]).

The value of the email address should match the email address in IAM Identity Center.

  1. Choose Save.
  2. Choose Grant and specify the permissions.

You can configure what access you want to provide for the specific user at the collection level and specific indexes at the index pattern level.

You should select the access the user needs based on the least privilege model. Refer to Supported policy permissions and Supported OpenSearch API operations and permissions to set up more granular access for your users.

  1. Choose Save and configure any additional rules, if required.

You can now review and edit your configuration if needed.

  1. Choose Create to create the data access policy.

Now you have the data access policy that will allow the users to perform the allowed actions on OpenSearch Dashboards.

Access OpenSearch Dashboards

To sign in to OpenSearch Dashboards, complete the following steps:

  1. On the OpenSearch Service dashboard, under Serverless in the navigation pane, choose Dashboard.
  2. Locate your dashboard and copy the OpenSearch Dashboards URL (in the format <collection-endpoint>/_dashboards).
  3. Enter this URL into a new browser tab.
  4. On the OpenSearch login page, choose your IdP and specify your SSO credentials.
  5. Choose Login.

Configure SAML authentication using groups in IAM Identity Center

Groups can help you organize your users and permissions in a coherent way. With groups, you can add multiple users from the IdP, and then use groupid as the identifier in the data access policy. For more information, refer to Add groups and Add users to groups.

To configure group access to OpenSearch Dashboards, complete the following steps:

  1. On the IAM Identity Center console, navigate to your application.
  2. In the Attribute mappings section, add an additional user as group and map it to ${user:groups}, with the format unspecified.
  3. Choose Save changes.
  4. For the SAML provider in OpenSearch Serverless, under Additional settings, for Group attribute, enter group.
  5. For the data access policy, create a new rule or add an additional principal in the previous rule.
  6. Choose the SAML provider name and enter group/<GroupId>.

You can fetch the value for the group ID by navigating to the Group section on the IAM Identity Center console.

Clean up

If you don’t want to continue using the solution, be sure to delete the resources you created:

  1. On the IAM Identity Center console, remove the application.
  2. On OpenSearch Dashboards, delete the following resources:
    1. Delete your collection.
    2. Delete the data access policy.
    3. Delete the SAML provider.

Conclusion

In this post, you learned how to set up IAM Identity Center as an IdP to access OpenSearch Dashboards using SAML as SSO. You also learned on how to set up users and groups within IAM Identity Center and control the access of users and groups for OpenSearch Dashboards. For more details, refer to SAML authentication for Amazon OpenSearch Serverless.

Stay tuned for a series of posts focusing on the various options available for you to build effective log analytics and search solutions using OpenSearch Serverless. You can also refer to the Getting started with Amazon OpenSearch Serverless workshop to know more about OpenSearch Serverless.

If you have feedback about this post, submit it in the comments section. If you have questions about this post, start a new thread on the OpenSearch Service forum or contact AWS Support.


About the Authors

Utkarsh Agarwal is a Cloud Support Engineer in the Support Engineering team at Amazon Web Services. He specializes in Amazon OpenSearch Service. He provides guidance and technical assistance to customers thus enabling them to build scalable, highly available and secure solutions in AWS Cloud. In his free time, he enjoys watching movies, TV series and of course cricket! Lately, he his also attempting to master the art of cooking in his free time – The taste buds are excited, but the kitchen might disagree.

Ravi Bhatane is a software engineer with Amazon OpenSearch Serverless Service. He is passionate about security, distributed systems, and building scalable services. When he’s not coding, Ravi enjoys photography and exploring new hiking trails with his friends.

Prashant Agrawal is a Sr. Search Specialist Solutions Architect with Amazon OpenSearch Service. He works closely with customers to help them migrate their workloads to the cloud and helps existing customers fine-tune their clusters to achieve better performance and save on cost. Before joining AWS, he helped various customers use OpenSearch and Elasticsearch for their search and log analytics use cases. When not working, you can find him traveling and exploring new places. In short, he likes doing Eat → Travel → Repeat.

Python 3.10 runtime now available in AWS Lambda

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/python-3-10-runtime-now-available-in-aws-lambda/

This post is written by Suresh Poopandi, Senior Solutions Architect, Global Life Sciences.

AWS Lambda now supports Python 3.10 as both a managed runtime and container base image. With this release, Python developers can now take advantage of new features and improvements introduced in Python 3.10 when creating serverless applications on Lambda.

Enhancements in Python 3.10 include structural pattern matching, improved error messages, and performance enhancements. This post outlines some of the benefits of Python 3.10 and how to use this version in your Lambda functions.

AWS has also published a preview Lambda container base image for Python 3.11. Customers can use this image to get an early look at Python 3.11 support in Lambda. This image is subject to change and should not be used for production workloads. To provide feedback on this image, and for future updates on Python 3.11 support, see https://github.com/aws/aws-lambda-base-images/issues/62.

What’s new in Python 3.10

Thanks to its simplicity, readability, and extensive community support, Python is a popular language for building serverless applications. The Python 3.10 release includes several new features, such as:

  • Structural pattern matching (PEP 634): Structural pattern matching is one of the most significant additions to Python 3.10. With structural pattern matching, developers can use patterns to match against data structures such as lists, tuples, and dictionaries and run code based on the match. This feature enables developers to write code that processes complex data structures more easily and can improve code readability and maintainability.
  • Parenthesized context managers (BPO-12782): Python 3.10 introduces a new syntax for parenthesized context managers, making it easier to read and write code that uses the “with” statement. This feature simplifies managing resources such as file handles or database connections, ensuring they are released correctly.
  • Writing union types as X | Y (PEP 604): Python 3.10 allows writing union types as X | Y instead of the previous versions’ syntax of typing Union[X, Y]. Union types represent a value that can be one of several types. This change does not affect the functionality of the code and is backward-compatible, so code written with the previous syntax will still work. The new syntax aims to reduce boilerplate code, and improve readability and maintainability of Python code by providing a more concise and intuitive syntax for union types.
  • User-defined type guards (PEP 647): User-defined type guards allow developers to define their own type guards to handle custom data types or to refine the types of built-in types. Developers can define their own functions that perform more complex type checks as user-defined typed guards. This feature improves Python code readability, maintainability, and correctness, especially in projects with complex data structures or custom data types.
  • Improved error messages: Python 3.10 has improved error messages, providing developers with more information about the source of the error and suggesting possible solutions. This helps developers identify and fix issues more quickly. The improved error messages in Python 3.10 include more context about the error, such as the line number and location where the error occurred, as well as the exact nature of the error. Additionally, Python 3.10 error messages now provide more helpful information about how to fix the error, such as suggestions for correct syntax or usage.

Performance improvements

The faster PEP 590 vectorcall calling convention allows for quicker and more efficient Python function calls, particularly those that take multiple arguments. The specific built-in functions that benefit from this optimization include map(), filter(), reversed(), bool(), and float(). By using the vectorcall calling convention, according to Python 3.10 release notes, these inbuilt functions’ performance improved by a factor of 1.26x.

When a function is defined with annotations, these are stored in a dictionary that maps the parameter names to their respective annotations. In previous versions of Python, this dictionary was created immediately when the function was defined. However, in Python 3.10, this dictionary is created only when the annotations are accessed, which can happen when the function is called. By delaying the creation of the annotation dictionary until it is needed, Python can avoid the overhead of creating and initializing the dictionary during function definition. This can result in a significant reduction in CPU time, as the dictionary creation can be a time-consuming operation, particularly for functions with many parameters or complex annotations.

In Python 3.10, the LOAD_ATTR instruction, which is responsible for loading attributes from objects in the code, has been improved with a new mechanism called the “per opcode cache”. This mechanism works by storing frequently accessed attributes in a cache specific to each LOAD_ATTR instruction, which reduces the need for repeated attribute lookups. As a result of this improvement, according to Python 3.10 release notes, the LOAD_ATTR instruction is now approximately 36% faster when accessing regular attributes and 44% faster when accessing attributes defined using the slots mechanism.

In Python, the str(), bytes(), and bytearray() constructors are used to create new instances of these types from existing data or values. Based on the result of the performance tests conducted as part of  BPO-41334, constructors str(), bytes(), and bytearray() are around 30–40% faster for small objects.

Lambda functions developed with Python that read and process Gzip compressed files can gain a performance improvement. Adding _BlocksOutputBuffer for the bz2/lzma/zlib module eliminated the overhead of resizing bz2/lzma buffers, preventing excessive memory footprint of the zlib buffer. According to Python 3.10 release notes, bz2 decompression is now 1.09x faster, lzma decompression 1.20x faster, and GzipFile read is 1.11x faster

Using Python 3.10 in Lambda

AWS Management Console

To use the Python 3.10 runtime to develop your Lambda functions, specify a runtime parameter value Python 3.10 when creating or updating a function. Python 3.10 version is now available in the Runtime dropdown in the Create function page.

Lambda create function page

To update an existing Lambda function to Python 3.10, navigate to the function in the Lambda console, then choose Edit in the Runtime settings panel. The new version of Python is available in the Runtime dropdown:

Edit runtime settings

AWS Serverless Application Model (AWS SAM)

In AWS SAM, set the Runtime attribute to python3.10 to use this version.

AWSTemplateFormatVersion: ‘2010-09-09’
Transform: AWS::Serverless-2016-10-31
Description: Simple Lambda Function

Resources:
  MyFunction:
    Type: AWS::Serverless::Function
    Properties:
      Description: My Python Lambda Function
      CodeUri: my_function/
      Handler: lambda_function.lambda_handler
      Runtime: python3.10

AWS SAM supports the generation of this template with Python 3.10 out of the box for new serverless applications using the sam init command. Refer to the AWS SAM documentation here.

AWS Cloud Development Kit (AWS CDK)

In the AWS CDK, set the runtime attribute to Runtime.PYTHON_3_10 to use this version. In Python:

from constructs import Construct
from aws_cdk import (
    App, Stack,
    aws_lambda as _lambda
)


class SampleLambdaStack(Stack):

    def __init__(self, scope: Construct, id: str, **kwargs) -> None:
        super().__init__(scope, id, **kwargs)

        base_lambda = _lambda.Function(self, 'SampleLambda',
                                       handler='lambda_handler.handler',
                                       runtime=_lambda.Runtime.PYTHON_3_10,
                                       code=_lambda.Code.from_asset('lambda'))

In TypeScript:

import * as cdk from 'aws-cdk-lib';
import * as lambda from 'aws-cdk-lib/aws-lambda'
import * as path from 'path';
import { Construct } from 'constructs';

export class CdkStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    // The code that defines your stack goes here

    // The python3.10 enabled Lambda Function
    const lambdaFunction = new lambda.Function(this, 'python310LambdaFunction', {
      runtime: lambda.Runtime.PYTHON_3_10,
      memorySize: 512,
      code: lambda.Code.fromAsset(path.join(__dirname, '/../lambda')),
      handler: 'lambda_handler.handler'
    })
  }
}

AWS Lambda – Container Image

Change the Python base image version by modifying FROM statement in the Dockerfile:

FROM public.ecr.aws/lambda/python:3.10

# Copy function code
COPY lambda_handler.py ${LAMBDA_TASK_ROOT}

To learn more, refer to the usage tab on building functions as container images.

Conclusion

You can build and deploy functions using Python 3.10 using the AWS Management ConsoleAWS CLIAWS SDKAWS SAM, AWS CDK, or your choice of Infrastructure as Code (IaC). You can also use the Python 3.10 container base image if you prefer to build and deploy your functions using container images.

We are excited to bring Python 3.10 runtime support to Lambda and empower developers to build more efficient, powerful, and scalable serverless applications. Try Python 3.10 runtime in Lambda today and experience the benefits of this updated language version and take advantage of improved performance.

For more serverless learning resources, visit Serverless Land.

Optimizing AWS Lambda extensions in C# and Rust

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/optimizing-aws-lambda-extensions-in-c-and-rust/

This post is written by Siarhei Kazhura, Senior Specialist Solutions Architect, Serverless.

Customers use AWS Lambda extensions to integrate monitoring, observability, security, and governance tools with their Lambda functions. AWS, along with AWS Lambda Ready Partners, like Datadog, Dynatrace, New Relic, provides ready-to-run extensions. You can also develop your own extensions to address your specific needs.

External Lambda extensions are designed as a companion process running in the same execution environment as the function code. That means that the Lambda function shares resources like memory, CPU, and disk I/O, with the extension. Improperly designed extensions can result in a performance degradation and extra costs.

This post shows how to measure the impact an extension has on the function performance using key performance metrics on an Amazon CloudWatch dashboard.

This post focuses on Lambda extensions written in C# and Rust. It shows the benefits of choosing to write Lambda extensions in Rust. Also, it explains how you can optimize a Lambda extension written in C# to deliver three times better performance. The solution can be converted to the programming languages of your choice.

Overview

A C# Lambda function (running on .NET 6) called HeaderCounter is used as a baseline. The function counts the number of headers in a request and returns the number in the response. A static delay of 500 ms is inserted in the function code to simulate extra computation. The function has the minimum memory setting (128 MB), which magnifies the impact that extension has on performance.

A load test is performed via a curl command that is issuing 5000 requests (with 250 requests running simultaneously) against a public Amazon API Gateway endpoint backed by the Lambda function. A CloudWatch dashboard, named lambda-performance-dashboard, displays performance metrics for the function.

Lambda performance dashboard

Metrics captured by the dashboard:

  1. The Max Duration, and Average Duration metrics allow you to assess the impact the extension has on the function execution duration.
  2. The PostRuntimeExtensionsDuration metric measures the extra time that the extension takes after the function invocation.
  3. The Average Memory Used, and Memory Allocated metrics allow you to assess the impact the extension has on the function memory consumption.
  4. The Cold Start Duration, and Cold Starts metrics allow you to assess the impact the extension has on the function cold start.

Running the extensions

There are a few differences between how the extensions written in C# and Rust are run.

The extension written in Rust is published as an executable. The advantage of an executable is that it is compiled to native code, and is ready to run. The extension is environment agnostic, so it can run alongside with a Lambda function written in another runtime.

The disadvantage of an executable is the size. Extensions are served as Lambda layers, and the size of the extension counts towards the deployment package size. The maximum unzipped deployment package size for Lambda is 250 MB.

The extension written in C# is published as a dynamic-link library (DLL). The DLL contains the Common Intermediate Language (CIL), that must be converted to native code via a just-in-time (JIT) compiler. The .NET runtime must be present for the extension to run. The dotnet command runs the DLL in the example provided with the solution.

Blank extension

Blank extension

Three instances of the HeaderCounter function are deployed:

  1. The first instance, available via a no-extension endpoint, has no extensions.
  2. The second instance, available via a dotnet-extension endpoint, is instrumented with a blank extension written in C#. The extension does not provide any extra functionality, except logging the event received to CloudWatch.
  3. The third instance, available via a rust-extension endpoint, is instrumented with a blank extension written in Rust. The extension does not provide any extra functionality, except logging the event received to CloudWatch.

Dashboard results

The dashboard shows that the extensions add minimal overhead to the Lambda function. The extension written in C# adds more overhead in the higher percentile metrics, such as the Maximum Cold Start Duration and Maximum Duration.

EventCollector extension

EventCollector extension

Three instances of the HeaderCounter function are deployed:

  1. The first instance, available via a no-extension endpoint, has no extensions.
  2. The second instance, available via a dotnet-extension endpoint, is instrumented with an EventCollector extension written in C#. The extension is pushing all the extension invocation events to Amazon S3.
  3. The third instance, available via a rust-extension endpoint, is instrumented with an EventCollector extension written in Rust. The extension is pushing all the extension invocation events to S3.

Performance dashboard

The Rust extension adds little overhead in terms of the Duration, number of Cold Starts, and Average PostRuntimeExtensionDuration metrics. Yet there is a clear performance degradation for the function that is instrumented with an extension written in C#. Average Duration jumped almost three times, and the Maximum Duration is now around six times higher.

The function is now consuming almost all the memory allocated. CPU, networking, and storage for Lambda functions are allocated based on the amount of memory selected. Currently, the memory is set to 128 MB, the lowest setting possible. Constrained resources influence the performance of the function.

Performance dashboard

Increasing the memory to 512 MB and re-running the load test improves the performance. Maximum Duration is now 721 ms (including the static 500 ms delay).

For the C# function, the Average Duration is now only 59 ms longer than the baseline. The Average PostRuntimeExtensionDuration is at 36.9 ms (compared with 584 ms previously). This performance gain is due to the memory increase without any code changes.

You can also use the Lambda Power Tuning to determine the optimal memory setting for a Lambda function.

Garbage collection

Unlike C#, Rust is not a garbage collected language. Garbage collection (GC) is a process of managing the allocation and release of memory for an application. This process can be resource intensive, and can affect higher percentile metrics. The impact of GC is visible with the blank extension’s and EventCollector extension’s metrics.

Rust uses ownership and borrowing features, allowing for safe memory release without relying on GC. This makes Rust a good runtime choice for tools like Lambda extensions.

EventCollector native AOT extension

Native ahead-of-time (Native AOT) compilation (available in .NET 7 and .NET 8), allows for the extensions written in C# to be delivered as executables, similar to the extensions written in Rust.

Native AOT does not use a JIT compiler. The application is compiled into a self-contained (all the resources that it needs are encapsulated) executable. The executable runs in the target environment (for example, Linux x64) that is specified at compilation time.

These are the results of compiling the .NET extension using Native AOT and re-running the performance test (with function memory set to 128 MB):

Performance dashboard

For the C# extension, Average Duration is now close the baseline (compared to three times the baseline as a DLL). Average PostRuntimeExtensionDuration is now 0.77 ms (compared with 584 ms as a DLL). The C# extension also outperforms the Rust extension for the Maximum PostRuntimeExtensionDuration metric – 297 ms versus 497 ms.

Overall, the Rust extension still has better Average/Maximum Duration, Average/Maximum Cold Start Duration, and Memory Consumption. The Lambda function with the C# extension still uses almost all the allocated memory.

Another metric to consider is the binary size. The Rust extension compiles into a 12.3 MB binary, while the C# extension compiles into a 36.4 MB binary.

Example walkthroughs

To follow the example walkthrough, visit the GitHub repository. The walkthrough explains:

  1. The prerequisites required.
  2. A detailed solution deployment walkthrough.
  3. The cleanup process.
  4. Cost considerations.

Conclusion

This post demonstrates techniques that can be used for running and profiling different types of Lambda extensions. This post focuses on Lambda extensions written in C# and Rust. This post outlines the benefits of writing Lambda extensions in Rust and shows the techniques that can be used to improve Lambda extension written in C# to deliver better performance.

Start writing Lambda extensions with Rust by using the Runtime extensions for AWS Lambda crate. This is a part of a Rust runtime for AWS Lambda.

For more serverless learning resources, visit Serverless Land.

Introducing AWS Lambda response streaming

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/introducing-aws-lambda-response-streaming/

Today, AWS Lambda is announcing support for response payload streaming. Response streaming is a new invocation pattern that lets functions progressively stream response payloads back to clients.

You can use Lambda response payload streaming to send response data to callers as it becomes available. This can improve performance for web and mobile applications. Response streaming also allows you to build functions that return larger payloads and perform long-running operations while reporting incremental progress.

In traditional request-response models, the response needs to be fully generated and buffered before it is returned to the client. This can delay the time to first byte (TTFB) performance while the client waits for the response to be generated. Web applications are especially sensitive to TTFB and page load performance. Response streaming lets you send partial responses back to the client as they become ready, improving TTFB latency to within milliseconds. For web applications, this can improve visitor experience and search engine rankings.

Other applications may have large payloads, like images, videos, large documents, or database results. Response streaming lets you transfer these payloads back to the client without having to buffer the entire payload in memory. You can use response streaming to send responses larger than Lambda’s 6 MB response payload limit up to a soft limit of 20 MB.

Response streaming currently supports the Node.js 14.x and subsequent managed runtimes. You can also implement response streaming using custom runtimes. You can progressively stream response payloads through Lambda function URLs, including as an Amazon CloudFront origin, along with using the AWS SDK or using Lambda’s invoke API. You can also use Amazon API Gateway and Application Load Balancer to stream larger payloads.

Writing response streaming enabled functions

Writing the handler for response streaming functions differs from typical Node handler patterns. To indicate to the runtime that Lambda should stream your function’s responses, you must wrap your function handler with the streamifyResponse() decorator. This tells the runtime to use the correct stream logic path, allowing the function to stream responses.

This is an example handler with response streaming enabled:

exports.handler = awslambda.streamifyResponse(
    async (event, responseStream, context) => {
        responseStream.setContentType(“text/plain”);
        responseStream.write(“Hello, world!”);
        responseStream.end();
    }
);

The streamifyResponse decorator accepts the following additional parameter, responseStream, besides the default node handler parameters, event, and context.

The new responseStream object provides a stream object that your function can write data to. Data written to this stream is sent immediately to the client. You can optionally set the Content-Type header of the response to pass additional metadata to your client about the contents of the stream.

Writing to the response stream

The responseStream object implements Node’s Writable Stream API. This offers a write() method to write information to the stream. However, we recommend that you use pipeline() wherever possible to write to the stream. This can improve performance, ensuring that a faster readable stream does not overwhelm the writable stream.

An example function using pipeline() showing how you can stream compressed data:

const pipeline = require("util").promisify(require("stream").pipeline);
const zlib = require('zlib');
const { Readable } = require('stream');

exports.gzip = awslambda.streamifyResponse(async (event, responseStream, _context) => {
    // As an example, convert event to a readable stream.
    const requestStream = Readable.from(Buffer.from(JSON.stringify(event)));
    
    await pipeline(requestStream, zlib.createGzip(), responseStream);
});

Ending the response stream

When using the write() method, you must end the stream before the handler returns. Use responseStream.end() to signal that you are not writing any more data to the stream. This is not required if you write to the stream with pipeline().

Reading streamed responses

Response streaming introduces a new InvokeWithResponseStream API. You can read a streamed response from your function via a Lambda function URL or use the AWS SDK to call the new API directly.

Neither API Gateway nor Lambda’s target integration with Application Load Balancer support chunked transfer encoding. It therefore does not support faster TTFB for streamed responses. You can, however, use response streaming with API Gateway to return larger payload responses, up to API Gateway’s 10 MB limit. To implement this, you must configure an HTTP_PROXY integration between your API Gateway and a Lambda function URL, instead of using the LAMBDA_PROXY integration.

You can also configure CloudFront with a function URL as origin. When streaming responses through a function URL and CloudFront, you can have faster TTFB performance and return larger payload sizes.

Using Lambda response streaming with function URLs

You can configure a function URL to invoke your function and stream the raw bytes back to your HTTP client via chunked transfer encoding. You configure the Function URL to use the new InvokeWithResponseStream API by changing the invoke mode of your function URL from the default BUFFERED to RESPONSE_STREAM.

RESPONSE_STREAM enables your function to stream payload results as they become available if you wrap the function with the streamifyResponse() decorator. Lambda invokes your function using the InvokeWithResponseStream API. If InvokeWithResponseStream invokes a function that is not wrapped with streamifyResponse(), Lambda does not stream the response and instead returns a buffered response which is subject to the 6 MB size limit.

Using AWS Serverless Application Model (AWS SAM) or AWS CloudFormation, set the InvokeMode property:

  MyFunctionUrl:
    Type: AWS::Lambda::Url
    Properties:
      TargetFunctionArn: !Ref StreamingFunction
      AuthType: AWS_IAM
      InvokeMode: RESPONSE_STREAM

Using generic HTTP client libraries with function URLs

Each language or framework may use different methods to form an HTTP request and parse a streamed response. Some HTTP client libraries only return the response body after the server closes the connection. These clients do not work with functions that return a response stream. To get the benefit of response streams, use an HTTP client that returns response data incrementally. Many HTTP client libraries already support streamed responses, including the Apache HttpClient for Java, Node’s built-in http client, and Python’s requests and urllib3 packages. Consult the documentation for the HTTP library that you are using.

Example applications

There are a number of example Lambda streaming applications in the Serverless Patterns Collection. They use AWS SAM to build and deploy the resources in your AWS account.

Clone the repository and explore the examples. The README file in each pattern folder contains additional information.

git clone https://github.com/aws-samples/serverless-patterns/ 
cd serverless-patterns

Time to first byte using write()

  1. To show how streaming improves time to first bite, deploy the lambda-streaming-ttfb-write-sam pattern.
  2. cd lambda-streaming-ttfb-write-sam
  3. Use AWS SAM to deploy the resources to your AWS account. Run a guided deployment to set the default parameters for the first deployment.
  4. sam deploy -g --stack-name lambda-streaming-ttfb-write-sam

    For subsequent deployments you can use sam deploy.

  5. Enter a Stack Name and accept the initial defaults.
  6. AWS SAM deploys a Lambda function with streaming support and a function URL.

    AWS SAM deploy --g

    AWS SAM deploy –g

    Once the deployment completes, AWS SAM provides details of the resources.

    AWS SAM resources

    AWS SAM resources

    The AWS SAM output returns a Lambda function URL.

  7. Use curl with your AWS credentials to view the streaming response as the URL uses AWS Identity and Access Management (IAM) for authorization. Replace the URL and Region parameters for your deployment.
curl --request GET https://<url>.lambda-url.<Region>.on.aws/ --user AKIAIOSFODNN7EXAMPLE:wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY --aws-sigv4 'aws:amz:<Region>:lambda'

You can see the gradual display of the streamed response.

Using curl to stream response from write () function

Using curl to stream response from write () function

Time to first byte using pipeline()

  1. To try an example using pipeline(), deploy the lambda-streaming-ttfb-pipeline-sam pattern.
  2. cd ..
    cd lambda-streaming-ttfb-pipeline-sam
  3. Use AWS SAM to deploy the resources to your AWS account. Run a guided deployment to set the default parameters for the first deploy.
  4. sam deploy -g --stack-name lambda-streaming-ttfb-pipeline-sam
  5. Enter a Stack Name and accept the initial defaults.
  6. Use curl with your AWS credentials to view the streaming response. Replace the URL and Region parameters for your deployment.
curl --request GET https://<url>.lambda-url.<Region>.on.aws/ --user AKIAIOSFODNN7EXAMPLE:wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY --aws-sigv4 'aws:amz:<Region>:lambda'

You can see the pipelined response stream returned.

Using curl to stream response from function

Using curl to stream response from function

Large payloads

  1. To show how streaming enables you to return larger payloads, deploy the lambda-streaming-large-sam application. AWS SAM deploys a Lambda function, which returns a 7 MB PDF file which is larger than Lambda’s non-stream 6 MB response payload limit.
  2. cd ..
    cd lambda-streaming-large-sam
    sam deploy -g --stack-name lambda-streaming-large-sam
  3. The AWS SAM output returns a Lambda function URL. Use curl with your AWS credentials to view the streaming response.
curl --request GET https://<url>.lambda-url.<Region>.on.aws/ --user AKIAIOSFODNN7EXAMPLE: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY --aws-sigv4 'aws:amz:<Region>:lambda' -o SVS401-ri22.pdf -w '%{content_type}'

This downloads the PDF file SVS401-ri22.pdf to your current directory and displays the content type as application/pdf.

You can also use API Gateway to stream a large payload with an HTTP_PROXY integration with a Lambda function URL.

Invoking a function with response streaming using the AWS SDK

You can use the AWS SDK to stream responses directly from the new Lambda InvokeWithResponseStream API. This provides additional functionality such as handling midstream errors. This can be helpful when building, for example, internal microservices. Response streaming is supported with the AWS SDK for Java 2.x, AWS SDK for JavaScript v3, and AWS SDKs for Go version 1 and version 2.

The SDK response returns an event stream that you can read from. The event stream contains two event types. PayloadChunk contains a raw binary buffer with partial response data received by the client. InvokeComplete signals that the function has completed sending data. It also contains additional metadata, such as whether the function encountered an error in the middle of the stream. Errors can include unhandled exceptions thrown by your function code and function timeouts.

Using the AWS SDK for Javascript v3

  1. To see how to use the AWS SDK to stream responses from a function, deploy the lambda-streaming-sdk-sam pattern.
  2. cd ..
    cd lambda-streaming-sdk-sam
    sam deploy -g --stack-name lambda-streaming-sdk-sam
  3. Enter a Stack Name and accept the initial defaults.
  4. AWS SAM deploys three Lambda functions with streaming support.

  • HappyPathFunction: Returns a full stream.
  • MidstreamErrorFunction: Simulates an error midstream.
  • TimeoutFunction: Function times out before stream completes.
  • Run the SDK example application, which invokes each Lambda function and outputs the result.
  • npm install @aws-sdk/client-lambda
    node index.mjs

    You can see each function and how the midstream and timeout errors are returned back to the SDK client.

    Streaming midstream error

    Streaming midstream error

    Streaming timeout error

    Streaming timeout error

    Quotas and pricing

    Streaming responses incur an additional cost for network transfer of the response payload. You are billed based on the number of bytes generated and streamed out of your Lambda function over the first 6 MB. For more information, see Lambda pricing.

    There is an initial maximum response size of 20 MB, which is a soft limit you can increase. There is a maximum bandwidth throughput limit of 16 Mbps (2 MB/s) for streaming functions.

    Conclusion

    Today, AWS Lambda is announcing support for response payload streaming to send partial responses to callers as the responses become available. This can improve performance for web and mobile applications. You can also use response streaming to build functions that return larger payloads and perform long-running operations while reporting incremental progress. Stream partial responses through Lambda function URLs, or using the AWS SDK. Response streaming currently supports the Node.js 14.x and subsequent runtimes, as well as custom runtimes.

    There are a number of example Lambda streaming applications in the Serverless Patterns Collection to explore the functionality.

    Lambda response streaming support is also available through many AWS Lambda Partners such as Datadog, Dynatrace, New Relic, Pulumi and Lumigo.

    For more serverless learning resources, visit Serverless Land.

    Serverless ICYMI Q1 2023

    Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/serverless-icymi-q1-2023/

    Welcome to the 21st edition of the AWS Serverless ICYMI (in case you missed it) quarterly recap. Every quarter, we share all the most recent product launches, feature enhancements, blog posts, webinars, live streams, and other interesting things that you might have missed!

    ICYMI2023Q1

    In case you missed our last ICYMI, check out what happened last quarter here.

    Artificial intelligence (AI) technologies, ChatGPT, and DALL-E are creating significant interest in the industry at the moment. Find out how to integrate serverless services with ChatGPT and DALL-E to generate unique bedtime stories for children.

    Example notification of a story hosted with Next.js and App Runner

    Example notification of a story hosted with Next.js and App Runner

    Serverless Land is a website maintained by the Serverless Developer Advocate team to help you build serverless applications and includes workshops, code examples, blogs, and videos. There is now enhanced search functionality so you can search across resources, patterns, and video content.

    SLand-search

    ServerlessLand search

    AWS Lambda

    AWS Lambda has improved how concurrency works with Amazon SQS. You can now control the maximum number of concurrent Lambda functions invoked.

    The launch blog post explains the scaling behavior of Lambda using this architectural pattern, challenges this feature helps address, and a demo of maximum concurrency in action.

    Maximum concurrency is set to 10 for the SQS queue.

    Maximum concurrency is set to 10 for the SQS queue.

    AWS Lambda Powertools is an open-source library to help you discover and incorporate serverless best practices more easily. Lambda Powertools for .NET is now generally available and currently focused on three observability features: distributed tracing (Tracer), structured logging (Logger), and asynchronous business and application metrics (Metrics). Powertools is also available for Python, Java, and Typescript/Node.js programming languages.

    To learn more:

    Lambda announced a new feature, runtime management controls, which provide more visibility and control over when Lambda applies runtime updates to your functions. The runtime controls are optional capabilities for advanced customers that require more control over their runtime changes. You can now specify a runtime management configuration for each function with three settings, Automatic (default), Function update, or manual.

    There are three new Amazon CloudWatch metrics for asynchronous Lambda function invocations: AsyncEventsReceived, AsyncEventAge, and AsyncEventsDropped. You can track the asynchronous invocation requests sent to Lambda functions to monitor any delays in processing and take corrective actions if required. The launch blog post explains the new metrics and how to use them to troubleshoot issues.

    Lambda now supports Amazon DocumentDB change streams as an event source. You can use Lambda functions to process new documents, track updates to existing documents, or log deleted documents. You can use any programming language that is supported by Lambda to write your functions.

    There is a helpful blog post suggesting best practices for developing portable Lambda functions that allow you to port your code to containers if you later choose to.

    AWS Step Functions

    AWS Step Functions has expanded its AWS SDK integrations with support for 35 additional AWS services including Amazon EMR Serverless, AWS Clean Rooms, AWS IoT FleetWise, AWS IoT RoboRunner and 31 other AWS services. In addition, Step Functions also added support for 1000+ new API actions from new and existing AWS services such as Amazon DynamoDB and Amazon Athena. For the full list of added services, visit AWS SDK service integrations.

    Amazon EventBridge

    Amazon EventBridge has launched the AWS Controllers for Kubernetes (ACK) for EventBridge and Pipes . This allows you to manage EventBridge resources, such as event buses, rules, and pipes, using the Kubernetes API and resource model (custom resource definitions).

    EventBridge event buses now also support enhanced integration with Service Quotas. Your quota increase requests for limits such as PutEvents transactions-per-second, number of rules, and invocations per second among others will be processed within one business day or faster, enabling you to respond quickly to changes in usage.

    AWS SAM

    The AWS Serverless Application Model (SAM) Command Line Interface (CLI) has added the sam list command. You can now show resources defined in your application, including the endpoints, methods, and stack outputs required to test your deployed application.

    AWS SAM has a preview of sam build support for building and packaging serverless applications developed in Rust. You can use cargo-lambda in the AWS SAM CLI build workflow and AWS SAM Accelerate to iterate on your code changes rapidly in the cloud.

    You can now use AWS SAM connectors as a source resource parameter. Previously, you could only define AWS SAM connectors as a AWS::Serverless::Connector resource. Now you can add the resource attribute on a connector’s source resource, which makes templates more readable and easier to update over time.

    AWS SAM connectors now also support multiple destinations to simplify your permissions. You can now use a single connector between a single source resource and multiple destination resources.

    In October 2022, AWS released OpenID Connect (OIDC) support for AWS SAM Pipelines. This improves your security posture by creating integrations that use short-lived credentials from your CI/CD provider. There is a new blog post on how to implement it.

    Find out how best to build serverless Java applications with the AWS SAM CLI.

    AWS App Runner

    AWS App Runner now supports retrieving secrets and configuration data stored in AWS Secrets Manager and AWS Systems Manager (SSM) Parameter Store in an App Runner service as runtime environment variables.

    AppRunner also now supports incoming requests based on HTTP 1.0 protocol, and has added service level concurrency, CPU and Memory utilization metrics.

    Amazon S3

    Amazon S3 now automatically applies default encryption to all new objects added to S3, at no additional cost and with no impact on performance.

    You can now use an S3 Object Lambda Access Point alias as an origin for your Amazon CloudFront distribution to tailor or customize data to end users. For example, you can resize an image depending on the device that an end user is visiting from.

    S3 has introduced Mountpoint for S3, a high performance open source file client that translates local file system API calls to S3 object API calls like GET and LIST.

    S3 Multi-Region Access Points now support datasets that are replicated across multiple AWS accounts. They provide a single global endpoint for your multi-region applications, and dynamically route S3 requests based on policies that you define. This helps you to more easily implement multi-Region resilience, latency-based routing, and active-passive failover, even when data is stored in multiple accounts.

    Amazon Kinesis

    Amazon Kinesis Data Firehose now supports streaming data delivery to Elastic. This is an easier way to ingest streaming data to Elastic and consume the Elastic Stack (ELK Stack) solutions for enterprise search, observability, and security without having to manage applications or write code.

    Amazon DynamoDB

    Amazon DynamoDB now supports table deletion protection to protect your tables from accidental deletion when performing regular table management operations. You can set the deletion protection property for each table, which is set to disabled by default.

    Amazon SNS

    Amazon SNS now supports AWS X-Ray active tracing to visualize, analyze, and debug application performance. You can now view traces that flow through Amazon SNS topics to destination services, such as Amazon Simple Queue Service, Lambda, and Kinesis Data Firehose, in addition to traversing the application topology in Amazon CloudWatch ServiceLens.

    SNS also now supports setting content-type request headers for HTTPS notifications so applications can receive their notifications in a more predictable format. Topic subscribers can create a DeliveryPolicy that specifies the content-type value that SNS assigns to their HTTPS notifications, such as application/json, application/xml, or text/plain.

    EDA Visuals collection added to Serverless Land

    The Serverless Developer Advocate team has extended Serverless Land and introduced EDA visuals. These are small bite sized visuals to help you understand concept and patterns about event-driven architectures. Find out about batch processing vs. event streaming, commands vs. events, message queues vs. event brokers, and point-to-point messaging. Discover bounded contexts, migrations, idempotency, claims, enrichment and more!

    EDA-visuals

    EDA Visuals

    To learn more:

    Serverless Repos Collection on Serverless Land

    There is also a new section on Serverless Land containing helpful code repositories. You can search for code repos to use for examples, learning or building serverless applications. You can also filter by use-case, runtime, and level.

    Serverless Repos Collection

    Serverless Repos Collection

    Serverless Blog Posts

    January

    Jan 12 – Introducing maximum concurrency of AWS Lambda functions when using Amazon SQS as an event source

    Jan 20 – Processing geospatial IoT data with AWS IoT Core and the Amazon Location Service

    Jan 23 – AWS Lambda: Resilience under-the-hood

    Jan 24 – Introducing AWS Lambda runtime management controls

    Jan 24 – Best practices for working with the Apache Velocity Template Language in Amazon API Gateway

    February

    Feb 6 – Previewing environments using containerized AWS Lambda functions

    Feb 7 – Building ad-hoc consumers for event-driven architectures

    Feb 9 – Implementing architectural patterns with Amazon EventBridge Pipes

    Feb 9 – Securing CI/CD pipelines with AWS SAM Pipelines and OIDC

    Feb 9 – Introducing new asynchronous invocation metrics for AWS Lambda

    Feb 14 – Migrating to token-based authentication for iOS applications with Amazon SNS

    Feb 15 – Implementing reactive progress tracking for AWS Step Functions

    Feb 23 – Developing portable AWS Lambda functions

    Feb 23 – Uploading large objects to Amazon S3 using multipart upload and transfer acceleration

    Feb 28 – Introducing AWS Lambda Powertools for .NET

    March

    Mar 9 – Server-side rendering micro-frontends – UI composer and service discovery

    Mar 9 – Building serverless Java applications with the AWS SAM CLI

    Mar 10 – Managing sessions of anonymous users in WebSocket API-based applications

    Mar 14 –
    Implementing an event-driven serverless story generation application with ChatGPT and DALL-E

    Videos

    Serverless Office Hours – Tues 10AM PT

    Weekly office hours live stream. In each session we talk about a specific topic or technology related to serverless and open it up to helping you with your real serverless challenges and issues. Ask us anything you want about serverless technologies and applications.

    January

    Jan 10 – Building .NET 7 high performance Lambda functions

    Jan 17 – Amazon Managed Workflows for Apache Airflow at Scale

    Jan 24 – Using Terraform with AWS SAM

    Jan 31 – Preparing your serverless architectures for the big day

    February

    Feb 07- Visually design and build serverless applications

    Feb 14 – Multi-tenant serverless SaaS

    Feb 21 – Refactoring to Serverless

    Feb 28 – EDA visually explained

    March

    Mar 07 – Lambda cookbook with Python

    Mar 14 – Succeeding with serverless

    Mar 21 – Lambda Powertools .NET

    Mar 28 – Server-side rendering micro-frontends

    FooBar Serverless YouTube channel

    Marcia Villalba frequently publishes new videos on her popular serverless YouTube channel. You can view all of Marcia’s videos at https://www.youtube.com/c/FooBar_codes.

    January

    Jan 12 – Serverless Badge – A new certification to validate your Serverless Knowledge

    Jan 19 – Step functions Distributed map – Run 10k parallel serverless executions!

    Jan 26 – Step Functions Intrinsic Functions – Do simple data processing directly from the state machines!

    February

    Feb 02 – Unlock the Power of EventBridge Pipes: Integrate Across Platforms with Ease!

    Feb 09 – Amazon EventBridge Pipes: Enrichment and filter of events Demo with AWS SAM

    Feb 16 – AWS App Runner – Deploy your apps from GitHub to Cloud in Record Time

    Feb 23 – AWS App Runner – Demo hosting a Node.js app in the cloud directly from GitHub (AWS CDK)

    March

    Mar 02 – What is Amazon DynamoDB? What are the most important concepts? What are the indexes?

    Mar 09 – Choreography vs Orchestration: Which is Best for Your Distributed Application?

    Mar 16 – DynamoDB Single Table Design: Simplify Your Code and Boost Performance with Table Design Strategies

    Mar 23 – 8 Reasons You Should Choose DynamoDB for Your Next Project and How to Get Started

    Sessions with SAM & Friends

    SAMFiends

    AWS SAM & Friends

    Eric Johnson is exploring how developers are building serverless applications. We spend time talking about AWS SAM as well as others like AWS CDK, Terraform, Wing, and AMPT.

    Feb 16 – What’s new with AWS SAM

    Feb 23 – AWS SAM with AWS CDK

    Mar 02 – AWS SAM and Terraform

    Mar 10 – Live from ServerlessDays ANZ

    Mar 16 – All about AMPT

    Mar 23 – All about Wing

    Mar 30 – SAM Accelerate deep dive

    Still looking for more?

    The Serverless landing page has more information. The Lambda resources page contains case studies, webinars, whitepapers, customer stories, reference architectures, and even more Getting Started tutorials.

    You can also follow the Serverless Developer Advocacy team on Twitter to see the latest news, follow conversations, and interact with the team.