Tag Archives: serverless

Deploying AWS Lambda functions using AWS Controllers for Kubernetes (ACK)

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/deploying-aws-lambda-functions-using-aws-controllers-for-kubernetes-ack/

This post is written by Rajdeep Saha, Sr. SSA, Containers/Serverless.

AWS Controllers for Kubernetes (ACK) allows you to manage AWS services directly from Kubernetes. With the ACK service controller for AWS Lambda, you can provision and manage Lambda functions with kubectl and custom resources. With ACK, you can have a single consolidated approach to managing container workloads and other AWS services, such as Lambda, directly from Kubernetes without needing additional infrastructure automation tools.

This post walks you through deploying a sample Lambda function from a Kubernetes cluster provided by Amazon EKS.

Use cases

Some of the use cases for provisioning Lambda functions from ACK include:

  • Your organization already has a DevOps process to deploy resources into the Amazon EKS cluster using Kubernetes declarative YAMLs (known as manifest files). With ACK for AWS Lambda, you can now use manifest files to provision Lambda functions without creating separate infrastructure as a code template.
  • Your project has implemented GitOps with Kubernetes. With GitOps, git becomes the single source of truth, and all the changes are done via git repo. In this model, Kubernetes continuously reconciles the git repo (desired state) with the resources running inside the cluster (current state). If any differences are found, the GitOps process automatically implements changes to the cluster from the git repo. Using ACK for AWS Lambda, since you are creating the Lambda function using Kubernetes custom resource, the GitOps model is applied for Lambda.
  • Your organization has established permissions boundaries for different users and groups using role-based access control (RBAC) and IAM roles for service accounts (IRSA). You can reuse this security model for Lambda without having to create new users and policies.

How ACK for AWS Lambda works

  1. The ‘Ops’ team deploys the ACK service controller for Lambda. This controller runs as a pod within the Amazon EKS cluster.
  2. The controller pod needs permission to read the Lambda function code and create the Lambda function. The Lambda function code is stored as a zip file in an S3 bucket for this example. The permissions are granted to the pod using IRSA.
  3. Each AWS service has separate ACK service controllers. This specific controller for AWS Lambda can act on the custom resource type ‘Function’.
  4. The ‘Dev’ team deploys Kubernetes manifest file with custom resource type ‘Function’. This manifest file defines the necessary fields required to create the function, such as S3 bucket name, zip file name, Lambda function IAM role, etc.
  5. The ACK service controller creates the Lambda function using the values from the manifest file.

Prerequisites

You need a few tools before deploying the sample application. Ensure that you have each of the following in your working environment:

This post uses shell variables to make it easier to substitute the actual names for your deployment. When you see placeholders like NAME=<your xyz name>, substitute in the name for your environment.

Setting up the Amazon EKS cluster

  1. Run the following to create an Amazon EKS cluster. The following single command creates a two-node Amazon EKS cluster with a unique name.
    eksctl create cluster
  2. It may take 15–30 minutes to provision the Amazon EKS cluster. When the cluster is ready, run:
    kubectl get nodes
  3. The output shows the following:
    Output
  4. To get the Amazon EKS cluster name to use throughout the walkthrough, run:
    eksctl get cluster
    
    export EKS_CLUSTER_NAME=<provide the name from the previous command>

Setting up the ACK Controller for Lambda

To set up the ACK Controller for Lambda:

  1. Install an ACK Controller with Helm by following these instructions:
    – Change ‘export SERVICE=s3’ to ‘export SERVICE=lambda’.
    – Change ‘export AWS_REGION=us-west-2’ to reflect your Region appropriately.
  2. To configure IAM permissions for the pod running the Lambda ACK Controller to permit it to create Lambda functions, follow these instructions.
    – Replace ‘SERVICE=”s3”’ with ‘SERVICE=”lambda”’.
  3. Validate that the ACK Lambda controller is running:
    kubectl get pods -n ack-system
  4. The output shows the running ACK Lambda controller pod:
    Output

Provisioning a Lambda function from the Kubernetes cluster

In this section, you write a sample “Hello world” Lambda function. You zip up the code and upload the zip file to an S3 bucket. Finally, you deploy that zip file to a Lambda function using the ACK Controller from the EKS cluster you created earlier. For this example, use Python3.9 as your language runtime.

To provision the Lambda function:

  1. Run the following to create the sample “Hello world” Lambda function code, and then zip it up:
    mkdir my-helloworld-function
    cd my-helloworld-function
    cat << EOF > lambda_function.py 
    import json
    
    def lambda_handler(event, context):
        # TODO implement
        return {
            'statusCode': 200,
            'body': json.dumps('Hello from Lambda!')
        }
    EOF
    zip my-deployment-package.zip lambda_function.py
    
  2. Create an S3 bucket following the instructions here. Alternatively, you can use an existing S3 bucket in the same Region of the Amazon EKS cluster.
  3. Run the following to upload the zip file into the S3 bucket from the previous step:
    export BUCKET_NAME=<provide the bucket name from step 2>
    aws s3 cp  my-deployment-package.zip s3://${BUCKET_NAME}
  4. The output shows:
    upload: ./my-deployment-package.zip to s3://<BUCKET_NAME>/my-deployment-package.zip
  5. Create your Lambda function using the ACK Controller. The full spec with all the available fields is listed here. First, provide a name for the function:
    export FUNCTION_NAME=hello-world-s3-ack
  6. Create and deploy the Kubernetes manifest file. The command at the end, kubectl create -f function.yaml submits the manifest file, with kind as ‘Function’. The ACK Controller for Lambda identifies this custom ‘Function’ object and deploys the Lambda function based on the manifest file.
    export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query "Account" --output text)
    export LAMBDA_ROLE="arn:aws:iam::${AWS_ACCOUNT_ID}:role/lambda_basic_execution"
    
    cat << EOF > lambdamanifest.yaml 
    apiVersion: lambda.services.k8s.aws/v1alpha1
    kind: Function
    metadata:
     name: $FUNCTION_NAME
     annotations:
       services.k8s.aws/region: $AWS_REGION
    spec:
     name: $FUNCTION_NAME
     code:
       s3Bucket: $BUCKET_NAME
       s3Key: my-deployment-package.zip
     role: $LAMBDA_ROLE
     runtime: python3.9
     handler: lambda_function.lambda_handler
     description: function created by ACK lambda-controller e2e tests
    EOF
    kubectl create -f lambdamanifest.yaml
    
  7. The output shows:
    function.lambda.services.k8s.aws/< FUNCTION_NAME> created
  8. To retrieve the details of the function using a Kubernetes command, run:
    kubectl describe function/$FUNCTION_NAME
  9. This Lambda function returns a “Hello world” message. To invoke the function, run:
    aws lambda invoke --function-name $FUNCTION_NAME  response.json
    cat response.json
    
  10. The Lambda function returns the following output:
    {"statusCode": 200, "body": "\"Hello from Lambda!\""}

Congratulations! You created a Lambda function from your Kubernetes cluster.

To learn how to provision the Lambda function using the ACK controller from an OCI container image instead of a zip file in an S3 bucket, follow these instructions.

Cleaning up

This section cleans up all the resources that you have created. To clean up:

  1. Delete the Lambda function:
    kubectl delete function $FUNCTION_NAME
  2. If you have created a new S3 bucket, delete it by running:
    aws s3 rm s3://${BUCKET_NAME} --recursive
    aws s3api delete-bucket --bucket ${BUCKET_NAME}
  3. Delete the EKS cluster:
    eksctl delete cluster --name $EKS_CLUSTER_NAME
  4. Delete the IAM role created for the ACK Controller. Get the IAM role name by running the following command, then delete the role from the IAM console:
    echo $ACK_CONTROLLER_IAM_ROLE

Conclusion

This blog post shows how AWS Controllers for Kubernetes enables you to deploy a Lambda function directly from your Amazon EKS environment. AWS Controllers for Kubernetes provides a convenient way to connect your Kubernetes applications to AWS services directly from Kubernetes.

ACK is open source: you can request new features and report issues on the ACK community GitHub repository.

For more serverless learning resources, visit Serverless Land.

Speeding up incremental changes with AWS SAM Accelerate and nested stacks

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/speeding-up-incremental-changes-with-aws-sam-accelerate-and-nested-stacks/

This blog written by Jeff Marcinko, Sr. Technical Account Manager, Health Care & Life Sciencesand Brian Zambrano, Sr. Specialist Solutions Architect, Serverless.

Developers and operators have been using the AWS Serverless Application Model (AWS SAM) to author, build, test, and deploy serverless applications in AWS for over three years. Since its inception, the AWS SAM team has focused on developer productivity, simplicity, and best practices.

As good as AWS SAM is at making your serverless development experience easier and faster, building non-trivial cloud applications remains a challenge. Developers and operators want a development experience that provides high-fidelity and fast feedback on incremental changes. With serverless development, local emulation of an application composed of many AWS resources and managed services can be incomplete and inaccurate. We recommend developing serverless applications in the AWS Cloud against live AWS services to increase developer confidence. However, the latency of deploying an entire AWS CloudFormation stack for every code change is a challenge that developers face with this approach.

In this blog post, I show how to increase development velocity by using AWS SAM Accelerate with AWS CloudFormation nested stacks. Nested stacks are an application lifecycle management best practice at AWS. We recommend nested stacks for deploying complex serverless applications, which aligns to the Serverless Application Lens of the AWS Well-Architected Framework. AWS SAM Accelerate speeds up deployment from your local system by bypassing AWS CloudFormation to deploy code and resource updates when possible.

AWS CloudFormation nested stacks and AWS SAM

A nested stack is a CloudFormation resource that is part of another stack, referred to as the parent, or root stack.

Nested stack architecture

Nested stack architecture

The best practice for modeling complex applications is to author a root stack template and declare related resources in their own nested stack templates. This partitioning improves maintainability and encourages reuse of common template patterns. It is easier to reason about the configuration of the AWS resources in the example application because they are described in nested templates for each application component.

With AWS SAM, developers create nested stacks using the AWS::Serverless::Application resource type. The following example shows a snippet from a template.yaml file, which is the root stack for an AWS SAM application.

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31

Resources:
  DynamoDB:
    Type: AWS::Serverless::Application
    Properties:
      Location: db/template.yaml

  OrderWorkflow:
    Type: AWS::Serverless::Application
    Properties:
      Location: workflow/template.yaml

  ApiIntegrations:
    Type: AWS::Serverless::Application
    Properties:
      Location: api-integrations/template.yaml

  Api:
    Type: AWS::Serverless::Application
    Properties:
      Location: api/template.yaml

Each AWS::Serverless::Application resource type references a child stack, which is an independent AWS SAM template. The Location property tells AWS SAM where to find the stack definition.

Solution overview

The sample application exposes an API via Amazon API Gateway. One API endpoint (#2) forwards POST requests to Amazon SQS, an AWS Lambda function polls (#3) the SQS Queue and starts an Amazon Step Function workflow execution (#4) for each message.

Sample application architecture

Sample application architecture

Prerequisites

  1. AWS SAM CLI, version 1.53.0 or higher
  2. Python 3.9

Deploy the application

To deploy the application:

  1. Clone the repository:
    git clone <a href="https://github.com/aws-samples/sam-accelerate-nested-stacks-demo.git" target="_blank" rel="noopener">https://github.com/aws-samples/sam-accelerate-nested-stacks-demo.git</a>
  2. Change to the root directory of the project and run the following AWS SAM CLI commands:
    cd sam-accelerate-nested-stacks-demo
    sam build
    sam deploy --guided --capabilities CAPABILITY_IAM CAPABILITY_AUTO_EXPAND

    You must include the CAPABILITY_IAM and CAPABILITY_AUTO_EXPAND capabilities to support nested stacks and the creation of permissions.

  3. Use orders-app as the stack name during guided deployment. During the deploy process, enter your email for the SubscriptionEmail value. This requires confirmation later. Accept the defaults for the rest of the values.

    SAM deploy example

    SAM deploy example

  4. After the CloudFormation deployment completes, save the API endpoint URL from the outputs.

Confirming the notifications subscription

After the deployment finishes, you receive an Amazon SNS subscription confirmation email at the email address provided during the deployment. Choose the Confirm Subscription link to receive notifications.

You have chosen to subscribe to the topic: 
arn:aws:sns:us-east-1:123456789012:order-topic-xxxxxxxxxxxxxxxxxx

To confirm this subscription, click or visit the link below (If this was in error no action is necessary): 
Confirm subscription

Testing the orders application

To test the application, use the curl command to create a new Order request with the following JSON payload:

{
    "quantity": 1,
    "name": "Pizza",
    "restaurantId": "House of Pizza"
}
curl -s --header "Content-Type: application/json" \
  --request POST \
  --data '"quantity":1,"name":"Pizza","quantity":1,"restaurantId":"House of Pizza"}' \
  https://xxxxxxxxxx.execute-api.us-east-1.amazonaws.com/Dev/orders  | python -m json.tool

API Gateway responds with the following message, showing it successfully sent the request to the SQS queue:

API Gateway response

API Gateway response

The application sends an order notification once the Step Functions workflow completes processing. The workflow intentionally randomizes the SUCCESS or FAILURE status message.

Accelerating development with AWS SAM sync

AWS SAM Accelerate enhances the development experience. It automatically observes local code changes and synchronizes them to AWS without building and deploying every function in my project.

However, when you synchronize code changes directly into the AWS Cloud, it can introduce drift between your CloudFormation stacks and its deployed resources. For this reason, you should only use AWS SAM Accelerate to publish changes in a development stack.

In your terminal, change to the root directory of the project folder and run the sam sync command. This runs in the foreground while you make code changes:

cd sam-accelerate-nested-stacks-demo
sam sync --watch --stack-name orders-app

The –watch option causes AWS SAM to perform an initial CloudFormation deployment. After the deployment is complete, AWS SAM watches for local changes and synchronizes them to AWS. This feature allows you to make rapid iterative code changes and sync to the Cloud automatically in seconds.

Making a code change

In the editor, update the Subject argument in the send_order_notification function in workflow/src/complete_order/app.py.

def send_order_notification(message):
    topic_arn = TOPIC_ARN
    response = sns.publish(
        TopicArn=topic_arn,
        Message=json.dumps(message),
        Subject=f'Orders-App: Update for order {message["order_id"]}'
        #Subject='Orders-App: SAM Accelerate for the win!'
    )

On save, AWS SAM notices the local code change, and updates the CompleteOrder Lambda function. AWS SAM does not trigger updates to other AWS resources across the different stacks, since they are unchanged. This can result in increased development velocity.

SAM sync output

SAM sync output

Validate the change by sending a new order request and review the notification email subject.

curl -s --header "Content-Type: application/json" \
  --request POST \
  --data '"quantity":1,"name":"Pizza","quantity":1,"restaurantId":"House of Pizza"}' \
  https://xxxxxxxxxx.execute-api.us-east-1.amazonaws.com/Dev/orders  | python -m json.tool

In this example, AWS SAM Accelerate is 10–15 times faster than the CloudFormation deployment workflow (sam deploy) for single function code changes.

Deployment speed comparison between SAM accelerate and CloudFormation

Deployment speed comparison between SAM accelerate and CloudFormation

Deployment times vary based on the size and complexity of your Lambda functions and the number of resources in your project.

Making a configuration change

Next, make an infrastructure change to show how sync –watch handles configuration updates.

Update ReadCapacityUnits and WriteCapacityUnits in the DynamoDB table definition by changing the values from five to six in db/template.yaml.

Resources:
  OrderTable:
    Type: AWS::DynamoDB::Table
    Properties:
      TableName: order-table-test
      AttributeDefinitions:
        - AttributeName: user_id
          AttributeType: S
        - AttributeName: id
          AttributeType: S
      KeySchema:
        - AttributeName: user_id
          KeyType: HASH
        - AttributeName: id
          KeyType: RANGE
      ProvisionedThroughput:
        ReadCapacityUnits: 5
        WriteCapacityUnits: 5

The sam sync –watch command recognizes the configuration change requires a CloudFormation deployment to update the db nested stack. Nested stacks reflect an UPDATE_COMPLETE status because CloudFormation starts an update to every nested stack to determine if changes must be applied.

SAM sync infrastructure update

SAM sync infrastructure update

Cleaning up

Delete the nested stack resources to make sure that you don’t continue to incur charges. After stopping the sam sync –watch command, run the following command to delete your resources:

sam delete orders-app

You can also delete the CloudFormation root stack from the console by following these steps.

Conclusion

Local emulation of complex serverless applications, built with nested stacks, can be challenging. AWS SAM Accelerate helps builders achieve a high-fidelity development experience by rapidly synchronizing code changes into the AWS Cloud.

This post shows AWS SAM Accelerate features that push code changes in near real time to a development environment in the Cloud. I use a non-trivial sample application to show how developers can push code changes to a live environment in seconds while using CloudFormation nested stacks to achieve the isolation and maintenance benefits.

For more serverless learning resources, visit Serverless Land.

How Fresenius Medical Care aims to save dialysis patient lives using real-time predictive analytics on AWS

Post Syndicated from Kanti Singh original https://aws.amazon.com/blogs/big-data/how-fresenius-medical-care-aims-to-save-dialysis-patient-lives-using-real-time-predictive-analytics-on-aws/

This post is co-written by Kanti Singh, Director of Data & Analytics at Fresenius Medical Care.

Fresenius Medical Care is the world’s leading provider of kidney care products and services, and operates more than 2,600 dialysis centers in the US alone. The company provides comprehensive solutions for people living with chronic kidney disease and related conditions, with a mission to improve the quality of life of every patient, every day, by transforming healthcare through research, innovation, and compassion. Data analysis that leads to timely interventions is critical to this mission, and essential to reduce hospitalizations and prevent adverse events.

In this post, we walk you through the solution architecture, performance considerations, and how a research partnership with AWS around medical complexity led to an automated solution that helped deliver alerts for potential adverse events.

Why Fresenius Medical Care chose AWS

The Fresenius Medical Care technical team chose AWS as their preferred cloud platform for two key reasons.

First, we determined that AWS IoT Core was more mature than other solutions and would likely face fewer issues with deployment and certificates. As an organization, we wanted to go with a cloud platform that had a proven track record and established technical solutions and services in the IoT and data analytics space. This included Amazon Athena, which is an easy-to-use serverless service that you can use to run queries on data stored in Amazon Simple Storage Service (Amazon S3) for analysis.

Another factor that played a major role in our decision was the fact that AWS offered the largest set of serverless services for analytics than any other cloud provider. We ultimately determined that AWS innovations met the company’s current needs as well as positioned the company for the future as we worked to expand our predictive capabilities.

Solution overview

We needed to develop a near-real-time analytics solution that would collect dynamic dialysis machine data every 10 seconds during hemodialysis treatment in near-real time and personalize it to predict every 30 minutes if a patient is at a health risk for intradialytic hypotension (IDH) within the next 15–75 minutes. This solution needed to scale to all our dialysis centers nationwide, with each location sending 10 MBps of treatment data at peak times.

The complexities that needed to be managed in the solution included handling high throughput data, a low-latency time-sensitive solution of 10 seconds from data origination to reporting and notification, a highly available solution, and a cost-effective solution with on-demand scaling up or down based on data volume.

Fresenius Medical Care partnered with AWS on this mission and developed an architecture that met our technical and business requirements. Core components in the architecture included Amazon Kinesis Data Streams, Amazon Kinesis Data Analytics, and Amazon SageMaker. We chose Kinesis Data Streams and Kinesis Data Analytics primarily because they’re serverless and highly available (99.9%), offer very high throughput, and are easy to scale. We chose SageMaker due to its unique capability that allows ease of building, training, and running machine learning (ML) models at scale.

The following diagram illustrates the architecture.

The solution consists of the following key components:

  1. Data collection
  2. Data ingestion and aggregation
  3. Data lake storage
  4. ML Inference and operational analytics

Let’s discuss each stage in the workflow in more detail.

Data collection

Dialysis machines located in Fresenius Medical Care centers help patients in the treatment of end-stage renal disease by performing hemodialysis. The dialysis machines provide immediate access to all treatment and clinical trending data across the fleet of hemodialysis machines in all centers in the US.

These machines transmit a data payload every 10 seconds to Kafka brokers located in Fresenius Medical Care’s on-premises data center for use by several applications.

Data ingestion and aggregation

We use a Kinesis-Kafka connector hosted on self-managed Amazon Elastic Compute Cloud (Amazon EC2) instances to ingest data from a Kafka topic in near-real time into Kinesis Data Streams.

We use AWS Lambda to read the data points and filter the datasets accordingly to Kinesis Data Analytics. Upon reaching the batch size threshold, Lambda sends the data to Kinesis Data Analytics for instream analytics.

We chose Kinesis Data Analytics due to the ease-of-use it provides for SQL-based stream analytics. By using SQL with KDA (KDA Studio/Flink SQL), we can create dynamic features based on machine interval data arriving in real time. This data is joined with the patient demographic, historical clinical, treatment, and laboratory data (enriched with Amazon S3 data) to create the complete set of features required for a downstream ML model.

Data lake storage

Amazon Kinesis Data Firehose was the simplest way to consistently load streaming data to build a raw data lake in Amazon S3. Kinesis Data Firehose micro-batches data into 128 MB file sizes and delivers streaming data to Amazon S3.

Clinical datasets are required to enrich stream data sourced from on-premises data warehouses via AWS Glue Spark jobs on a nightly basis. The AWS Glue jobs extract patient demographic, historical clinical, treatment, and laboratory data from the data warehouse to Amazon S3 and transform machine data from JSON to Parquet format for better storage and retrieval costs in Amazon S3. AWS Glue also helps build the static features for the intradialytic hypotension (IDH) ML model, which are required for downstream ML inference.

ML Inference and Operational analytics

Lambda batches the stream data from Kinesis Data Analytics that has all the features required for IDH ML model inference.

SageMaker, a fully managed service, trains and deploys the IDH predictive model. The deployed ML model provides a SageMaker endpoint that is used by Lambda for ML inference.

Amazon OpenSearch Service helps store the IDH inference results it received from Lambda. The results are then used for visualization through Kibana, which displays a personalized health prediction dashboard visual for each patient undergoing treatment and is available in near-real time for the care team to provide intervention proactively.

Observability and traceability for failures

Because this solution offers the potential for life-saving interventions, it’s considered business critical. The following key measures are taken to proactively monitor the AWS jobs in Fresenius Medical Care’s VPC account:

  • For AWS Glue jobs that have failures and errors in Lambda functions, an immediate email and Amazon CloudWatch alert is sent to the Data Ops team for resolution.
  • CloudWatch alarms are also generated for Amazon OpenSearch Service whenever there are blocks on writes or the cluster is overloaded with shard capacity, CPU utilization, or other issues, as recommended by AWS.
  • Kinesis Data Analytics and Kinesis Data Streams generate data quality alerts on data rejections or empty results.
  • Data quality alerts are also generated whenever data quality rules on data points are mismatched. To check mismatched data, we use quality rule comparison and sanity checks between message payloads in the stream with data loaded in the data lake.

These systematic and automated monitoring and alerting mechanisms help our team stay one step ahead to ensure that systems are running smoothly and successfully, and any unforeseen problems can be resolved as quickly as possible before it causes any adverse impact on users of the system.

AWS partnership

After Fresenius Medical Care took advantage of the AWS Data Lab to create a working prototype within one week, expert Solutions Architects from AWS became trusted advisors, helping our team with prescriptive guidance from ideation to production. The AWS team helped with both solution-based and service-specific best practices, helped resolve key blockers in every phase from development through production, and performed architecture reviews to ensure the solution was robust and resilient to business needs.

Solution results

This solution allows Fresenius Medical Care to better personalize care to patients undergoing dialysis treatment with a proactive intervention by clinicians at the point of care that has the potential to save patient lives. The following are some of the key benefits due to this solution:

  • Cloud computing resources enable the development, analysis, and integration of real-time predictive IDH that can be easily and seamlessly scaled as needed to reach additional clinics.
  • The use of our tool may be particularly useful in institutions facing staff shortages and, possibly, during home dialysis. Additionally, it may provide insights on strategies to prevent and manage IDH.
  • The solution enables modern and innovative solutions that improve patient care by providing world-class research and data-driven insights.

This solution has been proven to scale to an acceptable performance level of 6,000 messages per second, translating to 19 MB/sec with 60,000/sec concurrent Lambda invocations. The ability to adapt by scaling up and down every component in the architecture with ease kept costs very low, which wouldn’t have been possible elsewhere.

Conclusion

Successful implementation of this solution led to a think big approach in modernizing several legacy data assets and has set Fresenius Medical Care on the path of building an enterprise unified data analytics platform on AWS using Amazon S3, AWS Glue, Amazon EMR, and AWS Lake Formation. The unified data analytics platform offers robust data security and data sharing for multi-tenants in various geographies across the US. Similar to Fresenius, you can accelerate time to market by using the right tool for the job, using the broad and deep variety of AWS analytic native services.


About the authors

Kanti Singh is a Director of Data & Analytics at Fresenius Medical Care, leading the big data platform, architecture, and the engineering team. She loves to explore new technologies and how to leverage them to solve complex business problems. In her free time, she loves traveling, dancing, and spending time with family.

Harsha Tadiparthi is a Specialist Principal Solutions Architect specialized in analytics at Amazon Web Services. He enjoys solving complex customer problems in databases and analytics, and delivering successful outcomes. Outside of work, he loves to spend time with his family, watch movies, and travel whenever possible.

Extending your SaaS platform with AWS Lambda

Post Syndicated from Hasan Tariq original https://aws.amazon.com/blogs/architecture/extending-your-saas-platform-with-aws-lambda/

Software as a service (SaaS) providers continuously add new features and capabilities to their products to meet their growing customer needs. As enterprises adopt SaaS to reduce the total cost of ownership and focus on business priorities, they expect SaaS providers to enable customization capabilities.

Many SaaS providers allow their customers (tenants) to provide customer-specific code that is triggered as part of various workflows by the SaaS platform. This extensibility model allows customers to customize system behavior and add rich integrations, while allowing SaaS providers to prioritize engineering resources on the core SaaS platform and avoid per-customer customizations.

To simplify experience for enterprise developers to build on SaaS platforms, SaaS providers are offering the ability to host tenant’s code inside the SaaS platform. This blog provides architectural guidance for running custom code on SaaS platforms using AWS serverless technologies and AWS Lambda without the overhead of managing infrastructure on either the SaaS provider or customer side.

Vendor-hosted extensions

With vendor-hosted extensions, the SaaS platform runs the customer code in response to events that occur in the SaaS application. In this model, the heavy-lifting of managing and scaling the code launch environment is the responsibility of the SaaS provider.

To host and run custom code, SaaS providers must consider isolating the environment that runs untrusted custom code from the core SaaS platform, as detailed in Figure 1. This introduces additional challenges to manage security, cost, and utilization.

Distribution of responsibility between Customer and SaaS platform with vendor-hosted extensions

Figure 1. Distribution of responsibility between Customer and SaaS platform with vendor-hosted extensions

Using AWS serverless services to run custom code

Using AWS serverless technologies removes the tasks of infrastructure provisioning and management, as there are no servers to manage, and SaaS providers can take advantage of automatic scaling, high availability, and security, while only paying for value.

Example use case

Let’s take an example of a simple SaaS to-do list application that supports the ability to initiate custom code when a new to-do item is added to the list. This application is used by customers who supply custom code to enrich the content of newly added to-do list items. The requirements for the solution consist of:

  • Custom code provided by each tenant should run in isolation from all other tenants and from the SaaS core product
  • Track each customer’s usage and cost of AWS resources
  • Ability to scale per customer

Solution overview

The SaaS application in Figure 2 is the core application used by customers, and each customer is considered a separate tenant. For the sake of brevity, we assume that the customer code was already stored in an Amazon Simple Storage Service (Amazon S3) bucket as part of the onboarding. When an eligible event is generated in the SaaS application as a result of user action, like a new to-do item added, it gets propagated down to securely launch the associated customer code.

Example use case architecture

Figure 2. Example use case architecture

Walkthrough of custom code run

Let’s detail the initiation flow of custom code when a user adds a new to-do item:

  1. An event is generated in the SaaS application when a user performs an action, like adding a new to-do list item. To extend the SaaS application’s behavior, this event is linked to the custom code. Each event contains a tenant ID and any additional data passed as part of the payload. Each of these events is an “initiation request” for the custom code Lambda function.
  2. Amazon EventBridge is used to decouple the SaaS Application from event processing implementation specifics. EventBridge makes it easier to build event-driven applications at scale and keeps the future prospect of adding additional consumers. In case of unexpected failure in any downstream service, EventBridge retries sending events a set number of times.
  3. EventBridge sends the event to an Amazon Simple Queue Service (Amazon SQS) queue as a message that is subsequently picked up by a Lambda function (Dispatcher) for further routing. Amazon SQS enables decoupling and scaling of microservices and also provides a buffer for the events that are awaiting processing.
  4. The Dispatcher polls the messages from SQS queue and is responsible for routing the events to respective tenants for further processing. The Dispatcher retrieves the tenant ID from the message and performs a lookup in the database (we recommend Amazon DynamoDB for low latency), retrieves tenant SQS Amazon Resource Name (ARN) to determine which queue to route the event. To further improve performance, you can cache the tenant-to-queue mapping.
  5. The tenant SQS queue acts as a message store buffer and is configured as an event source for a Lambda function. Using Amazon SQS as an event source for Lambda is a common pattern.
  6. Lambda executes the code uploaded by the tenant to perform the desired operation. Common utility and management code (including logging and telemetry code) is kept in Lambda layers that get added to every custom code Lambda function provisioned.
  7. After performing the desired operation on data, custom code Lambda returns a value back to the SaaS application. This completes the run cycle.

This architecture allows SaaS applications to create a self-managed queue infrastructure for running custom code for tenants in parallel.

Tenant code upload

The SaaS platform can allow customers to upload code either through a user interface or using a command line interface that the SaaS provider provides to developers to facilitate uploading custom code to the SaaS platform. Uploaded code is saved in the custom code S3 bucket in .zip format that can be used to provision Lambda functions.

Custom code Lambda provisioning

The tenant environment includes a tenant SQS queue and a Lambda function that polls initiation requests from the queue. This Lambda function serves several purposes, including:

  1. It polls messages from the SQS queue and constructs a JSON payload that will be sent an input to custom code.
  2. It “wraps” the custom code provided by the customer using boilerplate code, so that custom code is fully abstracted from the processing implementation specifics. For example, we do not want custom code to know that the payload it is getting is coming from Amazon SQS or be aware of the destination where launch results will be sent.
  3. Once custom code initiation is complete, it sends a notification with launch results back to the SaaS application. This can be done directly via EventBridge or Amazon SQS.
  4. This common code can be shared across tenants and deployed by the SaaS provider, either as a library or as a Lambda layer that gets added to the Lambda function.

Each Lambda function execution environment is fully isolated by using a combination of open-source and proprietary isolation technologies, it helps you to address the risk of cross-contamination. By having a separate Lambda function provisioned per-tenant, you achieve the highest level of isolation and benefit from being able to track per-tenant costs.

Conclusion

In this blog post, we explored the need to extend SaaS platforms using custom code and why AWS serverless technologies—using Lambda and Amazon SQS—can be a good fit to accomplish that. We also looked at a solution architecture that can provide the necessary tenant isolation and is cost-effective for this use case.

For more information on building applications with Lambda, visit Serverless Land. For best practices on building SaaS applications, visit SaaS on AWS.

Sequence Diagrams enrich your understanding of distributed architectures

Post Syndicated from Kevin Hakanson original https://aws.amazon.com/blogs/architecture/sequence-diagrams-enrich-your-understanding-of-distributed-architectures/

Architecture diagrams visually communicate and document the high-level design of a solution. As the level of detail increases, so does the diagram’s size, density, and layout complexity. Using Sequence Diagrams, you can explore additional usage scenarios and enrich your understanding of the distributed architecture while continuing to communicate visually.

This post takes a sample architecture and iteratively builds out a set of Sequence Diagrams. Each diagram adds to the vocabulary and graphical notation of Sequence Diagrams, then shows how the diagram deepened understanding of the architecture. All diagrams in this post were rendered from a text-based domain specific language using a diagrams-as-code tool instead of being drawn with graphical diagramming software.

Sample architecture

The architecture is based on Implementing header-based API Gateway versioning with Amazon CloudFront from the AWS Compute Blog, which uses the AWS Lambda@Edge feature to dynamically route the request to the targeted API version.

Amazon API Gateway is a fully managed service that makes it easier for developers to create, publish, maintain, monitor, and secure APIs at any scale. Amazon CloudFront is a global content delivery network (CDN) service built for high-speed, low-latency performance, security, and developer ease-of-use. Lambda@Edge lets you run functions that customize the content that CloudFront delivers.

The numbered labels in Figure 1 correspond to the following text descriptions:

  1. User sends an HTTP request to CloudFront, including a version header.
  2. CloudFront invokes the Lambda@Edge function for the Origin Request event.
  3. The function matches the header value to data fetched from an Amazon DynamoDB table, then modifies the Host header and path of the request and returns it to CloudFront.
  4. CloudFront routes the HTTP request to the matching API Gateway.

Figure 1 architecture diagram is a free-form mixture between a structure diagram and a behavior diagram. It includes structural aspects from a high-level Deployment Diagram, which depicts network connections between AWS services. It also demonstrates behavioral aspects from a Communication Diagram, which uses messages represented by arrows labeled with chronological numbers.

High-level architecture diagram

Figure 1. High-level architecture diagram

Sequence Diagrams

Sequence Diagrams are part of a subset of behavior diagrams known as interaction diagrams, which emphasis control and data flow. Sequence Diagrams model the ordered logic of usage scenarios in a consistent visual manner and capture detailed behaviors. I use this diagram type for analysis and design purposes and to validate my assumptions about data flows in distributed architectures. Let’s investigate the system use case where the API is called without a header indicating the requested version using a Sequence Diagram.

Examining the system use case

In Figure 2, User, Web Distribution, and Origin Request are each actors or system participants. The parallel vertical lines underneath these participants are lifelines. The horizontal arrows between participants are messages, with the arrowhead indicating message direction. Messages are arranged in time sequence from top to bottom. The dashed lines represent reply messages. The text inside guillemets («like this») indicate a stereotype, which refines the meaning of a model element. The rectangle with the bent upper-right corner is a note containing additional useful information.

Missing accept-version header

Figure 2. Missing accept-version header

The message from User to Web Distribution lacks any HTTP header that indicates the version, which precipitates the choice of Accept-Version for this name. The return message requires a decision about HTTP status code for this error case (400). The interaction with the Origin Request prompts a selection of Lambda runtimes (nodejs14.x) and understanding the programming model for generating an HTTP response for this request.

Designing the interaction

Next, let’s design the interaction when the Accept-Version header is present, but the corresponding value is not found in the Version Mappings table.

Figure 3 adds new notation to the diagram. The rectangle with “opt” in the upper-left corner and bolded text inside square brackets is an interaction fragment. The “opt” indicates this operation is an option based on the constraint (or guard) that “version mappings not cached” is true.

API version not found

Figure 3. API version not found

A DynamoDB scan operation on every request consumes table read capacity. Caching Version Mappings data inside the Lambda@Edge function’s memory optimizes for on-demand capacity mode. The «on-demand» stereotype on the DynamoDB participant succinctly communicates this decision. The “API V3 not found” note on Figure 3 provides clarity to the reader. The HTTP status code for this error case is decided as 404 with a custom description of “API Version Not Found.”

Now, let’s design the interaction where the API version is found and the caller receives a successful response.

Figure 4 is similar to Figure 3 up until the note, which now indicates “API V1 found.” Consulting the documentation for Writing functions for Lambda@Edge, the request event is updated with the HTTP Host header and path for the “API V1” Amazon API Gateway.

API version found

Figure 4. API version found

Instead of three separate diagrams for these individual scenarios, a single, combined diagram can represent the entire set of use cases. Figure 5 includes two new “alt” interaction fragments that represent choices of alternative behaviors.

The first “alt” has a guard of “missing Accept-Version header” mapping to our Figure 2 use case. The “else” guard encompasses the remaining use cases containing a second “alt” splitting where Figure 3 and Figure 4 diverge. That “version not found” guard is the Figure 3 use case returning the 404, while that “else” guard is the Figure 4 success condition. The added notes improve visual clarity.

Header-based API Gateway versioning with CloudFront

Figure 5. Header-based API Gateway versioning with CloudFront

Diagrams as code

After diagrams are created, the next question is where to save them and how to keep them updated. Because diagrams-as-code use text-based files, they can be stored and versioned in the same source control system as application code. Also consider an architectural decision record (ADR) process to document and communicate architecturally significant decisions. Then as application code is updated, team members can revise both the ADR narrative and the text-based diagram source. Up-to-date documentation is important for operationally supporting production deployments, and these diagrams quickly provide a visual understanding of system component interactions.

Conclusion

This post started with a high-level architecture diagram and ended with an additional Sequence Diagram that captures multiple usage scenarios. This improved understanding of the system design across success and error use cases. Focusing on system interactions prior to coding facilitates the interface definition and emergent properties discovery, before thinking in terms of programming language specific constructs and SDKs.

Experiment to see if Sequence Diagrams improve the analysis and design phase of your next project. View additional examples of diagrams-as-code from the AWS Icons for PlantUML GitHub repository. The Workload Discovery on AWS solution can even build detailed architecture diagrams of your workloads based on live data from AWS.

For vetted architecture solutions and reference architecture diagrams, visit the AWS Architecture Center. For more serverless learning resources, visit Serverless Land.

Related information

  • The Unified Modeling Language specification provides the full definition of Sequence Diagrams. This includes notations for additional interaction frame operators, using open arrow heads to represent asynchronous messages, and more.
  • Diagrams were created for this blog post using PlantUML and the AWS Icons for PlantUML. PlantUML integrates with IDEs, wikis, and other external tools. PlantUML is distributed under multiple open-source licenses, allowing local server rendering for diagrams containing sensitive information. AWS Icons for PlantUML include the official AWS Architecture Icons.

Using custom consumer group ID support for the AWS Lambda event sources for MSK and self-managed Kafka

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/using-custom-consumer-group-id-support-for-the-aws-lambda-event-sources-for-msk-and-self-managed-kafka/

This post is written by Adam Wagner, Principal Serverless Specialist SA.

AWS Lambda already supports Amazon Managed Streaming for Apache Kafka (MSK) and self-managed Apache Kafka clusters as event sources. Today, AWS adds support for specifying a custom consumer group ID for the Lambda event source mappings (ESMs) for MSK and self-managed Kafka event sources.

With this feature, you can create a Lambda ESM that uses a consumer group that has already been created. This enables you to use Lambda as a Kafka consumer for topics that are replicated with MirrorMaker v2 or with consumer groups you create to start consuming at a particular offset or timestamp.

Overview

This blog post shows how to use this feature to enable Lambda to consume a Kafka topic starting at a specific timestamp. This can be useful if you must reprocess some data but don’t want to reprocess all of the data in the topic.

In this example application, a client application writes to a topic on the MSK cluster. It creates a consumer group that points to a specific timestamp within that topic as the starting point for consuming messages. A Lambda ESM is created using that existing consumer group that triggers a Lambda function. This processes and writes the messages to an Amazon DynamoDB table.

Reference architecture

  1. A Kafka client writes messages to a topic in the MSK cluster.
  2. A Kafka consumer group is created with a starting point of a specific timestamp
  3. The Lambda ESM polls the MSK topic using the existing consumer group and triggers the Lambda function with batches of messages.
  4. The Lambda function writes the messages to DynamoDB

Step-by-step instructions

To get started, create an MSK cluster and a client Amazon EC2 instance from which to create topics and publish messages. If you don’t already have an MSK cluster, follow this blog on setting up an MSK cluster and using it as an event source for Lambda.

  1. On the client instance, set an environment variable to the MSK cluster bootstrap servers to make it easier to reference them in future commands:
    export MSKBOOTSTRAP='b-1.mskcluster.oy1hqd.c23.kafka.us-east-1.amazonaws.com:9094,b-2.mskcluster.oy1hqd.c23.kafka.us-east-1.amazonaws.com:9094,b-3.mskcluster.oy1hqd.c23.kafka.us-east-1.amazonaws.com:9094'
  2. Create the topic. This example has a three-node MSK cluster so the replication factor is also set to three. The partition count is set to three in this example. In your applications, set this according to throughput and parallelization needs.
    ./bin/kafka-topics.sh --create --bootstrap-server $MSKBOOT --replication-factor 3 --partitions 3 --topic demoTopic01
  3. Write messages to the topic using this Python script:
    #!/usr/bin/env python3
    import json
    import time
    from random import randint
    from uuid import uuid4
    from kafka import KafkaProducer
    
    BROKERS = ['b-1.mskcluster.oy1hqd.c23.kafka.us-east-1.amazonaws.com:9094', 
            'b-2.mskcluster.oy1hqd.c23.kafka.us-east-1.amazonaws.com:9094',
            'b-3.mskcluster.oy1hqd.c23.kafka.us-east-1.amazonaws.com:9094']
    TOPIC = 'demoTopic01'
    
    producer = KafkaProducer(bootstrap_servers=BROKERS, security_protocol='SSL',
            value_serializer=lambda x: json.dumps(x).encode('utf-8'))
    
    def create_record(sequence_num):
        number = randint(1000000,10000000)
        record = {"id": sequence_num, "record_timestamp": int(time.time()), "random_number": number, "producer_id": str(uuid4()) }
        print(record)
        return record
    
    def publish_rec(seq):
        data = create_record(seq)
        producer.send(TOPIC, value=data).add_callback(on_send_success).add_errback(on_send_error)
        producer.flush()
    
    def on_send_success(record_metadata):
        print(record_metadata.topic, record_metadata.partition, record_metadata.offset)
    
    def on_send_error(excp):
        print('error writing to kafka', exc_info=excp)
    
    for num in range(1,10000000):
        publish_rec(num)
        time.sleep(0.5) 
    
  4. Copy the script into a file on the client instance named producer.py. The script uses the kafka-python library, so first create a virtual environment and install the library.
    python3 -m venv venv
    source venv/bin/activate
    pip3 install kafka-python
    
  5. Start the script. Leave it running for a few minutes to accumulate some messages in the topic.
    Output
  6. Previously, a Lambda function would choose between consuming messages starting at the beginning of the topic or starting with the latest messages. In this example, it starts consuming messages from a few hours ago at 14:30 UTC. To do this, first create a new consumer group on the client instance:
    ./bin/kafka-consumer-groups.sh --command-config client.properties --bootstrap-server $MSKBOOTSTRAP --topic demoTopic01 --group specificTimeCG --to-datetime 2022-08-10T16:00:00.000 --reset-offsets --execute
  7. In this case, specificTimeCG is the consumer group ID used when creating the Lambda ESM. Listing the consumer groups on the cluster shows the new group:
    ./bin/kafka-consumer-groups.sh --list --command-config client.properties --bootstrap-server $MSKBOOTSTRAP

    Output

  8. With the consumer group created, create the Lambda function along with the Event Source Mapping that uses this new consumer group. In this case, the Lambda function and DynamoDB table are already created. Create the ESM with the following AWS CLI Command:
    aws lambda create-event-source-mapping --region us-east-1 --event-source-arn arn:aws:kafka:us-east-1:0123456789:cluster/demo-us-east-1/78a8d1c1-fa31-4f59-9de3-aacdd77b79bb-23 --function-name msk-consumer-demo-ProcessMSKfunction-IrUhEoDY6X9N --batch-size 3 --amazon-managed-kafka-event-source-config '{"ConsumerGroupId":"specificTimeCG"}' --topics demoTopic01

    The event source in the Lambda console or CLI shows the starting position set to TRIM_HORIZON. However, if you specify a custom consumer group ID that already has existing offsets, those offsets take precedent.

  9. With the event source created, navigate to the DynamoDB console. Locate the DynamoDB table to see the records written by the Lambda function.
    DynamoDB table

Converting the record timestamp of the earliest record in DynamoDB, 1660147212, to a human-readable date shows that the first record was created on 2022-08-10T16:00:12.

In this example, the consumer group is created before the Lambda ESM so that you can specify the timestamp to start from.

If you create an ESM and specify a custom consumer group ID that does not exist, it is created. This is a convenient way to create a new consumer group for an ESM with an ID of your choosing.

Deleting an ESM does not delete the consumer group, regardless of whether it is created before, or during, the ESM creation.

Using the AWS Serverless Application Model (AWS SAM)

To create the event source mapping with a custom consumer group using an AWS Serverless Application Model (AWS SAM) template, use the following snippet:

Events:
  MyMskEvent:
    Type: MSK
    Properties:
      Stream: !Sub arn:aws:kafka:${AWS::Region}:012345678901:cluster/ demo-us-east-1/78a8d1c1-fa31-4f59-9de3-aacdd77b79bb-23
      Topics:
        - "demoTopic01"
      ConsumerGroupId: specificTimeCG

Other types of Kafka clusters

This example uses the custom consumer group ID feature when consuming a Kafka topic from an MSK cluster. In addition to MSK clusters, this feature also supports self-managed Kafka clusters. These could be clusters running on EC2 instances or managed Kafka clusters from a partner such as Confluent.

Conclusion

This post shows how to use the new custom consumer group ID feature of the Lambda event source mapping for Amazon MSK and self-managed Kafka. This feature can be used to consume messages with Lambda starting at a specific timestamp or offset within a Kafka topic. It can also be used to consume messages from a consumer group that is replicated from another Kafka cluster using MirrorMaker v2.

For more serverless learning resources, visit Serverless Land.

Introducing bidirectional event integrations with Salesforce and Amazon EventBridge

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/introducing-bidirectional-event-integrations-with-salesforce-and-amazon-eventbridge/

This post is written by Alseny Diallo, Prototype Solutions Architect and Rohan Mehta, Associate Cloud Application Architect.

AWS now supports Salesforce as a partner event source for Amazon EventBridge, allowing you to send Salesforce events to AWS. You can also configure Salesforce with EventBridge API Destinations and send EventBridge events to Salesforce. These integrations enable you to act on changes to your Salesforce data in real-time and build custom applications with EventBridge and over 100 built-in sources and targets.

In this blog post, you learn how to set up a bidirectional integration between Salesforce and EventBridge and use cases for working with Salesforce events. You see an example application for interacting with Salesforce support case events with automated workflows for detecting sentiment with AWS AI/ML services and enriching support cases with customer order data.

Integration overview

Salesforce is a customer relationship management (CRM) platform that gives companies a single, shared view of customers across their marketing, sales, commerce, and service departments. Salesforce Event Relays for AWS enable bidirectional event flows between Salesforce and AWS through EventBridge.

Amazon EventBridge is a serverless event bus that makes it easier to build event-driven applications at scale using events generated from your applications, integrated software as a service (SaaS) applications, and AWS services. EventBridge partner event source integrations enable customers to receive events from over 30 SaaS applications and ingest them into their AWS applications.

Salesforce as a partner event source for EventBridge makes it easier to build event-driven applications that span customers’ data in Salesforce and applications running on AWS. Customers can send events from Salesforce to EventBridge and vice versa without having to write custom code or manage an integration.

EventBridge joins Amazon AppFlow as a way to integrate Salesforce with AWS. The Salesforce Amazon AppFlow integration is well suited for use cases that require ingesting large volumes of data, like a daily scheduled data transfer sending Salesforce records into an Amazon Redshift data warehouse or an Amazon S3 data lake. The Salesforce EventBridge integration is a good fit for real-time processing of changes to individual Salesforce records.

Use cases

Customers can act on new or modified Salesforce records through integrations with a variety of EventBridge targets, including AWS Lambda, AWS Step Functions, and API Gateway. The integration can enable use cases across industries that must act on customer events in real time.

  • Retailers can automatically unify their Salesforce data with AWS data sources. When a new customer support case is created in Salesforce, enrich the support case with recent order data from that customer retrieved from an orders database running on AWS.
  • Media and entertainment providers can augment their omnichannel experiences with AWS AI/ML services to increase customer engagement. When a new customer account is created in Salesforce, use Amazon Personalize and Amazon Simple Email Service to send a welcome email with personalized media recommendations.
  • Insurers can automate form processing workflows. When a new insurance claim form PDF is uploaded to Salesforce, extract the submitted information with Amazon Textract and orchestrate processing the claim information with AWS Step Functions.

Solution overview

The example application shows how the integration can enhance customer support experiences by unifying support tickets with customer order data, detecting customer sentiment, and automating support case workflows.

Reference architecture

  1. A new case is created in Salesforce and an event is sent to an EventBridge partner event bus.
  2. If the event matches the EventBridge rule, the rule sends the event to both the Enrich Case and Case Processor Workflows in parallel.
  3. The Enrich Case Workflow uses the Customer ID in the event payload to query the Orders table for the customer’s recent order. If this step fails, the event is sent to an Amazon SQS dead letter queue.
  4. The Enrich Case Workflow publishes a new event with the customer’s recent order to an EventBridge custom event bus.
  5. The Case Processor Workflow performs sentiment analysis on the support case content and sends a customized text message to the customer. See the diagram below for details on the workflow.
  6. The Case Processor Workflow publishes a new event with the sentiment analysis results to the custom event bus.
  7. EventBridge rules match the events published to the associated rules: CaseProcessorEventRule and EnrichCaseAppEventRule.
  8. These rules send the events to EventBridge API Destinations. API Destinations sends the events to Salesforce HTTP endpoints to create two Salesforce Platform Events.
  9. Salesforce data is updated with the two Platform Events:
    1. The support case record is updated with the customer’s recent order details and the support case sentiment.
    2. If the support case sentiment is negative, a task is created for an agent to follow up with the customer.

The Case Processor workflow uses Step Functions to process the Salesforce events.

Case processor workflow

  1. Detect the sentiment of the customer feedback using Amazon Comprehend. This is positive, negative, or neutral.
  2. Check if the customer phone number is a mobile number and can receive SMS using Amazon Pinpoint’s mobile number validation endpoint.
  3. If the customer did not provide a mobile number, bypass the SMS steps and put an event with the detected sentiment onto the custom event bus.
  4. If the customer provided a mobile number, send them an SMS with the appropriate message based on the sentiment of their case.
    1. If sentiment is positive or neutral, the message is thanking the customer for their feedback.
    2. If the sentiment is negative, the message offers additional support.
  5. The state machine then puts an event with the sentiment analysis results onto the custom event bus.

Prerequisites

Environment setup

  1. Follow the instructions here to set up your Salesforce Event Relay. Once you have an event bus created with the partner event source, proceed to step 2.
  2. Copy the ARN of the event bus.
  3. Create a Salesforce Connected App. This is used for the API Destinations configuration to send updates back into Salesforce.
  4. You can create a new user within Salesforce with appropriate API permissions to update records. The user name and password is used by the API Destinations configuration.
  5. The example provided by Salesforce uses a Platform Event called “Carbon Comparison”. For this sample app, you create three custom platform events with the following configurations:
    1. Customer Support Case (Salesforce to AWS):
      Customer support case
    2. Processed Support Case (AWS to Salesforce):
      Processed Support case
    3. Enrich Case (AWS to Salesforce):
      Enrich case example
  6. This example application assumes that a custom Sentiment field is added to the Salesforce Case record type. See this link for how to create custom fields in Salesforce.
  7. The example application uses Salesforce Flows to trigger outbound platform events and handle inbound platform events. See this link for how to use Salesforce Flows to build event driven applications on Salesforce.
  8. Clone the AWS SAM template here.
    sam build
    sam deploy —guided

    For the parameter prompts, enter:

  • SalesforceOauthClientId and SalesforceOauthClientSecret: Use the values created with the Connected App in step 3.
  • SalesforceUsername and SalesforcePassword: Use the values created for the new user in step 4.
  • SalesforceOauthUrl: Salesforce URL for OAuth authentication
  • SalesforceCaseProcessorEndpointUrl: Salesforce URL for creating a new Processed Support Case Platform Event object, in this case: https://MyDomainName.my.salesforce.com/services/data/v54.0/sobjects/Processed_Support_Case__e
  • SFEnrichCaseEndpointUrl: Salesforce URL for creating a new Enrich Case Platform Event object, in this case: https://MyDomainName.my.salesforce.com/services/data/v54.0/sobjects/Enrich_Case__e
  • SalesforcePartnerEventBusArn: Use the value from step 2.
  • SalesforcePartnerEventPattern: The detail-type value should be the API name of the custom platform event, in thiscase: {“detail-type”: [“Customer_Support_Case__e”]}

Conclusion

This blog shows how to act on changes to your Salesforce data in real-time using the new Salesforce partner event source integration with EventBridge. The example demonstrated how your Salesforce data can be processed and enriched with custom AWS applications and updates sent back to Salesforce using EventBridge API Destinations.

To learn more about EventBridge partner event sources and API Destinations, see the EventBridge Developer Guide. For more serverless resources, visit Serverless Land.

Build a pseudonymization service on AWS to protect sensitive data, part 1

Post Syndicated from Rahul Shaurya original https://aws.amazon.com/blogs/big-data/part-1-build-a-pseudonymization-service-on-aws-to-protect-sensitive-data/

According to an article in MIT Sloan Management Review, 9 out of 10 companies believe their industry will be digitally disrupted. In order to fuel the digital disruption, companies are eager to gather as much data as possible. Given the importance of this new asset, lawmakers are keen to protect the privacy of individuals and prevent any misuse. Organizations often face challenges as they aim to comply with data privacy regulations like Europe’s General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA). These regulations demand strict access controls to protect sensitive personal data.

This is a two-part post. In part 1, we walk through a solution that uses a microservice-based approach to enable fast and cost-effective pseudonymization of attributes in datasets. The solution uses the AES-GCM-SIV algorithm to pseudonymize sensitive data. In part 2, we will walk through useful patterns for dealing with data protection for varying degrees of data volume, velocity, and variety using Amazon EMR, AWS Glue, and Amazon Athena.

Data privacy and data protection basics

Before diving into the solution architecture, let’s look at some of the basics of data privacy and data protection. Data privacy refers to the handling of personal information and how data should be handled based on its relative importance, consent, data collection, and regulatory compliance. Depending on your regional privacy laws, the terminology and definition in scope of personal information may differ. For example, privacy laws in the United States use personally identifiable information (PII) in their terminology, whereas GDPR in the European Union refers to it as personal data. Techgdpr explains in detail the difference between the two. Through the rest of the post, we use PII and personal data interchangeably.

Data anonymization and pseudonymization can potentially be used to implement data privacy to protect both PII and personal data and still allow organizations to legitimately use the data.

Anonymization vs. pseudonymization

Anonymization refers to a technique of data processing that aims to irreversibly remove PII from a dataset. The dataset is considered anonymized if it can’t be used to directly or indirectly identify an individual.

Pseudonymization is a data sanitization procedure by which PII fields within a data record are replaced by artificial identifiers. A single pseudonym for each replaced field or collection of replaced fields makes the data record less identifiable while remaining suitable for data analysis and data processing. This technique is especially useful because it protects your PII data at record level for analytical purposes such as business intelligence, big data, or machine learning use cases.

The main difference between anonymization and pseudonymization is that the pseudonymized data is reversible (re-identifiable) to authorized users and is still considered personal data.

Solution overview

The following architecture diagram provides an overview of the solution.

Solution overview

This architecture contains two separate accounts:

  • Central pseudonymization service: Account 111111111111 – The pseudonymization service is running in its own dedicated AWS account (right). This is a centrally managed pseudonymization API that provides access to two resources for pseudonymization and reidentification. With this architecture, you can apply authentication, authorization, rate limiting, and other API management tasks in one place. For this solution, we’re using API keys to authenticate and authorize consumers.
  • Compute: Account 222222222222 – The account on the left is referred to as the compute account, where the extract, transform, and load (ETL) workloads are running. This account depicts a consumer of the pseudonymization microservice. The account hosts the various consumer patterns depicted in the architecture diagram. These solutions are covered in detail in part 2 of this series.

The pseudonymization service is built using AWS Lambda and Amazon API Gateway. Lambda enables the serverless microservice features, and API Gateway provides serverless APIs for HTTP or RESTful and WebSocket communication.

We create the solution resources via AWS CloudFormation. The CloudFormation stack template and the source code for the Lambda function are available in GitHub Repository.

We walk you through the following steps:

  1. Deploy the solution resources with AWS CloudFormation.
  2. Generate encryption keys and persist them in AWS Secrets Manager.
  3. Test the service.

Demystifying the pseudonymization service

Pseudonymization logic is written in Java and uses the AES-GCM-SIV algorithm developed by codahale. The source code is hosted in a Lambda function. Secret keys are stored securely in Secrets Manager. AWS Key Management System (AWS KMS) makes sure that secrets and sensitive components are protected at rest. The service is exposed to consumers via API Gateway as a REST API. Consumers are authenticated and authorized to consume the API via API keys. The pseudonymization service is technology agnostic and can be adopted by any form of consumer as long as they’re able to consume REST APIs.

As depicted in the following figure, the API consists of two resources with the POST method:

API Resources

  • Pseudonymization – The pseudonymization resource can be used by authorized users to pseudonymize a given list of plaintexts (identifiers) and replace them with a pseudonym.
  • Reidentification – The reidentification resource can be used by authorized users to convert pseudonyms to plaintexts (identifiers).

The request response model of the API utilizes Java string arrays to store multiple values in a single variable, as depicted in the following code.

Request/Response model

The API supports a Boolean type query parameter to decide whether encryption is deterministic or probabilistic.

The implementation of the algorithm has been modified to add the logic to generate a nonce, which is dependent on the plaintext being pseudonymized. If the incoming query parameters key deterministic has the value True, then the overloaded version of the encrypt function is called. This generates a nonce using the HmacSHA256 function on the plaintext, and takes 12 sub-bytes from a predetermined position for nonce. This nonce is then used for the encryption and prepended to the resulting ciphertext. The following is an example:

  • IdentifierVIN98765432101234
  • NonceNjcxMDVjMmQ5OTE5
  • PseudonymNjcxMDVjMmQ5OTE5q44vuub5QD4WH3vz1Jj26ZMcVGS+XB9kDpxp/tMinfd9

This approach is useful especially for building analytical systems that may require PII fields to be used for joining datasets with other pseudonymized datasets.

The following code shows an example of deterministic encryption.Deterministic Encryption

If the incoming query parameters key deterministic has the value False, then the encrypt method is called without the deterministic parameter and the nonce generated is a random 12 bytes. This generates a different ciphertext for the same incoming plaintext.

The following code shows an example of probabilistic encryption.

Probabilistic Encryption

The Lambda function utilizes a couple of caching mechanisms to boost the performance of the function. It uses Guava to build a cache to avoid generation of the pseudonym or identifier if it’s already available in the cache. For the probabilistic approach, the cache isn’t utilized. It also uses SecretCache, an in-memory cache for secrets requested from Secrets Manager.

Prerequisites

For this walkthrough, you should have the following prerequisites:

Deploy the solution resources with AWS CloudFormation

The deployment is triggered by running the deploy.sh script. The script runs the following phases:

  1. Checks for dependencies.
  2. Builds the Lambda package.
  3. Builds the CloudFormation stack.
  4. Deploys the CloudFormation stack.
  5. Prints to standard out the stack output.

The following resources are deployed from the stack:

  • An API Gateway REST API with two resources:
    • /pseudonymization
    • /reidentification
  • A Lambda function
  • A Secrets Manager secret
  • A KMS key
  • IAM roles and policies
  • An Amazon CloudWatch Logs group

You need to pass the following parameters to the script for the deployment to be successful:

  • STACK_NAME – The CloudFormation stack name.
  • AWS_REGION – The Region where the solution is deployed.
  • AWS_PROFILE – The named profile that applies to the AWS Command Line Interface (AWS CLI). command
  • ARTEFACT_S3_BUCKET – The S3 bucket where the infrastructure code is stored. The bucket must be created in the same account and Region where the solution lives.

Use the following commands to run the ./deployments_scripts/deploy.sh script:

chmod +x ./deployment_scripts/deploy.sh ./deployment_scripts/deploy.sh -s STACK_NAME -b ARTEFACT_S3_BUCKET -r AWS_REGION -p AWS_PROFILE AWS_REGION

Upon successful deployment, the script displays the stack outputs, as depicted in the following screenshot. Take note of the output, because we use it in subsequent steps.

Stack Output

Generate encryption keys and persist them in Secrets Manager

In this step, we generate the encryption keys required to pseudonymize the plain text data. We generate those keys by calling the KMS key we created in the previous step. Then we persist the keys in a secret. Encryption keys are encrypted at rest and in transit, and exist in plain text only in-memory when the function calls them.

To perform this step, we use the script key_generator.py. You need to pass the following parameters for the script to run successfully:

  • KmsKeyArn – The output value from the previous stack deployment
  • AWS_PROFILE – The named profile that applies to the AWS CLI command
  • AWS_REGION – The Region where the solution is deployed
  • SecretName – The output value from the previous stack deployment

Use the following command to run ./helper_scripts/key_generator.py:

python3 ./helper_scripts/key_generator.py -k KmsKeyArn -s SecretName -p AWS_PROFILE -r AWS_REGION

Upon successful deployment, the secret value should look like the following screenshot.

Encryption Secrets

Test the solution

In this step, we configure Postman and query the REST API, so you need to make sure Postman is installed in your machine. Upon successful authentication, the API returns the requested values.

The following parameters are required to create a complete request in Postman:

  • PseudonymizationUrl – The output value from stack deployment
  • ReidentificationUrl – The output value from stack deployment
  • deterministic – The value True or False for the pseudonymization call
  • API_Key – The API key, which you can retrieve from API Gateway console

Follow these steps to set up Postman:

  1. Start Postman in your machine.
  2. On the File menu, choose Import.
  3. Import the Postman collection.
  4. From the collection folder, navigate to the pseudonymization request.
  5. To test the pseudonymization resource, replace all variables in the sample request with the parameters mentioned earlier.

The request template in the body already has some dummy values provided. You can use the existing one or exchange with your own.

  1. Choose Send to run the request.

The API returns in the body of the response a JSON data type.

Reidentification

  1. From the collection folder, navigate to the reidentification request.
  2. To test the reidentification resource, replace all variables in the sample request with the parameters mentioned earlier.
  3. Pass to the response template in the body the pseudonyms output from earlier.
  4. Choose Send to run the request.

The API returns in the body of the response a JSON data type.

Pseudonyms

Cost and performance

There are many factors that can determine the cost and performance of the service. Performance especially can be influenced by payload size, concurrency, cache hit, and managed service limits on the account level. The cost is mainly influenced by how much the service is being used. For our cost and performance exercise, we consider the following scenario:

The REST API is used to pseudonymize Vehicle Identification Numbers (VINs). On average, consumers request pseudonymization of 1,000 VINs per call. The service processes on average 40 requests per second, or 40,000 encryption or decryption operations per second. The average process time per request is as follows:

  • 15 milliseconds for deterministic encryption
  • 23 milliseconds for probabilistic encryption
  • 6 milliseconds for decryption

The number of calls hitting the service per month is distributed as follows:

  • 50 million calls hitting the pseudonymization resource for deterministic encryption
  • 25 million calls hitting the pseudonymization resource for probabilistic encryption
  • 25 million calls hitting the reidentification resource for decryption

Based on this scenario, the average cost is $415.42 USD per month. You may find the detailed cost breakdown in the estimate generated via the AWS Pricing Calculator.

We use Locust to simulate a similar load to our scenario. Measurements from Amazon CloudWatch metrics are depicted in the following screenshots (network latency isn’t considered during our measurement).

The following screenshot shows API Gateway latency and Lambda duration for deterministic encryption. Latency is high at the beginning due to the cold start, and flattens out over time.

API Gateway Latency & Lamdba Duration for deterministic encryption. Latency is high at the beginning due to the cold start and flattens out over time.

The following screenshot shows metrics for probabilistic encryption.

metrics for probabilistic encryption

The following shows metrics for decryption.

metrics for decryption

Clean up

To avoid incurring future charges, delete the CloudFormation stack by running the destroy.sh script. The following parameters are required to run the script successfully:

  • STACK_NAME – The CloudFormation stack name
  • AWS_REGION – The Region where the solution is deployed
  • AWS_PROFILE – The named profile that applies to the AWS CLI command

Use the following commands to run the ./deployment_scripts/destroy.sh script:

chmod +x ./deployment_scripts/destroy.sh ./deployment_scripts/destroy.sh -s STACK_NAME -r AWS_REGION -p AWS_PROFILE

Conclusion

In this post, we demonstrated how to build a pseudonymization service on AWS. The solution is technology agnostic and can be adopted by any form of consumer as long as they’re able to consume REST APIs. We hope this post helps you in your data protection strategies.

Stay tuned for part 2, which will cover consumption patterns of the pseudonymization service.


About the authors

Edvin Hallvaxhiu is a Senior Global Security Architect with AWS Professional Services and is passionate about cybersecurity and automation. He helps customers build secure and compliant solutions in the cloud. Outside work, he likes traveling and sports.

Rahul Shaurya is a Senior Big Data Architect with AWS Professional Services. He helps and works closely with customers building data platforms and analytical applications on AWS. Outside of work, Rahul loves taking long walks with his dog Barney.

Andrea Montanari is a Big Data Architect with AWS Professional Services. He actively supports customers and partners in building analytics solutions at scale on AWS.

María Guerra is a Big Data Architect with AWS Professional Services. Maria has a background in data analytics and mechanical engineering. She helps customers architecting and developing data related workloads in the cloud.

Pushpraj is a Data Architect with AWS Professional Services. He is passionate about Data and DevOps engineering. He helps customers build data driven applications at scale.

Introducing the new AWS Serverless Snippets Collection

Post Syndicated from dboyne original https://aws.amazon.com/blogs/compute/introducing-the-new-aws-serverless-snippets-collection/

Today, the AWS Serverless Developer Advocate team introduces the Serverless Snippets Collection. This is a new page hosted on Serverless Land that makes it easier to discover, copy, and share common code that can help with serverless application development.

Builders are writing serverless applications in many programming languages and spend a growing amount of time finding and reusing code that is trusted and tested.

With many online resources and code located within private repositories, it can be hard to find reusable or up-to-date code snippets that you can copy and paste into your applications or use with your AWS accounts. Code examples can soon become out of date or replaced by new best practices.

The Serverless Snippets Collection is designed to enable reusable, tested, and recommended snippets driven and maintained by the community. Builders can use serverless snippets to find and integrate tools, code examples, and Amazon CloudWatch Logs Insights queries to help with their development workflow.

This blog post explains what serverless snippets are and what challenges they help to solve. It shows how to use the snippets and how builders can contribute to the new collection.

Overview

The new Serverless Snippets Collection helps builders explore and reuse common code snippets to help accelerate application development workflows. Builders can also write their own snippets and contribute them to the site using standard GitHub pull requests.

Serverless snippets are organized into a number of categories, initially supporting Amazon CloudWatch Logs Insights queries, tools, and service integrations.

Code snippets can easily become outdated as new functionality emerges and best practices are discovered. Serverless snippets offer a platform where application developers can collaborate together to keep code examples up to date and relevant, while supporting many programming languages.

Snippets can contain code from any programming language. You can include multiple languages within a single snippet giving you the option to be creative and flexible.

Serverless snippets use tags to simplify discovery. You can use tags to filter by snippet type, programming language, AWS service, or custom tags to find relevant code snippets for your own use cases.

Each snippet type has a custom interface, giving builders a simplified experience and quick deployment methods.

CloudWatch Logs Insights snippets

CloudWatch Logs Insights enables you to search interactively and analyze your log data in CloudWatch Logs. You can use CloudWatch Logs Insights to help you efficiently and effectively search for operational issues, and debug your applications.

Serverless snippets contain a number of CloudWatch Logs Insights queries. These help you analyze your applications faster and include tags such as memory, latency, duration, or errors. You can launch queries directly into your AWS Management Console with one click.

Using CloudWatch Insights snippets

  1. Select the CloudWatch Logs Insights as the snippet type and choose View on a snippet.Filering by CloudWach Logs Insights Queries
  2. Select Open in CloudWatch Insights to launch the snippet directly into your AWS account or copy the code and follow the manual instructions on the page to run the query.Open CloudWatch Insights query with open click deploy button

Tool snippets

Another snippet type supported are tools. Builders can search for tools by programming language or AWS service. Tool snippets include detailed instructions on how to install the tool and example usage with additional resource links. Tools can also be tagged by programming language allowing you to select the tools for your particular language using the snippet tabs functionality.

To use tools snippets:

  1. Select Tools as the selected snippet type and View any tool snippet.Selecting tools on Serverless Snippets
  2. Each snippet may have many steps. Follow the instructions documented in the tool snippet to install and use within your own application.

Integration snippets

Service integrations are part of many applications built on AWS. Serverless snippets include integration type snippets to share integration code between AWS services.

For example, you can add a snippet for Amazon S3 integration with AWS Lambda and also split your snippet into programming languages.

To use integration snippets:

  1. Select Integration as the selected snippet type and View any tool snippet.
  2. The integrations snippets give you examples of how you can integrate between AWS services. You can select your desired programming language by selecting the language buttons.

Contributing to the Serverless Snippets Collection

You can write your own snippets and contribute them to the serverless snippets collection, which is stored in the AWS snippets-collection repository. Requests are reviewed for quality and relevancy before publishing.

To submit a snippet:

  1. Choose Submit a Serverless Snippet.
  2. Read the Adding new snippet guide and fill out the GitHub issue template.
  3. Clone the repository. Duplicate and rename the example snippet_model directory (or _snippet-model-multi-files if you want to support multiple files in your snippet)
  4. Add required information in the README.md file.
  5. Add the required meta information to `snippet-data.json`
  6. Add the snippet code to the snippet.txt file.
  7. Submit a pull request to the repository with the new snippet files.

To write snippets with multiple code blocks or to support different runtimes, read the guide.

Conclusion

When building serverless applications, builders reuse and share code across many applications and organizations. These code snippets can be difficult to find across your own applications and local development environments.

Today, the AWS Serverless Developer Advocate team is adding the Serverless Snippets Collection on Serverless Land to help builders search, discover, and contribute reusable code snippets across the world.

The Serverless Snippet Collection includes tools, integration code examples and CloudWatch Logs Insights queries. The collection supports many types of snippets, examples, and code that can be shared across the community.

Builders can use the custom filter functionality to search for snippets by programming language, AWS service, or custom snippet tags.

All serverless developers are invited to contribute to the collection. You can submit a pull request to the Serverless Snippets Collection GitHub repository, which is reviewed for quality before publishing.

For more information on building serverless applications visit Serverless Land.

How NerdWallet uses AWS and Apache Hudi to build a serverless, real-time analytics platform

Post Syndicated from Kevin Chun original https://aws.amazon.com/blogs/big-data/how-nerdwallet-uses-aws-and-apache-hudi-to-build-a-serverless-real-time-analytics-platform/

This is a guest post by Kevin Chun, Staff Software Engineer in Core Engineering at NerdWallet.

NerdWallet’s mission is to provide clarity for all of life’s financial decisions. This covers a diverse set of topics: from choosing the right credit card, to managing your spending, to finding the best personal loan, to refinancing your mortgage. As a result, NerdWallet offers powerful capabilities that span across numerous domains, such as credit monitoring and alerting, dashboards for tracking net worth and cash flow, machine learning (ML)-driven recommendations, and many more for millions of users.

To build a cohesive and performant experience for our users, we need to be able to use large volumes of varying user data sourced by multiple independent teams. This requires a strong data culture along with a set of data infrastructure and self-serve tooling that enables creativity and collaboration.

In this post, we outline a use case that demonstrates how NerdWallet is scaling its data ecosystem by building a serverless pipeline that enables streaming data from across the company. We iterated on two different architectures. We explain the challenges we ran into with the initial design and the benefits we achieved by using Apache Hudi and additional AWS services in the second design.

Problem statement

NerdWallet captures a sizable amount of spending data. This data is used to build helpful dashboards and actionable insights for users. The data is stored in an Amazon Aurora cluster. Even though the Aurora cluster works well as an Online Transaction Processing (OLTP) engine, it’s not suitable for large, complex Online Analytical Processing (OLAP) queries. As a result, we can’t expose direct database access to analysts and data engineers. The data owners have to solve requests with new data derivations on read replicas. As the data volume and the diversity of data consumers and requests grow, this process gets more difficult to maintain. In addition, data scientists mostly require data files access from an object store like Amazon Simple Storage Service (Amazon S3).

We decided to explore alternatives where all consumers can independently fulfill their own data requests safely and scalably using open-standard tooling and protocols. Drawing inspiration from the data mesh paradigm, we designed a data lake based on Amazon S3 that decouples data producers from consumers while providing a self-serve, security-compliant, and scalable set of tooling that is easy to provision.

Initial design

The following diagram illustrates the architecture of the initial design.

The design included the following key components:

  1. We chose AWS Data Migration Service (AWS DMS) because it’s a managed service that facilitates the movement of data from various data stores such as relational and NoSQL databases into Amazon S3. AWS DMS allows one-time migration and ongoing replication with change data capture (CDC) to keep the source and target data stores in sync.
  2. We chose Amazon S3 as the foundation for our data lake because of its scalability, durability, and flexibility. You can seamlessly increase storage from gigabytes to petabytes, paying only for what you use. It’s designed to provide 11 9s of durability. It supports structured, semi-structured, and unstructured data, and has native integration with a broad portfolio of AWS services.
  3. AWS Glue is a fully managed data integration service. AWS Glue makes it easier to categorize, clean, transform, and reliably transfer data between different data stores.
  4. Amazon Athena is a serverless interactive query engine that makes it easy to analyze data directly in Amazon S3 using standard SQL. Athena scales automatically—running queries in parallel—so results are fast, even with large datasets, high concurrency, and complex queries.

This architecture works fine with small testing datasets. However, the team quickly ran into complications with the production datasets at scale.

Challenges

The team encountered the following challenges:

  • Long batch processing time and complexed transformation logic – A single run of the Spark batch job took 2–3 hours to complete, and we ended up getting a fairly large AWS bill when testing against billions of records. The core problem was that we had to reconstruct the latest state and rewrite the entire set of records per partition for every job run, even if the incremental changes were a single record of the partition. When we scaled that to thousands of unique transactions per second, we quickly saw the degradation in transformation performance.
  • Increased complexity with a large number of clients – This workload contained millions of clients, and one common query pattern was to filter by single client ID. There were numerous optimizations that we were forced to tack on, such as predicate pushdowns, tuning the Parquet file size, using a bucketed partition scheme, and more. As more data owners adopted this architecture, we would have to customize each of these optimizations for their data models and consumer query patterns.
  • Limited extendibility for real-time use cases – This batch extract, transform, and load (ETL) architecture wasn’t going to scale to handle hourly updates of thousands of records upserts per second. In addition, it would be challenging for the data platform team to keep up with the diverse real-time analytical needs. Incremental queries, time-travel queries, improved latency, and so on would require heavy investment over a long period of time. Improving on this issue would open up possibilities like near-real-time ML inference and event-based alerting.

With all these limitations of the initial design, we decided to go all-in on a real incremental processing framework.

Solution

The following diagram illustrates our updated design. To support real-time use cases, we added Amazon Kinesis Data Streams, AWS Lambda, Amazon Kinesis Data Firehose and Amazon Simple Notification Service (Amazon SNS) into the architecture.

The updated components are as follows:

  1. Amazon Kinesis Data Streams is a serverless streaming data service that makes it easy to capture, process, and store data streams. We set up a Kinesis data stream as a target for AWS DMS. The data stream collects the CDC logs.
  2. We use a Lambda function to transform the CDC records. We apply schema validation and data enrichment at the record level in the Lambda function. The transformed results are published to a second Kinesis data stream for the data lake consumption and an Amazon SNS topic so that changes can be fanned out to various downstream systems.
  3. Downstream systems can subscribe to the Amazon SNS topic and take real-time actions (within seconds) based on the CDC logs. This can support use cases like anomaly detection and event-based alerting.
  4. To solve the problem of long batch processing time, we use Apache Hudi file format to store the data and perform streaming ETL using AWS Glue streaming jobs. Apache Hudi is an open-source transactional data lake framework that greatly simplifies incremental data processing and data pipeline development. Hudi allows you to build streaming data lakes with incremental data pipelines, with support for transactions, record-level updates, and deletes on data stored in data lakes. Hudi integrates well with various AWS analytics services such as AWS Glue, Amazon EMR, and Athena, which makes it a straightforward extension of our previous architecture. While Apache Hudi solves the record-level update and delete challenges, AWS Glue streaming jobs convert the long-running batch transformations into low-latency micro-batch transformations. We use the AWS Glue Connector for Apache Hudi to import the Apache Hudi dependencies in the AWS Glue streaming job and write transformed data to Amazon S3 continuously. Hudi does all the heavy lifting of record-level upserts, while we simply configure the writer and transform the data into Hudi Copy-on-Write table type. With Hudi on AWS Glue streaming jobs, we reduce the data freshness latency for our core datasets from hours to under 15 minutes.
  5. To solve the partition challenges for high cardinality UUIDs, we use the bucketing technique. Bucketing groups data based on specific columns together within a single partition. These columns are known as bucket keys. When you group related data together into a single bucket (a file within a partition), you significantly reduce the amount of data scanned by Athena, thereby improving query performance and reducing cost. Our existing queries are filtered on the user ID already, so we significantly improve the performance of our Athena usage without having to rewrite queries by using bucketed user IDs as the partition scheme. For example, the following code shows total spending per user in specific categories:
    SELECT ID, SUM(AMOUNT) SPENDING
    FROM "{{DATABASE}}"."{{TABLE}}"
    WHERE CATEGORY IN (
    'ENTERTAINMENT',
    'SOME_OTHER_CATEGORY')
    AND ID_BUCKET ='{{ID_BUCKET}}'
    GROUP BY ID;

  1. Our data scientist team can access the dataset and perform ML model training using Amazon SageMaker.
  2. We maintain a copy of the raw CDC logs in Amazon S3 via Amazon Kinesis Data Firehose.

Conclusion

In the end, we landed on a serverless stream processing architecture that can scale to thousands of writes per second within minutes of freshness on our data lakes. We’ve rolled out to our first high-volume team! At our current scale, the Hudi job is processing roughly 1.75 MiB per second per AWS Glue worker, which can automatically scale up and down (thanks to AWS Glue auto scaling). We’ve also observed an outstanding improvement of end-to-end freshness at less than 5 minutes due to Hudi’s incremental upserts vs. our first attempt.

With Hudi on Amazon S3, we’ve built a high-leverage foundation to personalize our users’ experiences. Teams that own data can now share their data across the organization with reliability and performance characteristics built into a cookie-cutter solution. This enables our data consumers to build more sophisticated signals to provide clarity for all of life’s financial decisions.

We hope that this post will inspire your organization to build a real-time analytics platform using serverless technologies to accelerate your business goals.


About the authors

Kevin Chun is a Staff Software Engineer in Core Engineering at NerdWallet. He builds data infrastructure and tooling to help NerdWallet provide clarity for all of life’s financial decisions.

Dylan Qu is a Specialist Solutions Architect focused on big data and analytics with Amazon Web Services. He helps customers architect and build highly scalable, performant, and secure cloud-based solutions on AWS.

Building AWS Lambda governance and guardrails

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/building-aws-lambda-governance-and-guardrails/

When building serverless applications using AWS Lambda, there are a number of considerations regarding security, governance, and compliance. This post highlights how Lambda, as a serverless service, simplifies cloud security and compliance so you can concentrate on your business logic. It covers controls that you can implement for your Lambda workloads to ensure that your applications conform to your organizational requirements.

The Shared Responsibility Model

The AWS Shared Responsibility Model distinguishes between what AWS is responsible for and what customers are responsible for with cloud workloads. AWS is responsible for “Security of the Cloud” where AWS protects the infrastructure that runs all the services offered in the AWS Cloud. Customers are responsible for “Security in the Cloud”, managing and securing their workloads. When building traditional applications, you take on responsibility for many infrastructure services, including operating systems and network configuration.

Traditional application shared responsibility

Traditional application shared responsibility

One major benefit when building serverless applications is shifting more responsibility to AWS so you can concentrate on your business applications. AWS handles managing and patching the underlying servers, operating systems, and networking as part of running the services.

Serverless application shared responsibility

Serverless application shared responsibility

For Lambda, AWS manages the application platform where your code runs, which includes patching and updating the managed language runtimes. This reduces the attack surface while making cloud security simpler. You are responsible for the security of your code and AWS Identity and Access Management (IAM) to the Lambda service and within your function.

Lambda is SOCHIPAAPCI, and ISO-compliant. For more information, see Compliance validation for AWS Lambda and the latest Lambda certification and compliance readiness services in scope.

Lambda isolation

Lambda functions run in separate isolated AWS accounts that are dedicated to the Lambda service. Lambda invokes your code in a secure and isolated runtime environment within the Lambda service account. A runtime environment is a collection of resources running in a dedicated hardware-virtualized Micro Virtual Machines (MVM) on a Lambda worker node.

Lambda workers are bare metalEC2 Nitro instances, which are managed and patched by the Lambda service team. They have a maximum lease lifetime of 14 hours to keep the underlying infrastructure secure and fresh. MVMs are created by Firecracker, an open source virtual machine monitor (VMM) that uses Linux’s Kernel-based Virtual Machine (KVM) to create and manage MVMs securely at scale.

MVMs maintain a strong separation between runtime environments at the virtual machine hardware level, which increases security. Runtime environments are never reused across functions, function versions, or AWS accounts.

Isolation model for AWS Lambda workers

Isolation model for AWS Lambda workers

Network security

Lambda functions always run inside secure Amazon Virtual Private Cloud (Amazon VPCs) owned by the Lambda service. This gives the Lambda function access to AWS services and the public internet. There is no direct network inbound access to Lambda workers, runtime environments, or Lambda functions. All inbound access to a Lambda function only comes via the Lambda Invoke API, which sends the event object to the function handler.

You can configure a Lambda function to connect to private subnets in a VPC in your account if necessary, which you can control with IAM condition keys . The Lambda function still runs inside the Lambda service VPC but sends all network traffic through your VPC. Function outbound traffic comes from your own network address space.

AWS Lambda service VPC with VPC-to-VPC NAT to customer VPC

AWS Lambda service VPC with VPC-to-VPC NAT to customer VPC

To give your VPC-connected function access to the internet, route outbound traffic to a NAT gateway in a public subnet. Connecting a function to a public subnet doesn’t give it internet access or a public IP address, as the function is still running in the Lambda service VPC and then routing network traffic into your VPC.

All internal AWS traffic uses the AWS Global Backbone rather than traversing the internet. You do not need to connect your functions to a VPC to avoid connectivity to AWS services over the internet. VPC connected functions allow you to control and audit outbound network access.

You can use security groups to control outbound traffic for VPC-connected functions and network ACLs to block access to CIDR IP ranges or ports. VPC endpoints allow you to enable private communications with supported AWS services without internet access.

You can use VPC Flow Logs to audit traffic going to and from network interfaces in your VPC.

Runtime environment re-use

Each runtime environment processes a single request at a time. After Lambda finishes processing the request, the runtime environment is ready to process an additional request for the same function version. For more information on how Lambda manages runtime environments, see Understanding AWS Lambda scaling and throughput.

Data can persist in the local temporary filesystem path, in globally scoped variables, and in environment variables across subsequent invocations of the same function version. Ensure that you only handle sensitive information within individual invocations of the function by processing it in the function handler, or using local variables. Do not re-use files in the local temporary filesystem to process unencrypted sensitive data. Do not put sensitive or confidential information into Lambda environment variables, tags, or other freeform fields such as Name fields.

For more Lambda security information, see the Lambda security whitepaper.

Multiple accounts

AWS recommends using multiple accounts to isolate your resources because they provide natural boundaries for security, access, and billing. Use AWS Organizations to manage and govern individual member accounts centrally. You can use AWS Control Tower to automate many of the account build steps and apply managed guardrails to govern your environment. These include preventative guardrails to limit actions and detective guardrails to detect and alert on non-compliance resources for remediation.

Lambda access controls

Lambda permissions define what a Lambda function can do, and who or what can invoke the function. Consider the following areas when applying access controls to your Lambda functions to ensure least privilege:

Execution role

Lambda functions have permission to access other AWS resources using execution roles. This is an AWS principal that the Lambda service assumes which grants permissions using identity policy statements assigned to the role. The Lambda service uses this role to fetch and cache temporary security credentials, which are then available as environment variables during a function’s invocation. It may re-use them across different runtime environments that use the same execution role.

Ensure that each function has its own unique role with the minimum set of permissions..

Identity/user policies

IAM identity policies are attached to IAM users, groups, or roles. These policies allow users or callers to perform operations on Lambda functions. You can restrict who can create functions, or control what functions particular users can manage.

Resource policies

Resource policies define what identities have fine-grained inbound access to managed services. For example, you can restrict which Lambda function versions can add events to a specific Amazon EventBridge event bus. You can use resource-based policies on Lambda resources to control what AWS IAM identities and event sources can invoke a specific version or alias of your function. You also use a resource-based policy to allow an AWS service to invoke your function on your behalf. To see which services support resource-based policies, see “AWS services that work with IAM”.

Attribute-based access control (ABAC)

With attribute-based access control (ABAC), you can use tags to control access to your Lambda functions. With ABAC, you can scale an access control strategy by setting granular permissions with tags without requiring permissions updates for every new user or resource as your organization scales. You can also use tag policies with AWS Organizations to standardize tags across resources.

Permissions boundaries

Permissions boundaries are a way to delegate permission management safely. The boundary places a limit on the maximum permissions that a policy can grant. For example, you can use boundary permissions to limit the scope of the execution role to allow only read access to databases. A builder with permission to manage a function or with write access to the applications code repository cannot escalate the permissions beyond the boundary to allow write access.

Service control policies

When using AWS Organizations, you can use Service control policies (SCPs) to manage permissions in your organization. These provide guardrails for what actions IAM users and roles within the organization root or OUs can do. For more information, see the AWS Organizations documentation, which includes example service control policies.

Code signing

As you are responsible for the code that runs in your Lambda functions, you can ensure that only trusted code runs by using code signing with the AWS Signer service. AWS Signer digitally signs your code packages and Lambda validates the code package before accepting the deployment, which can be part of your automated software deployment process.

Auditing Lambda configuration, permissions and access

You should audit access and permissions regularly to ensure that your workloads are secure. Use the IAM console to view when an IAM role was last used.

IAM last used

IAM last used

IAM access advisor

Use IAM access advisor on the Access Advisor tab in the IAM console to review when was the last time an AWS service was used from a specific IAM user or role. You can use this to remove IAM policies and access from your IAM roles.

IAM access advisor

IAM access advisor

AWS CloudTrail

AWS CloudTrail helps you monitor, log, and retain account activity to provide a complete event history of actions across your AWS infrastructure. You can monitor Lambda API actions to ensure that only appropriate actions are made against your Lambda functions. These include CreateFunction, DeleteFunction, CreateEventSourceMapping, AddPermission, UpdateEventSourceMapping,  UpdateFunctionConfiguration, and UpdateFunctionCode.

AWS CloudTrail

AWS CloudTrail

IAM Access Analyzer

You can validate policies using IAM Access Analyzer, which provides over 100 policy checks with security warnings for overly permissive policies. To learn more about policy checks provided by IAM Access Analyzer, see “IAM Access Analyzer policy validation”.

You can also generate IAM policies based on access activity from CloudTrail logs, which contain the permissions that the role used in your specified date range.

IAM Access Analyzer

IAM Access Analyzer

AWS Config

AWS Config provides you with a record of the configuration history of your AWS resources. AWS Config monitors the resource configuration and includes rules to alert when they fall into a non-compliant state.

For Lambda, you can track and alert on changes to your function configuration, along with the IAM execution role. This allows you to gather Lambda function lifecycle data for potential audit and compliance requirements. For more information, see the Lambda Operators Guide.

AWS Config includes Lambda managed config rules such as lambda-concurrency-check, lambda-dlq-check, lambda-function-public-access-prohibited, lambda-function-settings-check, and lambda-inside-vpc. You can also write your own rules.

There are a number of other AWS services to help with security compliance.

  1. AWS Audit Manager: Collect evidence to help you audit your use of cloud services.
  2. Amazon GuardDuty: Detect unexpected and potentially unauthorized activity in your AWS environment.
  3. Amazon Macie: Evaluates your content to identify business-critical or potentially confidential data.
  4. AWS Trusted Advisor: Identify opportunities to improve stability, save money, or help close security gaps.
  5. AWS Security Hub: Provides security checks and recommendations across your organization.

Conclusion

Lambda makes cloud security simpler by taking on more responsibility using the AWS Shared Responsibility Model. Lambda implements strict workload security at scale to isolate your code and prevent network intrusion to your functions. This post provides guidance on assessing and implementing best practices and tools for Lambda to improve your security, governance, and compliance controls. These include permissions, access controls, multiple accounts, and code security. Learn how to audit your function permissions, configuration, and access to ensure that your applications conform to your organizational requirements.

For more serverless learning resources, visit Serverless Land.

Estimating cost for Amazon SQS message processing using AWS Lambda

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/estimating-cost-for-amazon-sqs-message-processing-using-aws-lambda/

This post was written by Sabha Parameswaran, Senior Solutions Architect.

AWS Lambda enables fully managed asynchronous messaging processing through integration with Amazon SQS. This blog post helps estimate the cost and performance benefits when using Lambda to handle millions of messages per day by using a simulated setup.

Overview

Lambda supports asynchronous handling of messages using SQS integration as an event source and can scale for handling millions of messages per day. Customers often ask about the cost of implementing a Lambda-based messaging solution.

There are multiple variables like Lambda function runtime, individual message size, batch size for consuming from SQS, processing latency per message (depending on the backend services invoked), and function memory size settings. These can determine the overall performance and associated cost of a Lambda-based messaging solution.

This post provides cost estimation using these variables, along with guidance around optimization. The estimates focus on consuming from standard queues and not FIFO queues.

SQS event source

The Lambda event source mapping supports integration for SQS. Lambda users specify the SQS queue to consume messages. Lambda internally polls the queue and invokes the function synchronously with an event containing the queue messages.

The configuration controls in Lambda for consuming messages from an SQS queue are:

  • Batch size: The maximum number of records that can be batched as one event delivered to the consuming Lambda function. The maximum batch size is 10,000 records.
  • Batch window: The maximum time (in seconds) to gather records as a single batch. A larger batch window size means waiting longer for a larger SQS batch of messages before passing to the Lambda function.
  • SQS content filtering: Selecting only the messages that match a defined content criteria. This can reduce cost by removing unwanted or irrelevant messages. Lambda now supports content filtering (for SQS, Kinesis, and DynamoDB) and developers can use the filtering capabilities to avoid processing SQS messages, reducing unnecessary invocations and associated cost.

Lambda sends as many records in a single batch as allowed by the batch size, as long as it’s earlier than the batch window value, and smaller than the maximum payload size of 6 MB. Having large batch sizes means that a single Lambda invocation can handle more messages rather than multiple Lambda invocations to handle smaller batches (which translates to setting higher concurrency limits).

The cost and time to process might vary based on the actual number of messages in the batch. A larger batch size can imply longer processing but requires lower concurrency (number of concurrent Lambda invocations).

Lambda configurations

Lambda function costs are calculated based on memory used and time spent (in GB-second) in execution of a function. Aside from the event source configuration, there are several other Lambda function configurations that impact cost and performance:

  • Processor type: Lambda functions provide options to choose between x86 and Arm/Graviton processors. The newer Arm/Graviton processors can yield a higher performance and lower cost compared to x86 based on the workload. Compare the options and run tests before selecting.
  • Memory allotted: This is directly proportional to the CPU allotted to the function and translates to price for each invocation. Higher memory can lead to faster execution but also higher cost. The optimal memory required for a small batch versus large batch can vary based on the workload, size of incoming messages, transformations, requirements to store intermediate, or final results. Optimal tuning of the memory configurations is key to ensuring right cost versus performance. See the AWS Lambda Power Tuning documentation for more details on identifying the optimal memory versus performance for a fixed batch size and then extrapolate the memory settings for larger batch sizes.
  • Lambda function runtime: Some runtimes have a smaller memory footprint and may be more cost effective than others that are memory intensive. Choosing the runtime affects the memory allocation.
  • Function performance: This can be considered as TPS – total number of requests completed per second. Or conversely measured as time to complete one request. The performance – time to finish a function execution can be dependent on the event containing the batch of messages; bigger batches mean more time to complete an event and complexity and dependencies (performance of the backend that needs to be invoked) of the message processing. The calculations are based on the assumption that the Lambda function and related dependencies have been optimized and tuned to scale linearly with various batch sizes and number of invocations.
  • Concurrency: Number of concurrent Lambda function executions. Concurrency is important for scaling of Lambda functions, allowing users to delegate the capacity planning and scaling to thee Lambda service.

The higher the concurrency, the more workloads it can process in a shorter time, allowing better performance, but this does not change the overall cost. Concurrency is not equivalent to TPS: it is more of a scaling factor in overall TPS. For example, a workload comprised of a set of messages takes 20 seconds to complete. 100 workloads would mean 2000 seconds to complete. With a concurrency of 10, it takes 200 seconds. With a concurrency of 100, the time drops to 20 seconds as each of the 100 workloads are handled concurrently. But each function essentially runs for the same duration and memory, regardless of concurrency. So the cost remains the same, as it is measured in GB-hours (memory multiplied by time). But the performance view differs. So, the cost estimations do not consider the concurrency settings of Lambda functions as the workloads have to be processed either sequential or concurrently.

Assumptions

The cost estimation tool presented helps users estimate monthly Lambda function costs for processing SQS standard queue messages based on the following assumptions:

  • The system has reached steady state and has millions of messages available to be consumed per day in standard queues. The number of messages per day remains constant throughout the entire month.
  • Since it’s a steady state, there are no associated Lambda function cold start delays.
  • All SQS messages that need to be processed successfully have already met the filter criteria. Also, no poison messages that have to be re-tried repeatedly. Messages are not going to be rejected, unacknowledged, or reprocessed.
  • The workload scales linearly in performance versus batch size. All the associated dependencies can scale linearly and a batch of N messages should take the same time as N x a single message with a fixed overhead per function invocation irrespective of the batch size. For example, a function’s overhead is 50 ms irrespective of the batch size. Processing a single message takes 20 ms. So a batch of 20 messages should take 490 ms (50 + 20*20) versus a batch of 5 messages takes 150 ms (50 + 5*20).
  • Function memory increases in steps, based on increasing the batch size. For example, 100 messages uses a 256 MB of baseline memory. Every additional 500 messages require additional 128 MB of memory. A sliding window of memory to batch size:
Batch size Memory
1–100 256 MB
100–600 384 MB
600–1100 512 MB
1100–1600 640 MB

Lambda uses SQS APIs internally to poll and dequeue the messages. The costs for the polling and dequeue operations using SQS APIs are not included as part of the estimations. The internal SQS dequeue portion is outside the control of the Lambda developer and the cost estimates only cover the message processing using Lambda. Also, the tool does not consider any reprocessing or duplicate processing of messages due to exceptions or errors that can vary the cost.

Using the cost estimation tool

The estimator tool is a Python-based command line program that takes in an input properties file that specifies the various input parameters to come up with Lambda function cost versus performance estimations for various batch sizes, messages per day, etc. The tool does take into account the eligible monthly free tier for Lambda function executions.

Pre-requisites: Running the tool requires Python 3.9 and installation of Plotly package (5.7.+) or creating and using Docker images.

To run the tool:

  1. Clone the repo:
    git clone https://github.com/aws-samples/aws-lambda-sqs-cost-estimator
  2. Install the tool:
    cd aws-lambda-sqs-cost-estimator/code
    pip3 install -r requirements.txt
  3. Edit the input.prop file and run the tool to generate cost estimations:
    python3 LambdaPlotly.py

This shows the cost estimates on a local browser instance. Running the code as a Docker image is also supported. Refer to the GitHub repo for additional instructions.

  1. Clone the repo and build the Docker container:
    git clone https://github.com/aws-samples/aws-lambda-sqs-cost-estimator
    cd aws-lambda-sqs-cost-estimator/code
    docker build -t lambda-dash .
  2. Edit the input.prop file and run the tool to generate cost estimations:
    docker run -it -v `pwd`:/app -p 8080:8080 lambda-dash
  3. Navigate to http://0.0.0.0:8080/app in a browser to view the generated cost estimate plot.

There are various input parameters for the cost estimations specified inside the input.prop file. Tune the input parameters as needed:

Parameter Description Sample value (units not included)
base_lambda_memory_mb Baseline memory for the Lambda function (in MB) 128
warm_latency_ms Invocation time for Lambda handler method (going with warm start) irrespective of batch size in the incoming event payload in ms 20
process_per_message_ms Time to process a single message (linearly scales with number of messages per batch in event payload) in ms 10
max_batch_size Maximum batch size per event payload processed by a single Lambda instance 1000 (max is 10000)
batch_memory_overhead_mb Additional memory for processing increments in batch size (in MB) 128
batch_increment Increments of batch size for increased memory 300

The following is sample input.prop file content:

base_lambda_memory_mb=128

# Total process time for N messages in batch = warm_latency_ms + (process_per_message_ms * N)

# Time spent in function initialization/warm-up

warm_latency_ms=20

# Time spent for processing each message in milliseconds

process_per_message_ms=10

# Max batch size

max_batch_size=1000

# Additional lambda memory X mb required for managing/parsing/processing N additional messages processed when using variable batch sizes

#batch_memory_overhead_mb=X

#batch_increment=N

batch_memory_overhead_mb=128

batch_increment=300

The tool generates a page with plot graphs and tables with 3 sections:

Cost example

There is an accompanying interactive legend showing cost and batch size. The top section shows a graph of cost versus message volumes versus batch size:

cost versus message volumes vs Batch size

The second section shows the actual cost variation for different batch sizes for 10 million messages:

actual cost variation for different batch sizes for 10 million messages.

The third section shows the memory and time required to process with different batch sizes:

memory and time required to process with different batch sizes

The various control input parameters used for graph generation are shown at the bottom of the page.

Double-clicking on a specific batch size or line on the right-hand legend displays that specific plot with its pricing details.

specific plot to be displayed with its pricing details

You can modify the input parameters with different settings for memory, batch sizes, memory for increased batches and rerun the program to create different cost estimations. You can also export the generated graphs as PNG image files for reference.

Conclusion

You can use Lambda functions to handle fully managed asynchronous processing of SQS messages. Estimating the cost and optimal setup depends on leveraging the various configurations of SQS and Lambda functions. The cost estimator tool presented in this blog should help you understand these configurations and their impact on the overall cost and performance of the Lambda function-based messaging solutions.

For more serverless learning resources, visit Serverless Land.

Securely retrieving secrets with AWS Lambda

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/securely-retrieving-secrets-with-aws-lambda/

AWS Lambda functions often need to access secrets, such as certificates, API keys, or database passwords. Storing secrets outside the function code in an external secrets manager helps to avoid exposing secrets in application source code. Using a secrets manager also allows you to audit and control access, and can help with secret rotation. Do not store secrets in Lambda environment variables, as these are visible to anyone who has access to view function configuration.

This post highlights some solutions to store secrets securely and retrieve them from within your Lambda functions.

AWS Partner Network (APN) member Hashicorp provides Vault to secure secrets and application data. Vault allows you to control access to your secrets centrally, across applications, systems, and infrastructure. You can store secrets in Vault and access them from a Lambda function to access a database, for example. The Vault Agent for AWS helps you authenticate with Vault, retrieve the database credentials, and then perform the queries. You can also use the Vault AWS Lambda extension to manage connectivity to Vault.

AWS Systems Manager Parameter Store enables you to store configuration data securely, including secrets, as parameter values. For information on Parameter Store pricing, see the documentation.

AWS Secrets Manager allows you to replace hardcoded credentials in your code with an API call to Secrets Manager to retrieve the secret programmatically. You can generate, protect, rotate, manage, and retrieve secrets throughout their lifecycle. By default, Secrets Manager does not write or cache the secret to persistent storage. Secrets Manager supports cross-account access to secrets. For information on Secrets Manager pricing, see the documentation.

Parameter Store integrates directly with Secrets Manager as a pass-through service for references to Secrets Manager secrets. Use this integration if you prefer using Parameter Store as a consistent solution for calling and referencing secrets across your applications. For more information, see “Referencing AWS Secrets Manager secrets from Parameter Store parameters.”

For an example application to show Secrets Manager functionality, deploy the example detailed in “How to securely provide database credentials to Lambda functions by using AWS Secrets Manager”.

When to retrieve secrets

When Lambda first invokes your function, it creates a runtime environment. It runs the function’s initialization (init) code, which is the code outside the main handler. Lambda then runs the function handler code as the invocation. This receives the event payload and processes your business logic. Subsequent invocations can use the same runtime environment.

You can retrieve secrets during each function invocation from within your handler code. This ensures that the secret value is always up to date but can lead to increased function duration and cost, as the function calls the secret manager during each invocation. There may also be additional retrieval costs from Secret Manager.

Retrieving secret during each invocation

Retrieving secret during each invocation

You can reduce costs and improve performance by retrieving the secret during the function init process. During subsequent invocations using the same runtime environment, your handler code can use the same secret.

Retrieving secret during function initialization.

Retrieving secret during function initialization.

The Serverless Land pattern example shows how to retrieve a secret during the init phase using Node.js and top-level await.

If a secret may change between subsequent invocations, ensure that your handler can check for the secret validity and, if necessary, retrieve the secret again.

Retrieve changed secret during subsequent invocation.

Retrieve changed secret during subsequent invocation.

You can also use Lambda extensions to retrieve secrets from Secrets Manager, cache them, and automatically refresh the cache based on a time value. The extension retrieves the secret from Secrets Manager before the init process and makes it available via a local HTTP endpoint. The function then retrieves the secret from the local HTTP endpoint, rather than directly from Secrets Manager, increasing performance. You can also share the extension with multiple functions, which can reduce function code. The extension handles refreshing the cache based on a configurable timeout value. This ensures that the function has the updated value, without handling the refresh in your function code, which increases reliability.

Using Lambda extensions to cache and refresh secret.

Using Lambda extensions to cache and refresh secret.

You can deploy the solution using the steps in Cache secrets using AWS Lambda extensions.

Lambda Powertools

Lambda Powertools provides a suite of utilities for Lambda functions to simplify the adoption of serverless best practices. AWS Lambda Powertools for Python and AWS Lambda Powertools for Java both provide a parameters utility that integrates with Secrets Manager.

from aws_lambda_powertools.utilities import parameters
def handler(event, context):
    # Retrieve a single secret
    value = parameters.get_secret("my-secret")
import software.amazon.lambda.powertools.parameters.SecretsProvider;
import software.amazon.lambda.powertools.parameters.ParamManager;

public class AppWithSecrets implements RequestHandler<APIGatewayProxyRequestEvent, APIGatewayProxyResponseEvent> {
    // Get an instance of the Secrets Provider
    SecretsProvider secretsProvider = ParamManager.getSecretsProvider();

    // Retrieve a single secret
    String value = secretsProvider.get("/my/secret");

Rotating secrets

You should rotate secrets to prevent the misuse of your secrets. This helps you to replace long-term secrets with short-term ones, which reduces the risk of compromise.

Secrets Manager has built-in functionality to rotate secrets on demand or according to a schedule. Secrets Manager has native integrations with Amazon RDS, Amazon DocumentDB, and Amazon Redshift, using a Lambda function to manage the rotation process for you. It deploys an AWS CloudFormation stack and populates the function with the Amazon Resource Name (ARN) of the secret. You specify the permissions to rotate the credentials, and how often you want to rotate the secret. You can view and edit Secrets Manager rotation settings in the Secrets Manager console.

Secrets Manager rotation settings

Secrets Manager rotation settings

You can also create your own rotation Lambda function for other services.

Auditing secrets access

You should continually review how applications are using your secrets to ensure that the usage is as you expect. You should also log any changes to them so you can investigate any potential issues, and roll back changes if necessary.

When using Hashicorp Vault, use Audit devices to log all requests and responses to Vault. Audit devices can append logs to a file, write to syslog, or write to a socket.

Secrets Manager supports logging API calls using AWS CloudTrail. CloudTrail monitors and records all API calls for Secrets Manager as events. This includes calls from code calling the Secrets Manager APIs and access via the Secrets Manager console. CloudTrail data is considered sensitive, so you should use AWS KMS encryption to protect it.

The CloudTrail event history shows the requests to secretsmanager.amazonaws.com.

Viewing CloudTrail access to Secrets Manager

Viewing CloudTrail access to Secrets Manager

You can use Amazon EventBridge to respond to alerts based on specific operations recorded in CloudTrail. These include secret rotation or deleted secrets. You can also generate an alert if someone tries to use a version of a secret version while it is pending deletion. This may help identify and alert you when an outdated certificate is used.

Securing secrets

You must tightly control access to secrets because of their sensitive nature. Create AWS Identity and Access Management (IAM) policies and resource policies to enable minimal access to secrets. You can use role-based, as well as attribute-based, access control. This can prevent credentials from being accidentally used or compromised. For more information, see “Authentication and access control for AWS Secrets Manager”.

Secrets Manager supports encryption at rest using AWS Key Management Service (AWS KMS) using keys that you manage. Secrets are encrypted in transit using TLS by default, which requires request signing.

You can access secrets from inside an Amazon Virtual Private Cloud (Amazon VPC) without requiring internet access. Use AWS PrivateLink and configure a Secrets Manager specific VPC endpoint.

Do not store plaintext secrets in Lambda environment variables. Ensure that you do not embed secrets directly in function code, commit these secrets to code repositories, or log the secret to CloudWatch.

Conclusion

Using a secrets manager to store secrets such as certificates, API keys or database passwords helps to avoid exposing secrets in application source code. This post highlights some AWS and third-party solutions, such as Hashicorp Vault, to store secrets securely and retrieve them from within your Lambda functions.

Secrets Manager is the preferred AWS solution for storing and managing secrets. I explain when to retrieve secrets, including using Lambda extensions to cache secrets, which can reduce cost and improve performance.

You can use the Lambda Powertools parameters utility, which integrates with Secrets Manager. Rotating secrets reduces the risk of compromise and you can audit secrets using CloudTrail and respond to alerts using EventBridge. I also cover security considerations for controlling access to your secrets.

For more serverless learning resources, visit Serverless Land.

Build your next big idea with Cloudflare Pages

Post Syndicated from Nevi Shah original https://blog.cloudflare.com/big-ideas-on-pages/

Build your next big idea with Cloudflare Pages

Build your next big idea with Cloudflare Pages

Have you ever had a surge of inspiration for a project? That feeling when you have a great idea – a big idea — that you just can’t shake? When all you can think about is putting your hands to your keyboard and hacking away? Building a website takes courage, creativity, passion and drive, and with Cloudflare Pages we believe nothing should stand in the way of that vision.

Especially not a price tag.

Big ideas

We built Pages to be at the center of your developer experience – a way for you to get started right away without worrying about the heavy lift of setting up a fullstack app.  A quick commit to your git provider or direct upload to our platform, and your rich and powerful site is deployed to our network of 270+ data centers in seconds. And above all, we built Pages to scale with you as you grow exponentially without getting hit by an unexpected bill.

The limit does not exist

We’re a platform that’s invested in your vision – no matter how wacky and wild (the best ones usually are!). That’s why for many parts of Pages we want your experience to be limitless.

Unlimited requests: As your idea grows, so does your traffic. While thousands and millions of end users flock to your site, Pages is prepared to handle all of your traffic with no extra cost to you – even when millions turn to billions.

Unlimited bandwidth: As your traffic grows, you’ll need more bandwidth – and with Pages, we got you. If your site takes off in popularity one day, the next day shouldn’t be a cause for panic because of a nasty bill. It should be a day of celebration and pride. We’re giving unlimited bandwidth so you can keep your eyes focused on moving up and to the right.

Unlimited free seats: With a rise in demand for your app, you’re going to need more folks working with you. We know from experience that more great ideas don’t just come from one person but a strong team of people. We want to be there every step of the way along with every person you want on this journey with you. Just because your team grows doesn’t mean your bill has to.

Unlimited projects: With one great idea, means many more to follow. With Pages, you can deploy as many projects as you want – keep them coming! Not every idea is going to be the right one – we know this! Prove out your vision in private org-owned repos for free! Try out a plethora of development frameworks until you’ve found the perfect combination for your idea. Test your changes locally using our Wrangler integration so you can be confident in whatever you choose to put out into the world.

Quick, easy and free integrations

Workers: Take your idea from static to dynamic with Pages’ native integration with Cloudflare Workers, our serverless functions offering. Drop your functions into your functions folder and deploy them alongside your static assets no extra configuration required! We announced built-in support for Cloudflare Workers back in November and have since announced framework integrations with Remix, Sveltekit and Qwik, and are working on fullstack support for Next.js within the year!

Cloudflare Access: Working with more than just a team of developers? Send your staging changes to product managers and marketing teams with a unique preview URL for every deployment. And what’s more? You can enable protection for your preview links using our native integration with Cloudflare Access at no additional cost. With one click, send around your latest version without fear of getting into the wrong hands.

Custom domains: With every Pages project, get a free pages.dev subdomain to deploy your project under. When you’re ready for the big day, with built in SSL for SaaS, bring that idea to life with a custom domain of its own!

Web Analytics: When launch day comes around, check out just how well it’s going with our deep, privacy-first integration with Cloudflare’s Web Analytics. Track every single one of your sites’ progress and performance, including metrics about your traffic and core web vitals with just one click – completely on us!

Wicked fast performance

And the best part? Our generous free tier never means compromising site performance. Bring your site closer to your users on your first deployment no matter where they are in the world. The Cloudflare network spans across 270+ cities around the globe and your site is distributed to each of them faster than you can say “it’s go time”. There’s also no need to choose regions for your deployments, we want you to have them all and get even more granular, so your big idea can truly go global.

What else?

Building on Pages is just the start of what your idea could grow to become. In the coming months you can expect deep integrations with our new Cloudflare storage offerings like R2, our object storage service with zero egress fees (open beta), and D1 our first SQL database on the edge (private beta).

We’ve talked a lot about building your entire platform on Cloudflare. We’re reimagining this experience to be even easier and even more powerful.

Using just Cloudflare, you’ll be able to build big projects – like an entire store! You can use R2 to host the images, D1 to store product info, inventory data and user details, and Pages to seamlessly build and deploy. A frictionless dev experience for a full stack app that can live and work entirely from the edge. Best of all, don’t worry about getting hung up on cost, we’ll always have a generous free tier so you can get started right away.

At Cloudflare, we believe that every developer deserves to dream big. For the developers who love to build, who are curious, who explore, let us take you there – no surprises! Leave the security and scalability to us, so you can put your fingers to the keyboard and do what you do best!

Give it a go

Learn more about Pages and check out our developer documentation. Be sure to join our active Cloudflare Developer Discord and meet our community of developers building on our platform. You can chat directly with our product and engineering teams and get exclusive access to our offered betas!

Building scheduling system with Workers and Durable Objects

Post Syndicated from Adam Janiš original https://blog.cloudflare.com/building-scheduling-system-with-workers-and-durable-objects/

Building scheduling system with Workers and Durable Objects

Building scheduling system with Workers and Durable Objects

We rely on technology to help us on a daily basis – if you are not good at keeping track of time, your calendar can remind you when it’s time to prepare for your next meeting. If you made a reservation at a really nice restaurant, you don’t want to miss it! You appreciate the app to remind you a day before your plans the next evening.

However, who tells the application when it’s the right time to send you a notification? For this, we generally rely on scheduled events. And when you are relying on them, you really want to make sure that they occur. Turns out, this can get difficult. The scheduler and storage backend need to be designed with scale in mind – otherwise you may hit limitations quickly.

Workers, Durable Objects, and Alarms are actually a perfect match for this type of workload. Thanks to the distributed architecture of Durable Objects and their storage, they are a reliable and scalable option. Each Durable Object has access to its own isolated storage and alarm scheduler, both being automatically replicated and failover in case of failures.

There are many use cases where having a reliable scheduler can come in handy: running a webhook service, sending emails to your customers a week after they sign up to keep them engaged, sending invoices reminders, and more!

Today, we’re going to show you how to build a scalable service that will schedule HTTP requests on a specific schedule or as one-off at a specific time as a way to guide you through any use case that requires scheduled events.

Building scheduling system with Workers and Durable Objects

Quick intro into the application stack

Before we dive in, here are some of the tools we’re going to be using today:

The application is going to have following components:

  • Scheduling system API to accept scheduled requests and manage Durable Objects
  • Unique Durable Object per scheduled request, each with
  • Storage – keeping the request metadata, such as URL, body, or headers.
  • Alarm – a timer (trigger) to wake Durable Object up.

While we will focus on building the application, the Cloudflare global network will take care of the rest – storing and replicating our data, and making sure to wake our Durable Objects up when the time’s right. Let’s build it!

Initialize new Workers project

Get started by generating a completely new Workers project using the wrangler init command, which makes creating new projects quick & easy.

wrangler init -y durable-objects-requests-scheduler

Building scheduling system with Workers and Durable Objects

For more information on how to install, authenticate, or update Wrangler, check out https://developers.cloudflare.com/workers/wrangler/get-started

Preparing TypeScript types

From my personal experience, at least a draft of TypeScript types significantly helps to be more productive down the road, so let’s prepare and describe our scheduled request in advance. Create a file types.ts in src directory and paste the following TypeScript definitions.

src/types.ts

export interface Env {
    DO_REQUEST: DurableObjectNamespace
}
 
export interface ScheduledRequest {
  url: string // URL of the request
  triggerAt?: number // optional, unix timestamp in milliseconds, defaults to `new Date()`
  requestInit?: RequestInit // optional, includes method, headers, body
}

A scheduled request Durable Object class & alarm

Based on our architecture design, each scheduled request will be saved into its own Durable Object, effectively separating storage and alarms from each other and allowing our scheduling system to scale horizontally – there is no limit to the number of Durable Objects we create.

In the end, the Durable Object class is a matter of a couple of lines. The code snippet below accepts and saves the request body to a persistent storage and sets the alarm timer. Workers runtime will wake up the Durable Object and call the alarm() method to process the request.

The alarm method reads the scheduled request data from the storage, then processes the request, and in the end reschedules itself in case it’s configured to be executed on a recurring schedule.

src/request-durable-object.ts

import { ScheduledRequest } from "./types";
 
export class RequestDurableObject {
  id: string|DurableObjectId
  storage: DurableObjectStorage
 
  constructor(state:DurableObjectState) {
    this.storage = state.storage
    this.id = state.id
  }
 
    async fetch(request:Request) {
    // read scheduled request from request body
    const scheduledRequest:ScheduledRequest = await request.json()
     
    // save scheduled request data to Durable Object storage, set the alarm, and return Durable Object id
    this.storage.put("request", scheduledRequest)
    this.storage.setAlarm(scheduledRequest.triggerAt || new Date())
    return new Response(JSON.stringify({
      id: this.id.toString()
    }), {
      headers: {
        "content-type": "application/json"
      }
    })
  }
 
  async alarm() {
    // read the scheduled request from Durable Object storage
    const scheduledRequest:ScheduledRequest|undefined = await this.storage.get("request")
 
    // call fetch on scheduled request URL with optional requestInit
    if (scheduledRequest) {
      await fetch(scheduledRequest.url, scheduledRequest.requestInit ? webhook.requestInit : undefined)
 
      // cleanup scheduled request once done
      this.storage.deleteAll()
    }
  }
}

Wrangler configuration

Once we have the Durable Object class, we need to create a Durable Object binding by instructing Wrangler where to find it and what the exported class name is.

wrangler.toml

name = "durable-objects-request-scheduler"
main = "src/index.ts"
compatibility_date = "2022-08-02"
 
# added Durable Objects configuration
[durable_objects]
bindings = [
  { name = "DO_REQUEST", class_name = "RequestDurableObject" },
]
 
[[migrations]]
tag = "v1"
new_classes = ["RequestDurableObject"]

Scheduling system API

The API Worker will accept POST HTTP methods only, and is expecting a JSON body with scheduled request data – what URL to call, optionally what method, headers, or body to send. Any other method than POST will return 405 – Method Not Allowed HTTP error.

HTTP POST /:scheduledRequestId? will create or override a scheduled request, where :scheduledRequestId is optional Durable Object ID returned from a scheduling system API before.

src/index.ts

import { Env } from "./types"
export { RequestDurableObject } from "./request-durable-object"
 
export default {
    async fetch(
        request: Request,
        env: Env
    ): Promise<Response> {
        if (request.method !== "POST") {
            return new Response("Method Not Allowed", {status: 405})
        }
 
        // parse the URL and get Durable Object ID from the URL
        const url = new URL(request.url)
        const idFromUrl = url.pathname.slice(1)
 
        // construct the Durable Object ID, use the ID from pathname or create a new unique id
        const doId = idFromUrl ? env.DO_REQUEST.idFromString(idFromUrl) : env.DO_REQUEST.newUniqueId()
 
        // get the Durable Object stub for our Durable Object instance
        const stub = env.DO_REQUEST.get(doId)
 
        // pass the request to Durable Object instance
        return stub.fetch(request)
    },
}

It’s good to mention that the script above does not implement any listing of scheduled or processed webhooks. Depending on how the scheduling system would be integrated, you can save each created Durable Object ID to your existing backend, or write your own registry – using one of the Workers storage options.

Starting a local development server and testing our application

We are almost done! Before we publish our scheduler system to the Cloudflare edge, let’s start Wrangler in a completely local mode to run a couple of tests against it and to see it in action – which will work even without an Internet connection!

wrangler dev --local
Building scheduling system with Workers and Durable Objects

The development server is listening on localhost:8787, which we will use for scheduling our first request. The JSON request payload should match the TypeScript schema we defined in the beginning – required URL, and optional triggerEverySeconds number or triggerAt unix timestamp. When only the required URL is passed, the request will be dispatched right away.

An example of request payload that will send a GET request to https://example.com every 30 seconds.

{
	"url":  "https://example.com",
	"triggerEverySeconds": 30,
}
> curl -X POST -d '{"url": "https://example.com", "triggerEverySeconds": 30}' http://localhost:8787
{"id":"000000018265a5ecaa5d3c0ab6a6997bf5638fdcb1a8364b269bd2169f022b0f"}
Building scheduling system with Workers and Durable Objects

From the wrangler logs we can see the scheduled request ID 000000018265a5ecaa5d3c0ab6a6997bf5638fdcb1a8364b269bd2169f022b0f is being triggered in 30s intervals.

Need to double the interval? No problem, just send a new POST request and pass the request ID as a pathname.

> curl -X POST -d '{"url": "https://example.com", "triggerEverySeconds": 60}' http://localhost:8787/000000018265a5ecaa5d3c0ab6a6997bf5638fdcb1a8364b269bd2169f022b0f
{"id":"000000018265a5ecaa5d3c0ab6a6997bf5638fdcb1a8364b269bd2169f022b0f"}

Every scheduled request gets a unique Durable Object ID with its own storage and alarm. As we demonstrated, the ID becomes handy when you need to change the settings of the scheduled request, or to deschedule them completely.

Publishing to the network

Following command will bundle our Workers application, export and bind Durable Objects, and deploy it to our workers.dev subdomain.

wrangler publish
Building scheduling system with Workers and Durable Objects

That’s it, we are live! 🎉 The URL of your deployment is shown in the Workers logs. In a reasonably short period of time we managed to write our own scheduling system that is ready to handle requests at scale.

You can check full source code in Workers templates repository, or experiment from your browser without installing any dependencies locally using the StackBlitz template.

What to see or build next

Introducing tiered pricing for AWS Lambda

Post Syndicated from Sam Dengler original https://aws.amazon.com/blogs/compute/introducing-tiered-pricing-for-aws-lambda/

This blog post is written by Heeki Park, Principal Solutions Architect, Serverless.

AWS Lambda charges for on-demand function invocations based on two primary parameters: invocation requests and compute duration, measured in GB-seconds. If you configure additional ephemeral storage for your function, Lambda also charges for ephemeral storage duration, measured in GB-seconds.

AWS continues to find ways to help customers reduce cost for running on Lambda. In February 2020, AWS announced that AWS Lambda would participate in Compute Savings Plans. In December 2020, AWS announced 1 ms billing granularity to help customers save on cost for their Lambda function invocations. With that pricing change, customers whose function duration is less than 100 ms pay less for those function invocations. In September 2021, AWS announced Graviton2 support for running your function on ARM and potential improvements for price performance for compute.

Today, AWS introduces tiered pricing for Lambda. With tiered pricing, customers who run large workloads on Lambda can automatically save on their monthly costs. Tiered pricing is based on compute duration measured in GB-seconds. The tiered pricing breaks down as follows:

Compute duration (GB-seconds) Architecture New tiered discount
0 – 6 billion x86 Same as today
6 – 15 billion x86 10%
Anything over 15 billion x86 20%
0 – 7.5 billion arm64 Same as today
7.5 – 18.75 billion arm64 10%
Anything over 18.75 billion arm64 20%

The Lambda pricing page lists the pricing for all Regions and architectures.

Tiered pricing discount example

Consider a financial services provider who provides on-demand stock portfolio analysis. The customers pay per portfolio analyzed and find the service valuable for providing them insight into the performance of those assets. The application is built using Lambda, runs on x86, and is optimized to use 2048 MB (2 GB) of memory with an average function duration of 60 seconds. This current month resulted in 75 million function invocations.

Without tiered pricing, this workload costs the following:

Monthly request charges: 75M * $0.20/million = $15.00
Monthly compute duration (seconds): 75M * 60 seconds = 4.5B seconds
Monthly compute (GB-seconds): 4.5B seconds * 2 GB = 9B GB-seconds
Monthly compute duration charges: 9B GB-s * $0.0000166667/GB-s = $150,000.30
Total monthly charges = request charges + compute duration charges = $15.00 + $150,000.30 = $150,015.30

With tiered pricing, the portion of compute duration that exceeds 6B GB-seconds receives an automatic discount as follows:

Monthly request charges: 75M * $0.20/million = $15.00
Monthly compute duration (seconds): 75M * 60 seconds = 4.5B seconds
Monthly compute (GB-seconds): 4.5B seconds * 2GB = 9B GB-seconds
Monthly compute duration charge (tier 1): 6B Gb-s * $0.0000166667/GB-s = $100,000.20
Monthly compute duration charge (tier 2): 3B Gb-s * $0.0000150000/GB-s = $45,000.09
Monthly compute duration charges (post-discount): $100,000.20 + $45,000.09 = $145,000.29.
Total monthly charges = request charges + compute duration charges = $15.00 + $145,000.29 = $145,015.29 ($5,000.01 cost savings)

Tiered pricing discount example with increased growth

The service is successful and usage in the following month quadruples, resulting in 300 million function invocations.

Without tiered pricing, this workload costs the following:

Monthly request charges: 300M * $0.20/million = $60.00
Monthly compute duration (seconds): 300M * 60 seconds = 18B seconds
Monthly compute (GB-seconds): 18B seconds * 2GB = 36B GB-seconds
Monthly compute duration charges: 36B GB-s * $0.0000166667/GB-s = $600,001.20
Total monthly charges = request charges + compute duration charges = $60.00 + $600,001.20 = $600,061.20

With tiered pricing, the compute duration portion now also exceeds 15B GB-seconds and receives an automatic discount as follows:

Monthly request charges: 300M * $0.20/million = $60.00
Monthly compute duration (seconds): 300M * 60 seconds = 18B seconds
Monthly compute (GB-seconds): 18B seconds * 2GB = 36B GB-seconds
Monthly compute duration charge (tier 1): 6B GB-s * $0.0000166667/GB-s = $100,000.02
Monthly compute duration charge (tier 2): 9B GB-s * $0.0000150000/GB-s = $135,000.27
Monthly compute duration charge (tier 3): 21B GB-s * $0.0000133333/GB-s = $280,000.56
Monthly compute duration charges (post-discount): $100,000.02 + $135,000.27 + $280,000.56 = $515,001.03.
Total monthly charges = request charges + compute duration charges = $60.00 + $515,001.03 = $515,061.03 ($85,000.17 cost savings)

Tiered pricing discount example with decreased growth

Alternatively, customers used the service less frequently than expected. As a result, usage in the following month is one-third the prior month’s usage, resulting in 25 million function invocations.

Without tiered pricing, this workload costs the following:

Monthly request charges: 25M * $0.20/million = $5.00
Monthly compute duration (seconds): 25M * 60 seconds = 1.5B seconds
Monthly compute (GB-seconds): 1.5B seconds * 2GB = 3B GB-seconds
Monthly compute duration charges: 3B GB-s * $0.0000166667/GB-s = $50,000.10
Total monthly charges = request charges + compute duration charges = $5.00 + $50,000.10 = $50,005.10

When considering tiered pricing, the compute duration portion is under 6B GB-s and is priced without any additional pricing discounts. In this case, the financial services provider did not grow the business as expected or take advantage of tiered pricing. However, they did take advantage of Lambda’s pay-as-you-go model, paying only for the compute that this application used.

Summary and other considerations

Tiered pricing for Lambda applies to the compute duration portion of your on-demand function invocations. It is specific to the architecture (x86 or arm64) and is bucketed by the Region. Refer to the previous table for the specific pricing tiers.

For example, consider a function that is using x86 architecture, deployed in both us-east-1 and us-west-2. Usage in us-east-1 is bucketed and priced separately from usage in us-west-2. If there is a function using arm64 architecture in us-east-1 and us-west-2, that function is also in a separate bucket.

The cost for invocation requests remains the same. The discount applies only to on-demand compute duration and does not apply to provisioned concurrency. Customers who also purchase Compute Savings Plans (CSPs) can take advantage of both, where Lambda applies tiered pricing first, followed by CSPs.

Conclusion

With tiered pricing for Lambda, you can save on the compute duration portion of your monthly Lambda bills. This allows you to architect, build, and run large-scale applications on Lambda and take advantage of these tiered prices automatically.

For more information on tiered pricing for Lambda, see: https://aws.amazon.com/lambda/pricing/.

Using certificate-based authentication for iOS applications with Amazon SNS

Post Syndicated from Sam Dengler original https://aws.amazon.com/blogs/compute/using-certificate-based-authentication-for-ios-applications-with-amazon-sns/

This blog post is written by Yashlin Naidoo, Arnav Thakur, Kim Read, Guilherme Silva.

Amazon SNS enables you to send notifications to a mobile push endpoint using a platform application endpoint by dispatching the notification on your application’s behalf. Push notifications for iOS apps are sent using Apple Push Notification Service (APNs).

To send push notifications using SNS for APNS certificate-based authentication, you must provide a set of credentials for connecting to the Apple Push Notification Service (see prerequisites for push). SNS supports using certificate-based authentication (.p12), in addition to the new token-based authentication (.p8).

Certificate-based authentication uses a provider certificate to establish a secure connection between your provider and APNs. These certificates are tied to a single application and are used to send notifications to this application. This approach can be useful when you haven’t migrated to the new token-based authentication.

For new applications, we recommend using token-based authentication as it provides improved security. It removes the need for yearly renewal of the certificates and can also be shared amongst multiple applications. To learn about how to use token-based authentication, visit Token-Based authentication for iOS applications with Amazon SNS in the AWS Compute Blog.

This blog shows step-by-step instructions on how to build an iOS application. You learn how to create a new certificate from your Apple developer account, and set up a platform application and endpoint in the SNS console. Next, you will learn how to test your application by sending a push notification via SNS to your device. Finally, you view the push notification delivered to your device.

Setting up your iOS application

This section will go over:

  • Creating an iOS application.
  • Creating a .p12 certificate to upload to SNS.

Prerequisites:

Creating an iOS application

  1. Create a new XCode project. Select iOS as the platform.

    New XCode project

    New XCode project

  2. Select your Apple Developer Account team and organization identifier.

    Select your Apple Developer Account team

    Select your Apple Developer Account team

  3. In your project, go to Signing & Capabilities. Under signing, ensure that “Automatically manage signing” is checked and your team is selected.

    Signing & Capabilities

    Signing & Capabilities

  4. To add the push notification capability to your application, select “+” and select Push Notifications.
    Add push notification capability

    Add push notification capability

    This step creates resources on your Apple Developer Account (the App ID and adds Push notification capability to it). You can also verify this in your Apple Developer Account.

  5. Add the following code to AppDelegate.swift:
        import UIKit
        import UserNotifications
    
        @main
        class AppDelegate: UIResponder, UIApplicationDelegate {
    
        func application(_ application: UIApplication, didFinishLaunchingWithOptions launchOptions: [UIApplication.LaunchOptionsKey: Any]?) -> Bool {
        // Override point for customization after application launch
    
        //Call to register for push notifications when launched
        registerForPushNotifications()
    
        return true
        }
    
        // MARK: UISceneSession Lifecycle
    
        func application(_ application: UIApplication, configurationForConnecting connectingSceneSession: UISceneSession, options: UIScene.ConnectionOptions) -> UISceneConfiguration {
        // Called when a new scene session is being created.
        // Use this method to select a configuration to create the new scene with.
        return UISceneConfiguration(name: "Default Configuration", sessionRole: connectingSceneSession.role)
        }
    
        func application(_ application: UIApplication, didDiscardSceneSessions sceneSessions: Set<UISceneSession>) {
        // Called when the user discards a scene session.
        // If any sessions were discarded while the application was not running, this will be called shortly after application:didFinishLaunchingWithOptions.
        // Use this method to release any resources that were specific to the discarded scenes, as they will not return.
        }
    
        func getNotificationSettings() {
        UNUserNotificationCenter.current().getNotificationSettings { settings in
        print("Notification settings: \(settings)")
    
        guard settings.authorizationStatus == .authorized else { return }
        DispatchQueue.main.async {
        UIApplication.shared.registerForRemoteNotifications()
        }
    
        }
        }
    
        func registerForPushNotifications() {
        //1 this handles all notification-related activities in the app including push notifications
        UNUserNotificationCenter.current()
    
        //2 this requests authorization to send the types of notifications specifies in the options
        .requestAuthorization(
        options: [.alert, .sound, .badge]) { [weak self] granted, _ in
        print("Permission granted: \(granted)")
        guard granted else { return }
        self?.getNotificationSettings()
        }
    
        }
    
        func application(
        _ application: UIApplication,
        didRegisterForRemoteNotificationsWithDeviceToken deviceToken: Data
        ) {
        let tokenParts = deviceToken.map { data in String(format: "%02.2hhx", data) }
        let token = tokenParts.joined()
        print("Device Token: \(token)")
        }
    
        func application(
        _ application: UIApplication,
        didFailToRegisterForRemoteNotificationsWithError error: Error
        ) {
        print("Failed to register: \(error)")
        }
    
        }
  6. Build and run the application on an iPhone. Note that the push notification feature does not work with a simulator.
  7. On your phone, select “Allow” when prompted to allow push notifications.

    Allow push notifications

    Allow push notifications

  8. The debugger prints “Permission granted: true” if successful and returns the Device Token.

    Device token

    Device token

You have now configured an iOS application that can receive push notifications. Next, use the application to test sending push notifications with SNS using certificate-based authentication.

Creating a .p12 certificate to upload to SNS

After completing the previous step, you need:

  • An app identifier
  • A certificate signing request (CSR)
  • An SSL certificate

Create an identifier

  1. Log in to your Apple Developer Account.
  2. Choose Certificates, Identifiers & Profiles.
  3. In the Identifiers section, choose the Add button (+).
  4. In the Register a new identifier section, choose App IDs and select Continue.
  5. In the Select a type section, choose App, and select Continue.
  6. For Description, type the application description.
  7. For Bundle ID, use the Bundle ID assigned to your application. You can find this ID under Signing & Capabilities of your application in XCode (see step 3 under “Creating an application”).
  8. Under Capabilities, choose Push Notifications.
  9. Select Continue. In the Confirm your App ID panel, check that all values were entered correctly. The identifier should match your app ID and bundle ID.
  10. Select Register to register the new app ID.

Create a certificate signing request (CSR)

  1. Open Keychain Access located in /Applications/Utilities or search for it on Finder.
  2. Once opened, choose the tab Keychain Access Tab (next to the Apple icon). Navigate to Certificate Assistant and choose Request a Certificate from a Certificate Authority.
  3. Enter the Username, Email Address, Common Name and leave CA Email Address empty.
  4. Choose Saved to disk and choose Continue.

Create a certificate

  1. Log in to your Apple Developer Account.
  2. Choose Certificates, Identifiers & Profiles.
  3. In the Certificate section, select Create new certificate.
  4. Under services, choose your certificate: Apple Push Notification service SSL (Sandbox)/Apple Push Notification service SSL (Sandbox & Production).
  5. Keep Platform as iOS and choose App ID (Identifier) created previously.
  6. Upload the Certificate Signing Request created in the previous step and Download your certificate.

Create .p12 certificate to upload to SNS

  1. Once your certificate.cer file is downloaded (for example, “aps_development.cer”), open it to show in keychain access. Find Apple Development iOS Push Services: (Your Identifier Name/App ID Name) and ensure that the file is placed in the “Login” folder.
  2. Right-click and choose Export as file format .p12 and choose Save. Optionally, set a password.

Creating a new platform application using APNs certificate-based authentication

Prerequisites

To implement APNs certificate-based authentication from SNS, you must have:

  • An Apple Developer Account
  • An iOS mobile application

For creating a new SNS Platform Application that is used to store Push Notification Platform credentials, configurations and related configurations:

  1. Navigate to the SNS Console. Expand the Mobile menu and choose Create platform application.
  2. For the Application name field, enter an application name such as “myfirstiOSapp”. For Push Notification Platform, select Apple iOS/ VoIP/ macOS.

    Create platform application

    Create platform application

  3. Under the Apple Credentials section:
    1. If your application is in development, select the radio button for Used for development in sandbox. If your application is in production, uncheck Used for development in sandbox.
    2. For Push service, choose iOS and for Authentication method, choose Certificate.
    3. Under Certificate, select Choose file to upload the .p12 certificate file.
    4. If you configured a password while creating the certificate, enter this in the Certificate Password field.
    5. Choose Load Credentials from File to extract the Certificate and private key components.
  4. Event Notifications, Delivery Status Logging – Optional: Refer to the guide for enabling Delivery Status logs and the guide to set up Mobile Event related Notifications. More on this step can also be found in the best practices guide.

    Enter Apple credentials

    Enter Apple credentials

  5. Choose Create Platform Application. This creates a certificate-based authentication APNs Platform Application for iOS.

    Create platform application

    Create platform application

Creating a new platform endpoint using APNs token-based authentication

To send Push Notifications using SNS, a platform endpoint resource is created to store the destination address of the corresponding iOS application that is associated with the SNS platform application.

A destination address of a user’s device with the iOS application installed is identified by an unique device token. It is obtained once the app has registered successfully with APNs to receive push notifications. The details of the device token captured in the Platform Endpoint resource along with the configurations in the SNS Platform application are used in conjunction by the service to deliver a push notification message.

In the following steps, you create a new platform endpoint for a destination device that has the iOS application installed and is capable of receiving push notifications.

  1. Open your Platform Application. Choose Create Application Endpoint.

    Application endpoints list

    Application endpoints list

  2. Locate the Device token in the application logs of the iOS app provisioned earlier. Enter it in the Device Token Field.
  3. To store any additional arbitrary data for the endpoint, you can include in the User data field and choose Create application endpoint.

    Create application endpoint

    Create application endpoint

  4. Choose Create application endpoint and the details are shown on the console.

    Application endpoint detail

    Application endpoint detail

Testing a push notification from your device

In this section, you test sending a push notification to your device.

  1. From the SNS console, navigate to your platform endpoint and choose Publish message.
  2. Enter a message to send. This example uses a custom payload that allows you to provide additional APNs headers.

    Publish message

    Publish message

  3. Choose Publish message.
  4. The push notification is delivered to your device.

    Notification

    Notification

Conclusion

Developers send mobile push notifications for APNs certificate-based authentication by using a .p12 certificate to authenticate an Apple device endpoint. Certificate-based authentication ensures a secure connection through TLS (Transport Layer Security). The provider (SNS) initiates the request to APNs and validation from the provider and APNS is required to complete the secure connection.

Certificates expire annually and must be renewed to ensure that SNS can continue to deliver to the endpoint. In this post, you learn how to create an iOS application for APNs certificate-based authentication and integrate it with SNS to send push notifications to your device using a .p12 certificate to authenticate your application with the mobile endpoint.

To learn more about APNs certificate-based authentication with Amazon SNS, visit the Amazon SNS Developer Guide.

For more serverless learning resources, visit Serverless Land.

Using AWS Lambda to run external transactions on Db2 for IBM i

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/using-aws-lambda-to-run-external-transactions-on-db2-for-ibm-i/

This post is written by Basil Lin, Cloud Application Architect, and Jud Neer, Delivery Practice Manager.

Db2 for IBM i (Db2) is a relational database management system that can pose connectivity challenges with cloud environments because of a lack of native support. However, by using Docker on Amazon ECR and AWS Lambda container images, you can transfer data between the two environments with a serverless architecture.

While mainframe modernization solutions are helping customers migrate from on-premises technologies to agile cloud solutions, a complete migration is not always immediately possible. AWS offers broad modernization support from rehosting to refactoring, and platform augmentation is a common scenario for customers getting started on their cloud journey.

Db2 is a common database in on-premises workloads. One common use case of platform augmentation is maintaining Db2 as the existing system-of-record while rehosting applications in AWS. To ensure Db2 data consistency, a change data capture (CDC) process must be able to capture any database changes as SQL transactions. A mechanism then runs these transactions on the existing Db2 database.

While AWS provides CDC tools for multiple services, converting and running these changes for Db2 requires proprietary IBM drivers. Conventionally, you can implement this by hosting a stream-processing application on a server. However, this approach relies on traditional server architecture. This may be less efficient, incur higher overhead, and may not meet availability requirements.

To avoid these issues, you can build this transaction mechanism using a serverless architecture. This blog post’s approach uses ECR and Lambda to externalize and run serverless, on-demand transactions on Db2 for IBM i databases.

Overview

The solution you deploy relies on a Lambda container image to run SQL queries on Db2. While you provide your own Lambda invocation methods and queries, this solution includes the drivers and connection code required to interface with Db2. The following architecture diagram shows this generic solution with no application-specific triggers:

Architecture diagram

This solution builds a Docker image containerized with Db2 interfacing code. The code consists of a Lambda handler to run the specified database transactions, a base class that helps create database Python functions via Open Database Connectivity (ODBC), and finally a forwarder class to establish encrypted connections with the target database.

Deployment scripts create the Docker image, deploy the image to ECR, and create a Lambda function from the image. Lambda then runs your queries on your target Db2 database. This solution does not include the Lambda invocation trigger, the Amazon VPC, and the AWS Direct Connect connection as part of the deployment, but these components may be necessary depending on your use case. The README in the sample repository shows the complete deployment prerequisites.

To interface with Db2, the Lambda function establishes an ODBC session using a proprietary IBM driver. This enables the use of high-level ODBC functions to manipulate the Db2 database management system.

Even with the proprietary driver, ODBC does not properly support TLS encryption with Db2. During testing, enabling the TLS encryption option can cause issues with database connectivity. To work around this limitation, a forwarding package captures all ODBC traffic and forwards packets using TLS encryption to the database. The forwarder opens a local socket listener on port 8471 for unencrypted loopback connections. Once the Lambda function initializes an unencrypted ODBC connection locally, the forwarding package then captures, encrypts, and forwards all ODBC calls to the target Db2 database. This method allows Lambda to form encrypted connections with your target database while still using ODBC to control transactions.

With secure connectivity in place, you can invoke the Lambda function. The function starts the forwarder and retrieves Db2 access credentials from AWS Secrets Manager, as shown in the following diagram. The function then attempts an ODBC loopback connection to send transactions to the forwarder.

Flow process

If the connection is successful, the Lambda function runs the queries, and the forwarder sends the queries to the target Db2. However, if the connection fails, it makes a second connection attempt. The second attempt consists of both restarting the forwarder module and the loopback connection. If the second attempt fails again, the function errors out.

After the transactions complete, a cleanup process runs and the function exits with a success status, unless an exception occurs during the function invocation. If an exception arises during the transaction, the function exits with a failure status. This is an important consideration when building retry mechanisms. You must review Lambda exit statuses to prevent default AWS retry mechanisms from causing unintended invocations.

To simplify deployment, the solution contains scripts you can use. Once you provide AWS credentials, the deployment script deploys a base set of infrastructure into AWS, including the ECR repository for the Docker images and the Secrets Manager secret for the Db2 configuration details.

The deployment script also asks for Db2 configuration details. After you finish entering these, the script sends the information to AWS to configure the previously deployed secret.

Once the secret configuration is complete, the script then builds and pushes a base Docker image to the deployed ECR repository. This base image contains a few basic Python prerequisite libraries necessary for the final code, and also the RPM driver for interfacing with Db2 via ODBC.

Finally, the script builds the solution infrastructure and deploys it into the AWS Cloud. Using the base image in ECR, it creates a Lambda function from a new Docker container image containing the SQL queries and the ODBC transaction code. After deployment, the solution is ready for testing and customization for your use case.

Prerequisites

Before deployment, you must have the following:

  1. The cloned code repository locally.
  2. A local environment configured for deployment.
  3. Amazon VPC and networking configured with Db2 access.

You can find detailed prerequisites and associated instructions in the README file.

Deploying the solution

The deployment creates an ECR repository, a Secrets Manager secret, a Lambda function built from a base container image uploaded to the ECR repo, and associated elastic network interfaces (ENIs) for VPC access.

Because of the complexity of the deployment, a combination of Bash and Python scripts automates the process by automatically deploying infrastructure templates, building and pushing container images, and prompting for input where required. Refer to the README included in the repository for detailed instructions.

To deploy:

  1. Ensure you have met the prerequisites.
  2. Open the README file in the repository and follow the deployment instructions
    1. Configure your local AWS CLI environment.
    2. Configure the project environment variables file.
    3. Run the deployment scripts.
  3. Test connectivity by invoking the deployed Lambda function
  4. Change infrastructure and code for specific queries and use cases

Cleanup

To avoid incurring additional charges, ensure that you delete unused resources. The README contains detailed instructions. You may either manually delete the resources provisioned through the AWS Management Console, or use the automated cleanup script in the repository. The deletion of resources may take up to 45 minutes to complete because of the ENIs created for Lambda in your VPC.

Conclusion

In this blog post, you learn how to run external transactions securely on Db2 for IBM i databases using a combination of Amazon ECR and AWS Lambda. By using Docker to package the driver, forwarder, and custom queries, you can execute transactions from Lambda, allowing modern architectures to interface directly with Db2 workloads. Get started by cloning the GitHub repository and following the deployment instructions.

For more serverless learning resources, visit Serverless Land.

Migrating mainframe JCL jobs to serverless using AWS Step Functions

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/migrating-mainframe-jcl-jobs-to-serverless-using-aws-step-functions/

This post is written by Raghuveer Reddy Talakola, Sr. Modernization Architect, Sanjay Rao, Sr. Mainframe Consultant, and Aneel Murari, Solution Architect.

JCL (Job Control Language) is a scripting language used to program batch jobs on mainframe systems. A JCL can contain one to many job control statements. It can be challenging to understand the condition code parameter checking syntax, which determines the order and conditions under which these statements are run.

If a JCL fails midway through execution, mainframe programmers have no visual aids to help them understand the flow of the JCL. They must examine text-based execution logs to manually correlate condition codes in the logs with condition check rules attached to JCL statements to understand the root cause of failure.

This post explains how AWS Step Functions can make it easier to maintain batch jobs migrated from mainframes to AWS.

Overview

The sample application shows how to use AWS Step Functions to address typical challenges when maintaining a batch workflow built using JCL. The sample business case validates a feed of new employee information against an existing employee database. It identifies discrepancies between the feed and the database and sends out notifications if it finds any.

The mainframe JCL supplied with this blog has seven steps. Each step applies condition code rules to check codes emitted by previous steps to decide if it must run. The Step Functions example achieves the same result. Using its graphical user interface, you can develop each step as an independent task, and link them visually. This makes it easier to understand how to decouple, reorder, or scale tasks if needed.

Visual tools for workflow analysis

A JCL controls its flow by using condition code checking or/and using IF-ELSE statements. A JCL condition code check defines the rules under which its associated JCL step will not run. Developers may code compound rules, double negatives, or triple or more negative conditions into the flow.

Example of condition code check in JCL What it means
//STEPTS2 EXEC PGM=XYZ,COND=(4,GT,STEPTST) Do not execute PGM XYZ if previous step STEPTST ended execution with a code greater than 4
//STEPTS3 EXEC PGM=XYZ,COND=EVEN Execute PGM XYZ even if all the previous steps failed

//STEPTS5 EXEC PGM=XYZ,

COND=((6,EQ),(8,GT))

Do not execute PGM XYZ if any of the preceding steps exited with return code 6 or a code greater than 8

The sample JCL illustrates the complexity of setting up a batch workflow using JCL condition code:

  1. The first step of this JCL deletes files from a previous run. If it ends with code 0, the second JCL step extracts employee data from Db2 using a COBOL program and ends with a return code 0 if it is successful or 4 if no records were found.
  2. The next step coded with condition check (4,LT), runs if all preceding steps ended with codes less than 5. It checks the external extract and emits a condition code of 8 if the external extract is empty.
  3. The next step compares the 2 files if the extract validation step produced a return code of zero.
  4. If this comparison step detects some records that are missing in the employee Db2 database, it creates a file with missing records. If that file is empty, it sets a return code of 8, which ends the program. If the mismatch file has data, it copies the mismatch file over to another system for processing.

With Step Functions, you define the same workflow more easily by using the Amazon States Language (ASL). The Step Functions console provided a graphical representation of that state machine to visualize the application logic using a drag and drop interface.

Step Functions Workflow Studio

  1. The first task fetches the employee file from Amazon S3. It does not need a cleanup task as S3 supports versioning.
  2. If the fetched file is not empty, control passes to the step that runs business logic code inside an AWS Lambda function to validate the employee feed.
  3. The workflow retrieves an environment variable from an external parameter store. This step shows how environment parameters can be externalized in a Step Functions workflow.
  4. It publishes an event to Amazon EventBridge to trigger the external processing needed if discrepancies are found and conditions are met.
  5. The final step is a Succeeded state that marks flow completion.

The following image compares the sample JCL that is converted to a Step Functions workflow:

Sample JCL and Step Functions

Using a graphical interface instead of job control statements

In JCL, you define a batch process with a series of job control statements, which run a program, utility, or a nested procedure in a text editor. There is no visual aid. If a batch process becomes complex, it’s harder to understand the dependencies between the steps.

Step Functions makes it simpler for you to set up tasks, which are the equivalents of steps in JCL. It provides you with a graphical user interface (GUI) that enables you to configure and drag-and-drop steps into a state machine.

Decoupling tasks instead of deleting and commenting of code

To disable or change a step in a JCL, you examine the condition code logic associated with all preceding and succeeding steps of the job. Any mistake in editing these codes can lead to unintended consequences.

With Step Functions, removing or changing a step can be done using the visual editor or by updating the ASL code. This can help improve your agility and make it easier to implement change.

Using Parameter Store instead of editing parameters in code

To make JCL behave differently based on parameters, you must edit dynamic variables known as JCL Symbols inside the JCL or in control cards to affect the behavior change. The following JCL code sample shows a parameter called REGN coded to value DEV. At runtime, this REGN parameter is substituted by DEV in every statement that references this parameter. To reuse this JCL in production, you can change the value assigned to REGN to say PROD.

//   SET REGN=DEV
//    -------
//******************************************************************
//*  RUN  Db2 COBOL Batch Program 
//******************************************************************
//EXTRDB2 EXEC PGM=IKJEFT01,COND=(0,NE)                                
//    -------
//FILEOUT  DD DSN=&REGN..AWS.APG.STEPDB2,                             
//******************************************************************
//*  RUN  VSAM COBOL Batch Program 
//******************************************************************
//    -------
//FILE2    DD DSN=&REGN..AWS.APG.STEPVSM,                             

In Step Functions, configuration parameters can be decoupled from state machine code by managing them in an external data source such as the Amazon DynamoDB, AWS Systems Manager Parameter Store. In the Step Functions workflow, the following step demonstrates retrieving a configuration from Parameter Store and using it to perform branching logic:

Workflow example

Independent scaling of steps versus splitting and cloning JCLs

When a JCL takes a long time to run, mainframe programmers split the job into multiple jobs or steps. Each job is a replica addressing different ranges of data.

With Step Functions, you can run a step or a group of steps concurrently by using a parallel state or map state, without creating multiple jobs that do the same thing. This can help make maintenance easier.

Improved observability and automated retry

If a JCL fails, there are no visual aids to help debug the errors. On the mainframe, you must log into the mainframe and run through several screens of text on SDSF (System Display and Search Facility) to find the cause of the failure.

Step Functions provide visual information on failures, automated retry capabilities, and native integration with AWS services. This can make it easier to understand and recover from failed jobs compared with reading through lengthy logs.

JCL example

Workflow visualization

Benefits for developers

Step Functions provides the following improvements over jobs written in JCL or migrated from JCL.

  • Visual analysis: Step Functions provide a graphical console that shows the status of each task in a visual presentation that developers and support staff can understand and debug more easily than a failed JCL.
  • Decoupling: You can update each component in the workflow independently, unlike in a JCL, where changing a step requires redeployment of the entire batch job to production.
  • Low code: Step Functions are defined with minimal code. The workflow editor can be used to drag and drop different steps and visually edit the workflows.
  • Independent scaling of steps: Step Functions is a serverless solution, and each step can scale independently. This opens up the possibility of scaling up resources for steps that are resource-intensive.
  • Automated retry capabilities: You can configure Step Functions to retry steps and recover from failures. This is much simpler than coding restart conditions in the JCL.
  • Improved logging and visibility: Step Functions can integrate with observability tools like Amazon CloudWatch and AWS X-Ray.

Conclusion

This conversion example shows how Step Functions can help you rewrite complex batch processes written in JCL to serverless workflows. It also shows how such a conversion provides maintenance and monitoring features that make it easier to simplify and scale these batch processes.

To learn more, download the sample JCL and Step Functions workflow from the GitHub repository. To learn more about our AWS Mainframe migration and modernization services, go here.

For more serverless learning resources, visit Serverless Land.

Scaling AWS Lambda permissions with Attribute-Based Access Control (ABAC)

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/scaling-aws-lambda-permissions-with-attribute-based-access-control-abac/

This blog post is written by Chris McPeek, Principal Solutions Architect.

AWS Lambda now supports attribute-based access control (ABAC), allowing you to control access to Lambda functions within AWS Identity and Access Management (IAM) using tags. With ABAC, you can scale an access control strategy by setting granular permissions with tags without requiring permissions updates for every new user or resource as your organization scales.

This blog post shows how to use tags for conditional access to Lambda resources. You can control access to Lambda resources using ABAC by using one or more tags within IAM policy conditions. This can help you scale permissions in rapidly growing environments. To learn more about ABAC, see What is ABAC for AWS, and AWS Services that work with IAM.

Each tag in AWS is a label comprising a user-defined key and value. Customers often use tags with Lambda functions to define keys such as cost center, environment, project, and teams, along with values that map to these keys. This helps with discovery and cost allocation, especially in accounts that may have many Lambda functions. AWS best practices for tagging are included in Tagging AWS resources.

You can now use these same tags, or create new ones, and use them to grant conditional IAM access to Lambda functions more easily. As projects start and finish, employees move to different teams, and applications grow, maintaining access to resources can become cumbersome. ABAC helps developers and security administrators work together to maintain least privilege access to their resources more effectively by using the same tags on IAM roles and Lambda functions. Security administrators can allow or deny access to Lambda API actions when the IAM role tags match the tags on a Lambda function, ensuring least privilege. As developers add additional Lambda functions to the project, they simply apply the same tag when they create a new Lambda function, which grants the same security credentials.

ABAC in Lambda

Using ABAC with Lambda is similar to developing ABAC policies when working with other services. To illustrate how to use ABAC with Lambda, consider a scenario where two new developers join existing projects called Project Falcon and Project Eagle. Project Falcon uses ABAC for authorization using the tag key project-name and value falcon. Project Eagle uses the tag key project-name and value eagle.

Projects Falcon and Eagle tags

Projects Falcon and Eagle tags

The two new developers need access to the Lambda console. The security administrator creates the following policy to allow the developers to list the existing functions that are available using ListFunction. The GetAccountSettings permission allows them to retrieve Lambda-specific information about their account.

{
"Version": "2012-10-17",
"Statement": [
    {
    "Sid": "AllResourcesLambdaNoTags",
    "Effect": "Allow",
    "Action": [
        "lambda:ListFunctions",
        "lambda:GetAccountSettings"
    ],
    "Resource": "*"
    }
]
}

Condition key mappings

The developers then need access to Lambda actions that are part of their projects. The Lambda actions are API calls such as InvokeFunction or PutFunctionConcurrency (see the following table). IAM condition keys are then used to refine the conditions under which an IAM policy statement applies.

Lambda supports the existing global context key:

  • "aws:PrincipalTag/${TagKey}": Control what the IAM principal (the person making the request) is allowed to do based on the tags that are attached to their IAM user or role.

As part of ABAC support, Lambda now supports three additional condition keys:

  • "aws:ResourceTag/${TagKey}": Control access based on the tags that are attached to Lambda functions.
  • "aws:RequestTag/${TagKey}": Require tags to be present in a request, such as when creating a new function.
  • "aws:TagKeys": Control whether specific tag keys can be used in a request.

For more details on these condition context keys, see AWS global condition context keys.

When using condition keys in IAM policies, each Lambda API action supports different tagging condition keys. The following table maps each condition key to its Lambda actions.

Condition keys supported Description Lambda actions
aws:ResourceTag/${TagKey} Set this tag value to allow or deny user actions on resources with specific tags.
lambda:AddPermission
lambda:CreateAlias
lambda:CreateFunctionUrlConfig
lambda:DeleteAlias
lambda:DeleteFunction
lambda:DeleteFunctionCodeSigningConfig
lambda:DeleteFunctionConcurrency
lambda:DeleteFunctionEventInvokeConfig
lambda:DeleteFunctionUrlConfig
lambda:DeleteProvisionedConcurrencyConfig
lambda:DisableReplication
lambda:EnableReplication
lambda:GetAlias
lambda:GetFunction
lambda:GetFunctionCodeSigningConfig
lambda:GetFunctionConcurrency
lambda:GetFunctionConfiguration
lambda:GetFunctionEventInvokeConfig
lambda:GetFunctionUrlConfig
lambda:GetPolicy
lambda:GetProvisionedConcurrencyConfig
lambda:InvokeFunction
lambda:InvokeFunctionUrl
lambda:ListAliases
lambda:ListFunctionEventInvokeConfigs
lambda:ListFunctionUrlConfigs
lambda:ListProvisionedConcurrencyConfigs
lambda:ListTags
lambda:ListVersionsByFunction
lambda:PublishVersion
lambda:PutFunctionCodeSigningConfig
lambda:PutFunctionConcurrency
lambda:PutFunctionEventInvokeConfig
lambda:PutProvisionedConcurrencyConfig
lambda:RemovePermission
lambda:UpdateAlias
lambda:UpdateFunctionCode
lambda:UpdateFunctionConfiguration
lambda:UpdateFunctionEventInvokeConfig
lambda:UpdateFunctionUrlConfig

aws:ResourceTag/${TagKey}
aws:RequestTag/${TagKey}

aws:TagKeys
Set this tag value to allow or deny user requests to create a Lambda function. lambda:CreateFunction
aws:ResourceTag/${TagKey}
aws:RequestTag/${TagKey}

aws:TagKeys
Set this tag value to allow or deny user requests to add or update tags. lambda:TagResource
aws:ResourceTag/${TagKey}
aws:TagKeys
Set this tag value to allow or deny user requests to remove tags. lambda:UntagResource

Security administrators create conditions that only permit the action if the tag matches between the role and the Lambda function.
In this example, the policy grants access to all Lambda function API calls when a project-name tag exists and matches on both the developer’s IAM role and the Lambda function.

{
"Version": "2012-10-17",
"Statement": [
    {
    "Sid": "AllActionsLambdaSameProject",
    "Effect": "Allow",
    "Action": [
        "lambda:InvokeFunction",
        "lambda:UpdateFunctionConfiguration",
        "lambda:CreateAlias",
        "lambda:DeleteAlias",
        "lambda:DeleteFunction",
        "lambda:DeleteFunctionConcurrency", 
        "lambda:GetAlias",
        "lambda:GetFunction",
        "lambda:GetFunctionConfiguration",
        "lambda:GetPolicy",
        "lambda:ListAliases", 
        "lambda:ListVersionsByFunction",
        "lambda:PublishVersion",
        "lambda:PutFunctionConcurrency",
        "lambda:UpdateAlias",
        "lambda:UpdateFunctionCode"
    ],
    "Resource": "arn:aws:lambda:*:*:function:*",
    "Condition": {
        "StringEquals": {
        "aws:ResourceTag/project-name": "${aws:PrincipalTag/project-name}"
        }
    }
    }
]
}

In this policy, Resource is wild-carded as "*" for all Lambda functions. The condition limits access to only resources that have the same project-name key and value, without having to list each individual Amazon Resource Name (ARN).

The security administrator creates an IAM role for each developer’s project, such as falcon-developer-role or eagle-developer-role. Since the policy references both the function tags and the IAM role tags, she can reuse the previous policy and apply it to both of the project roles. Each role should have the tag key project-name with the value set to the project, such as falcon or eagle. The following shows the tags for Project Falcon:

Tags for Project Falcon

Tags for Project Falcon

The developers now have access to the existing Lambda functions in their respective projects. The developer for Project Falcon needs to create additional Lambda functions for only their project. Since the project-name tag also authorizes who can access the function, the developer should not be able to create a function without the correct tags. To enforce this, the security administrator applies a new policy to the developer’s role using the RequestTag condition key to specify that a project-name tag exists:

{
"Version": "2012-10-17",
"Statement": [
    {
    "Sid": "AllowLambdaTagOnCreate",
    "Effect": "Allow",
    "Action": [
        "lambda:CreateFunction",
        “lambda:TagResource”
    ]
    "Resource": "arn:aws:lambda:*:*:function:*",
    "Condition": {
        "StringEquals": {,
            “aws:RequestTag/project-name”: “${aws:PrincipalTag/project-name}”
        },
        "ForAllValues:StringEquals": {
            "aws:TagKeys": [
                 “project-name”
            ]
        }
    }
    }
]
}

To create the functions, the developer must add the key project-name and value falcon to the tags. Without the tag, the developer cannot create the function.

Project Falcon tags

Project Falcon tags

Because Project Falcon is using ABAC, by tagging the Lambda functions during creation, they did not need to engage the security administrator to add additional ARNs to the IAM policy. This provides flexibility to the developers to support their projects. This also helps scale the security administrators’ function by no longer needing to coordinate which resources need to be added to IAM policies to maintain least privilege access.

The project must then add a manager who requires read access to projects as long as they are also in the organization labeled birds and cost-center : it.

Organization and Cost Center tags

Organization and Cost Center tags

The security administrator creates a new IAM policy called manager-policy with the following statements:

{
"Version": "2012-10-17",
"Statement": [
    {
    "Sid": "AllActionsLambdaManager",
    "Effect": "Allow",
    "Action": [
        "lambda:GetAlias",
        "lambda:GetFunction",
        "lambda:GetFunctionConfiguration",
        "lambda:GetPolicy",
        "lambda:GetPolicy",
        "lambda:ListAliases", 
        "lambda:ListVersionsByFunction"
    ],
    "Resource": "arn:aws:lambda:*:*:function:*",
    "Condition": {
        "StringEquals": {
            “aws:ResourceTag/organization”: “${aws:PrincipalTag/organization}”,
            “aws:ResourceTag/cost-center”: “$}aws:PrincipalTag/cost-center}”
        }
    }
    }
]
}

The security administrator attaches the policy to the manager’s role along with the tag organization:birds, and cost-center:it. If any of the projects change organization, the manager no longer has access, even if the cost-center remains IT.

In this policy, the condition ensures both the cost-center and organization tags exist for the function and the values are equal to the tags in the manager’s role. Even if the cost-center tag matches for both the Lambda function and the manager’s role, yet the manager’s organization tag doesn’t match, IAM denies access to the Lambda function. Tags themselves are only a key:value pair with no relationship to other tags. You can use multiple tags, as in this example, to more granularly define Lambda function permissions.

Conclusion

You can now use attribute-based access control (ABAC) with Lambda to control access to functions using tags. This allows you to scale your access controls by simplifying the management of permissions while still maintaining least privilege security best practices. Security administrators can coordinate with developers on a tagging strategy and create IAM policies with ABAC condition keys. This then gives freedom to developers to grow their applications by adding tags to functions, without needing a security administrator to update individual IAM policies.

Attribute-based Access Control (ABAC) for Lambda functions support is also available through many AWS Lambda Partners such as Lumigo, Pulumi and Vertical Relevance.

For additional documentation on ABAC with Lambda see Attribute-based access control for Lambda.