Tag Archives: serverless

Monitoring and troubleshooting serverless data analytics applications

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/monitoring-and-troubleshooting-serverless-data-analytics-applications/

This series is about building serverless solutions in streaming data workloads. The application example used in this series is Alleycat, which allows bike racers to compete with each other virtually on home exercise bikes.

The first four posts have explored the architecture behind the application, which is enabled by Amazon Kinesis, Amazon DynamoDB, and AWS Lambda. This post explains how to monitor and troubleshoot issues that are common in streaming applications.

To set up the example, visit the GitHub repo and follow the instructions in the README.md file. Note that this walkthrough uses services that are not covered by the AWS Free Tier and incur cost.

Monitoring the Alleycat application

The business requirements for Alleycat state that it must handle up to 1,000 simultaneous racers. With each racer emitting a message every second, each 5-minute race results in 300,000 messages.

Reference architecture

While the architecture can support this throughput, the settings for each service determine how the workload scales up. The deployment templates in the GitHub repo do not use sufficiently high settings to handle this amount of data. In the section, I show how this results in errors and what steps you can take to resolve the issues. To start, I run the simulator for several races with the maximum racers configuration set to 1,000.

Monitoring the Kinesis stream

The monitoring tab of the Kinesis stream provides visualizations of stream metrics. This immediately shows that there is a problem in the application when running at full capacity:

Monitoring the Kinesis stream

  1. The iterator age is growing, indicating that the data consumers are falling behind the data producers. The Get records graph also shows the number of records in the stream growing.
  2. The Incoming data (count) metric shows the number of separate records ingested by the stream. The red line indicates the maximum capacity of this single-shard stream. With 1,000 active racers, this is almost at full capacity.
  3. However, the Incoming data – sum (bytes) graph shows that the total amount of data ingested by the stream is currently well under the maximum level shown by the red line.

There are two solutions for improving the capacity on the stream. First, the data producer application (the Alleycat frontend) could combine messages before sending. It’s currently reaching the total number of messages per second but the total byte capacity is significantly below the maximum. This action improves message packing but increases latency since the frontend waits to group messages.

Alternatively, you can add capacity by resharding. This enables you to increase (or decrease) the number of shards in a stream to adapt to the rate of data flowing through the application. You can do this with the UpdateShardCount API action. The existing stream goes into an Updating status and the stream scales by splitting shards. This creates two new child shards that split the partition keyspace of the parent. It also results in another, separate Lambda consumer for the new shard.

Monitoring the Lambda function

The monitoring tab of the consuming Lambda function provides visualization of metrics that can highlight problems in the workload. At full capacity, the monitoring highlights issues to resolve:

Monitoring the Lambda function

  1. The Duration chart shows that the function is exceeding its 15-second timeout, when the function normally finishes in under a second. This typically indicates that there are too many records to process in a single batch or throttling is occurring downstream.
  2. The Error count metric is growing, which highlights either logical errors in the code or errors from API calls to downstream resources.
  3. The IteratorAge metric appears for Lambda functions that are consuming from streams. In this case, the growing metric confirms that data consumption is falling behind data production in the stream.
  4. Concurrent executions remain at 1 throughout. This is set by the parallelization factor in the event source mapping and can be increased up to 10.

Monitoring the DynamoDB table

The metric tab on the application’s table in the DynamoDB console provides visualizations for the performance of the service:

Monitoring the DynamoDB table

  1. The consumed Read usage is well within the provisioned maximum and there is no read throttling on the table.
  2. Consumed Write usage, shown in blue, is frequently bursting through the provisioned capacity.
  3. The number of Write throttled requests confirms that the DynamoDB service is throttling requests since the table is over capacity.

You can resolve this issue by increasing the provisioned throughput on the table and related global secondary indexes. Write capacity units (WCUs) provide 1 KB of write throughput per second. You can set this value manually, use automatic scaling to match varying throughout, or enable on-demand mode. Read more about the pricing models for each to determine the best approach for your workload.

Monitoring Kinesis Data Streams

Kinesis Data Streams ingests data into shards, which are fixed capacity sequences of records, up to 1,000 records or 1 MB per second. There is no limit to the amount of data held within a stream but there is a configurable retention period. By default, Kinesis stores records for 24 hours but you can increase this up to 365 days as needed.

Kinesis is integrated with Amazon CloudWatch. Basic metrics are published every minute, and you can optionally enable enhanced metrics for an additional charge. In this section, I review the most commonly used metrics for monitoring the health of streams in your application.

Metrics for monitoring data producers

When data producers are throttled, they cannot put new records onto a Kinesis stream. Use the WriteProvisionedThroughputExceeded metric to detect if producers are throttled. If this is more than zero, you won’t be able to put records to the stream. Monitoring the Average for this statistic can help you determine if your producers are healthy.

When producers succeed in sending data to a stream, the PutRecord.Success and PutRecords.Success are incremented. Monitoring for spikes or drops in these metrics can help you monitor the health of producers and catch problems early. There are two separate metrics for each of the API calls, so watch the Average statistic for whichever of the two calls your application uses.

Metrics for monitoring data consumers

When data consumers are throttled or start to generate errors, Kinesis continues to accept new records from producers. However, there is growing latency between when records are written and when they are consumed for processing.

Using the GetRecords.IteratorAgeMilliseconds metric, you can measure the difference between the age of the last record consumed and the latest record put to the stream. It is important to monitor the iterator age. If the age is high in relation to the stream’s retention period, you can lose data as records expire from the stream. This value should generally not exceed 50% of the stream’s retention period – when the value reaches 100% of the stream retention period, data is lost.

If the iterator age is growing, one temporary solution is to increase the retention time of the stream. This gives you more time to resolve the issue before losing data. A more permanent solution is to add more consumers to keep up with data production, or resolve any errors that are slowing consumers.

When consumers exceed the ReadProvisionedThroughputExceeded metric, they are throttled and you cannot read from the stream. This results in a growth of records in the stream waiting for processing. Monitor the Average statistic for this metric and aim for values as close to 0 as possible.

The GetRecords.Success metric is the consumer-side equivalent of PutRecords.Success. Monitor this value for spikes or drops to ensure that your consumers are healthy. The Average is usually the most useful statistic for this purpose.

Increasing data processing throughput for Kinesis Data Streams

Adjusting the parallelization factor

Kinesis invokes Lambda consumers every second with a configurable batch size of messages. It’s important that the processing in the function keeps pace with the rate of traffic to avoid a growing iterator age. For compute intensive functions, you can increase the memory allocated in the function, which also increases the amount of virtual CPU available. This can help reduce the duration of a processing function.

If this is not possible or the function is falling behind data production in the stream, consider increasing the parallelization factor. By default, this is set to 1, meaning that each shard has a single instance of a Lambda function it invokes. You can increase this up to 10, which results in multiple instances of the consumer function processing additional batches of messages.

Adjusting the parallelization factor

Using enhanced fan-out to reduce iterator age

Standard consumers use a pull model over HTTP to fetch batches of records. Each consumer operates in serial. A stream with five consumers averages 200 ms of latency each, meaning it takes up to 1 second for all five to receive batches of records.

You can improve the overall latency by removing any unnecessary data consumers. If you use Kinesis Data Firehose and Kinesis Data Analytics on a stream, these count as consumers too. If you can remove subscribers, this helps with over data consumption throughput.

If the workload needs all of the existing subscribers, use enhanced fan-out (EFO). EFO consumers use a push model over HTTP/2 and are independent of each other. With EFO, the same five consumers in the previous example would receive batches of messages in parallel, using dedicated throughput. Overall latency averages 70 ms and typically data delivery speed is improved by up to 65%. There is an additional charge for this feature.

Enhanced fan-out

To learn more about processing streaming data with Lambda, see this AWS Online Tech Talk presentation.

Conclusion

In this post, I show how the existing settings in the Alleycat application are not sufficient for handling the expected amount of traffic. I walk through the metrics visualizations for Kinesis Data Streams, Lambda, and DynamoDB to find which quotas should be increased.

I explain which CloudWatch metrics can be used with Kinesis Data Stream to ensure that data producers and data consumers are healthy. Finally, I show how you can use the parallelization factor and enhanced fan-out features to increase the throughput of data consumers.

For more serverless learning resources, visit Serverless Land.

Using GitHub Actions to deploy serverless applications

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/using-github-actions-to-deploy-serverless-applications/

This post is written by Gopi Krishnamurthy, Senior Solutions Architect.

Continuous integration and continuous deployment (CI/CD) is one of the major DevOps components. This allows you to build, test, and deploy your applications rapidly and reliably, while improving quality and reducing time to market.

GitHub is an AWS Partner Network (APN) with the AWS DevOps Competency. GitHub Actions is a GitHub feature that allows you to automate tasks within your software development lifecycle. You can use GitHub Actions to run a CI/CD pipeline to build, test, and deploy software directly from GitHub.

The AWS Serverless Application Model (AWS SAM) is an open-source framework for building serverless applications. It provides shorthand syntax to express functions, APIs, databases, and event source mappings. With a few lines per resource, you can define the application you want and model it using YAML.

During deployment, AWS SAM transforms and expands the AWS SAM syntax into AWS CloudFormation syntax, enabling you to build serverless applications faster. The AWS SAM CLI allows you to build, test, and debug applications locally, defined by AWS SAM templates. You can also use the AWS SAM CLI to deploy your applications to AWS. For AWS SAM example code, see the serverless patterns collection.

In this post, you learn how to create a sample serverless application using AWS SAM. You then use GitHub Actions to build, and deploy the application in your AWS account.

New GitHub action setup-sam

A GitHub Actions runner is the application that runs a job from a GitHub Actions workflow. You can use a GitHub hosted runner, which is a virtual machine hosted by GitHub with the runner application installed. You can also host your own runners to customize the environment used to run jobs in your GitHub Actions workflows.

AWS has released a GitHub action called setup-sam to install AWS SAM, which is pre-installed on GitHub hosted runners. You can use this action to install a specific, or the latest AWS SAM version.

This demo uses AWS SAM to create a small serverless application using one of the built-in templates. When the code is pushed to GitHub, a GitHub Actions workflow triggers a GitHub CI/CD pipeline. This builds, and deploys your code directly from GitHub to your AWS account.

Prerequisites

  1. A GitHub account: This post assumes you have the required permissions to configure GitHub repositories, create workflows, and configure GitHub secrets.
  2. Create a new GitHub repository and clone it to your local environment. For this example, create a repository called github-actions-with-aws-sam.
  3. An AWS account with permissions to create the necessary resources.
  4. Install AWS Command Line Interface (CLI) and AWS SAM CLI locally. This is separate from using the AWS SAM CLI in a GitHub Actions runner. If you use AWS Cloud9 as your integrated development environment (IDE), AWS CLI and AWS SAM are pre-installed.
  5. Create an Amazon S3 bucket in your AWS account to store the build package for deployment.
  6. An AWS user with access keys, which the GitHub Actions runner uses to deploy the application. The user also write requires access to the S3 bucket.

Creating the AWS SAM application

You can create a serverless application by defining all required resources in an AWS SAM template. AWS SAM provides a number of quick-start templates to create an application.

  1. From the CLI, open a terminal, navigate to the parent of the cloned repository directory, and enter the following:
  2. sam init -r python3.8 -n github-actions-with-aws-sam --app-template "hello-world"
  3. When asked to select package type (zip or image), select zip.

This creates an AWS SAM application in the root of the repository named github-actions-with-aws-sam, using the default configuration. This consists of a single AWS Lambda Python 3.8 function invoked by an Amazon API Gateway endpoint.

To see additional runtimes supported by AWS SAM and options for sam init, enter sam init -h.

Local testing

AWS SAM allows you to test your applications locally. AWS SAM provides a default event in events/event.json that includes a message body of {\"message\": \"hello world\"}.

    1. Invoke the HelloWorldFunction Lambda function locally, passing the default event:
    2. sam local invoke HelloWorldFunction -e events/event.json
    3. The function response is:
    4. {"message": "hello world"}

    5. Test the API Gateway functionality in front of the Lambda function by first starting the API locally:
    6. sam local start-api
    7. AWS SAM launches a Docker container with a mock API Gateway endpoint listening on localhost:3000.
    8. Use curl to call the hello API:
    curl http://127.0.0.1:3000/hello

    The API response should be:

    {"message": "hello world"}

    Creating the sam-pipeline.yml file

    GitHub CI/CD pipelines are configured using a YAML file. This file configures what specific action triggers a workflow, such as push on main, and what workflow steps are required.

    In the root of the repository containing the files generated by sam init, create the directory: .github/workflows.

    1. Create a new file called sam-pipeline.yml under the .github/workflows directory.
    2. sam-pipeline.yml file

      sam-pipeline.yml file

    3. Edit the sam-pipeline.yml file and add the following:
    4. on:
        push:
          branches:
            - main
      jobs:
        build-deploy:
          runs-on: ubuntu-latest
          steps:
            - uses: actions/checkout@v2
            - uses: actions/setup-python@v2
            - uses: aws-actions/setup-sam@v1
            - uses: aws-actions/configure-aws-credentials@v1
              with:
                aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
                aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
                aws-region: ##region##
            # sam build 
            - run: sam build --use-container
      
      # Run Unit tests- Specify unit tests here 
      
      # sam deploy
            - run: sam deploy --no-confirm-changeset --no-fail-on-empty-changeset --stack-name sam-hello-world --s3-bucket ##s3-bucket## --capabilities CAPABILITY_IAM --region ##region## 
      
    5. Replace ##s3-bucket## with the name of the S3 bucket previously created to store the deployment package.
    6. Replace both ##region## with your AWS Region.

    The configuration triggers the GitHub Actions CI/CD pipeline when code is pushed to the main branch. You can amend this if you are using another branch. For a full list of supported events, refer to GitHub documentation page.

    You can further customize the sam build –use-container command if necessary. By default the Docker image used to create the build artifact is pulled from Amazon ECR Public. The default Python 3.8 image in this example is based on the language specified during sam init. To pull a different container image, use the --build-image option as specified in the documentation.

    The AWS CLI and AWS SAM CLI are installed in the runner using the GitHub action setup-sam. To install a specific version, use the version parameter.

    uses: aws-actions/setup-sam@v1
    with:
      version: 1.23.0

    As part of the CI/CD process, we recommend you scan your code for quality and vulnerabilities in bundled libraries. You can find these security offerings from our AWS Lambda Technology Partners.

    Configuring AWS credentials in GitHub

    The GitHub Actions CI/CD pipeline requires AWS credentials to access your AWS account. The credentials must include AWS Identity and Access Management (IAM) policies that provide access to Lambda, API Gateway, AWS CloudFormation, S3, and IAM resources.

    These credentials are stored as GitHub secrets within your GitHub repository, under Settings > Secrets. For more information, see “GitHub Actions secrets”.

    In your GitHub repository, create two secrets named AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY and enter the key values. We recommend following IAM best practices for the AWS credentials used in GitHub Actions workflows, including:

    • Do not store credentials in your repository code. Use GitHub Actions secrets to store credentials and redact credentials from GitHub Actions workflow logs.
    • Create an individual IAM user with an access key for use in GitHub Actions workflows, preferably one per repository. Do not use the AWS account root user access key.
    • Grant least privilege to the credentials used in GitHub Actions workflows. Grant only the permissions required to perform the actions in your GitHub Actions workflows.
    • Rotate the credentials used in GitHub Actions workflows regularly.
    • Monitor the activity of the credentials used in GitHub Actions workflows.

    Deploying your application

    Add all the files to your local git repository, commit the changes, and push to GitHub.

    git add .
    git commit -am "Add AWS SAM files"
    git push

    Once the files are pushed to GitHub on the main branch, this automatically triggers the GitHub Actions CI/CD pipeline as configured in the sam-pipeline.yml file.

    The GitHub actions runner performs the pipeline steps specified in the file. It checks out the code from your repo, sets up Python, and configures the AWS credentials based on the GitHub secrets. The runner uses the GitHub action setup-sam to install AWS SAM CLI.

    The pipeline triggers the sam build process to build the application artifacts, using the default container image for Python 3.8.

    sam deploy runs to configure the resources in your AWS account using the securely stored credentials.

    To view the application deployment progress, select Actions in the repository menu. Select the workflow run and select the job name build-deploy.

    GitHub Actions progress

    GitHub Actions progress

    If the build fails, you can view the error message. Common errors are:

    • Incompatible software versions such as the Python runtime being different from the Python version on the build machine. Resolve this by installing the proper software versions.
    • Credentials could not be loaded. Verify that AWS credentials are stored in GitHub secrets.
    • Ensure that your AWS account has the necessary permissions to deploy the resources in the AWS SAM template, in addition to the S3 deployment bucket.

    Testing the application

    1. Within the workflow run, expand the Run sam deploy section.
    2. Navigate to the AWS SAM Outputs section. The HelloWorldAPI value shows the API Gateway endpoint URL deployed in your AWS account.
    AWS SAM outputs

    AWS SAM outputs

  1. Use curl to test the API:
curl https://<api-id>.execute-api.us-east-1.amazonaws.com/Prod/hello/

The API response should be:
{"message": "hello world"}

Cleanup

To remove the application resources, navigate to the CloudFormation console and delete the stack. Alternatively, you can use an AWS CLI command to remove the stack:

aws cloudformation delete-stack --stack-name sam-hello-world

Empty, and delete the S3 deployment bucket.

Conclusion

GitHub Actions is a GitHub feature that allows you to run a CI/CD pipeline to build, test, and deploy software directly from GitHub. AWS SAM is an open-source framework for building serverless applications.

In this post, you use GitHub Actions CI/CD pipeline functionality and AWS SAM to create, build, test, and deploy a serverless application. You use sam init to create a serverless application and tested the functionality locally. You create a sam-pipeline.yml file to define the pipeline steps for GitHub Actions.

The GitHub action setup-sam installed AWS SAM on the GitHub hosted runner. The GitHub Actions workflow uses sam build to create the application artifacts and sam deploy to deploy them to your AWS account.

For more serverless learning resources, visit https://serverlessland.com.

Deploying machine learning models with serverless templates

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/deploying-machine-learning-models-with-serverless-templates/

This post written by Sean Wilkinson, Machine Learning Specialist Solutions Architect, and Newton Jain, Senior Product Manager for Lambda

After designing and training machine learning models, data scientists deploy the models so applications can use them. AWS Lambda is a compute service that lets you run code without provisioning or managing servers. Lambda’s pay-per-request billing, automatic scaling, and ease of use make it a popular deployment choice for data science teams.

With minimal code, data scientists can turn a model into a cost effective and scalable API endpoint backed by Lambda. Lambda supports container images, Advanced Vector Extensions 2 (AVX2), and functions with up to 10 GB of memory. Using these capabilities, data science teams can deploy larger, more powerful models with improved performance.

To deploy Lambda-based applications, serverless developers can use the AWS Serverless Application Model framework (AWS SAM). AWS SAM creates and manages serverless applications based on templates. It supports local testing, aids best practices, and integrates with popular developer tools. It allows data scientists to define serverless applications, security permissions, and advanced configuration capabilities using YAML.

AWS SAM contains pre-built templates that allow developers to get started quickly. This blog shows how to use machine learning templates to deploy a Scikit-Learn based model that classifies images of handwritten digits from zero to nine. Once deployed to Lambda, you can access the model via a REST API.

This walkthrough creates resources that incur costs in an AWS account. To minimize cost, follow the Cleaning up section to remove resources after completing the walkthrough.

Overview

The AWS SAM machine learning templates are available for the Scikit-Learn, PyTorch, TensorFlow, and XGBoost frameworks. Each template deploys a Lambda function to host the model behind an Amazon API Gateway, which serves as the front end and handles authentication. The following diagram shows the architecture of the solution:

Serverless architecture for ML inference

Serverless architecture for ML inference

Creating the containerized Lambda function

This section uses AWS SAM to build, test, and deploy a Docker image containing a pre-trained digit classifier model on Lambda:

  1. Update or install AWS SAM. AWS SAM CLI v1.24.1 or later is required to use the machine learning templates.
  2. In a terminal, create a new serverless application in AWS SAM using the command:
    sam init
  3. Follow the on-screen prompts, select AWS Quick Start Templates as the template source.

    SAM: choose a template source

    SAM: choose a template source

  4. Choose Image as the package type.

    SAM: Choose a package type

    SAM: Choose a package type

  5. Select amazon/python3.8-base as the base image.

    SAM: Choose an runtime image

    SAM: Choose an runtime image

  6. When prompted, enter an application name. AWS SAM uses this to group and label resources it creates.

    SAM: Choose an runtime image

    SAM: Choose an runtime image

  7. Select the desired ML framework from the template list. The walkthrough uses the Scikit-Learn template.

    SAM: choose the application template

    SAM: choose the application template

  8. AWS SAM creates a directory with the name of your application. Change to the new directory and run the AWS SAM build command:
    sam build

    SAM: build results

    SAM: build results

Files generated by AWS SAM

After selecting the template, AWS SAM generates the following files in the application directory:

  • Dockerfile: The application uses the Lambda-provided Python 3.8 base image. It installs the relevant dependencies and defines the CMD variable for the Lambda execution environment to initialize the handler.
    FROM public.ecr.aws/lambda/python:3.8
    
    COPY app.py requirements.txt ./
    
    COPY digit_classifier.joblib /opt/ml/model/1
    
    RUN python3.8 -m pip install -r requirements.txt -t .
    
    CMD ["app.lambda_handler"]
  • app.py: This Python code runs after the Lambda handler is invoked and generates predictions from the Scikit-Learn model. The model is reused across multiple Lambda invocations by loading it outside the lambda_handler.
    import joblib
    import base64
    import numpy as np
    import json
    
    from io import BytesIO
    from PIL import Image
    from scipy.ndimage import interpolation
    
    model_file = '/opt/ml/model'
    model = joblib.load(model_file)
    
    
    # Functions to pre-process images (we used same preprocessing when training)
    
    def moments(image):
        c0, c1 = np.mgrid[:image.shape[0], :image.shape[1]]
        img_sum = np.sum(image)
        
        m0 = np.sum(c0 * image) / img_sum
        m1 = np.sum(c1 * image) / img_sum
        m00 = np.sum((c0-m0)**2 * image) / img_sum
        m11 = np.sum((c1-m1)**2 * image) / img_sum
        m01 = np.sum((c0-m0) * (c1-m1) * image) / img_sum
        
        mu_vector = np.array([m0,m1])
        covariance_matrix = np.array([[m00, m01],[m01, m11]])
        
        return mu_vector, covariance_matrix
    
    
    def deskew(image):
        c, v = moments(image)
        alpha = v[0,1] / v[0,0]
        affine = np.array([[1,0], [alpha,1]])
        ocenter = np.array(image.shape) / 2.0
        offset = c - np.dot(affine, ocenter)
    
        return interpolation.affine_transform(image, affine, offset=offset)
    
    
    def get_np_image(image_bytes):
        image = Image.open(BytesIO(base64.b64decode(image_bytes))).convert(mode='L')
        image = image.resize((28, 28))
    
        return np.array(image)
    
    
    # Lambda handler code
    
    def lambda_handler(event, context):
        image_bytes = event['body'].encode('utf-8')
        x = deskew(get_np_image(image_bytes))
    
        prediction = int(model.predict(x.reshape(1, -1))[0])
    
        return {
            'statusCode': 200,
            'body': json.dumps(
                {
                    "predicted_label": prediction,
                }
            )
        }

After completing these steps, this is the directory structure:

File structure

File structure

Testing the AWS SAM templates

For container image-based Lambda functions, sam build creates and updates a container image in the local Docker repo. It copies the template to the output directory and updates the location for the newly built image.

You can see the following top-level tree under the .aws-sam directory:

SAM build artifacts directory structure

SAM build artifacts directory structure

After building the Docker image, use AWS SAM’s local test functionality to test the endpoint. There are two ways to test the application locally:

  1. Local invoke –event uses the mock data in event.json to invoke the function and generate a prediction. An image of a handwritten digit is encoded as a base64 string in the body attribute in the event.json file. Test using mock event.json:
    sam local invoke InferenceFunction --event events/event.json

    SAM local invoke results

    SAM local invoke results

  2. The start-api command starts up a local endpoint that emulates a REST API endpoint. It downloads an execution container that runs API Gateway and the Lambda function locally. Invoke using the API Gateway emulator:
    sam local start-apiSAM local start-api monitor

SAM local start-api monitorTo test the local endpoint use a REST client, like Postman, to send a POST request to the /classify_digit endpoint.

Testing with Postman

Testing with Postman

While testing locally, use images smaller than 100 KB. If the file is larger, the request fails with status code: 502 and the error “argument list too long”. After deploying to Lambda, you can use larger images.

Deploying the application to Lambda

After testing the model locally, use the AWS SAM guided deployment process to package and deploy the application:

  1. To deploy a Lambda function based on a container image, the container image must be pushed to Amazon Elastic Container Registry (ECR). Run the following command to retrieve an authentication token and authenticate the Docker client with the ECR registry. Replace the region and accountID placeholders with your Region and AWS account ID:
    aws --region <region> ecr get-login-password | docker login --username AWS --password-stdin <accountID>.dkr.ecr.<region>.amazonaws.com

    Login Succeeded

    Login Succeeded

  2. Use the AWS CLI to create an ECR repository called classifier-demo:
    aws ecr create-repository \
    --repository-name classifier-demo \
    --image-tag-mutability MUTABLE \
    --image-scanning-configuration scanOnPush=true
    

    Create ECR repo results

    Create ECR repo results

  3. Copy the repositoryUri from the output. This is needed in the next step. Initiate the AWS SAM guided deployment using the deploy command:
    sam deploy --guided
  4. Follow the on-screen prompts. To accept the default options provided in the interactive experience, press Enter. When prompted for an ECR repository, use the Amazon ECR repository created in the previous step.
    CloudFormation change set verification screen

    CloudFormation change set verification screen

    CloudFormation outputs

    CloudFormation outputs

  5. AWS SAM packages and deploys the application as a versioned entity. After deployment, the production API endpoint is ready to use. The template produces multiple outputs. Find the unique URL of the endpoint in the “HelloWorldAPI” key in the “Outputs” section.

After retrieving the URL, test the live endpoint using a REST client:

Testing with Postman

Testing with Postman

Optimizing performance

After the Lambda function is deployed, you can optimize for latency and cost. To do this, adjust the memory allocation setting for the function, which also linearly changes the allocated vCPU (to learn more, read the AWS News Blog).

The digit classifier model is optimized with 5 GB memory (~3 vCPUs). Any gains beyond 5 GB are relatively minor. Each model responds differently to changes in vCPU and memory, so it is best practice to determine this experimentally. There are open-source tools available to automate performance tuning.

Further optimizations can be made by compiling the source code to take advantage of AVX2 instructions. AVX2 allows Lambda to run more operations per clock cycle, reducing the time it takes a model to generate predictions.

Cleaning up

This walkthrough creates a Lambda function, API Gateway endpoint, and an ECR repository. These resources incur charges so it is recommended to clean up resources to avoid incurring cost. To delete the ECR repository, run:

aws ecr delete-repository --registry-id <account-id> --repository-name classifier-demo --force

To delete the remaining resources, navigate to AWS CloudFormation in the AWS Management Console and select the Region used for the walkthrough. Select the stack created by AWS SAM (the default is “sam-app”) and choose Delete.

Conclusion

Lambda is a cost-effective, scalable, and reliable way for data scientists to deploy CPU-based machine learning models for inference. With support for larger functions sizes, AVX2 instruction sets, and container image support, Lambda can now deploy more complex models while maintaining low latency.

Use the new machine learning templates within AWS SAM today to deploy your first serverless machine learning application in minutes. We look forward to seeing the exciting machine learning applications that you build on Lambda.

For more serverless learning resources, visit Serverless Land.

Building well-architected serverless applications: Managing application security boundaries – part 1

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/building-well-architected-serverless-applications-managing-application-security-boundaries-part-1/

This series of blog posts uses the AWS Well-Architected Tool with the Serverless Lens to help customers build and operate applications using best practices. In each post, I address the serverless-specific questions identified by the Serverless Lens along with the recommended best practices. See the introduction post for a table of contents and explanation of the example application.

Security question SEC2: How do you manage your serverless application’s security boundaries?

Defining and securing your serverless application’s boundaries ensures isolation for, within, and between components.

Required practice: Evaluate and define resource policies

Resource policies are AWS Identity and Access Management (IAM) statements. They are attached to resources such as an Amazon S3 bucket, or an Amazon API Gateway REST API resource or method. The policies define what identities have fine-grained access to the resource. To see which services support resource-based policies, see “AWS Services That Work with IAM”. For more information on how resource policies and identity policies are evaluated, see “Identity-Based Policies and Resource-Based Policies”.

Understand and determine which resource policies are necessary

Resource policies can protect a component by restricting inbound access to managed services. Use resource policies to restrict access to your component based on a number of identities, such as the source IP address/range, function event source, version, alias, or queues. Resource policies are evaluated and enforced at IAM level before each AWS service applies it’s own authorization mechanisms, when available. For example, IAM resource policies for API Gateway REST APIs can deny access to an API before an AWS Lambda authorizer is called.

If you use multiple AWS accounts, you can use AWS Organizations to manage and govern individual member accounts centrally. Certain resource policies can be applied at the organizations level, providing guardrail for what actions AWS accounts within the organization root or OU can do. For more information see, “Understanding how AWS Organization Service Control Policies work”.

Review your existing policies and how they’re configured, paying close attention to how permissive individual policies are. Your resource policies should only permit necessary callers.

Implement resource policies to prevent unauthorized access

For Lambda, use resource-based policies to provide fine-grained access to what AWS IAM identities and event sources can invoke a specific version or alias of your function. Resource-based policies can also be used to control access to Lambda layers. You can combine resource policies with Lambda event sources. For example, if API Gateway invokes Lambda, you can restrict the policy to the API Gateway ID, HTTP method, and path of the request.

In the serverless airline example used in this series, the IngestLoyalty service uses a Lambda function that subscribes to an Amazon Simple Notification Service (Amazon SNS) topic. The Lambda function resource policy allows SNS to invoke the Lambda function.

Lambda resource policy document

Lambda resource policy document

API Gateway resource-based policies can restrict API access to specific Amazon Virtual Private Cloud (VPC), VPC endpoint, source IP address/range, AWS account, or AWS IAM users.

Amazon Simple Queue Service (SQS) resource-based policies provide fine-grained access to certain AWS services and AWS IAM identities (users, roles, accounts). Amazon SNS resource-based policies restrict authenticated and non-authenticated actions to topics.

Amazon DynamoDB resource-based policies provide fine-grained access to tables and indexes. Amazon EventBridge resource-based policies restrict AWS identities to send and receive events including to specific event buses.

For Amazon S3, use bucket policies to grant permission to your Amazon S3 resources.

The AWS re:Invent session Best practices for growing a serverless application includes further suggestions on enforcing security best practices.

Best practices for growing a serverless application

Best practices for growing a serverless application

Good practice: Control network traffic at all layers

Apply controls for controlling both inbound and outbound traffic, including data loss prevention. Define requirements that help you protect your networks and protect against exfiltration.

Use networking controls to enforce access patterns

API Gateway and AWS AppSync have support for AWS Web Application Firewall (AWS WAF) which helps protect web applications and APIs from attacks. AWS WAF enables you to configure a set of rules called a web access control list (web ACL). These allow you to block, or count web requests based on customizable web security rules and conditions that you define. These can include specified IP address ranges, CIDR blocks, specific countries, or Regions. You can also block requests that contain malicious SQL code, or requests that contain malicious script. For more information, see How AWS WAF Works.

private API endpoint is an API Gateway interface VPC endpoint that can only be accessed from your Amazon Virtual Private Cloud (Amazon VPC). This is an elastic network interface that you create in a VPC. Traffic to your private API uses secure connections and does not leave the Amazon network, it is isolated from the public internet. For more information, see “Creating a private API in Amazon API Gateway”.

To restrict access to your private API to specific VPCs and VPC endpoints, you must add conditions to your API’s resource policy. For example policies, see the documentation.

By default, Lambda runs your functions in a secure Lambda-owned VPC that is not connected to your account’s default VPC. Functions can access anything available on the public internet. This includes other AWS services, HTTPS endpoints for APIs, or services and endpoints outside AWS. The function cannot directly connect to your private resources inside of your VPC.

You can configure a Lambda function to connect to private subnets in a VPC in your account. When a Lambda function is configured to use a VPC, the Lambda function still runs inside the Lambda service VPC. The function then sends all network traffic through your VPC and abides by your VPC’s network controls. Functions deployed to virtual private networks must consider network access to restrict resource access.

AWS Lambda service VPC with VPC-to-VPT NAT to customer VPC

AWS Lambda service VPC with VPC-to-VPT NAT to customer VPC

When you connect a function to a VPC in your account, the function cannot access the internet, unless the VPC provides access. To give your function access to the internet, route outbound traffic to a NAT gateway in a public subnet. The NAT gateway has a public IP address and can connect to the internet through the VPC’s internet gateway. For more information, see “How do I give internet access to my Lambda function in a VPC?”. Connecting a function to a public subnet doesn’t give it internet access or a public IP address.

You can control the VPC settings for your Lambda functions using AWS IAM condition keys. For example, you can require that all functions in your organization are connected to a VPC. You can also specify the subnets and security groups that the function’s users can and can’t use.

Unsolicited inbound traffic to a Lambda function isn’t permitted by default. There is no direct network access to the execution environment where your functions run. When connected to a VPC, function outbound traffic comes from your own network address space.

You can use security groups, which act as a virtual firewall to control outbound traffic for functions connected to a VPC. Use security groups to permit your Lambda function to communicate with other AWS resources. For example, a security group can allow the function to connect to an Amazon ElastiCache cluster.

To filter or block access to certain locations, use VPC routing tables to configure routing to different networking appliances. Use network ACLs to block access to CIDR IP ranges or ports, if necessary. For more information about the differences between security groups and network ACLs, see “Compare security groups and network ACLs.”

In addition to API Gateway private endpoints, several AWS services offer VPC endpoints, including Lambda. You can use VPC endpoints to connect to AWS services from within a VPC without an internet gateway, NAT device, VPN connection, or AWS Direct Connect connection.

Using tools to audit your traffic

When you configure a Lambda function to use a VPC, or use private API endpoints, you can use VPC Flow Logs to audit your traffic. VPC Flow Logs allow you to capture information about the IP traffic going to and from network interfaces in your VPC. Flow log data can be published to Amazon CloudWatch Logs or S3 to see where traffic is being sent to at a granular level. Here are some flow log record examples. For more information, see “Learn from your VPC Flow Logs”.

Block network access when required

In addition to security groups and network ACLs, third-party tools allow you to disable outgoing VPC internet traffic. These can also be configured to allow traffic to AWS services or allow-listed services.

Conclusion

Managing your serverless application’s security boundaries ensures isolation for, within, and between components. In this post, I cover how to evaluate and define resource policies, showing what policies are available for various serverless services. I show some of the features of AWS WAF to protect APIs. Then I review how to control network traffic at all layers. I explain how Lambda functions connect to VPCs, and how to use private APIs and VPC endpoints. I walk through how to audit your traffic.

This well-architected question will be continued where I look at using temporary credentials between resources and components. I cover why smaller, single purpose functions are better from a security perspective, and how to audit permissions. I show how to use AWS Serverless Application Model (AWS SAM) to create per-function IAM roles.

For more serverless learning resources, visit https://serverlessland.com.

Building leaderboard functionality with serverless data analytics

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-serverless-applications-with-streaming-data-part-4/

This series is about building serverless solutions in streaming data workloads. The application example used in this series is Alleycat, which allows bike racers to compete with each other virtually on home exercise bikes.

Part 1 explains the application’s functionality, how to deploy to your AWS account, and provides an architectural review. Part 2 compares different ways to ingest streaming data into Amazon Kinesis Data Streams and shows how to optimize shard capacity. Part 3 uses Amazon Kinesis Data Firehose with AWS Lambda to implement the all-time rankings functionality.

This post walks through the leaderboard functionality in the application. This solution uses Kinesis Data Streams, Lambda, and Amazon DynamoDB to provide both real-time and historical rankings.

To set up the example, visit the GitHub repo and follow the instructions in the README.md file. Note that this walkthrough uses services that are not covered by the AWS Free Tier and incur cost.

Overview of leaderboards in Alleycat

Alleycat races are 5 minutes long and run continuously throughout the day for each class. Competitors select a class type and automatically join the current race by choosing Start Race. There are up to 1,000 competitors per race. To track performance, there are two types of leaderboard in Alleycat: Race results for completed races and the Realtime rankings for active races.

Alleycat frontend

  1. The user selects a completed race from the dropdown to a leaderboard ranking of results. This table is not real time and only reflects historical results.
  2. The user selects the Here now option to see live results for all competitors in the current virtual race for the chosen class.

Architecture overview

The backend microservice supporting these features is located in the 1-streaming-kds directory in the GitHub repo. It uses the following architecture:

Solution architecture

  1. The tumbling window function receives results every second from Kinesis Data Streams. This function aggregates metrics and saves the latest results to the application’s DynamoDB table.
  2. DynamoDB Streams invoke the publishing Lambda function every time an item is changed in the table. This reformats the message and publishes to the application’s IoT topic in AWS IoT Core to update the frontend.
  3. The final results function is an additional consumer on Kinesis Data Streams. It filters for only the last result in each competitor’s race and publishes the results to the DynamoDB table.
  4. The frontend calls an Amazon API Gateway endpoint to fetch the historical results for completed races via a Lambda function.

Configuring the tumbling window function

This Lambda function aggregates data from the Kinesis stream and stores the result in the DynamoDB table:

Tumbling window architecture

This function receives batches of 1,000 messages per invocation. The messages may contain results for multiple races and competitors. The function groups the results by race and racer ID and then flattens the data structure. It writes an item to the DynamoDB table in this format:

{
 "PK": "race-5403307",
 "SK": "results",
 "ts": 1620992324960,
 "results": "{\"0\":167.04,\"1\":136,\"2\":109.52,\"3\":167.14,\"4\":129.69,\"5\":164.97,\"6\":149.86,\"7\":123.6,\"8\":154.29,\"9\":89.1,\"10\":137.41,\"11\":124.8,\"12\":131.89,\"13\":117.18,\"14\":143.52,\"15\":95.04,\"16\":109.34,\"17\":157.38,\"18\":81.62,\"19\":165.76,\"20\":181.78,\"21\":140.65,\"22\":112.35,\"23\":112.1,\"24\":148.4,\"25\":141.75,\"26\":173.24,\"27\":131.72,\"28\":133.77,\"29\":118.44}",
 "GSI": 2
}

This transformation significantly reduces the number of writes to the DynamoDB table per function invocation. Each time a write occurs, this triggers the publishing process and notifies the frontend via the IoT topic.

DynamoDB can handle almost any number of writes to the table. You can set up to 40,000 write capacity units with default limits and can request even higher limits via an AWS Support ticket. However, you may want to limit the throughput to reduce the WCU cost.

The tumbling window feature of Lambda allows invocations from a streaming source to pass state between invocations. You can specify a window interval of up to 15 minutes. During the window, a state is passed from one invocation to the next, until a final invocation at the end of the window. Alleycat uses this feature to buffer aggregated results and only write the output at the end of the tumbling window:

Tumbling window process

For a tumbling window period of 5 seconds, this means that the Lambda function is invoked multiple times, passing an intermediate state from invocation to invocation. Once the window ends, it then writes the final aggregated result to DynamoDB. The tradeoff in this solution is that it reduces the number of real-time notifications to the frontend since these are published from the table’s stream. This increases the latency of live results in the frontend application.

Buffer by the Lambda function

Implementing tumbling windows in Lambda functions

The template.yaml file describes the tumbling Lambda function. The event definition specifies the tumbling window duration in the TumblingWindowInSeconds attribute:

  TumblingWindowFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: tumblingFunction/    
      Handler: app.handler
      Runtime: nodejs14.x
      Timeout: 15
      MemorySize: 256
      Environment:
        Variables:
          DDB_TABLE: !Ref DynamoDBtableName
      Policies:
        DynamoDBCrudPolicy:
          TableName: !Ref DynamoDBtableName
      Events:
        Stream:
          Type: Kinesis
          Properties:
            Stream: !Sub "arn:aws:kinesis:${AWS::Region}:${AWS::AccountId}:stream/${KinesisStreamName}"
            BatchSize: 1000
            StartingPosition: TRIM_HORIZON      
            TumblingWindowInSeconds: 15  

When you enable tumbling windows, the function’s event payload contains several new attributes:

{
    "Records": [
        {
 	    ...
        }
    ],
    "shardId": "shardId-000000000000",
    "eventSourceARN": "arn:aws:kinesis:us-east-2:123456789012:stream/alleycat",
    "window": {
        "start": "2021-05-05T18:51:00Z",
        "end": "2021-05-05T18:51:15Z"
    },
    "state": {},
    "isFinalInvokeForWindow": false,
    "isWindowTerminatedEarly": false
}

These include:

  • Window start and end: The beginning and ending timestamps for the current tumbling window.
  • State: An object containing the state returned from the previous invocation, which is initially empty in a new window. The state object can contain up to 1 MB of data.
  • isFinalInvokeForWindow: Indicates if this is the last invocation for the current window. This only occurs once per window period.
  • isWindowTerminatedEarly: A window ends early if the state exceeds the maximum allowed size of 1 MB.

The event handler in app.js uses tumbling windows if they are defined in the AWS SAM template:

// Main Lambda handler
exports.handler = async (event) => {

	// Retrieve existing state passed during tumbling window	
	let state = event.state || {}
	
	// Process the results from event
	let jsonRecords = getRecordsFromPayload(event)
	jsonRecords.map((record) => raceMap[record.raceId] = record.classId)
	state = getResultsByRaceId(state, jsonRecords)

	// If tumbling window is not configured, save and exit
	if (event.window === undefined) {
		return await saveCurrentRaces(state) 
	}

	// If tumbling window is configured, save to DynamoDB on the 
	// final invocation in the window
	if (event.isFinalInvokeForWindow) {
		await saveCurrentRaces(state) 
	} else {
		return { state }
	}
}

The final results function and API

The final results function filters for the last event in each competitor’s race, which contains the final score. The function writes each score to the DynamoDB table. Since there may be many write events per invocation, this function uses the Promise.all construct in Node.js to complete the database operations in parallel:

const saveFinalScores = async (raceResults) => {
	let paramsArr = []

	raceResults.map((result) => {
		paramsArr.push({
		  TableName : process.env.DDB_TABLE,
		  Item: {
		    PK: `race-${result.raceId}`,
		    SK: `racer-${result.racerId}`,
		    GSI: result.output,
		    ts: Date.now()
		  }
		})
	})
	// Save to DDB in parallel
	await Promise.all(paramsArr.map((params) => documentClient.put (params).promise()))
}

Using this approach, each call to documentClient.put is made without waiting for a response from the SDK. The call returns a Promise object, which is in a pending state until the database operation returns with a status. Promise.all waits for all promises to resolve or reject before code execution continues. Comparing the serial and concurrent approach, this reduces the overall time for multiple database writes. The tradeoff is that it increases the number of writes to DynamoDB and the number of WCUs consumed.

Serial vs concurrent writes

For a large number of put operations to DynamoDB, you can also use the DocumentClient’s batchWrite operation. This delegates to the underlying DynamoDB BatchWriteItem operation in the AWS SDK and can accept up to 25 separate put requests in a single call. To handle more than 25, you can make multiple batchWrite requests and still use the parallelized invocation method shown above.

The DynamoDB table maintains a list of race results with one item per racer per race:

DynamoDB table items

The frontend calls an API Gateway endpoint that invokes the getLeaderboard function. This function uses the DocumentClient’s query API to return results for a selected race, sorted by a global secondary index containing the final score:

const AWS = require('aws-sdk')
AWS.config.region = process.env.AWS_REGION 
const documentClient = new AWS.DynamoDB.DocumentClient()

// Main Lambda handler
exports.handler = async (event) => {

    const classId = parseInt(event.queryStringParameters.classId)

    const params = {
        TableName: process.env.DDB_TABLE,
        IndexName: 'GSI_PK_Index',
        KeyConditionExpression: 'PK = :ID',
        ExpressionAttributeValues: {
          ':ID': `class-${classId}`
        },
        ScanIndexForward: false,
        Limit: 1000
    }   

    const result = await documentClient.query(params).promise()   
    return result.Items
}

By default, this returns the top 1,000 places by using the Limit parameter. You can customize this or use pagination to implement fetching large result sets more efficiently.

Conclusion

In this post, I explain the all-time leaderboard logic in the Alleycat application. This is an asynchronous, eventually consistent process that checks batches of incoming records for new personal best records. This uses Kinesis Data Firehose to provide a zero-administration way to deliver and process large batches of records continuously.

This post shows the architecture in Alleycat and how this is defined in AWS SAM. Finally, I walk through how to build a data transformation Lambda function that decodes a payload and returns records back to Kinesis.

Part 5 discusses how to troubleshoot issues in Alleycat and how to monitor streaming applications generally.

For more serverless learning resources, visit Serverless Land.

Prototyping at speed with AWS Step Functions new Workflow Studio

Post Syndicated from Benjamin Smith original https://aws.amazon.com/blogs/compute/prototyping-at-speed-with-aws-step-functions-new-workflow-studio/

AWS recently introduced Workflow Studio for AWS Step Functions. This is a new visual builder for creating Step Functions workflows in the AWS Management Console. This post shows how to use the Workflow Studio for rapid workflow prototyping. It also explains how to transition to local development, integrating the prototype with your infrastructure as code templates.

Since its release in December 2016, developers have been building Step Functions workflows with Amazon States Language (ASL) to orchestrate multiple services into business-critical applications. Developers wanted faster ways to prototype and build orchestration workflows without writing custom code or using additional services.

­­­­­­

What’s new?

The new Step Functions Workflow Studio provides an additional workflow building experience. Developers and business users can now build prototype workflows quickly with a graphical user interface in the Step Functions console.

These workflo­­­ws can include all the same workflow states, patterns, and service integrations available when building with ASL. Each state is configured using editable forms. The workflow ASL definition can be exported for further editing in the console or in your local integrated development environment (IDE). Workflow Studio can build new workflows or edit a pre-existing workflow. To get started with Workflow Studio, see this introduction video.

Business users

Workflow Studio provides new opportunities for a more diverse range of users to build step functions workflows. Business users and those in non-technical roles can quickly create workflow prototypes. This can help to reason about and understand business processes before passing to a developer to add business logic and configure service integrations.

Rapid workflow prototyping

Workflow Studio allows you to create placeholders for AWS Lambda functions and other service integrations using the ‘drag-and-drop’ interface. This means that resources do not need to exist before designing the workflow. Once a workflow is prototyped you can save and continue to edit in the console or copy the ASL definition to your local IDE. You can then incorporate the workflow with application resources and infrastructure as code templates.

In the following steps, I use Workflow Studio to build the workflow described in this post. The full application template is found in this GitHub repository. The workflow analyzes web form submissions for negative sentiment. It generates a case reference number and saves the data in an Amazon DynamoDB table. The workflow returns the case reference number and message sentiment score.

To start fast prototyping for this workflow with the visual studio:

  1. Log into the Step Functions console and choose Create state machine.
  2. Choose Design your workflow visually from the authoring method section. This opens up Workflow Studio.
  3. Choose AWS Lambda Invoke from the Actions menu and drag it into the workflow.
  4. Choose the Configuration tab from the Form panel and enter the name Detect Sentiment in the State name field.
  5. In the function name field, choose Enter Function Name.
  6. Enter ${DetectSentiment} into the function name parameters field. This is a dynamic reference to a value that is provided by an Infrastructure-as-code template.

    The Workflow Studio provides an interface to add input and output path processing configurations to the workflow.
  7. Choose the Output tab and select Combine input and result with ResultPath. Selecting this option uses the ResultPath filter to add the result into the original state input. The specified path indicates where to add the result.
  8. Enter $.SentimentResults into the path ResultsPath text input.
  9. View the workflow ASL definition by choosing Definition from the top menu. This shows:
    1. The state is named Detect Sentiment.
    2. The Lambda function name uses a dynamic reference to ${DetectSentiment}. This is provided by the infrastructure-as-code template, explained in the following steps.
    3. A default retry configuration is defined.
    4. The ResultPath is configured.

Continue building the workflow this way, adding more Task and Flow states. A completed workflow looks as follows:

Transitioning to local development

Once the workflow is created in the Workflow Studio, you can export the ASL definition to a local IDE to incorporate into an infrastructure as code template. The template describes all the AWS resources that make up the application:

  1. To copy the ASL definition, choose the Definition button in the top navigation, and copy the entire ASL workflow definition to the clipboard.
  2. Create a new directory in your local filesystem named statemachine and save the definition to a file in this directory named sfn-template.asl.json. The following screenshot shows how the workflow appears in your IDE when rendered with the AWS Toolkit for Visual Studio Code.

  3. AWS Serverless Application Model (AWS SAM) is an open-source infrastructure as code framework for building serverless applications.
  4. Create an AWS SAM template named template.yaml to describe the application resources. A completed version of this file is found in this GitHub repository.
  5. Create a directory for each Lambda function. Within each directory, save the function code to a file called app.js. The function code, can be found in this GitHub repository. The final application file directory looks as follows:
    root
    ┣ LambdaFunctions/
    ┃ ┣ GenerateReferenceNumber/
    ┃ ┃ ┗ app.js
    ┃ ┣ detectSentiment/
    ┃ ┃ ┗ app.js
    ┃ ┗ sendEmailConfirmation/
    ┃   ┗ app.js
    ┣ statemachine/
    ┃ ┗ sfn-template.asl.json
    ┗ template.yaml

The full application can be found in this GitHub repository.

The AWS SAM template describes the Step Functions workflow’s security permissions and allows for dynamic referencing of the resources described within the template such as the Lambda functions and DynamoDB table:

##########################################################################
#   STEP FUNCTION                                                        #
##########################################################################

  ProcessFormStateMachineExpressSync:
    Type: AWS::Serverless::StateMachine # More info about State Machine Resource: https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/sam-resource-statemachine.html
    Properties:
      DefinitionUri: statemachine/sfn-template.asl.json
      DefinitionSubstitutions:
        NotifyAdminWithSES: !Ref NotifyAdminWithSES
        GenerateRefernceNumber: !Ref GenerateRefernceNumber
        DetectSentiment: !Ref DetectSentiment
        DDBTable: !Ref FormDataTable
      Policies: # Find out more about SAM policy templates: https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-policy-templates.html
        - LambdaInvokePolicy:
            FunctionName: !Ref NotifyAdminWithSES
        - LambdaInvokePolicy:
            FunctionName: !Ref DetectSentiment
        - LambdaInvokePolicy:
            FunctionName: !Ref GenerateRefernceNumber
        - DynamoDBWritePolicy:
            TableName: !Ref FormDataTable
      Type: EXPRESS
  • The DefinitionURI value provides the location of the ASL definition that is exported from the Workflow Studio, in statemachine/sfn-template.asl.json.
  • The DefinitionSubstitutions values provide the names of the resources used within the workflow. Here you see $.DetectSentiment Lambda function name passed to the workflow definition. This was entered into the Workflow Studio in the previous steps.

The application is deployed using the AWS SAM CLI. Follow these steps in the GitHub repository to deploy the application.

Once the application is deployed, the workflow can be edited by updating the ASL definition in the Step Functions console or the local template file. It can also be edited from the drag-and-snap interface in the Workflow Studio. Any edits made in the AWS Management Console should be copied back to the local template file.

Conclusion

The AWS Step Functions Workflow Studio is a new visual builder for creating Step Functions workflows in the AWS Management Console. The drag-and-drop interface can be used to build new or edit existing workflows quickly. Each state is configured using editable forms, with the ASL definition visible and available for export as you build.

This post shows how to use the Workflow Studio for rapid workflow prototyping. It explains how to export the ASL definition to your local IDE and integrate it with your infrastructure as code application templates.

The Workflow Studio is included in Step Functions pricing at no additional fee and is available in all regions where Step Functions is available. To get started, visit https://aws.amazon.com/stepfunctions.

Exploring serverless patterns for Amazon DynamoDB

Post Syndicated from Talia Nassi original https://aws.amazon.com/blogs/compute/exploring-serverless-patterns-for-amazon-dynamodb/

Amazon DynamoDB is a fully managed, serverless NoSQL database. In this post, you learn about the different DynamoDB patterns used in serverless applications, and use the recently launched Serverless Patterns Collection to configure DynamoDB as an event source for AWS Lambda.

Benefits of using DynamoDB as a serverless developer

DynamoDB is a serverless service that automatically scales up and down to adjust for capacity and maintain performance. It also has built-in high availability and fault tolerance. DynamoDB provides both provisioned and on-demand capacity modes so that you can optimize costs by specifying capacity per table, or paying for only the resources you consume. You are not provisioning, patching, or maintaining any servers.

Serverless patterns with DynamoDB

The recently launched Serverless Patterns Collection is a repository of serverless architecture examples that demonstrate integrating two or more AWS services. Each pattern uses either the AWS Serverless Application Model (AWS SAM) or AWS Cloud Development Kit (AWS CDK). These simplify the creation and configuration of the services referenced.

There are currently four patterns that use DynamoDB:

Amazon API Gateway REST API to Amazon DynamoDB

This pattern creates an Amazon API Gateway REST API that integrates with an Amazon DynamoDB table named “Music”. The API includes an API key and usage plan. The DynamoDB table includes a global secondary index named “Artist-Index”. The API integrates directly with the DynamoDB API and supports PutItem and Query actions. The REST API uses an AWS Identity and Access Management (IAM) role to provide full access to the specific DynamoDB table and index created by the AWS CloudFormation template. Use this pattern to store items in a DynamoDB table that come from the specified API.

AWS Lambda to Amazon DynamoDB

This pattern deploys a Lambda function, a DynamoDB table, and the minimum IAM permissions required to run the application. A Lambda function uses the AWS SDK to persist an item to a DynamoDB table.

AWS Step Functions to Amazon DynamoDB

This pattern deploys a Step Functions workflow that accepts a payload and puts the item in a DynamoDB table. Additionally, this workflow also shows how to read an item directly from the DynamoDB table, and contains the minimum IAM permissions required to run the application.

Amazon DynamoDB to AWS Lambda

This pattern deploys the following Lambda function, DynamoDB table, and the minimum IAM permissions required to run the application. The Lambda function is invoked whenever items are written or updated in the DynamoDB table. The changes are then sent to a stream. The Lambda function polls the DynamoDB stream. The function is invoked with a payload containing the contents of the table item that changed. We use this pattern in the following steps.

AWSTemplateFormatVersion: '2010-09-09'
Transform: 'AWS::Serverless-2016-10-31'
Description: An Amazon DynamoDB trigger that logs the updates made to a table.
Resources:
  DynamoDBProcessStreamFunction:
    Type: 'AWS::Serverless::Function'
    Properties:
      Handler: app.handler
      Runtime: nodejs14.x
      CodeUri: src/
      Description: An Amazon DynamoDB trigger that logs the updates made to a table.
      MemorySize: 128
      Timeout: 3
      Events:
        MyDynamoDBtable:
          Type: DynamoDB
          Properties:
            Stream: !GetAtt MyDynamoDBtable.StreamArn
            StartingPosition: TRIM_HORIZON
            BatchSize: 100
  MyDynamoDBtable:
    Type: 'AWS::DynamoDB::Table'
    Properties:
      AttributeDefinitions:
        - AttributeName: id
          AttributeType: S
      KeySchema:
        - AttributeName: id
          KeyType: HASH
      ProvisionedThroughput:
        ReadCapacityUnits: 5
        WriteCapacityUnits: 5
      StreamSpecification:
        StreamViewType: NEW_IMAGE

Setting up the Amazon DynamoDB to AWS Lambda Pattern

Prerequisites

For this tutorial, you need:

Downloading and testing the pattern

  1. From the Serverless Patterns home page, choose Amazon DynamoDB from the Filters menu. Then choose the DynamoDB to Lambda pattern.
    DynamoDB to Lambda Pattern
  2. Clone the repository and change directories into the pattern’s directory.git clone https://github.com/aws-samples/serverless-patterns/
    cd serverless-patterns/dynamodb-lambda

    Download instructions

  3. Run sam deploy –guided. This deploys your application. Keeping the responses blank chooses the default options displayed in the brackets.
    sam deploy instructions
  4. You see the following confirmation message once your stack is created.
    sam confirmation message
  5. Navigate to the DynamoDB Console and choose Tables. Select the newly created table. Newly created DynamoDB table
  6. Choose the Items tab and choose Create Item.
    Create Item
  7. Add an item and choose Save.
    Add item to table
  8. You see that item now in the DynamoDB table.
    DynamoDB table
  9. Navigate to the Lambda console and choose your function.
  10. From the Monitor tab choose View logs in CloudWatch.
    cloudwatch logs
  11. You see the new image inserted into the DynamoDB table.
    Cloudwatch Logs

Anytime a new item is added to the DynamoDB table, the invoked Lambda function logs the event in Amazon Cloudwatch Logs.

Configuring the event source mapping for the DynamoDB table

An event source mapping defines how a particular service invokes a Lambda function. It defines how that service is going to invoke the function. In this post, you use DynamoDB as the event source for Lambda. There are a few specific attributes of a DynamoDB trigger.

The batch size controls how many items can be sent for each Lambda invocation. This template sets the batch size to 100, as shown in the following deployed resource. The batch window indicates how long to wait until it invokes the Lambda function.

These configurations are beneficial because they increase your capabilities of what the DynamoDB table can do. In a traditional trigger for a database, the trigger gets invoked once per row per trigger action. With this batching capability, you can control the size of each payload and how frequently the function is invoked.

Trigger screenshot

Using DynamoDB capacity modes

DynamoDB has two read/write capacity modes for processing reads and writes on your tables: provisioned and on-demand. The read/write capacity mode controls how you pay for read and write throughput and how you manage capacity.

With provisioned mode, you specify the number of reads and writes per second that you require for your application. You can use automatic scaling to adjust the table’s provisioned capacity automatically in response to traffic changes. This helps to govern your DynamoDB use to stay at or below a defined request rate to obtain cost predictability.

Provisioned mode is a good option if you have predictable application traffic, or you run applications whose traffic is consistent or ramps gradually. To use provisioned mode in a DynamoDB table, enter ProvisionedThroughput as a property, and then define the read and write capacity:

 MyDynamoDBtable:
    Type: 'AWS::DynamoDB::Table'
    Properties:
      AttributeDefinitions:
        - AttributeName: id
          AttributeType: S
      KeySchema:
        - AttributeName: id
          KeyType: HASH
      ProvisionedThroughput:
        ReadCapacityUnits: 5
        WriteCapacityUnits: 5
      StreamSpecification:
        StreamViewType: NEW_IMAGE

With on-demand mode, DynamoDB accommodates workloads as they ramp up or down. If a workload’s traffic level reaches a new peak, DynamoDB adapts rapidly to accommodate the workload.

On-demand mode is a good option if you create new tables with unknown workload, or you have unpredictable application traffic. Additionally, it can be a good option if you prefer paying for only what you use. To use on-demand mode for a DynamoDB table, in the properties section of the template.yaml file, enter BillingMode: PAY_PER_REQUEST.

ApplicationTable:
    Type: AWS::DynamoDB::Table
    Properties:
      TableName: !Ref ApplicationTableName
      BillingMode: PAY_PER_REQUEST    
      StreamSpecification:
        StreamViewType: NEW_AND_OLD_IMAGES  

Stream specification

When DynamoDB sends the payload to Lambda, you can decide the view type of the stream. There are three options: new images, old images, and new and old images. To view only the new updated changes to the table, choose NEW_IMAGES as the StreamViewType. To view only the old change to the table, choose OLD_IMAGES as the StreamViewType. To view both the old image and new image, choose NEW_AND_OLD_IMAGES as the StreamViewType.

ApplicationTable:
    Type: AWS::DynamoDB::Table
    Properties:
      TableName: !Ref ApplicationTableName
      BillingMode: PAY_PER_REQUEST    
      StreamSpecification:
        StreamViewType: NEW_AND_OLD_IMAGES  

Cleanup

Once you have completed this tutorial, be sure to remove the stack from CloudFormation with the commands shown below.

cleanup

Submit a pattern to the Serverless Land Patterns Collection

While there are many patterns available to use from the Serverless Land website, there is also the option to create your own pattern and submit it. From the Serverless Patterns Collection main page, choose Submit a Pattern.

There you see guidance on how to submit. We have added many patterns from the community and we are excited to see what you build!

Conclusion

In this post, I explain the benefits of using DynamoDB patterns, and the different configuration settings, including batch size and batch window, that you can use in your pattern. I explain the difference between the two capacity modes, and I also show you how to configure a DynamoDB stream as an event source for Lambda by using the existing serverless pattern.

For more serverless learning resources, visit Serverless Land.

Building serverless applications with streaming data: Part 3

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-serverless-applications-with-streaming-data-part-3/

This series is about building serverless solutions in streaming data workloads. These are traditionally challenging to build, since data can be streamed from thousands or even millions of devices continuously. The application example used in this series is Alleycat, which allows bike racers to compete with each other virtually on home exercise bikes.

Part 1 explains the application’s functionality, how to deploy to your AWS account, and provides an architectural review. Part 2 compares different ways to ingest streaming data into Amazon Kinesis Data Streams and shows how to optimize shard capacity.

In this post, I explain how the all-time rankings functionality is implemented. This uses Amazon Kinesis Data Firehose to deliver hundreds of thousands of data points for processing by AWS Lambda functions.

To set up the example, visit the GitHub repo and follow the instructions in the README.md file. Note that this walkthrough uses services that are not covered by the AWS Free Tier and incur cost.

Overview of real time rankings in Alleycat

In the example scenario, there are 40,000 users and up to 1,000 competitors may race at any given time. While a competitor is racing, there is a real time rankings display that shows their performance in the selected class:

Alleycat front end

The racer can select Here now to compare against racers in the current virtual race. Alternatively, selecting All time compares their performance against the best performance of all races who have ever competed in the race. The rankings board is dynamic and shows the rankings for the current second on the display for the local racer:

Leaderboard changes over time

In Alleycat, races occur every five minutes continuously, and the all-time data is gathered from races that are completed. For this to work, the application must do the following:

  • Continuously aggregate race data from each of the six exercise classes and deliver to an Amazon S3 bucket.
  • Compare the incoming race data with the personal best records for every competitor in the same class.
  • If the current performance is a record, update the all-time best records dataset.
  • Compress the dataset, since there are thousands of racers, each with 5 minutes of personal racing history.

The resulting dataset contains second-by-second personal records for every racer in a selected race. This is saved in the application’s history bucket in S3 and loaded by the Alleycat front end before the beginning of each race:

        // From Home.vue's loadRealtimeHistory method

        // Load gz from S3 history bucket, unzip and parse to JSON
        const URL = `https://${this.$appConfig.historyBucket}.s3.${this.$appConfig.region}.amazonaws.com/class-${this.selectedClassId}.gz`
        const response = await axios.get(URL, { responseType: 'arraybuffer' })
        const buffer = Buffer.from(response.data, 'base64')
        const dezipped = await gunzip(buffer)
        const history = JSON.parse(dezipped.toString())

Architecture overview

The backend microservice handling this feature is located in the 2-streaming-kdf directory in the GitHub repo. It uses the following architecture:

Solution architecture

  1. Amazon Kinesis Data Firehose is a consumer of Kinesis Data Streams. Records are delivered from the application’s main stream as they arrive.
  2. Kinesis Data Firehose invokes a data transformation Lambda function. This calculates the output for each racer’s data points and returns the modified records to Kinesis.
  3. Kinesis Data Firehose delivers batches of records to an S3 bucket.
  4. When objects are written to the S3 bucket, this event triggers the S3 processor Lambda function. This compares incoming racer performance with historical records.
  5. The Lambda function saves the new all-time records in a compressed format to the application’s history bucket in S3.

The process happens continuously providing that new records are delivered to the application’s main Kinesis stream.

Using Kinesis Data Firehose

Kinesis Data Firehose can consume records from Kinesis Data Streams, or directly from the AWS SDK, AWS IoT Core, and other data producers. In Alleycat, there are multiple consumers that process incoming data, so Kinesis Data Firehose is configured as a consumer on the main stream.

The process of updating historical records is not a real-time process in Alleycat. Processing individual racer messages and comparing against all-time records is possible but would be computationally more complex. In practice, only a few incoming data points are all-time records. As a result, Alleycat asynchronously processes batches of records to find and update all-time records. The process is eventually consistent and the tradeoff is latency. There may be up to 1 minute between recording a personal record and updating the historical dataset in S3.

Kinesis Data Firehose provides several key functions for this process. First, it batches groups of messages based upon the batching hints provided in the AWS Serverless Application Model (AWS SAM) template. This application buffers records for up to 60 seconds or until 1 MB of records are available, whichever is reached first. You can adjust these settings and batch for up to 900 seconds or 128 MB of data. Note that these settings are hints and not absolute – the service can dynamically adjust these if data delivery falls behind data writing in the stream. As a result, hints should be treated as guidance and the actual settings may change.

Kinesis Data Firehose also enables data compression before delivery in various common formats. Since the S3 delivery bucket is an intermediary storage location in this application, using data compression reduces the storage cost. Kinesis can also encrypt data before delivery but this feature is not used in Alleycat. Finally, Kinesis Data Firehose can transform the incoming records by invoking a Lambda function. This enables the application to calculate the racer’s output before delivering the records.

The configuration for Kinesis Data Firehose, the S3 buckets, and the Lambda function is located in the template.yaml file in the GitHub repo. This AWS SAM template shows how to define the complete integration:

  DeliveryStream:
    Type: AWS::KinesisFirehose::DeliveryStream
    DependsOn:
      - DeliveryStreamPolicy    
    Properties:
      DeliveryStreamName: "alleycat-data-firehose"
      DeliveryStreamType: "KinesisStreamAsSource"
      KinesisStreamSourceConfiguration: 
        KinesisStreamARN: !Sub "arn:aws:kinesis:${AWS::Region}:${AWS::AccountId}:stream/${KinesisStreamName}"
        RoleARN: !GetAtt DeliveryStreamRole.Arn
      ExtendedS3DestinationConfiguration: 
        BucketARN: !GetAtt DeliveryBucket.Arn
        BufferingHints: 
          SizeInMBs: 1
          IntervalInSeconds: 60
        CloudWatchLoggingOptions: 
          Enabled: true
          LogGroupName: "/aws/kinesisfirehose/alleycat-firehose"
          LogStreamName: "S3Delivery"
        CompressionFormat: "GZIP"
        EncryptionConfiguration: 
          NoEncryptionConfig: "NoEncryption"
        Prefix: ""
        RoleARN: !GetAtt DeliveryStreamRole.Arn
        ProcessingConfiguration: 
          Enabled: true
          Processors: 
            - Type: "Lambda"
              Parameters: 
                - ParameterName: "LambdaArn"
                  ParameterValue: !GetAtt FirehoseProcessFunction.Arn

Once Kinesis Data Firehose is set up, there is no ongoing administration. The service scales automatically to adjust to the amount of the data available in the application’s Kinesis data stream.

How the Lambda data transformer works

Kinesis Data Firehose buffers up to 3 MB before invoking the data transformation function (you can configure this setting with the ProcessingConfiguration API). The service delivers records as a JSON array:

{
    "invocationId": "d8ce3cc6-abcd-407a-abcd-d9bc2ce58e72",
    "sourceKinesisStreamArn": "arn:aws:kinesis:us-east-2:012345678912:stream/alleycat",
    "deliveryStreamArn": "arn:aws:firehose:us-east-2:012345678912:deliverystream/alleycat-firehose",
    "region": "us-east-2",
    "records": [
        {
            "recordId": "49617596128959546022271021234567...",
            "approximateArrivalTimestamp": 1619185789847,
            "data": "eyJ1dWlkIjoiYjM0MGQzMGYtjI1Yi00YWM4LThjY2QtM2ZhNThiMWZmZjNlIiwiZXZlbnQiOiJ1cGRhdGUiLCJkZXZpY2VUaW1lc3RhbXiOjE2MTkxODU3ODk3NUsInNlY29uZCI6MwicmFjZUlkIjoxNjE5MTg1NjIwMj1LCJuYW1lIjoiSHViZXJ0IiwicmFjZXJJZCI6MSwiY2xhc3NJZCI6MSwiY2FkZW5jZSI6ODAsInJlc2lzdGFuY2UiOjc4fQ==",
            "kinesisRecordMetadata": {
                "sequenceNumber": "49617596128959546022271020452",
                "subsequenceNumber": 0,
                "partitionKey": "1619185789798",
                "shardId": "shardId-000000000000",
                "approximateArrivalTimestamp": 1619185789847
            }
        }, 
   ...

The payload contains metadata about the invocation and data source and each record contains a recordId and a base64 encoded data attribute. The Lambda function can then decode and modify the data attribute for each record as needed. The Lambda function must finally return a JSON array of the same length as the incoming event.records array, with the following attributes:

  • recordId: this must match the incoming recordId, so Kinesis can map the modified data payload back to the original record.
  • result: this must be “Ok”, “Dropped”, or “ProcessingFailed”. “Dropped” means that the function has intentionally removed a payload from processing. Any records with “ProcessingFailed” are delivered to the S3 bucket in a folder called processing-failed. This includes metadata indicating the number of delivery attempts, the timestamp of the last attempt, and the Lambda function’s ARN.
  • data: the returned data payload must be base64 encoded and the modified record must be within the 1 MB limit per record.

The transformer function in Alleycat shows how to implement this process in Node.js:

exports.handler = (event) => {
  const output = event.records.map((record) => {
    // Extract JSON record from base64 data
    const buffer = Buffer.from(record.data, "base64").toString()
    const jsonRecord = JSON.parse(buffer)

    // Add calculated field
    jsonRecord.output = ((jsonRecord.cadence + 35) * (jsonRecord.resistance + 65)) / 100

    // Convert back to base64 + add a newline
    const dataBuffer = Buffer.from(JSON.stringify(jsonRecord) + "\n", "utf8").toString("base64")

    return {
      recordId: record.recordId,
      result: "Ok",
      data: dataBuffer,
    }
  })

  console.log(`{ recordsTotal: ${output.length} }`)
  return { records: output }
}

If you are processing the data in downstream services in JSON format, adding a newline at the end of the data buffer for each record can simplify the import process.

The data transformation Lambda function can run for up to 5 minutes and can use other AWS services for data processing. The Kinesis Data Firehose service scales up the Lambda function automatically if traffic increases, up to 5 outstanding invocations per shard (or 10 if the destination is Splunk).

Processing the Kinesis Data Firehose output

Kinesis Data Firehose puts a series of objects in the intermediary S3 bucket. Each put event invokes the S3 processor Lambda function. This function compares each data point to historical best performance, per racer ID, per class ID, at the same second of the race.

Data points that do not beat existing records are discarded. Otherwise, the function merges new historical records into the dataset and saves the result into the final S3 history bucket. This object is compressed in gzip format and contains the history for a single class ID.

When the frontend downloads this dataset from the S3 bucket, it decompresses the object. The resulting JSON structure shows personal output records per racer ID for each second of the race. The frontend uses this data to update the leaderboard on the local device during each second of the active race.

JSON output

Conclusion

In this post, I explain the all-time leaderboard logic in the Alleycat application. This is an asynchronous, eventually consistent process that checks batches of incoming records for new personal records. This uses Kinesis Data Firehose to provide a zero-administration way to deliver and process large batches of records continuously.

This post shows the architecture in Alleycat and how this is defined in AWS SAM. Finally, I walk through how to build a data transformation Lambda function that correctly decodes a payload and returns records back to Kinesis.

Part 4 show how to combine Kinesis with Amazon DynamoDB to support queries for streaming data. Alleycat uses this architecture to provide real-time rankings for competitors in the same virtual race.

For more serverless learning resources, visit Serverless Land.

Getting started with serverless for developers part 5: Sandbox developer account

Post Syndicated from Benjamin Smith original https://aws.amazon.com/blogs/compute/getting-started-with-serverless-for-developers-part-5-sandbox-developer-account/

This is part 5 of the Getting started with serverless series. In part 4, you learn how the developer workflow for building serverless applications differs to a traditional developer workflow. You see how to test business logic locally before deploying to an AWS account.

In this post, you learn how to secure and manage access to your AWS Lambda functions. I show how to invoke Lambda functions in a sandbox developer account directly from an integrated developer environment (IDE) and view output logs in near-real-time. Finally, I show how this helps to test for infrastructure and security configurations before committing changes to the main branch.

A sandbox developer account

Serverless services like Lambda and Amazon API Gateway are pay-per-use, this means developers no longer need to share multiple environments (for example, dev, staging, and production). Instead, every developer can have their own sandboxed AWS developer account. This allows developers to not have to replicate everything to their local environment but rather test with real resources in the cloud.

You can still run code locally during the development of a feature. In post 4, I show how I run Lambda function code locally, using a test harness. This allows me to maintain a fast inner loop, iteratively updating and locally testing code. If my Lambda function interacts with other AWS infrastructure, I deploy them to a sandboxed AWS developer account. This allows me to test my Lambda function code locally while still being able to access managed services in the cloud.

However, it is useful to deploy your function code to a Lambda function in a sandboxed developer account. A sandbox developer account is an AWS account allocated to a developer on a 1:1 basis. It should give developers as much freedom as possible while still protecting resources and budget.

This allows you to test for security configurations and ensure that your Lambda function code behaves as expected when run in the Lambda execution environment:

Creating a sandboxed developer account

The following best practices can help to minimize costs and prevent unauthorized usage.

After creating a sandbox account, it can be useful to associate a named profile with it. A named profile is a collection of credentials that you can apply to an AWS Command Line Interface (AWS CLI) command. When you specify a profile to run a command, the settings and credentials are used to run that command. The AWS CLI supports multiple named profiles that are stored in the config and credentials files.

Configure profiles by adding entries to the config and credentials files. To learn more about named profiles refer to the AWS CLI documentation.

In the following example I configure my credentials file with two named profiles.

The profile named prod is my production account, and the profile named default is my sandbox developer account. The CLI automatically uses the profile named default, if no --profile option is specified in a CLI command.

[default]
aws_access_key_id=AKIAIOSFODNN7EXAMPLE
aws_secret_access_key=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

[dev]
aws_access_key_id=AKIAIOSFODNN7EXAMPLE
aws_secret_access_key=wJalBBUtnFEMI/&7MDENG/bPxRfiCYEXAMPLEKEY

[prod]
aws_access_key_id=AKIAI44QH8DHBEXAMPLE
aws_secret_access_key=je7MtGbClwBF/2Zp9Utk/h3yCo8nvbEXAMPLEKEY

AWS Lambda security permissions

AWS Identity and Access Management (IAM) is the service used to manage access to AWS services. Lambda is fully integrated with IAM, allowing you to control precisely what each Lambda function can do within the AWS Cloud. There are two important things that define the scope of permissions in Lambda functions:

The resource policy: Defines which events are authorized to invoke the function.

The execution role policy: Limits what the Lambda function is authorized to do.

Using IAM roles to describe a Lambda function’s permissions, decouples it’s security configuration from the code. This helps reduce the complexity of a lambda function, making it easier to maintain.

A Lambda function’s resource and execution policy should be granted the minimum required permissions for the function to perform it’s task effectively. This is sometimes referred to as the rule of least privilege. As you develop a Lambda function, you expand the scope of this policy to allow access to other resources as required.

When building Lambda-based applications with frameworks such as AWS SAM, you describe both policies in the application’s template.

The following steps show how I deploy and test a Lambda function in a sandbox developer account from within my IDE.

Before you start

All the code relating to this example application can be found in this GitHub repository. To deploy this stage of the application, follow the steps from post 1 to clone the sample application.

  1. Run the following command from the root directory of the cloned repository:
    cd ./part_5
  2. After creating a sandbox developer account, deploy the example application into it by specifying the corresponding profile name in the AWS SAM CLI command. You can omit this if you named the profile default:
    sam deploy --config-file ../samconfig.toml  –guided  --profile default

    This produces the following output:

    Make a note of the StarWebhookLambdaFunctionName, you will use this in the following steps.

Logging with serverless applications

After deploying your serverless application to the sandboxed developer account, you need to verify that it’s operating properly. Lambda automatically monitors functions on your behalf, reporting metrics through Amazon CloudWatch. It collects data in the form of logs, metrics, and events and provides a unified view of AWS resources, applications, and services.

To help simplify troubleshooting, the AWS Serverless Application Model CLI (AWS SAM CLI) has a command called sam logs. This command lets you fetch CloudWatch Logs generated by your Lambda function from the command line.

Run the following command in a terminal window to view a live tail of logs generated by the StarWebhookHandler Lambda function. Replace StarWebhookLambdaFunctionName with the Lambda function name generated by your deployment:

sam logs -n StarWebhookLambdaFunctionName --tail

Checking Lambda function permissions in a sandbox developer account

I open a new terminal window and invoke the StarWebhookHandler Lambda function directly from my IDE by running the following AWS SAM CLI command. To invoke the function I pass an example payload located in events/testEvent.json.

aws lambda invoke --function-name <<replace-with-function-name>> \
--payload fileb://events/testEvent.json  \
out.txt

The following screenshot shows my two terminal windows side by side.

The response returned by the CLI command is on the right. The left window shows the tail of logs generated by the Lambda function. I observe that the CLI invocation shows a status 200 response, but the Lambda function logs report an ‘AccessDenied’ error. The function does not have the required permissions to write to Amazon S3.

I edit the Lambda function policy definition, adding permission for my Lambda function to write to an S3 bucket. I run sam build and sam deploy to re-deploy the application to the sandbox developer account. I invoke the Lambda function again. The logs show the following:

  1. The Lambda function responds with “StatusCode 200″.
  2. The Lambda function billed duration, memory size and running duration.
  3. The Lambda function has successfully copied the file to S3

IAM permission errors such as these may not be detected when running the function code locally. This is one of the advantages of deploying and running Lambda functions in a sandboxed developer account while developing an application.

Conclusion

This post explains the advantages of using a sandbox developer account. It shows how to deploy your business logic to a Lambda function in a sandboxed developer account. You are introduced to IAM policies, which control precisely what each Lambda function can do within the AWS Cloud. You learn that CloudWatch provides a unified view of logs for all AWS resources.

Finally, I show how to use the AWS SAM CLI and AWS CLI to invoke a Lambda function in the cloud and view its log output directly from the IDE. This helps to test for security configurations and to ensure that your business logic behaves as expected when run in the Lambda service. Invoking functions and observing their log output directly from your IDE helps to reduce context switching as you build.

Announcing migration of the Java 8 runtime in AWS Lambda to Amazon Corretto

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/announcing-migration-of-the-java-8-runtime-in-aws-lambda-to-amazon-corretto/

This post is written by Jonathan Tuliani, Principal Product Manager, AWS Lambda.

What is happening?

Beginning July 19, 2021, the Java 8 managed runtime in AWS Lambda will migrate from the current Open Java Development Kit (OpenJDK) implementation to the latest Amazon Corretto implementation.

To reflect this change, the AWS Management Console will change how Java 8 runtimes are displayed. The display name for runtimes using the ‘java8’ identifier will change from ‘Java 8’ to ‘Java 8 on Amazon Linux 1’. The display name for runtimes using the ‘java8.al2’ identifier will change from ‘Java 8 (Corretto)’ to ‘Java 8 on Amazon Linux 2’. The ‘java8’ and ‘java8.al2’ identifiers themselves, as used by tools such as the AWS CLI, CloudFormation, and AWS SAM, will not change.

Why are you making this change?

This change enables customers to benefit from the latest innovations and extended support of the Amazon Corretto JDK distribution. Amazon Corretto is a no-cost, multiplatform, production-ready distribution of the OpenJDK. Corretto is certified as compatible with the Java SE standard and used internally at Amazon for many production services.

Amazon is committed to Corretto, and provides regular updates that include security fixes and performance enhancements. With this change, these benefits are available to all Lambda customers. For more information on improvements provided by Amazon Corretto 8, see Amazon Corretto 8 change logs.

How does this affect existing Java 8 functions?

Amazon Corretto 8 is designed as a drop-in replacement for OpenJDK 8. Most functions benefit seamlessly from the enhancements in this update without any action from you.

In rare cases, switching to Amazon Corretto 8 introduces compatibility issues. See below for known issues and guidance on how to verify compatibility in advance of this change.

When will this happen?

This migration to Amazon Corretto takes place in several stages:

  • June 15, 2021: Availability of Lambda layers for testing the compatibility of functions with the Amazon Corretto runtime. Start of AWS Management Console changes to java8 and java8.al2 display names.
  • July 19, 2021: Any new functions using the java8 runtime will use Amazon Corretto. If you update an existing function, it will transition to Amazon Corretto automatically. The public.ecr.aws/lambda/java:8 container base image is updated to use Amazon Corretto.
  • August 16, 2021: For functions that have not been updated since June 28, AWS will begin an automatic transition to the new Corretto runtime.
  • September 10, 2021: Migration completed.

These changes are only applied to functions not using the arn:aws:lambda:::awslayer:Java8Corretto or arn:aws:lambda:::awslayer:Java8OpenJDK layers described below.

Which of my Lambda functions are affected?

Lambda supports two versions of the Java 8 managed runtime: the java8 runtime, which runs on Amazon Linux 1, and the java8.al2 runtime, which runs on Amazon Linux 2. This change only affects functions using the java8 runtime. Functions the java8.al2 runtime are already using the Amazon Corretto implementation of Java 8 and are not affected.

The following command shows how to use the AWS CLI to list all functions in a specific Region using the java8 runtime. To find all such functions in your account, repeat this command for each Region:

aws lambda list-functions --function-version ALL --region us-east-1 --output text --query "Functions[?Runtime=='java8'].FunctionArn"

What do I need to do?

If you are using the java8 runtime, your functions will be updated automatically. For production workloads, we recommend that you test functions in advance for compatibility with Amazon Corretto 8.

For Lambda functions using container images, the existing public.ecr.aws/lambda/java:8 container base image will be updated to use the Amazon Corretto Java implementation. You must manually update your functions to use the updated container base image.

How can I test for compatibility with Amazon Corretto 8?

If you are using the java8 managed runtime, you can test functions with the new version of the runtime by adding the layer reference arn:aws:lambda:::awslayer:Java8Corretto to the function configuration. This layer instructs the Lambda service to use the Amazon Corretto implementation of Java 8. It does not contain any data or code.

If you are using container images, update the JVM in your image to Amazon Corretto for testing. Here is an example Dockerfile:

FROM public.ecr.aws/lambda/java:8

# Update the JVM to the latest Corretto version
## Import the Corretto public key
rpm --import https://yum.corretto.aws/corretto.key

## Add the Corretto yum repository to the system list
curl -L -o /etc/yum.repos.d/corretto.repo https://yum.corretto.aws/corretto.repo

## Install the latest version of Corretto 8
yum install -y java-1.8.0-amazon-corretto-devel

# Copy function code and runtime dependencies from Gradle layout
COPY build/classes/java/main ${LAMBDA_TASK_ROOT}
COPY build/dependency/* ${LAMBDA_TASK_ROOT}/lib/

# Set the CMD to your handler
CMD [ "com.example.LambdaHandler::handleRequest" ]

Can I continue to use the OpenJDK version of Java 8?

You can continue to use the OpenJDK version of Java 8 by adding the layer reference arn:aws:lambda:::awslayer:Java8OpenJDK to the function configuration. This layer tells the Lambda service to use the OpenJDK implementation of Java 8. It does not contain any data or code.

This option gives you more time to address any code incompatibilities with Amazon Corretto 8. We do not recommend that you use this option to continue to use Lambda’s OpenJDK Java implementation in the long term. Following this migration, it will no longer receive bug fix and security updates. After addressing any compatibility issues, remove this layer reference so that the function uses the Lambda-Amazon Corretto managed implementation of Java 8.

What are the known differences between OpenJDK 8 and Amazon Corretto 8 in Lambda?

Amazon Corretto caches TCP sessions for longer than OpenJDK 8. Functions that create new connections (for example, new AWS SDK clients) on each invoke without closing them may experience an increase in memory usage. In the worst case, this could cause the function to consume all the available memory, which results in an invoke error and a subsequent cold start.

We recommend that you do not create AWS SDK clients in your function handler on every function invocation. Instead, create SDK clients outside the function handler as static objects that can be used by multiple invocations. For more information, see static initialization in the Lambda Operator Guide.

If you must use a new client on every invocation, make sure it is shut down at the end of every invocation. This avoids TCP session caches using unnecessary resources.

What if I need additional help?

Contact AWS Support, the AWS Lambda discussion forums, or your AWS account team if you have any questions or concerns.

For more serverless learning resources, visit Serverless Land.

Building serverless applications with streaming data: Part 2

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-serverless-applications-with-streaming-data-part-2/

Part 1 introduces the Alleycat application that allows bike racers to compete with each other virtually on home exercise bikes. I explain the application’s functionality, how to deploy to your AWS account, and provide an architectural review.

This series is about building serverless solutions in streaming data workloads. These are traditionally challenging to build, since data can be streamed from thousands or even millions of devices continuously.

In the example scenario, there are 40,000 users and up to 1,000 competitors may race at any given time. The workload must continuously ingest and buffer this data, then process and analyze the information to provide analytics and leaderboard content for the frontend application.

In this post, I focus on data ingestion. I compare the two different methods used in Alleycat, and discuss other approaches available. This post refers to Amazon Kinesis Data Streams, the AWS SDK, and AWS IoT Core in the solutions.

To set up the example, visit the GitHub repo and follow the instructions in the README.md file. Note that this walkthrough uses services that are not covered by the AWS Free Tier and incur cost.

Using AWS IoT Core to ingest streaming data

AWS IoT Core enables publish-subscribe capabilities for large numbers of client applications. Clients can send data to the backend using the AWS IoT Device SDK, which uses the MQTT standard for IoT messaging. After processing, the backend can publish aggregation and status messages back to the frontend via AWS IoT Core. This service fans out the messages to clients using topics.

When using this approach, note the Quality of Service (QoS) options available. By default, the SDK uses QoS level 0, which means the device does not confirm the message is received. This is intended for workloads that can lose messages occasionally without impacting performance. In Alleycat, if performance metrics are sometimes lost, this does not likely impact the overall end user experience.

For workloads requiring higher reliability, use QoS level 1, which causes the SDK to resend the message until an acknowledgement is received. While there is no additional charge for using QoS level 1, it generally increases the number of messages, which increases the overall cost. You are not charged for the PUBACK acknowledgement message – for more details, read more about AWS IoT Core pricing.

Frontend

In this scenario, the Alleycat frontend application is running on a physical exercise bike. The user selects a racer ID and exercise class and chooses Start Race to join the current virtual race for that class.

Start race UI

Every second, the frontend sends a message containing the cadence and resistance metrics and the current second in the race for the local racer. This message is created as a JSON object in the Home.vue component and sent to the ‘alleycat-publish’ topic:

      const message = {
        uuid: uuidv4(),
        event: this.event,
        deviceTimestamp: Date.now(),
        second: this.currentSecond,
        raceId: RACE_ID,
        name: this.racer.name,
        racerId: this.racer.id,
        classId: this.selectedClassId,
        cadence: this.racer.getCurrentCadence(),
        resistance: this.racer.getCurrentResistance
      }

The IoT.vue component contains the logic for this integration and uses the AWS IoT Device SDK to send and receive messages. On startup, the frontend connects to AWS IoT Core and publishes the messages using an MQTT client:

    bus.$on('publish', (data) => {
      console.log('Publish: ', data)
      mqttClient.publish(topics.publish, JSON.stringify(data))
    })

The SDK automatically attempts to retry in the event of a network disconnection and exposes an error handler to allow custom logic if other errors occur.

Backend

The resources used in the backend are defined using the AWS Serverless Application Model (AWS SAM) and configured in the core setup templates:

Reference architecture

Messages are published to topics in AWS IoT Core, which act as channels of interest. The message broker uses topic names and topic filters to route messages between publishers and subscribers. Incoming messages are routed using rules. Alleycat’s IoT rule routes all incoming messages to a Kinesis stream:

  IotTopicRule:
    Type: AWS::IoT::TopicRule
    Properties:
      RuleName: 'alleycatIngest'
      TopicRulePayload:
        RuleDisabled: 'false'
        Sql: "SELECT * FROM 'alleycat-publish'"
        Actions:
        - Kinesis:
            StreamName: 'alleycat'
            PartitionKey: "${timestamp()}"
            RoleArn: !GetAtt IoTKinesisRole.Arn

  IoTKinesisRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Principal:
              Service:
                - iot.amazonaws.com
            Action:
              - 'sts:AssumeRole'
      Path: /
      Policies:
        - PolicyName: IoTKinesisPutPolicy
          PolicyDocument:
            Version: "2012-10-17"
            Statement:
              - Effect: Allow
                Action: 'kinesis:PutRecord'
                Resource: !GetAtt KinesisStream.Arn

Using the AWS::IoT::TopicRule resource, you can optionally define an error action. This allows you to store messages in a durable location, such as an Amazon S3 bucket, if an error occurs. Errors can occur if a rule does not have permission to access a destination or throttling occurs in a target.

Rules can route matching messages to up to 10 targets. For debugging purposes, you can also enable Amazon CloudWatch Logs, which can help in troubleshoot failed message deliveries. The AWS IoT Core Message Broker allows up to 20,000 publish requests per second – if you need a higher limit for your workload, submit a request to AWS Support.

Using the AWS SDK to ingest streaming data

The Alleycat frontend creates traffic for a single user but there is also a simulator application that can generate messages for up to 1,000 riders. Instead of routing messages using an MQTT client, the simulator uses the AWS SDK to put messages directly into the Kinesis data stream.

The SDK provides a service interface object for Kinesis and two API methods for putting messages into streams: putRecord and putRecords. The first option accepts only a single message but the second enables batching of up to 500 messages per request. This is the preferred option for adding multiple messages, compared with calling putRecord multiple times.

The putRecords API takes parameters as a JSON array of messages:

const params = {
   StreamName: 'alley-cat',
   [{
      "Data":"{\"event\":\"update\",\"deviceTimestamp\":1620824038331,\"second\":3,\"raceId\":5402746,\"name\":\"Hayden\",\"racerId\":0,\"classId\":1,\"cadence\":79.8,\"resistance\":79}",
      "PartitionKey":"1620824038331"
   },
   {
      "Data":"{\"event\":\"update\",\"deviceTimestamp\":1620824038331,\"second\":3,\"raceId\":5402746,\"name\":\"Hubert\",\"racerId\":1,\"classId\":1,\"cadence\":60.4,\"resistance\":60.6}",
      "PartitionKey":"1620824038331"
   }
]}

The SDK automatically base64 encodes the Data attribute, which in this case is the JSON string output from JSON.stringify. In the JavaScript SDK, the putRecords API can return a promise, allowing the code to await the operation:

const result = await kinesis.putRecords(params).promise()

Shards and partition keys

Kinesis data streams consist of one or more shards, which are sequences of data records with a fixed capacity. Each shard can support up to 1,000 records per second for writes, up to maximum total data write rate of 1MB per second. The total capacity of a stream is the total of its shards.

When you send messages to a stream, the partitionKey attribute determines which shard it is routed to. The example application configures a Kinesis data stream with a single shard so the partitionKey attribute has no effect – all messages are routed to the same shard. However, many production applications have more than one shard and use the partitionKey to assign messages to shards.

The partitionKey is hashed by the Kinesis service to route to a shard. This diagram shows how partitionKey values from data producers are hashed by an MD5 function and mapped to individual shards:

MD5 hash process

While you cannot designate a specific shard ID in a message, you can influence the assignment depending on your choice of partitionKey:

  • Random: Using a randomized value results in random hash so messages are randomly sent to different shards. This effectively load balances messages across all available shards.
  • Time-based: A timestamp value may cause groups of messages sent to a single shard, if the messages arrive at the same time. The identical timestamp results in an identical hash.
  • Application-specific: if Alleycat used the classID as a partitionKey, racers in each class would always be routed to the same shard. This could be useful for downstream aggregation logic but would limit the capacity of messages per classID.

Optimizing capacity in a shard

Each shard can ingest data at a rate of 1 MB per second or 1,000 records per second, whichever limit is reached first. Since the payload maximum is 1MB, this could equate to one 1MB message per second. If the payload is larger, you must divide it into smaller pieces to avoid an error. For 1,000 messages, each payload must be under 1 KB on average to fit within the allowed capacity.

The combination of the two payload limits can result in different capacity profiles for a shard:

Capacity profiles in a shard

  1. The data payloads are evenly sized and use the 1 MB per second capacity.
  2. Data payload sizes vary, so the number of messages that can be packed into 1 MB varies per second.
  3. There are a large number of very messages, consuming all 1,000 messages per second. However, the total data capacity used is significantly less than 1 MB.

In the Alleycat application, the average payload size is around 170 bytes. When producing 1,000 messages a second, the workload is only using about 20% of the 1 MB per second limit. Since PUT payload size is a factor in Kinesis pricing, messages that are much smaller than 25 KB are less cost-efficient. Compare these two messaging patterns for the Alleycat application:

Producer message patterns

  1. In this default mode, a smaller message is published once per second. This reduces overall latency but results in higher overall messaging cost.
  2. The client application batches outgoing messages and sends to Kinesis every 5 seconds. This results in lower cost and better packing of messages, but introduces additional latency.

There is a tradeoff between cost and latency when optimizing a shard’s capacity and the decision depends upon the needs of your workload. If the client buffers messages, this adds latency on the client side. This is acceptable in many workloads that collect metrics for archival or asynchronous reporting purchases. However, for low-latency applications like Alleycat, it provides a better experience for the application user to send messages as soon as they are available.

Conclusion

This post focuses on ingesting data into Kinesis Data Streams. I explain the two approaches used by the Alleycat frontend and the simulator application and highlight other approaches that you can use. I show how messages are routed to shards using partition keys. Finally, I explore additional factors to consider when ingesting data, to improve efficiency and reduce cost.

Part 3 covers using Amazon Kinesis Data Firehose for transforming, aggregating, and loading streaming data into data stores. This is used to provide the historical, second-by-second leaderboard for the frontend application.

For more serverless learning resources, visit Serverless Land.

Getting Started with serverless for developers: Part 4 – Local developer workflow

Post Syndicated from Benjamin Smith original https://aws.amazon.com/blogs/compute/getting-started-with-serverless-for-developers-part-4-local-developer-workflow/

This blog is part 4 of the “Getting started with serverless for developers” series, helping developers start building serverless applications from their IDE.

Many “getting started” guides demonstrate how to build serverless applications from within the AWS Management Console. However, most developers spend the majority of their time building from within their local integrated development environment (IDE).

The next two blog posts in this series focus on the serverless developer workflow. They describe how to check logs, test, and iterate on business logic while building locally, and how this differs from traditional applications.

This blog post explains how the developer workflow for building serverless applications differs from a traditional developer workflow. It shows the methods that I use to test business logic locally before deploying to a sandbox AWS developer account, and how to test against cloud services as you build. This eliminates the need to deploy to the AWS Cloud each time you want to test a code change.

Traditional developer workflow

Developers typically use the following workflow cycle before committing code to the main branch:

  1. Write code.
  2. Save code.
  3. Run code.
  4. Check results.

This is sometimes referred to as the inner loop, shown as follows:

With traditional (non-serverless) applications, developers commonly create a development environment on their local machine. This local development environment keeps parity with the staging and production environments. It allows developers to test their application locally end-to-end before committing code to the main branch.

Serverless developer workflow

Part 2: the business logic, explains how serverless applications use managed services that abstract away the need for developers to patch and scale their application infrastructure. This means that the code base in a serverless application is focused purely on business logic, allowing managed services to handle other important layers such as:

  • Authorization
  • Presentation
  • Database
  • Application integration
  • Notification

A good serverless developer workflow enables developers to test and iterate on business logic quickly. It allows them to check that the business logic runs correctly with the managed services that compose an application.

To achieve this, the best approach is not to try and emulate managed services on your local development machine. Instead, your local code should interact directly with real cloud services in a sandboxed AWS account.

This approach lets you rapidly test and iterate on business logic locally, deploying to the development environment to test for infrastructure, security, and environment configuration changes. Once the business logic is ready in the development environment, it can be deployed to other environments (test, staging, production) via CI/CD automation or manually triggered commands.

The following sections explain how I test business logic locally using a custom-written test harness. Each time I create a Lambda function I create a directory to hold:

  1. The function code.
  2. A relative package.json.
  3. A file called testharness.js.

The test harness file is used to run the Lambda function code on my local development machine. It is configured to mock environment variables loaded from a file named `env.json` and loads a JSON test event payload located in the events directory.

Testing Lambda function code locally with a test harness

Part 2: the business logic, explains the anatomy of a Lambda function and its handler:

A small Lambda function

Note that the function handler receives an input payload called an event object. To test the local function code effectively, use a test event object that represents the production event object.

The serverless application introduced in getting started with serverless part 1, shows that Amazon API Gateway invokes the Lambda function.

This invocation contains an event object with a JSON representation of the HTTP request. The event object has a defined structure. Follow the steps in this GitHub repository to see how to create a test event object.

Before you start

To deploy this stage of the application, follow the steps from post 1 to clone and deploy the sample application.

  1. Run the following commands from the root directory of the cloned repository:
    cd ./part_4/src_starred
    npm install

The following steps show how I configure my testharness.js file to run my function code:

// Mock event
const event = require('../events/testEvent.json')
// Mock environment variables
const environmentVars = require('../env.json')
process.env.AWS_REGION = environmentVars.AWS_REGION
process.env.localTest = true
process.env.slackEndpoint= environmentVars.slackEndpoint
process.env.bucket = environmentVars.bucket
// Lambda handler
const { handler } = require('./app')
const main = async () => {
  console.time('localTest')
  console.dir(await handler(event))
  console.timeEnd('localTest')
}
main().catch(error => console.error(error))
  1. A test event object is loaded into a variable called event.
  2. The environment variables required by the Lambda function are defined in env.json and loaded.
  3. The Lambda function code is loaded into a variable called handler
  4. The Lambda function code is run synchronously
  5. The console shows output from the Lambda function code, along with any errors that occur.

I run my test harness file by entering the following command in a terminal window:

$ node testharness.js

This produces the following output:

The output indicates that the function code ran without error, it returned a 200 status code and completed in 30.999 ms.

To generate an error in the function code,  I change the app.js file by commenting out the following line:

//const axios = require('axios');

I save the file and run the test harness again. I see the following response in the terminal window:

f

This indicates an error in my function code that I must resolve before deploying to my AWS account.

By iteratively updating and locally testing my code, I maintain a fast inner loop. This eliminates the need to deploy to the AWS Cloud each time I want to test a code change.

Testing against cloud resources

In many instances, a Lambda function interacts with other cloud resources. This could be via an SDK or some other native integration. In this case, you should deploy those resources to a sandboxed AWS developer account.

In the following example, I update a Lambda function to log each inbound HTTP request to a bucket in Amazon S3, a highly scalable object storage service running in the AWS cloud. To do this, I use the JavaScript SDK:

s3.putObject(params, function(err, data)

I update the AWS Serverless Application Model (AWS SAM) template to define a new S3 bucket:

SrcBucket:
Type: AWS::S3::Bucket

I change into the part_4 directory then build and deploy the application with the following commands:

cd ../part_4
sam build
sam deploy --guided --config-file ../samconfig.toml

After deploying, the output shows:

I copy the new bucket value to the environment variable defined in in /part_4/env.json .

"slackEndpoint": "Insert_Slack_Endpoint",
"bucket" : "Insert_S3_Bucket_Name"
}

Now I am ready to test the local Lambda function code against cloud resources in my sandbox AWS development account. I run a new local test with the following command:

node testHarness.js

The terminal returns:

This indicates that the Lambda function code completed without error. I can verify this by checking the contents of the S3 bucket using the AWS Command Line Interface (AWS CLI):

aws s3 ls s3://githubtoslackapp-srcbucket-ge4wkt9dljwa

This returns the following, confirming that the request object has been saved to S3 and the application code is running correctly:

Conclusion

Using managed services to build serverless applications helps developers focus on business logic. It also changes the developer workflow compared with building traditional (non-serverless) applications.

A good serverless developer workflow enables developers to test and iterate on business logic quickly while still being able to interact with cloud services. This blog post shows how I achieve this by using a test harness to run function code locally and deploying application resources to a sandboxed developer account.

In the next blog post, I show how to invoke Lambda functions deployed to sandboxed developer account, without leaving your IDE. This lets you test for infrastructure, security, and environment configuration changes while building.

Forwarding emails automatically based on content with Amazon Simple Email Service

Post Syndicated from Murat Balkan original https://aws.amazon.com/blogs/messaging-and-targeting/forwarding-emails-automatically-based-on-content-with-amazon-simple-email-service/

Introduction

Email is one of the most popular channels consumers use to interact with support organizations. In its most basic form, consumers will send their email to a catch-all email address where it is further dispatched to the correct support group. Often, this requires a person to inspect content manually. Some IT organizations even have a dedicated support group that handles triaging the incoming emails before assigning them to specialized support teams. Triaging each email can be challenging, and delays in email routing and support processes can reduce customer satisfaction. By utilizing Amazon Simple Email Service’s deep integration with Amazon S3, AWS Lambda, and other AWS services, the task of categorizing and routing emails is automated. This automation results in increased operational efficiencies and reduced costs.

This blog post shows you how a serverless application will receive emails with Amazon SES and deliver them to an Amazon S3 bucket. The application uses Amazon Comprehend to identify the dominant language from the message body.  It then looks it up in an Amazon DynamoDB table to find the support group’s email address specializing in the email subject. As the last step, it forwards the email via Amazon SES to its destination. Archiving incoming emails to Amazon S3 also enables further processing or auditing.

Architecture

By completing the steps in this post, you will create a system that uses the architecture illustrated in the following image:

Architecture showing how to forward emails by content using Amazon SES

The flow of events starts when a customer sends an email to the generic support email address like [email protected]. This email is listened to by Amazon SES via a recipient rule. As per the rule, incoming messages are written to a specified Amazon S3 bucket with a given prefix.

This bucket and prefix are configured with S3 Events to trigger a Lambda function on object creation events. The Lambda function reads the email object, parses the contents, and sends them to Amazon Comprehend for language detection.

Amazon DynamoDB looks up the detected language code from an Amazon DynamoDB table, which includes the mappings between language codes and support group email addresses for these languages. One support group could answer English emails, while another support group answers French emails. The Lambda function determines the destination address and re-sends the same email address by performing an email forward operation. Suppose the lookup does not return any destination address, or the language was not be detected. In that case, the email is forwarded to a catch-all email address specified during the application deployment.

In this example, Amazon SES hosts the destination email addresses used for forwarding, but this is not a requirement. External email servers will also receive the forwarded emails.

Prerequisites

To use Amazon SES for receiving email messages, you need to verify a domain that you own. Refer to the documentation to verify your domain with Amazon SES console. If you do not have a domain name, you will register one from Amazon Route 53.

Deploying the Sample Application

Clone this GitHub repository to your local machine and install and configure AWS SAM with a test AWS Identity and Access Management (IAM) user.

You will use AWS SAM to deploy the remaining parts of this serverless architecture.

The AWS SAM template creates the following resources:

  • An Amazon DynamoDB mapping table (language-lookup) contains information about language codes and associates them with destination email addresses.
  • An AWS Lambda function (BlogEmailForwarder) that reads the email content parses it, detects the language, looks up the forwarding destination email address, and sends it.
  • An Amazon S3 bucket, which will store the incoming emails.
  • IAM roles and policies.

To start the AWS SAM deployment, navigate to the root directory of the repository you downloaded and where the template.yaml AWS SAM template resides. AWS SAM also requires you to specify an Amazon Simple Storage Service (Amazon S3) bucket to hold the deployment artifacts. If you haven’t already created a bucket for this purpose, create one now. You will refer to the documentation to learn how to create an Amazon S3 bucket. The bucket should have read and write access by an AWS Identity and Access Management (IAM) user.

At the command line, enter the following command to package the application:

sam package --template template.yaml --output-template-file output_template.yaml --s3-bucket BUCKET_NAME_HERE

In the preceding command, replace BUCKET_NAME_HERE with the name of the Amazon S3 bucket that should hold the deployment artifacts.

AWS SAM packages the application and copies it into this Amazon S3 bucket.

When the AWS SAM package command finishes running, enter the following command to deploy the package:

sam deploy --template-file output_template.yaml --stack-name blogstack --capabilities CAPABILITY_IAM --parameter-overrides FromEmailAddress=info@ YOUR_DOMAIN_NAME_HERE CatchAllEmailAddress=catchall@ YOUR_DOMAIN_NAME_HERE

In the preceding command, change the YOUR_DOMAIN_NAME_HERE with the domain name you validated with Amazon SES. This domain also applies to other commands and configurations that will be introduced later.

This example uses “blogstack” as the stack name, you will change this to any other name you want. When you run this command, AWS SAM shows the progress of the deployment.

Configure the Sample Application

Now that you have deployed the application, you will configure it.

Configuring Receipt Rules

To deliver incoming messages to Amazon S3 bucket, you need to create a Rule Set and a Receipt rule under it.

Note: This blog uses Amazon SES console to create the rule sets. To create the rule sets with AWS CloudFormation, refer to the documentation.

  1. Navigate to the Amazon SES console. From the left navigation choose Rule Sets.
  2. Choose Create a Receipt Rule button at the right pane.
  3. Add info@YOUR_DOMAIN_NAME_HERE as the first recipient addresses by entering it into the text box and choosing Add Recipient.

 

 

Choose the Next Step button to move on to the next step.

  1. On the Actions page, select S3 from the Add action drop-down to reveal S3 action’s details. Select the S3 bucket that was created by the AWS SAM template. It is in the format of your_stack_name-inboxbucket-randomstring. You will find the exact name in the outputs section of the AWS SAM deployment under the key name InboxBucket or by visiting the AWS CloudFormation console. Set the Object key prefix to info/. This tells Amazon SES to add this prefix to all messages destined to this recipient address. This way, you will re-use the same bucket for different recipients.

Choose the Next Step button to move on to the next step.

In the Rule Details page, give this rule a name at the Rule name field. This example uses the name info-recipient-rule. Leave the rest of the fields with their default values.

Choose the Next Step button to move on to the next step.

  1. Review your settings on the Review page and finalize rule creation by choosing Create Rule

  1. In this example, you will be hosting the destination email addresses in Amazon SES rather than forwarding the messages to an external email server. This way, you will be able to see the forwarded messages in your Amazon S3 bucket under different prefixes. To host the destination email addresses, you need to create different rules under the default rule set. Create three additional rules for catchall@YOUR_DOMAIN_NAME_HERE , english@ YOUR_DOMAIN_NAME_HERE and french@YOUR_DOMAIN_NAME_HERE email addresses by repeating the steps 2 to 5. For Amazon S3 prefixes, use catchall/, english/, and french/ respectively.

 

Configuring Amazon DynamoDB Table

To configure the Amazon DynamoDB table that is used by the sample application

  1. Navigate to Amazon DynamoDB console and reach the tables view. Inspect the table created by the AWS SAM application.

language-lookup table is the table where languages and their support group mappings are kept. You need to create an item for each language, and an item that will hold the default destination email address that will be used in case no language match is found. Amazon Comprehend supports more than 60 different languages. You will visit the documentation for the supported languages and add their language codes to this lookup table to enhance this application.

  1. To start inserting items, choose the language-lookup table to open table overview page.
  2. Select the Items tab and choose the Create item From the dropdown, select Text. Add the following JSON content and choose Save to create your first mapping object. While adding the following object, replace Destination attribute’s value with an email address you own. The email messages will be forwarded to that address.

{

  “language”: “en”,

  “destination”: “english@YOUR_DOMAIN_NAME_HERE”

}

Lastly, create an item for French language support.

{

  “language”: “fr”,

  “destination”: “french@YOUR_DOMAIN_NAME_HERE”

}

Testing

Now that the application is deployed and configured, you will test it.

  1. Use your favorite email client to send the following email to the domain name info@ email address.

Subject: I need help

Body:

Hello, I’d like to return the shoes I bought from your online store. How can I do this?

After the email is sent, navigate to the Amazon S3 console to inspect the contents of the Amazon S3 bucket that is backing the Amazon SES Rule Sets. You will also see the AWS Lambda logs from the Amazon CloudWatch console to confirm that the Lambda function is triggered and run successfully. You should receive an email with the same content at the address you defined for the English language.

  1. Next, send another email with the same content, this time in French language.

Subject: j’ai besoin d’aide

Body:

Bonjour, je souhaite retourner les chaussures que j’ai achetées dans votre boutique en ligne. Comment puis-je faire ceci?

 

Suppose a message is not matched to a language in the lookup table. In that case, the Lambda function will forward it to the catchall email address that you provided during the AWS SAM deployment.

You will inspect the new email objects under english/, french/ and catchall/ prefixes to observe the forwarding behavior.

Continue experimenting with the sample application by sending different email contents to info@ YOUR_DOMAIN_NAME_HERE address or adding other language codes and email address combinations into the mapping table. You will find the available languages and their codes in the documentation. When adding a new language support, don’t forget to associate a new email address and Amazon S3 bucket prefix by defining a new rule.

Cleanup

To clean up the resources you used in your account,

  1. Navigate to the Amazon S3 console and delete the inbox bucket’s contents. You will find the name of this bucket in the outputs section of the AWS SAM deployment under the key name InboxBucket or by visiting the AWS CloudFormation console.
  2. Navigate to AWS CloudFormation console and delete the stack named “blogstack”.
  3. After the stack is deleted, remove the domain from Amazon SES. To do this, navigate to the Amazon SES Console and choose Domains from the left navigation. Select the domain you want to remove and choose Remove button to remove it from Amazon SES.
  4. From the Amazon SES Console, navigate to the Rule Sets from the left navigation. On the Active Rule Set section, choose View Active Rule Set button and delete all the rules you have created, by selecting the rule and choosing Action, Delete.
  5. On the Rule Sets page choose Disable Active Rule Set button to disable listening for incoming email messages.
  6. On the Rule Sets page, Inactive Rule Sets section, delete the only rule set, by selecting the rule set and choosing Action, Delete.
  7. Navigate to CloudWatch console and from the left navigation choose Logs, Log groups. Find the log group that belongs to the BlogEmailForwarderFunction resource and delete it by selecting it and choosing Actions, Delete log group(s).
  8. You will also delete the Amazon S3 bucket you used for packaging and deploying the AWS SAM application.

 

Conclusion

This solution shows how to use Amazon SES to classify email messages by the dominant content language and forward them to respective support groups. You will use the same techniques to implement similar scenarios. You will forward emails based on custom key entities, like product codes, or you will remove PII information from emails before forwarding with Amazon Comprehend.

With its native integrations with AWS services, Amazon SES allows you to enhance your email applications with different AWS Cloud capabilities easily.

To learn more about email forwarding with Amazon SES, you will visit documentation and AWS blogs.

Using API destinations with Amazon EventBridge

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/using-api-destinations-with-amazon-eventbridge/

Amazon EventBridge enables developers to route events between AWS services, integrated software as a service (SaaS) applications, and your own applications. It can help decouple applications and produce more extensible, maintainable architectures. With the new API destinations feature, EventBridge can now integrate with services outside of AWS using REST API calls.

API destinations architecture

This feature enables developers to route events to existing SaaS providers that integrate with EventBridge, like Zendesk, PagerDuty, TriggerMesh, or MongoDB. Additionally, you can use other SaaS endpoints for applications like Slack or Contentful, or any other type of API or webhook. It can also provide an easier way to ingest data from serverless workloads into Splunk without needing to modify application code or install agents.

This blog post explains how to use API destinations and walks through integration examples you can use in your workloads.

How it works

API destinations are third-party targets outside of AWS that you can invoke with an HTTP request. EventBridge invokes the HTTP endpoint and delivers the event as a payload within the request. You can use any preferred HTTP method, such as GET or POST. You can use input transformers to change the payload format to match your target.

An API destination uses a Connection to manage the authentication credentials for a target. This defines the authorization type used, which can be an API key, OAuth client credentials grant, or a basic user name and password.

Connection details

The service manages the secret in AWS Secrets Manager and the cost of storing the secret is included in the pricing for API destinations.

You create a connection to each different external API endpoint and share the connection with multiple endpoints. The API destinations console shows all configured connections, together with their authorization status. Any connections that cannot be established are shown here:

Connections list

To create an API destination, you provide a name, the HTTP endpoint and method, and the connection:

Create API destination

When you configure the destination, you must also set an invocation rate limit between 1 and 300 events per second. This helps protect the downstream endpoint from surges in traffic. If the number of arriving events exceeds the limit, the EventBridge service queues up events. It delivers to the endpoint as quickly as possible within the rate limit.

It continues to do this for 24 hours. To make sure you retain any events that cannot be delivered, set up a dead-letter queue on the event bus. This ensures that if the event is not delivered within this timeframe, it is stored durably in an Amazon SQS queue for further processing. This can also be useful if the downstream API experiences an outage for extended periods of time.

Throttling and retries

Once you have configured the API destination, it becomes available in the list of targets for rules. Matching events are sent to the HTTP endpoint with the event serialized as part of the payload.

Select targets

As with API Gateway targets in EventBridge, the maximum timeout for API destination is 5 seconds. If an API call exceeds this timeout, it is retried.

Debugging the payload from API destinations

You can send an event via API Destinations to debugging tools like Webhook.site to view the headers and payload of the API call:

  1. Create a connection with the credential, such as an API key.Connection details
  2. Create an API destination with the webhook URL endpoint and then create a rule to match and route events.Target configuration
  3. The Webhook.site testing service shows the headers and payload once the webhook is triggered. This can help you test rules if you are adding headers or manipulating the payload using an input transformer.Webhook.site testing service

Customizing the payload

Third-party APIs often require custom headers or payload formats when accepting data. EventBridge rules allow you to customize header parameters, query strings, and payload formats without the need for custom code. Header parameters and query strings can be configured with static values or attributes from the event:

Header parameters

To customize the payload, configure an input transformer, which consists of an Input Path and Input template. You use an Input Path to define variables and use JSONPath query syntax to identify the variable source in the event. For example, to extract two attributes from an Amazon S3 PutObject event, the Input Path is:

{
  "key" : "$.[0].s3.object.key", 
  "bucket" : " $.[0].s3.bucket.name "
}

Next, the Input template defines the structure of the data passed to the target, which references the variables. With this release you can now use variables inside quotes in the input transformer. As a result, you can pass these values as a string or JSON, for example:

{
  "filename" : "<key>", 
  "container" : "mycontainer-<bucket>"
}

Sending AWS events to DataDog

Using API destinations, you can send any AWS-sourced event to third-party services like DataDog. This approach uses the DataDog API to put data into the service. To do this, you must sign up for an account and create an API key. In this example, I send S3 events via CloudTrail to DataDog for further analysis.

  1. Navigate to the EventBridge console, select API destinations from the menu.
  2. Select the Connections tab and choose Create connection:Connections UI
  3. Enter a connection name, then select API Key for Authorization Type. Enter the API key name DD-API-KEY and paste your secret API key as the value. Choose Create.Create new connection UI
  4. In the API destinations tab, choose Create API destination.Create API destination UI
  5. Enter a name, set the API destination endpoint to https://http-intake.logs.datadoghq.com/v1/input, and the HTTP method to POST. Enter 300 for Invocation rate limit and select the DataDog connection from the dropdown. Choose Create.API destination detail UI
  6. From the EventBridge console, select Rules and choose Create rule. Enter a name, select the default bus, and enter this event pattern:
    {
      "source": ["aws.s3"]
    }
  7. In Select targets, choose API destination and select DataDog for the API destination. Expand, Configure Input.
  8. In the Input transformer section, enter {"detail":"$.detail"} in the Input Path field and enter {"message": <detail>} in the Input Template.Select targets UI
  9. You can optionally add a dead letter queue. To do this, open the Retry policy and dead-letter queue section. Under Dead-letter queue, select an existing SQS queue.
    Retry policy and DLQ
  10. Choose Create.
  11. Open AWS CloudShell and upload an object to an S3 bucket in your account to trigger an event:
    echo "test" > testfile.txt
    aws s3 cp testfile.txt s3://YOUR_BUCKET_NAME

    CloudShell output

  12. The logs appear in the DataDog Logs console, where you can process the raw data for further analysis:Datadog Logs console

Sending AWS events to Zendesk

Zendesk is a SaaS provider that provides customer support solutions. It can already send events to EventBridge using a partner integration. This post shows how you can consume ticket events from Zendesk and run a sentiment analysis using Amazon Comprehend.

With API destinations, you can now use events to call the Zendesk API to create and modify tickets and interact with chats and customer profiles.

To create an API destination for Zendesk:

  1. Log in with an existing Zendesk account or register for a trial account.
  2. Navigate to the EventBridge console, select API destinations from the menu and choose Create API destination.
  3. On the Create API destination, page:
    1. Enter a name for the destination (e.g. “SendToZendesk”).
    2. For API destination endpoint, enter https://<<your-subdomain>>.zendesk.com/api/v2/tickets.json.
    3. For HTTP method, select POST.
    4. For Invocation rate, enter 10.
  4. In the connection section:
    1. Select the Create a new connection radio button.
    2. For Connection name, enter ZendeskConnection.
    3. For Authorization type, select Basic (Username/Password).
    4. Enter your Zendesk username and password.
  5. Choose Create.
    Connection configuration with basic auth

When you create a rule to route to this API destination, use the Input transformer to build the defined JSON payload, as shown in the previous DataDog example. When an event matches the rule, EventBridge calls the Zendesk Create Ticket API. The new ticket appears in the Zendesk dashboard:

Zendesk dashboard

For more information on the Zendesk API, visit the Zendesk Developer Portal.

Building an integration with AWS CloudFormation and AWS SAM

To support this new feature, there are two new AWS CloudFormation resources available. These can also be used in AWS Serverless Application Model (AWS SAM) templates:

The connection resource defines the connection credential and optional invocation HTTP parameters:

Resources:
  TestConnection:
    Type: AWS::Events::Connection
    Properties:
      AuthorizationType: API_KEY
      Description: 'My connection with an API key'
      AuthParameters:
        ApiKeyAuthParameters:
          ApiKeyName: VHS
          ApiKeyValue: Testing
        InvocationHttpParameters:
          BodyParameters:
          - Key: 'my-integration-key'
            Value: 'ABCDEFGH0123456'

Outputs:
  TestConnectionName:
    Value: !Ref TestConnection
  TestConnectionArn:
    Value: !GetAtt TestConnection.Arn

The API destination resource provides the connection, endpoint, HTTP method, and invocation limit:

Resources:
  TestApiDestination:
    Type: AWS::Events::ApiDestination
    Properties:
      Name: 'datadog-target'
      ConnectionArn: arn:aws:events:us-east-1:123456789012:connection/datadogConnection/2
      InvocationEndpoint: 'https://http-intake.logs.datadoghq.com/v1/input'
      HttpMethod: POST
      InvocationRateLimitPerSecond: 300

Outputs:
  TestApiDestinationName:
    Value: !Ref TestApiDestination
  TestApiDestinationArn:
    Value: !GetAtt TestApiDestination.Arn
  TestApiDestinationSecretArn:
    Value: !GetAtt TestApiDestination.SecretArn

You can use the existing AWS::Events::Rule resource to configure an input transformer for API destination targets:

ApiDestinationDeliveryRule:
  Type: AWS::Events::Rule
  Properties:
    EventPattern:
      source:
        - "EventsForMyAPIdestination"
    State: "ENABLED"
    Targets:
      -
        Arn: !Ref TestApiDestinationArn
        InputTransformer:
          InputPathsMap:
            detail: $.detail
          InputTemplate: >
            {
                    "message": <detail>
            }

Conclusion

The API destinations feature of EventBridge enables developers to integrate workloads with third-party applications using REST API calls. This provides an easier way to build decoupled, extensible applications that work with applications outside of the AWS Cloud.

To use this feature, you configure a connection and an API destination. You can use API destinations in the same way as existing targets for rules, and also customize headers, query strings, and payloads in the API call.

Learn more about using API destinations with the following SaaS providers: DatadogFreshworks, MongoDB, TriggerMesh, and Zendesk.

For more serverless learning resources, visit Serverless Land.

Analyzing Freshdesk data using Amazon EventBridge and Amazon Athena

Post Syndicated from Benjamin Smith original https://aws.amazon.com/blogs/compute/analyzing-freshdesk-data-using-amazon-eventbridge-and-amazon-athena/

This post is written by Shashi Shankar, Application Architect, Shared Delivery Teams

Freshdesk is an omnichannel customer service platform by Freshworks. It provides automation services to help speed up customer support processes.

The Freshworks connector to Amazon EventBridge allows real time streaming of Freshdesk events with minimal configuration and setup. This integration provides real-time insights into customer support operations without the operational overhead of provisioning and maintaining any servers.

In this blog post, I walk through a serverless approach to ingest and analyze Freshdesk data. This solution uses EventBridge, Amazon Kinesis Data Firehose, Amazon S3, and Amazon Athena. I also look at examples of customer service questions that can be answered using this approach.

The following diagram shows a high-level architecture of the proposed solution:

  1. When a Freshdesk ticket is updated or created, the Freshworks connector pushes event data to the Amazon EventBridge partner event bus.
  2. A rule on the partner event bus pushes the event data to Kinesis Data Firehose.
  3. Kinesis Data Firehose batches data before sending to S3. An AWS Lambda function transforms the data by adding a new line to each record before sending.
  4. Kinesis Data Firehose delivers the batch of records to S3.
  5. Athena is used to query relevant data from S3 using standard SQL.

The walkthrough shows you how to:

  1. Add the EventBridge app to Freshdesk account.
  2. Configure a Freshworks partner event bus in EventBridge.
  3. Deploy a Kinesis Data Firehose stream, a Lambda function, and an S3 bucket.
  4. Set up a custom rule on the event bus to push data to Kinesis Data Firehose.
  5. Generate sample Freshdesk data to validate the ingestion process.
  6. Set up a table in Athena to query the S3 bucket.
  7. Query and analyze data

Pre-requisites

  • A Freshdesk account (which can be created here).
  • An AWS account.
  • AWS Serverless Application Model (AWS SAM CLI), installed and configured.

Adding the Amazon EventBridge app to a Freshdesk account

  1. Log in to your Freshdesk account and navigate to Admin Helpdesk Productivity Apps. Search for EventBridge:
  2. Choose the Amazon EventBridge icon and choose Install.
  • Enter your AWS account number in the AWS Account ID field.
  • Enter “OnTicketCreate”, “OnTicketUpdate” in the Events field.
  • Enter the AWS Region to send the Freshdesk events in the Region field. This walkthrough uses the us-east-1 Region.

Configuring a Freshworks partner event bus in EventBridge

Once previous step is completed, a partner event source is automatically created in the EventBridge console. Copy the partner event source name to a clipboard.

  1. Clone the GitHub repo and deploy the AWS SAM template:
    git clone https://github.com/aws-samples/amazon-eventbridge-freshdesk-example.git
    cd ./amazon-eventbridge-freshdesk-example
    sam deploy --guided
  2. PartnerEventSource – Enter partner event source name copied from the previous step.
  3. S3BucketName – Enter an S3 bucket name to store Freshdesk ticket event data.

The AWS SAM template creates an association between the partner event source and event bus:

    Type: AWS::Events::EventBus
    Properties:
      EventSourceName: !Ref PartnerEventSource
      Name: !Ref PartnerEventSource

The template creates a Kinesis Data Firehose delivery stream, Lambda function, and S3 bucket to process and store the events from Freshdesk tickets. It also adds a rule to the custom event bus with the Kinesis Data Firehose stream as the target:

  PushToFirehoseRule:
    Type: "AWS::Events::Rule"
    Properties:
      Description: Test Freshdesk Events Rule
      EventBusName: !Ref PartnerEventSource
      EventPattern:
        account: [!Ref AWS::AccountId]
      Name: freshdeskeventrule
      State: ENABLED
      Targets:
        - Arn:
            Fn::GetAtt:
              - "FirehoseDeliveryStream"
              - "Arn"
          Id: "idfreshdeskeventrule"
          RoleArn: !GetAtt EventRuleTargetIamRole.Arn

  EventRuleTargetIamRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Sid: ""
            Effect: "Allow"
            Principal:
              Service:
                - "events.amazonaws.com"
            Action:
              - "sts:AssumeRole"
      Path: "/"
      Policies:
        - PolicyName: Invoke_Firehose
          PolicyDocument:
            Version: "2012-10-17"
            Statement:
              - Effect: "Allow"
                Action:
                  - "firehose:PutRecord"
                  - "firehose:PutRecordBatch"
                Resource:
                  - !GetAtt FirehoseDeliveryStream.Arn

Generating sample Freshdesk data to validate the ingestion process:

To generate sample Freshdesk data, login to the Freshdesk account and browse to the “Tickets” screen as shown:

Follow the steps to simulate two customer service operations:

  1. To create a ticket of type “Refund”. Choose the New button and enter the details:
  2. Update an existing ticket and change the priority to “Urgent”.
  3. Within a few minutes of updating the ticket, the data is pushed via the Freshworks connector to the S3 bucket created using the AWS SAM template. To verify this, browse to the S3 bucket and see that a new object with the ticket data is created:

You can also use the S3 Select option under object actions to view the raw JSON data that is sent from the partner system. You are now ready to analyze the data using Athena.

Setting up a table in Athena to query the S3 bucket

If you are familiar with Apache Hive, you may find creating tables on Athena helpful. You can create tables by writing the DDL statement in the query editor or by using the wizard or JDBC driver. To create a table in Athena:

  1. Copy and paste the following DDL statement in the Athena query editor to create a Freshdesk’s events table. For this example, the table is created in the default database.
  2. Replace S3_Bucket_Name in the following query with the name of the S3 bucket created by deploying the previous AWS SAM template:
CREATE EXTERNAL TABLE ` freshdeskevents`(
  `id` string COMMENT 'from deserializer', 
  `detail-type` string COMMENT 'from deserializer', 
  `source` string COMMENT 'from deserializer', 
  `account` string COMMENT 'from deserializer', 
  `time` string COMMENT 'from deserializer', 
  `region` string COMMENT 'from deserializer', 
  `detail` struct<ticket:struct<subject:string,description:string,is_description_truncated:boolean,description_text:string,is_description_text_truncated:boolean,due_by:string,fr_due_by:string,fr_escalated:boolean,is_escalated:boolean,fwd_emails:array<string>,reply_cc_emails:array<string>,email_config_id:string,id:int,group_id:bigint,product_id:string,company_id:string,requester_id:bigint,responder_id:bigint,status:int,priority:int,type:string,tags:array<string>,spam:boolean,source:int,tweet_id:string,cc_emails:array<string>,to_emails:string,created_at:string,updated_at:string,attachments:array<string>,custom_fields:string,changes:struct<responder_id:array<bigint>,ticket_type:array<string>,status:array<int>,status_details:array<struct<id:int,name:string>>,group_id:array<bigint>>>,requester:struct<id:bigint,name:string,email:string,mobile:string,phone:string,language:string,created_at:string>> COMMENT 'from deserializer')
ROW FORMAT SERDE 
  'org.openx.data.jsonserde.JsonSerDe' 
WITH SERDEPROPERTIES ( 
  'paths'='account,detail,detail-type,id,region,resources,source,time,version') 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION  's3://S3_Bucket_Name/'

The table is created on the data stored in S3 and is ready to be queried. Note that table freshdeskevents points at the bucket s3://S3_Bucket_Name/. As more data is added to the bucket, the table automatically grows, providing a near-real-time data analysis experience.

Querying and analyzing data

You can use the following examples to get started with querying the Athena table.

  1. To get all the events data, run:
SELECT * FROM default.freshdeskevents  limit 10

The preceding output has a detail column containing the details related to the ticket. Tickets can be filtered on nested notations to build more insightful queries. Also, the detail-type column provides classification of tickets as new (onTicketCreate) vs updated (onTicketUpdate).

  1. To show new tickets created today with the type “Refund”:
SELECT detail.ticket.subject,detail.ticket.description_text, detail.ticket.type  FROM default.freshdeskevents
where detail.ticket.type = 'Refund' and "detail-type" = 'onTicketCreate' and date(from_iso8601_timestamp(time)) = date(current_date)
  1. All tickets with an “Urgent” priority but not assigned to an agent:
SELECT "detail-type", detail.ticket.responder_id,detail.ticket.priority, detail.ticket.subject, detail.ticket.type  FROM default.freshdeskevents
where detail.ticket.responder_id is null and detail.ticket.priority = 4

Conclusion

In this blog post, you learn how to configure Freshworks partner event source from the Freshdesk console. Once a partner event source is configured, an AWS SAM template is deployed that creates a custom event bus by attaching the partner event source. A Kinesis Data Firehose, Lambda function, and S3 bucket is used to ingest Freshdesk’s ticket events data for analysis. An EventBridge rule is configured to route the event data to the S3 bucket.

Once event data starts flowing into the S3 bucket, an Amazon Athena table is created to run queries and analyze the ticket events data. Alternative customer service data analysis use cases can be built on the architecture shown in this blog.

To learn more about other partner integrations and the native capabilities of EventBridge, visit the AWS Compute Blog.

Using AWS X-Ray tracing with Amazon EventBridge

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/using-aws-x-ray-tracing-with-amazon-eventbridge/

AWS X-Ray allows developers to debug and analyze distributed applications. It can be useful for tracing transactions through microservices architectures, such as those typically used in serverless applications. Amazon EventBridge allows you to route events between AWS services, integrated software as a service (SaaS) applications, and your own applications. EventBridge can help decouple applications and produce more extensible, maintainable architectures.

EventBridge now supports trace context propagation for X-Ray, which makes it easier to trace transactions through event-based architectures. This means you can potentially trace a single request from an event producer through to final processing by an event consumer. These may be decoupled application stacks where the consumer has no knowledge of how the event is produced.

This blog post explores how to use X-Ray with EventBridge and shows how to implement tracing using the example application in this GitHub repo.

How it works

X-Ray works by adding a trace header to requests, which acts as a unique identifier. In the case of a serverless application using multiple AWS services, this allows X-Ray to group service interactions together as a single trace. X-Ray can then produce a service map of the transaction flow or provide the raw data for a trace:

X-Ray service map

When you send events to EventBridge, the service uses rules to determine how the events are routed from the event bus to targets. Any event that is put on an event bus with the PutEvents API can now support trace context propagation.

The trace header is provided as internal metadata to support X-Ray tracing. The header itself is not available in the event when it’s delivered to a target. For developers using the EventBridge archive feature, this means that a trace ID is not available for replay. Similarly, it’s not available on events sent to a dead-letter queue (DLQ).

Enabling tracing with EventBridge

To enable tracing, you don’t need to change the event structure to add the trace header. Instead, you wrap the AWS SDK client in a call to AWSXRay.captureAWSClient and grant IAM permissions to allow tracing. This enables X-Ray to instrument the call automatically with the X-Amzn-Trace-Id header.

For code using the AWS SDK for JavaScript, this requires changes to the way that the EventBridge client is instantiated. Without tracing, you declare the AWS SDK and EventBridge client with:

const AWS = require('aws-sdk')
const eventBridge = new AWS.EventBridge()

To use tracing, this becomes:

const AWSXRay = require('aws-xray-sdk')
const AWS = AWSXRay.captureAWS(require('aws-sdk'))
const eventBridge = new AWS.EventBridge()

The interaction with the EventBridge client remains the same but the calls are now instrumented by X-Ray. Events are put on the event bus programmatically using a PutEvents API call. In a Node.js Lambda function, the following code processes an event to send to an event bus, with tracing enabled:

const AWSXRay = require('aws-xray-sdk')
const AWS = AWSXRay.captureAWS(require('aws-sdk'))
const eventBridge = new AWS.EventBridge()

exports.handler = async (event) => {

  let myDetail = { "name": "Alice" }

  const myEvent = { 
    Entries: [{
      Detail: JSON.stringify({ myDetail }),
      DetailType: 'myDetailType',
      Source: 'myApplication',
      Time: new Date
    }]
  }

  // Send to EventBridge
  const result = await eventBridge.putEvents(myEvent).promise()

  // Log the result
  console.log('Result: ', JSON.stringify(result, null, 2))
}

You can also define a custom tracing header using the new TraceHeader attribute on the PutEventsRequestEntry API model. The unique value you provide overrides any trace header on the HTTP header. The value is also validated by X-Ray and discarded if it does not pass validation. See the X-Ray Developer Guide to learn about generating valid trace headers.

Deploying the example application

The example application consists of a webhook microservice that publishes events and target microservices that consume events. The generated event contains a target attribute to determine which target receives the event:

Example application architecture

To deploy these microservices, you must have the AWS SAM CLI and Node.js 12.x installed. to To complete the deployment, follow the instructions in the GitHub repo.

EventBridge can route events to a broad range of target services in AWS. Targets that support active tracing for X-Ray can create comprehensive traces from the event source. The services offering active tracing are AWS Lambda, AWS Step Functions, and Amazon API Gateway. In each case, you can trace a request from the producer to the consumer of the event.

The GitHub repo contains examples showing how to use active tracing with EventBridge targets. The webhook application uses a query string parameter called target to determine which events are routed to these targets.

For X-Ray to detect each service in the webhook, tracing must be enabled on both the API Gateway stage and the Lambda function. In the AWS SAM template, the Tracing: Active property turns on active tracing for the Lambda function. If an IAM role is not specified, the AWS SAM CLI automatically adds the arn:aws:iam::aws:policy/AWSXrayWriteOnlyAccess policy to the Lambda function’s execution role. For the API definition, adding TracingEnabled: True enables tracing for this API stage.

When you invoke the webhook’s API endpoint, X-Ray generates a trace map of the request, showing each of the services from the REST API call to putting the event on the bus:

X-Ray trace map with EventBridge

The CloudWatch Logs from the webhook’s Lambda function shows the event that has been put on the event bus:

CloudWatch Logs from a webhook

Tracing with a Lambda target

In the targets-lambda example application, the Lambda function uses the X-Ray SDK and has active tracing enabled in the AWS SAM template:

Resources:
  ConsumerFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: src/
      Handler: app.handler
      MemorySize: 128
      Timeout: 3
      Runtime: nodejs12.x
      Tracing: Active

With these two changes, the target Lambda function propagates the tracing header from the original webhook request. When the webhook API is invoked, the X-Ray trace map shows the entire request through to the Lambda target. X-Ray shows two nodes for Lambda – one is the Lambda service and the other is the Lambda function invocation:

Downstream service node in service map

Tracing with an API Gateway target

Currently, active tracing is only supported by REST APIs but not HTTP APIs. You can enable X-Ray tracing from the AWS CLI or from the Stages menu in the API Gateway console, in the Logs/Tracing tab:

Enable X-Ray tracing in API Gateway

You cannot currently create an API Gateway target for EventBridge using AWS SAM. To invoke an API endpoint from the EventBridge console, create a rule and select the API as a target. The console automatically creates the necessary IAM permissions for EventBridge to invoke the endpoint.

Setting API Gateway as an EventBridge target

If the API invokes downstream services with active tracing available, these services also appear as nodes in the X-Ray service graph. Using the webhook application to invoke the API Gateway target, the trace shows the entire request from the initial API call through to the second API target:

API Gateway node in X-Ray service map

Tracing with a Step Functions target

To enable tracing for a Step Functions target, the state machine must have tracing enabled and have permissions to write to X-Ray. The AWS SAM template can enable tracing, define the EventBridge rule and the AWSXRayDaemonWriteAccess policy in one resource:

  WorkFlowStepFunctions:
    Type: AWS::Serverless::StateMachine
    Properties:
      DefinitionUri: definition.asl.json
      DefinitionSubstitutions:
        LoggerFunctionArn: !GetAtt LoggerFunction.Arn
      Tracing:
        Enabled: True
      Events:
        UploadComplete:
          Type: EventBridgeRule
          Properties:
            Pattern:
              account: 
                - !Sub '${AWS::AccountId}'
              source:
                - !Ref EventSource
              detail:
                apiEvent:
                  target:
                    - 'sfn'

      Policies: 
        - AWSXRayDaemonWriteAccess
        - LambdaInvokePolicy:
            FunctionName: !Ref LoggerFunction

If the state machine uses services that support active tracing, these also appear in the trace map for individual requests. Using the webhook to invoke this target, X-Ray now shows the request trace to the state machine and the Lambda function it contains:

Step Functions in X-Ray service map

Adding X-Ray tracing to existing Lambda targets

To wrap the SDK client, you must enable active tracing and include the AWS X-Ray SDK in the Lambda function’s deployment package. Unlike the AWS SDK, the X-Ray SDK is not included in the Lambda execution environment.

Another option is to include the X-Ray SDK as a Lambda layer. You can build this layer by following the instructions in the GitHub repo. Once deployed, you can attach the X-Ray layer to any Lambda function either via the console or the CLI:

Adding X-Ray tracing a Lambda function

To learn more about using Lambda layers, read “Using Lambda layers to simplify your development process”.

Conclusion

X-Ray is a powerful tool for providing observability in serverless applications. With the launch of X-Ray trace context propagation in EventBridge, this allows you to trace requests across distributed applications more easily.

In this blog post, I walk through an example webhook application with three targets that support active tracing. In each case, I show how to enable tracing either via the console or using AWS SAM and show the resulting X-Ray trace map.

To learn more about how to use tracing with events, read the X-Ray Developer Guide or see the Amazon EventBridge documentation for this feature.

For more serverless learning resources, visit Serverless Land.

Caching data and configuration settings with AWS Lambda extensions

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/caching-data-and-configuration-settings-with-aws-lambda-extensions/

This post is written by Hari Ohm Prasath Rajagopal, Senior Modernization Architect and Vamsi Vikash Ankam, Technical Account Manager

In this post, I show how to build a flexible in-memory AWS Lambda caching layer using Lambda extensions. Lambda functions use REST API calls to access the data and configuration from the cache. This can reduce latency and cost when consuming data from AWS services such as Amazon DynamoDB, AWS Systems Manager Parameter Store, and AWS Secrets Manager.

Applications making frequent API calls to retrieve static data can benefit from a caching layer. This can reduce the function’s latency, particularly for synchronous requests, as the data is retrieved from the cache instead of an external service. The cache can also reduce costs by reducing the number of calls to downstream services.

There are two types of cache to consider in this situation:

Lambda extensions are a new way for tools to integrate more easily into the Lambda execution environment and control and participate in Lambda’s lifecycle. They use the Extensions API, a new HTTP interface, to register for lifecycle events during function initialization, invocation, and shutdown.

They can also use environment variables to add options and tools to the runtime, or use wrapper scripts to customize the runtime startup behavior. The Lambda cache uses Lambda extensions to run as a separate process.

To learn more about how to use extensions with your functions, read “Introducing AWS Lambda extensions”.

Creating a cache using Lambda extensions

To set up the example, visit the GitHub repo, and follow the instructions in the README.md file.

The demo uses AWS Serverless Application Model (AWS SAM) to deploy the infrastructure. The walkthrough requires AWS SAM CLI (minimum version 0.48) and an AWS account.

To install the example:

  1. Create an AWS account if you do not already have one and login.
  2. Clone the repo to your local development machine:
  3. git clone https://github.com/aws-samples/aws-lambda-extensions
    cd aws-lambda-extensions/cache-extension-demo/
  4. If you are not running in a Linux environment, ensure that your build architecture matches the Lambda execution environment by compiling with GOOS=linux and GOARCH=amd64.
  5. GOOS=linux GOARCH=amd64
  6. Build the Go binary extension with the following command:
  7. go build -o bin/extensions/cache-extension-demo main.go
  8. Ensure that the extensions files are executable:
  9. chmod +x bin/extensions/cache-extension-demo
  10. Update the parameters region value in ../example-function/config.yaml with the Region where you are deploying the function.
  11. parameters:
      - region: us-west-2
  12. Build the function dependencies.
  13. cd SAM
    sam build
    AWS SAM build

    AWS SAM build

  14. Deploy the AWS resources specified in the template.yml file:
  15. sam deploy --guided
  16. During the prompts:
  17. Enter a stack name cache-extension-demo.
  18. Enter the same AWS Region specified previously.
  19. Accept the default DatabaseName. You can specify a custom database name, and also update the ../example-function/config.yaml and index.js files with the new database name.
  20. Enter MySecret as the Secrets Manager secret.
  21. Accept the defaults for the remaining questions.
  22. AWS SAM Deploy

    AWS SAM Deploy

    AWS SAM deploys:

    • A DynamoDB table.
    • The Lambda function ExtensionsCache-DatabaseEntry, which puts a sample item into the DynamoDB table.
    • An AWS Systems Manager Parameter Store parameter called CacheExtensions_Parameter1 with a value of MyParameter.
    • An AWS Secrets Manager secret called secret_info with a value of MySecret.
    • A Lambda layer called Cache_Extension_Layer.
    • A Lambda function using Nodejs.12 called ExtensionsCache-SampleFunction. This reads the cached values via the extension from either the DynamoDB table, Parameter Store, or Secrets Manager.
    • IAM permissions

    The cache extension is delivered as a Lambda layer and added to ExtensionsCache-SampleFunction.

    It is written as a self-contained binary in Golang, which makes the extension compatible with all of the supported runtimes. The extension caches the data from DynamoDB, Parameter Store, and Secrets Manager, and then runs a local HTTP endpoint to service the data. The Lambda function retrieves the configuration data from the cache using a local HTTP REST API call.

    Here is the architecture diagram.

    Extensions cache architecture diagram

    Extensions cache architecture diagram

    Once deployed, the extension performs the following steps:

    1. On start-up, the extension reads the config.yaml file, which determines which resources to cache. The file is deployed as part of the Lambda function.
    2. The boolean CACHE_EXTENSION_INIT_STARTUP Lambda environment variable specifies whether to load into cache the items specified in config.yaml. If false, the extension initializes an empty map with the names.
    3. The extension retrieves the required data based on the resources in the config.yaml file. This includes the data from DynamoDB, the configuration from Parameter Store, and the secret from Secrets Manager. The data is stored in memory.
    4. The extension starts a local HTTP server using TCP port 4000, which serves the cache items to the function. The Lambda function accesses the local in-memory cache by invoking the following endpoint: http://localhost:4000/<cachetype>?name=<name>.
    5. If the data is not available in the cache, or has expired, the extension accesses the corresponding AWS service to retrieve the data. It is cached first and then returned to the Lambda function. The CACHE_EXTENSION_TTL Lambda environment variable defines the refresh interval (defined based on Go time format, for example: 30s, 3m, etc.)

    This sequence diagram explains the data flow:

    Extensions cache sequence diagram

    Extensions cache sequence diagram

    Testing the example application

    Once the AWS SAM template is deployed, navigate to the AWS Lambda console.

    1. Select the function starting with the name ExtensionsCache-SampleFunction. Within the function code, the options array specifies which data to return from the cache. This is initially set to path: '/dynamodb?name=DynamoDbTable-pKey1-sKey1'
    2. Choose Configure test events to configure a test event.
    3. Enter a name for the Event name, accept the default payload, and select Create.
    4. Select Test to invoke the function. This returns the cached data from DynamoDB and logs the output.
    5. Successfully retrieve DynamoDB data from cache

      Successfully retrieve DynamoDB data from cache

    6. In the index.js file, amend the path statement to retrieve the Parameter Store configuration:
    7. const options = {
        "hostname": "localhost",
        "port": 4000,
        "path": "/parameters?name=CacheExtensions_Parameter1",
        "method": "GET"
      }
    8. Select Deploy to save the function configuration and select Test. The function returns the Parameter Store configuration item:
    9. Successfully retrieve Parameter Store data from cache

      Successfully retrieve Parameter Store data from cache

    10. In the function code, amend the path statement to retrieve the Secrets Manager secret:
    11. const options = {
        "hostname": "localhost",
        "port": 4000,
        "path": "/parameters?name=/aws/reference/secretsmanager/secret_info",
        "method": "GET"
      }
    12. Select Deploy to save the function configuration and select Test. The function returns the secret:
    Successfully retrieve Secrets Manager data from cache

    Successfully retrieve Secrets Manager data from cache

    The benefits of using Lambda extensions

    There are a number of benefits to using a Lambda extension for this solution:

    1. Improved Lambda function performance as data is cached in memory by the extension during initialization.
    2. Fewer AWS API calls to external services, this can reduce costs and helps avoid throttling limits if services are accessed frequently.
    3. Cache data is stored in memory and not in a file within the Lambda execution environment. This means that no additional process is required to manage the lifecycle of the file. In-memory is also more secure, as data is not persisted to disk for subsequent function invocations.
    4. The function requires less code, as it only needs to communicate with the extension via HTTP to retrieve the data. The function does not have to have additional libraries installed to communicate with DynamoDB, Parameter Store, Secrets Manager, or the local file system.
    5. The cache extension is a Golang compiled binary and the executable can be shared with functions running other runtimes like Node.js, Python, Java, etc.
    6. Using a YAML template to store the details of what to cache makes it easier to configure and add additional services.

    Comparing the performance benefit

    To test the performance of the cache extension, I compare two tests:

    1. A Golang Lambda function that accesses a secret from AWS Secrets Manager for every invocation.
    2. The ExtensionsCache-SampleFunction, previously deployed using AWS SAM. This uses the cache extension to access the secrets from Secrets Manager, the function reads the value from the cache.

    Both functions are configured with 512 MB of memory and the function timeout is set to 30 seconds.

    I use Artillery to load test both Lambda functions. The load runs for 100 invocations over 2 minutes. I use Amazon CloudWatch metrics to view the function average durations.

    Test 1 shows a duration of 43 ms for the first invocation as a cold start. Subsequent invocations average 22 ms.

    Test 1 performance results

    Test 1 performance results

    Test 2 shows a duration of 16 ms for the first invocation as a cold start. Subsequent invocations average 3 ms.

    Test 2 performance results

    Test 2 performance results

    Using the Lambda extensions caching layer shows a significant performance improvement. Cold start invocation duration is reduced by 62% and subsequent invocations by 80%.

    In this example, the CACHE_EXTENSION_INIT_STARTUP environment variable flag is not configured. With the flag enabled for the extension, data is pre-fetched during extension initialization and the cold start time is further reduced.

    Conclusion

    Using Lambda extensions is an effective way to cache static data from external services in Lambda functions. This reduces function latency and costs. This post shows how to build both a data and configuration cache using DynamoDB, Parameter Store, and Secrets Manager.

    To set up the walkthrough demo in this post, visit the GitHub repo, and follow the instructions in the README.md file.

    The extension uses a local configuration file to determine which values to cache, and retrieves the items from the external services. A Lambda function retrieves the values from the local cache using an HTTP request, without having to communicate with the external services directly. In this example, this results in an 80% reduction in function invocation time.

    For more serverless learning resources, visit https://serverlessland.com.

Operating Lambda: Building a solid security foundation – Part 2

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/operating-lambda-building-a-solid-security-foundation-part-2/

In the Operating Lambda series, I cover important topics for developers, architects, and systems administrators who are managing AWS Lambda-based applications. This two-part series discusses core security concepts for Lambda-based applications.

Part 1 explains the Lambda execution environment and how to apply the principles of least privilege to your workload. This post covers securing workloads with public endpoints, encrypting data, and using AWS CloudTrail for governance, compliance, and operational auditing.

Securing workloads with public endpoints

For workloads that are accessible publicly, AWS provides a number of features and services that can help mitigate certain risks. This section covers authentication and authorization of application users and protecting API endpoints.

Authentication and authorization

Authentication relates to identity and authorization refers to actions. Use authentication to control who can invoke a Lambda function, and then use authorization to control what they can do. For many applications, AWS Identity & Access Management (IAM) is sufficient for managing both control mechanisms.

For applications with external users, such as web or mobile applications, it is common to use JSON Web Tokens (JWTs) to manage authentication and authorization. Unlike traditional, server-based password management, JWTs are passed from the client on every request. They are a cryptographically secure way to verify identity and claims using data passed from the client. For Lambda-based applications, this allows you to secure APIs for each microservice independently, without relying on a central server for authentication.

You can implement JWTs with Amazon Cognito, which is a user directory service that can handle registration, authentication, account recovery, and other common account management operations. For frontend development, Amplify Framework provides libraries to simplify integrating Cognito into your frontend application. You can also use third-party partner services like Auth0.

Given the critical security role of an identity provider service, it’s important to use professional tooling to safeguard your application. It’s not recommended that you write your own services to handle authentication or authorization. Any vulnerabilities in custom libraries may have significant implications for the security of your workload and its data.

Protecting API endpoints

For serverless applications, the preferred way to serve a backend application publicly is to use Amazon API Gateway. This can help you protect an API from malicious users or spikes in traffic.

For authenticated API routes, API Gateway offers both REST APIs and HTTP APIs for serverless developers. Both types support authorization using AWS Lambda, IAM or Amazon Cognito. When using IAM or Amazon Cognito, incoming requests are evaluated and if they are missing a required token or contain invalid authentication, the request is rejected. You are not charged for these requests and they do not count towards any throttling quotas.

Unauthenticated API routes may be accessed by anyone on the public internet so it’s recommended that you limit their use. If you must use unauthenticated APIs, it’s important to protect these against common risks, such as denial-of-service (DoS) attacks. Applying AWS WAF to these APIs can help protect your application from SQL injection and cross-site scripting (XSS) attacks. API Gateway also implements throttling at the AWS account-level and per-client level when API keys are used.

In some cases, the functionality provided by an unauthenticated API can be achieved with an alternative approach. For example, a web application may provide a list of customer retail stores from an Amazon DynamoDB table to users who are not logged in. This request may originate from a frontend web application or from any other source that calls the URL endpoint. This diagram compares three solutions:

Solutions for an unauthenticated API

  1. The unauthenticated API can be called by anyone on the internet. In a denial of service attack, it’s possible to exhaust API throttling limits, Lambda concurrency, or DynamoDB provisioned read capacity on an underlying table.
  2. An Amazon CloudFront distribution in front of the API endpoint with an appropriate time-to-live (TTL) configuration may help absorb traffic in a DoS attack, without changing the underlying solution for fetching the data.
  3. Alternatively, for static data that rarely changes, the CloudFront distribution could serve the data from an S3 bucket.

The AWS Well-Architected Tool provides a Serverless Lens that analyzes the security posture of serverless workloads.

Encrypting data in Lambda-based applications

Managing secrets

For applications handling sensitive data, AWS services provide a range of encryption options for data in transit and at rest. It’s important to identity and classify sensitive data in your workload, and minimize the storage of sensitive data to only what is necessary.

When protecting data at rest, use AWS services for key management and encryption of stored data, secrets and environment variables. Both the AWS Key Management Service and AWS Secrets Manager provide a robust approach to storing and managing secrets used in Lambda functions.

Do not store plaintext secrets or API keys in Lambda environment variables. Instead, use KMS to encrypt environment variables. Also ensure you do not embed secrets directly in function code, or commit these secrets to code repositories.

Using HTTPS securely

HTTPS is encrypted HTTP, using TLS (SSL) to encrypt the request and response, including headers and query parameters. While query parameters are encrypted, URLs may be logged by different services in plaintext, so you should not use these to store sensitive data such as credit card numbers.

AWS services make it easier to use HTTPS throughout your application and it is provided by default in services like API Gateway. Where you need an SSL/TLS certificate in your application, to support features like custom domain names, it’s recommended that you use AWS Certificate Manager (ACM). This provides free public certificates for ACM-integrated services and managed certificate renewal.

Governance controls with AWS CloudTrail

For compliance and operational auditing of application usage, AWS CloudTrail logs activity related to your AWS account usage. It tracks resource changes and usage, and provides analysis and troubleshooting tools. Enabling CloudTrail does not have any negative performance implications for your Lambda-based application, since the logging occurs asynchronously.

Separate from application logging (see chapter 4), CloudTrail captures two types of events:

  • Control plane: These events apply to management operations performed on any AWS resources. Individual trails can be configured to capture read or write events, or both.
  • Data plane: Events performed on the resources, such as when a Lambda function is invoked or an S3 object is downloaded.

For Lambda, you can log who creates and invokes functions, together with any changes to IAM roles. You can configure CloudTrail to log every single activity by user, role, service, and API within an AWS account. The service is critical for understanding the history of changes made to your account and also detecting any unintended changes or suspicious activity.

To research which AWS user interacted with a Lambda function, CloudTrail provides an audit log to find this information. For example, when a new permission is added to a Lambda function, it creates an AddPermission record. You can interpret the meaning of individual attributes in the JSON message by referring to the CloudTrail Record Contents documentation.

CloudTrail Record Contents documentation

CloudTrail data is considered sensitive so it’s recommended that you protect it with KMS encryption. For any service processing encrypted CloudTrail data, it must use an IAM policy with kms:Decrypt permission.

By integrating CloudTrail with Amazon EventBridge, you can create alerts in response to certain activities and respond accordingly. With these two services, you can quickly implement an automated detection and response pattern, enabling you to develop mechanisms to mitigate security risks. With EventBridge, you can analyze data in real-time, using event rules to filter events and forward to targets like Lambda functions or Amazon Kinesis streams.

CloudTrail can deliver data to Amazon CloudWatch Logs, which allows you to process multi-Region data in real time from one location. You can also deliver CloudTrail to Amazon S3 buckets, where you can create event source mappings to start data processing pipelines, run queries with Amazon Athena, or analyze activity with Amazon Macie.

If you use multiple AWS accounts, you can use AWS Organizations to manage and govern individual member accounts centrally. You can set an existing trail as an organization-level trail in a primary account that can collect events from all other member accounts. This can simplify applying consistent auditing rules across a large set of existing accounts, or automatically apply rules to new accounts. To learn more about this feature, see Creating a Trail for an Organization.

Conclusion

In this blog post, I explain how to secure workloads with public endpoints and the different authentication and authorization options available. I also show different approaches to exposing APIs publicly.

CloudTrail can provide compliance and operational auditing for Lambda usage. It provides logs for both the control plane and data plane. You can integrate CloudTrail with EventBridge to create alerts in response to certain activities. Customers with multiple AWS accounts can use AWS Organizations to manage trails centrally.

For more serverless learning resources, visit Serverless Land.

Gaining operational insights with AIOps using Amazon DevOps Guru

Post Syndicated from Nikunj Vaidya original https://aws.amazon.com/blogs/devops/gaining-operational-insights-with-aiops-using-amazon-devops-guru/

Amazon DevOps Guru offers a fully managed AIOps platform service that enables developers and operators to improve application availability and resolve operational issues faster. It minimizes manual effort by leveraging machine learning (ML) powered recommendations. Its ML models take advantage of AWS’s expertise in operating highly available applications for the world’s largest e-commerce business for over 20 years. DevOps Guru automatically detects operational issues, predicts impending resource exhaustion, details likely causes, and recommends remediation actions.

This post walks you through how to enable DevOps Guru for your account in a typical serverless environment and observe the insights and recommendations generated for various activities. These insights are generated for operational events that could pose a risk to your application availability. DevOps Guru uses AWS CloudFormation stacks as the application boundary to detect resource relationships and co-relate with deployment events.

Solution overview

The goal of this post is to demonstrate the insights generated for anomalies detected by DevOps Guru from DevOps operations. If you don’t have a test environment and want to build out infrastructure to test the generation of insights, then you can follow along through this post. However, if you have a test or production environment (preferably spawned from CloudFormation stacks), you can skip the first section and jump directly to section, Enabling DevOps Guru and injecting traffic.

The solution includes the following steps:

1. Deploy a serverless infrastructure.

2. Enable DevOps Guru and inject traffic.

3. Review the generated DevOps Guru insights.

4. Inject another failure to generate a new insight.

As depicted in the following diagram, we use a CloudFormation stack to create a serverless infrastructure, comprising of Amazon API Gateway, AWS Lambda, and Amazon DynamoDB, and inject HTTP requests at a high rate towards the API published to list records.

Serverless infrastructure monitored by DevOps Guru

When DevOps Guru is enabled to monitor your resources within the account, it uses a combination of vended Amazon CloudWatch metrics and specific patterns from its ML models to detect anomalies. When an anomaly is detected, it generates an insight with the recommendations.

The generation of insights is dependent upon several factors. Although this post provides a canned environment to reproduce insights, the results may vary depending upon traffic pattern, timings of traffic injection, and so on.

Prerequisites

To complete this tutorial, you should have access to an AWS Cloud9 environment and the AWS Command Line Interface (AWS CLI).

Deploying a serverless infrastructure

To deploy your serverless infrastructure, you complete the following high-level steps:

1.   Create your IDE environment.

2.   Launch the CloudFormation template to deploy the serverless infrastructure.

3.   Populate the DynamoDB table.

 

1. Creating your IDE environment

We recommend using AWS Cloud9 to create an environment to get access to the AWS CLI from a bash terminal. AWS Cloud9 is a browser-based IDE that provides a development environment in the cloud. While creating the new environment, ensure you choose Linux2 as the operating system. Alternatively, you can use your bash terminal in your favorite IDE and configure your AWS credentials in your terminal.

When access is available, run the following command to confirm that you can see the Amazon Simple Storage Service (Amazon S3) buckets in your account:

aws s3 ls

Install the following prerequisite packages and ensure you have Python3 installed:

sudo yum install jq -y

export AWS_REGION=$(curl -s \
169.254.169.254/latest/dynamic/instance-identity/document | jq -r '.region')

sudo pip3 install requests

Clone the git repository to download the required CloudFormation templates:

git clone https://github.com/aws-samples/amazon-devopsguru-samples
cd amazon-devopsguru-samples/generate-devopsguru-insights/

2. Launching the CloudFormation template to deploy your serverless infrastructure

To deploy your infrastructure, complete the following steps:

  • Run the CloudFormation template using the following command:
aws cloudformation create-stack --stack-name myServerless-Stack \
--template-body file:///$PWD/cfn-shops-monitoroper-code.yaml \
--capabilities CAPABILITY_IAM CAPABILITY_NAMED_IAM

The AWS CloudFormation deployment creates an API Gateway, a DynamoDB table, and a Lambda function with sample code.

  • When it’s complete, go to the Outputs tab of the stack on the AWS CloudFormation console.
  • Record the links for the two APIs: one of them to list the table contents and other one to populate the contents.

3. Populating the DynamoDB table

Run the following commands (simply copy-paste) to populate the DynamoDB table. The below three commands will identify the name of the DynamoDB table from the CloudFormation stack and populate the name in the populate-shops-dynamodb-table.json file.

dynamoDBTableName=$(aws cloudformation list-stack-resources \
--stack-name myServerless-Stack | \
jq '.StackResourceSummaries[]|select(.ResourceType == "AWS::DynamoDB::Table").PhysicalResourceId' | tr -d '"')
sudo sed -i s/"<YOUR-DYNAMODB-TABLE-NAME>"/$dynamoDBTableName/g \
populate-shops-dynamodb-table.json
aws dynamodb batch-write-item \
--request-items file://populate-shops-dynamodb-table.json

The command gives the following output:

{
"UnprocessedItems": {}
}

This populates the DynamoDB table with a few sample records, which you can verify by accessing the ListRestApiEndpointMonitorOper API URL published on the Outputs tab of the CloudFormation stack. The following screenshot shows the output.

Screenshot showing the API output

Enabling DevOps Guru and injecting traffic

In this section, you complete the following high-level steps:

1.   Enable DevOps Guru for the CloudFormation stack.

2.   Wait for the serverless stack to complete.

3.   Update the stack.

4.   Inject HTTP requests into your API.

 

1. Enabling DevOps Guru for the CloudFormation stack

To enable DevOps Guru for CloudFormation, complete the following steps:

  • Run the CloudFormation template to enable DevOps Guru for this CloudFormation stack:
aws cloudformation create-stack \
--stack-name EnableDevOpsGuruForServerlessCfnStack \
--template-body file:///$PWD/EnableDevOpsGuruForServerlessCfnStack.yaml \
--parameters ParameterKey=CfnStackNames,ParameterValue=myServerless-Stack \
--region ${AWS_REGION}
  • When the stack is created, navigate to the Amazon DevOps Guru console.
  • Choose Settings.
  • Under CloudFormation stacks, locate myServerless-Stack.

If you don’t see it, your CloudFormation stack has not been successfully deployed. You may remove and redeploy the EnableDevOpsGuruForServerlessCfnStack stack.

Optionally, you can configure Amazon Simple Notification Service (Amazon SNS) topics or enable AWS Systems Manager integration to create OpsItem entries for every insight created by DevOps Guru. If you need to deploy as a stack set across multiple accounts and Regions, see Easily configure Amazon DevOps Guru across multiple accounts and Regions using AWS CloudFormation StackSets.

2. Waiting for baselining of resources

This is a necessary step to allow DevOps Guru to complete baselining the resources and benchmark the normal behavior. For our serverless stack with 3 resources, we recommend waiting for 2 hours before carrying out next steps. When enabled in a production environment, depending upon the number of resources selected to monitor, it can take up to 24 hours for it to complete baselining.

Note: Unlike many monitoring tools, DevOps Guru does not expect the dashboard to be continuously monitored and thus under normal system health, the dashboard would simply show zero’ed counters for the ongoing insights. It is only when an anomaly is detected, it will raise an alert and display an insight on the dashboard.

3. Updating the CloudFormation stack

When enough time has elapsed, we will make a configuration change to simulate a typical operational event. As shown below, update the CloudFormation template to change the read capacity for the DynamoDB table from 5 to 1.

CloudFormation showing read capacity settings to modify

Run the following command to deploy the updated CloudFormation template:

aws cloudformation update-stack --stack-name myServerless-Stack \
--template-body file:///$PWD/cfn-shops-monitoroper-code.yaml \
--capabilities CAPABILITY_IAM CAPABILITY_NAMED_IAM

4. Injecting HTTP requests into your API

We now inject ingress traffic in the form of HTTP requests towards the ListRestApiEndpointMonitorOper API, either manually or using the Python script provided in the current directory (sendAPIRequest.py). Due to reduced read capacity, the traffic will oversubscribe the dynamodb tables, thus inducing an anomaly. However, before triggering the script, populate the url parameter with the correct API link for the  ListRestApiEndpointMonitorOper API, listed in the CloudFormation stack’s output tab.

After the script is saved, trigger the script using the following command:

python sendAPIRequest.py

Make sure you’re getting an output of status 200 with the records that we fed into the DynamoDB table (see the following screenshot). You may have to launch multiple tabs (preferably 4) of the terminal to run the script to inject a high rate of traffic.

Terminal showing script executing and injecting traffic to API

After approximately 10 minutes of the script running in a loop, an operational insight is generated in DevOps Guru.

 

Reviewing DevOps Guru insights

While these operations are being run, DevOps Guru monitors for anomalies, logs insights that provide details about the metrics indicating an anomaly, and prints actionable recommendations to mitigate the anomaly. In this section, we review some of these insights.

Under normal conditions, DevOps Guru dashboard will show the ongoing insights counter to be zero. It monitors a high number of metrics behind the scenes and offloads the operator from manually monitoring any counters or graphs. It raises an alert in the form of an insight, only when anomaly occurs.

The following screenshot shows an ongoing reactive insight for the specific CloudFormation stack. When you choose the insight, you see further details. The number under the Total resources analyzed last hour may vary, so for this post, you can ignore this number.

DevOps Guru dashboard showing an ongoing reactive insight

The insight is divided into four sections: Insight overview, Aggregated metrics, Relevant events, and Recommendations. Let’s take a closer look into these sections.

The following screenshot shows the Aggregated metrics section, where it shows metrics for all the resources that it detected anomalies in (DynamoDB, Lambda, and API Gateway). Note that depending upon your traffic pattern, lambda settings, baseline traffic, the list of metrics may vary. In the example below, the timeline shows that an anomaly for DynamoDB started first and was followed by API Gateway and Lambda. This helps us understand the cause and symptoms, and prioritize the right anomaly investigation.

The listing of metrics inside an Insight

Initially, you may see only two metrics listed, however, over time, it populates more metrics that showed anomalies. You can see the anomaly for DynamoDB started earlier than the anomalies for API Gateway and Lambda, thus indicating them as after effects. In addition to the information in the preceding screenshot, you may see Duration p90 and IntegrationLatency p90 (for Lambda and API Gateway respectively, due to increased backend latency) metrics also listed.

Now we check the Relevant events section, which lists potential triggers for the issue. The events listed here depend on the set of operations carried out on this CloudFormation stack’s resources in this Region. This makes it easy for the operator to be reminded of a change that may have caused this issue. The dots (representing events) that are near the Insight start portion of timeline are of particular interest.

Related Events shown inside the Insight

If you need to delve into any of these events, just click of any of these points, and it provides more details as shown below screenshot.

Delving into details of the related event listed in Insight

You can choose the link for an event to view specific details about the operational event (configuration change via CloudFormation update stack operation).

Now we move to the Recommendations section, which provides prescribed guidance for mitigating this anomaly. As seen in the following screenshot, it recommends to roll back the configuration change related to the read capacity for the DynamoDB table. It also lists specific metrics and the event as part of the recommendation for reference.

Recommendations provided inside the Insight

Going back to the metric section of the insight, when we select Graphed anomalies, it shows you the graphs of all related resources. Below screenshot shows a snippet showing anomaly for DynamoDB ReadThrottleEvents metrics. As seen in the below screenshot of the graph pattern, the read operations on the table are exceeding the provisioned throughput of read capacity. This clearly indicates an anomaly.

Graphed anomalies in DevOps Guru

Let’s navigate to the DynamoDB table and check our table configuration. Checking the table properties, we notice that our read capacity is reduced to 1. This is our root cause that led to this anomaly.

Checking the DynamoDB table capacity settings

If we change it to 5, we fix this anomaly. Alternatively, if the traffic is stopped, the anomaly moves to a Resolved state.

The ongoing reactive insight takes a few minutes after resolution to move to a Closed state.

Note: When the insight is active, you may see more metrics get populated over time as we detect further anomalies. When carrying out the preceding tests, if you don’t see all the metrics as listed in the screenshots, you may have to wait longer.

 

Injecting another failure to generate a new DevOps Guru insight

Let’s create a new failure and generate an insight for that.

1.   After you close the insight from the previous section, trigger the HTTP traffic generation loop from the AWS Cloud9 terminal.

We modify the Lambda functions’s resource-based policy by removing the permissions for API Gateway to access this function.

2.   On the Lambda console, open the function ScanFunctionMonitorOper.

3.   On the Permissions tab, access the policy.

Accessing the permissions tab for the Lambda

 

4.   Save a copy of the policy offline as a backup before making any changes.

5.   Note down the “Sid” values for the “AWS:SourceArn” that ends with prod/*/ and prod/*/*.

Checking the Resource-based policy for the Lambda

6.   Run the following command to remove the “Sid” JSON statements in your Cloud9 terminal:

aws lambda remove-permission --function-name ScanFunctionMonitorOper \
--statement-id <Sid-value-ending-with-prod/*/>

7.   Run the same command for the second Sid value:

aws lambda remove-permission --function-name ScanFunctionMonitorOper \
--statement-id <Sid-value-ending-with-prod/*/*>

You should see several 5XX errors, as in the following screenshot.

Terminal output now showing 500 errors for the script output

After less than 8 minutes, you should see a new ongoing reactive insight on the DevOps Guru dashboard.

Let’s take a closer look at the insight. The following screenshot shows the anomalous metric 5XXError Average of API Gateway and its duration. (This insight shows as closed because I had already restored permissions.)

Insight showing 5XX errors for API-Gateway and link to OpsItem

If you have configured to enable creating OpsItem in Systems Manager, you would see the link to OpsItem ID created in the insight, as shown above. This is an optional configuration, which will enable you to track the insights in the form of open tickets (OpsItems) in Systems Manager OpsCenter.

The recommendations provide guidance based upon the related events and anomalous metrics.

After the insight has been generated, reviewed, and verified, restore the permissions by running the following command:

aws lambda add-permission --function-name ScanFunctionMonitorOper  \
--statement-id APIGatewayProdPerm --action lambda:InvokeFunction \
--principal apigateway.amazonaws.com

If needed, you can insert the condition to point to the API Gateway ARN to allow only specific API Gateways to access the Lambda function.

 

Cleaning up

After you walk through this post, you should clean up and un-provision the resources to avoid incurring any further charges.

1.   To un-provision the CloudFormation stacks, on the AWS CloudFormation console, choose Stacks.

2.   Select each stack (EnableDevOpsGuruForServerlessCfnStack and myServerless-Stack) and choose Delete.

3.   Check to confirm that the DynamoDB table created with the stacks is cleaned up. If not, delete the table manually.

4.   Un-provision the AWS Cloud9 environment.

 

Conclusion

This post reviewed how DevOps Guru can continuously monitor the resources in your AWS account in a typical production environment. When it detects an anomaly, it generates an insight, which includes the vended CloudWatch metrics that breached the threshold, the CloudFormation stack in which the resource existed, relevant events that could be potential triggers, and actionable recommendations to mitigate the condition.

DevOps Guru generates insights that are relevant to you based upon the pre-trained machine-learning models, removing the undifferentiated heavy lifting of manually monitoring several events, metrics, and trends.

I hope this post was useful to you and that you would consider DevOps Guru for your production needs.

 

Building a serverless multi-player game that scales

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-a-serverless-multiplayer-game-that-scales/

This post is written by Tim Bruce, Sr. Solutions Architect, Developer Acceleration.

Game development is a highly iterative process with rapidly changing requirements. Many game developers want to maximize the time spent building features and less time configuring servers, managing infrastructure, and mastering scale.

AWS Serverless provides four key benefits for customers. First, it can help move from idea to market faster, by reducing operational overhead. Second, customers may realize lower costs with serverless by not over-provisioning hardware and software to operate. Third, serverless scales with user activity. Finally, serverless services provide built-in integration, allowing you to focus on your game instead of connecting pieces together.

For AWS Gaming customers, these benefits allow your teams to spend more time focusing on gameplay and content, instead of undifferentiated tasks such as setting up and maintaining servers and software. This can result in better gameplay and content, and a faster time-to-market.

This blog post introduces a game with a serverless-first architecture. Simple Trivia Service is a web-based game showing architectural patterns that you can apply in your own games.

Introducing the Simple Trivia Service

The Simple Trivia Service offers single- and multi-player trivia games with content created by players. There are many features in Simple Trivia Service found in games, such as user registration, chat, content creation, leaderboards, game play, and a marketplace.

Simple Trivia Service UI

Authenticated players can chat with other players, create and manage quizzes, and update their profile. They can play single- and multi-player quizzes, host quizzes, and buy and sell quizzes on the marketplace. The single- and multi-player game modes show how games with different connectivity and technical requirements can be delivered with serverless first architectures. The game modes and architecture solutions are covered in the Simple Trivia Service backend architecture section.

Simple Trivia Service front end

The Simple Trivia Service front end is a Vue.js single page application (SPA) that accesses backend services. The SPA app, accessed via a web browser, allows users to make requests to the game endpoints using HTTPS, secure WebSockets, and WebSockets over MQTT. These requests use integrations to access the serverless backend services.

Vue.js helps make this reference architecture more accessible. The front end uses AWS Amplify to build, deploy, and host the SPA without the need to provision and manage any resources.

Simple Trivia Service backend architecture

The backend architecture for Simple Trivia Service is defined in a set of AWS Serverless Application Model (AWS SAM) templates for portions of the game. A deployment guide is included in the README.md file in the GitHub repository. Here is a visual depiction of the backend architecture.

Reference architecture

Services used

Simple Trivia Service is built using AWS Lambda, Amazon API Gateway, AWS IoT, Amazon DynamoDB, Amazon Simple Notification Service (SNS), AWS Step Functions, Amazon Kinesis, Amazon S3, Amazon Athena, and Amazon Cognito:

  • Lambda enables serverless microservice features in Simple Trivia Service.
  • API Gateway provides serverless endpoints for HTTP/RESTful and WebSocket communication while IoT delivers a serverless endpoint for WebSockets over MQTT communication.
  • DynamoDB enables data storage and retrieval for internet-scale applications.
  • SNS provides microservice communications via publish/subscribe functionality.
  • Step Functions coordinates complex tasks to ensure appropriate outcomes.
  • Analytics for Simple Trivia Service are delivered via Kinesis and S3 with Athena providing a query/visualization capability.
  • Amazon Cognito provides secure, standards-based login and a user directory.

Two managed services that are not serverless, Amazon VPC NAT Gateway and Amazon ElastiCache for Redis, are also used. VPC NAT Gateway is required by VPC-enabled Lambda functions to reach services outside of the VPC, such as DynamoDB. ElastiCache provides an in-memory database suited for applications with submillisecond latency requirements.

User security and enabling communications to backend services

Players are required to register and log in before playing. Registration and login credentials are sent to Amazon Cognito using the Secure Remote Password protocol. Upon successfully logging in, Amazon Cognito returns a JSON Web Token (JWT) and an Amazon Cognito user session.

The JWT is included within requests to API Gateway, which validates the token before allowing the request to be forwarded to Lambda.

IoT requires additional security for users by using an AWS Identity and Access Management (IAM) policy. A policy attached to the Amazon Cognito user allows the player to connect, subscribe, and send messages to the IoT endpoint.

Game types and supporting architectures

Simple Trivia Service’s three game modes define how players interact with the backend services. These modes align to different architectures used within the game.

“Single Player” quiz architecture

“Single Player” quiz architecture

Single player quizzes have simple rules, short play sessions, and appeal to wide audiences. Single player game communication is player-to-endpoint only. This is accomplished with API Gateway via an HTTP API.

Four Lambda functions (ActiveGamesList, GamePlay, GameAnswer, and LeaderboardGet) enable single player games. These functions are integrated with API Gateway and respond to specific requests from the client. API Gateway forwards the request, including URI, body, and query string, to the appropriate Lambda function.

When a player chooses “Play”, a request is sent to API Gateway, which invokes the ActiveGamesList function. This function queries the ActiveGames DynamoDB table and returns the list of active games to the user.

The player selects a game, resulting in another request triggering the GamePlay function. GamePlay retrieves the game’s questions from the GamesDetail DynamoDB table. The front end maintains the state for the user during the game.

When all questions are answered, the SPA sends the player’s responses to API Gateway, invoking the GameAnswer function. This function scores the player’s responses against the GameDetails table. The score and answers are sent to the user.

Additionally, this function sends the player score for the leaderboard and player experience to two SNS topics (LeaderboardTopic and PlayerProgressTopic). The ScorePut and PlayerProgressPut functions subscribe to these topics. These two functions write the details to the Leaderboard and Player Progress DynamoDB tables.

This architecture processes these two actions asynchronously, resulting in the player receiving their score and answers without having to wait. This also allows for increased security for player progress, as only the PlayerProgressPut function is allowed to write to this table.

Finally, the player can view the game’s leaderboard, which is returned to the player as the response to the GetLeaderboard function. The function retrieves the top 10 scores and the current player’s score from the Leaderboard table.

“Multi-player – Casual and Competitive” architecture

“Multiplayer – Casual and Competitive” architecture

These game types require player-to-player and service-to-player communication. This is typically performed using TCP/UDP sockets or the WebSocket protocol. API Gateway WebSockets provides WebSocket communication and enables Lambda functions to send messages to and receive messages from game hosts and players.

Game hosts start games via the “Host” button, messaging the LiveAdmin function via API Gateway. The function adds the game to the LiveGames table, which allows players to find and join the game. A list of questions for the game is sent to the game host from the LiveAdmin function at this time. Additionally, the game host is added to the GameConnections table, which keeps track of which connections are related to a game. Players, via the LivePlayer function, are also added to this table when they join a game.

The game host client manages the state of the game for all players and controls the flow of the game, sending questions, correct answers, and leaderboards to players via API Gateway and the LiveAdmin function. The function only sends game messages to the players in the GameConnections table. Player answers are sent to the host via the LivePlayer function.

To end the game, the game host sends a message with the final leaderboard to all players via the LiveAdmin function. This function also stores the leaderboard in the Leaderboard table, removes the game from the ActiveGames table, and sends player progression messages to the Player Progress topic.

“Multi-player – Live Scoreboard” architecture

“Multiplayer – Live Scoreboard” architecture

This is an extension of other multi-player game types requiring similar communications. This uses IoT with WebSockets over MQTT as the transport. It enables the client to subscribe to a topic and act on messages it receives. IoT manages routing messages to clients based on their subscriptions.

This architecture moves the state management from the game host client to a data store on the backend. This change requires a database that can respond quickly to user actions. Simple Trivia Service uses ElastiCache for Redis for this database. Game questions, player responses, and the leaderboard are all stored and updated in Redis during the quiz. The ElastiCache instance is blocked from internet traffic by placing it in a VPC. A security group configures access for the Lambda functions in the same VPC.

Game hosts for this type of game start the game by hosting it, which sends a message to IoT, triggering the CacheGame function. This function adds the game to the ActiveGames table and caches the quiz details from DynamoDB into Redis. Players join the game by sending a message, which is delivered to the JoinGame function. This adds the user record to Redis and alerts the game host that a player has joined.

Game hosts can send questions to the players via a message that invokes the AskQuestion function. This function updates the current question number in Redis and sends the question to subscribed players via the AskQuestion function. The ReceiveAnswer function processes player responses. It validates the response, stores it in Redis, updates the scoreboard, and replies to all players with the updated scoreboard after the first correct answer. The game scoreboard is updated for players in real time.

When the game is over, the game host sends a message to the EndGame function via IoT. This function writes the game leaderboard to the Leaderboard table, sends player progress to the Player Progress SNS topic, deletes the game from cache, and removes the game from the ActiveGames table.

Conclusion

This post introduces the Simple Trivia Service, a single- and multi-player game built using a serverless-first architecture on AWS. I cover different solutions that you can use to enable connectivity from your game client to a serverless-first backend for both single- and multi-player games. I also include a walkthrough of the architecture for each of these solutions.

You can deploy the code for this solution to your own AWS account via instructions in the Simple Trivia Service GitHub repository.

For more serverless learning resources, visit Serverless Land.