All posts by James Beswick

Adding resiliency to AWS CloudFormation custom resource deployments

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/adding-resiliency-to-aws-cloudformation-custom-resource-deployments/

This post is written by Dathu Patil, Solutions Architect and Naomi Joshi, Cloud Application Architect.

AWS CloudFormation custom resources allow you to write custom provisioning logic in templates. These run anytime you create, update, or delete stacks. Using AWS Lambda-backed custom resources, you can associate a Lambda function with a CloudFormation custom resource. The function is invoked whenever the custom resource is created, updated, or deleted.

When CloudFormation asynchronously invokes the function, it passes the request data, such as the request type and resource properties to the function. The customizability of Lambda functions in combination with CloudFormation allow a wide range of scenarios. For example, you can dynamically look up Amazon Machine Image (AMI) IDs during stack creation or use utilities such as string reversal functions.

Unhandled exceptions or transient errors in the custom resource Lambda function can cause your code to exit without sending a response. CloudFormation requires an HTTPS response to confirm if the operation is successful or not. An unreported exception causes CloudFormation to wait until the operation times out before starting a stack rollback.

If the exception occurs again on rollback, CloudFormation waits for a timeout exception before ending in a rollback failure. During this time, your stack is unusable. You can learn more about this and best practices by reviewing Best Practices for CloudFormation Custom Resources.

In this blog, you learn how you can use Amazon SQS and Lambda to add resiliency to your Lambda-backed CloudFormation custom resource deployments. The example shows how to use CloudFormation custom resource to look up an AMI ID dynamically during Amazon EC2 creation.

Overview

CloudFormation templates that declare an EC2 instance must also specify an AMI ID. This includes an operating system and other software and configuration information used to launch the instance. The correct AMI ID depends on the instance type and Region in which you’re launching your stack. AMI ID can change regularly, such as when an AMI is updated with software updates.

Customers often implement a CloudFormation custom resource to look up an AMI ID while creating an EC2 instance. In this example, the lookup Lambda function calls the EC2 API. It fetches the available AMI IDs, uses the latest AMI ID, and checks for a compliance tag. This implementation assumes that there are separate processes for creating AMI and running compliance checks. The process that performs compliance and security checks creates a compliance tag on a successful scan.

This solution shows how you can use SQS and Lambda to add resiliency to handle an exception. In this case, the exception occurs in the AMI lookup custom resource due to a missing compliance tag. When the AMI lookup function fails processing, it uses the Lambda destination configuration to send the request to an SQS queue. The message is reprocessed using the SQS queue and Lambda function.

Solution architecture

  1. The CloudFormation custom resource asynchronously invokes the AMI lookup Lambda function to perform appropriate actions.
  2. The AMI lookup Lambda function calls the EC2 API to fetch the list of AMIs and checks for a compliance tag. If the tag is missing, it throws an unhandled exception.
  3. On failure, the Lambda destination configuration sends the request to the retry queue that is configured as a dead-letter queue (DLQ). SQS adds a custom delay between retry processing to support more than two retries.
  4. The retry Lambda function processes messages in the retry queue using Lambda with SQS. Lambda polls the queue and invokes the retry Lambda function synchronously with an event that contains queue messages.
  5. The retry function then synchronously invokes the AMI lookup function using the information from the request SQS message.

The AMI Lookup Lambda function

An AWS Serverless Application Model (AWS SAM) template is used to create the AMI lookup Lambda function. You can configure async event options such as number of retries on the Lambda function. The maximum retries allowed is 2 and there is no option to set a delay between the invocation attempts.

When a transient failure or unhandled error occurs, the request is forwarded to the retry queue. This part of the AWS SAM template creates AMI lookup Lambda function:

  AMILookupLambda:
    Type: AWS::Serverless::Function 
    Properties:
      CodeUri: amilookup/
      Handler: app.lambda_handler
      Runtime: python3.8
      Timeout: 300
      EventInvokeConfig:
          MaximumEventAgeInSeconds: 60
          MaximumRetryAttempts: 2
          DestinationConfig:
            OnFailure:
              Type: SQS
              Destination: !GetAtt RetryQueue.Arn
      Policies:
        - AMIDescribePolicy: {}

This function calls the EC2 API using the boto3 AWS SDK for Python. It calls the describe_images method to get a list of images with given filter conditions. The Lambda function iterates through the AMI list and checks for compliance tags. If the tag is not present, it raises an exception:

ec2_client = boto3.client('ec2', region_name=region)
         # Get AMI IDs with the specified name pattern and owner
         describe_response = ec2_client.describe_images(
            Filters=[{'Name': "name", 'Values': architectures},
                     {'Name': "tag-key", 'Values': ['ami-compliance-check']}],
            Owners=["amazon"]
        )

The queue and the retry Lambda function

The retry queue adds a 60-second delay before a message is available for the processing. The time delay between retry processing attempts provides time for transient errors to be corrected. This is the AWS SAM template for creating these resources:

RetryQueue:
  Type: AWS::SQS::Queue
  Properties:
    VisibilityTimeout: 60
    DelaySeconds: 60
    MessageRetentionPeriod: 600

RetryFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: retry/
      Handler: app.lambda_handler
      Runtime: python3.8
      Timeout: 60
      Events:
        MySQSEvent:
          Type: SQS
          Properties:
            Queue: !GetAtt RetryQueue.Arn
            BatchSize: 1
      Policies:
        - LambdaInvokePolicy:
            FunctionName: !Ref AMILookupFunction

The retry Lambda function periodically polls for new messages in the retry queue. The function synchronously invokes the AMI lookup Lambda function. On success, a response is sent to CloudFormation. This process runs until the AMI lookup function returns a successful response or the message is deleted from the SQS queue. The deletion is based on the MessageRetentionPeriod, which is set to 600 seconds in this case.

for record in event['Records']:
        body = json.loads(record['body'])
        response = client.invoke(
            FunctionName=body['requestContext']['functionArn'],
            InvocationType='RequestResponse',
            Payload=json.dumps(body['requestPayload']).encode()
        )            

Deployment walkthrough

Prerequisites

To get started with this solution, you need:

  • AWS CLI and AWS SAM CLI installed to deploy the solution.
  • An existing Amazon EC2 public image. You can choose any of the AMIs from the AWS Management Console with Architecture = X86_64 and Owner = amazon for test purposes. Note the AMI ID.

Download the source code from the resilient-cfn-custom-resource GitHub repository. The template.yaml file is an AWS SAM template. It deploys the Lambda functions, SQS, and IAM roles required for the Lambda function. It uses Python 3.8 as the runtime and assigns 128 MB of memory for the Lambda functions.

  1. To build and deploy this application using the AWS SAM CLI build and guided deploy:
    sam build --use-container
    sam deploy --guided

The custom resource stack creation invokes the AMI lookup Lambda function. This fetches the AMI ID from all public EC2 images available in your account with the tag ami-compliance-check. Typically, the compliance tags are created by a process that performs security scans.

In this example, the security scan process is not running and the tag is not yet added to any AMIs. As a result, the custom resource throws an exception, which goes to the retry queue. This is retried by the retry function until it is successfully processed.

  1. Use the console or AWS CLI to add the tag to the chosen EC2 AMI. In this example, this is analogous to a separate governance process that checks for AMI compliance and adds the compliance tag if passed. Replace the $AMI-ID with the AMI ID captured in the prerequisites:
    aws ec2 create-tags –-resources $AMI-ID --tags Key=ami-compliance-check,Value=True
  2. After the tags are added, a response is sent successfully from the custom resource Lambda function to the CloudFormation stack. It includes your $AMI-ID and a test EC2 instance is created using that image. The stack creation completes successfully with all resources deployed.

Conclusion

This blog post demonstrates how to use SQS and Lambda to add resiliency to CloudFormation custom resources deployments. This solution can be customized for use cases where CloudFormation stacks have a dependency on a custom resource.

CloudFormation custom resource failures can happen due to unhandled exceptions. These are caused by issues with a dependent component, internal service, or transient system errors. Using this solution, you can handle the failures automatically without the need for manual intervention. To get started, download the code from the GitHub repo and start customizing.

For more serverless learning resources, visit Serverless Land.

Python 3.9 runtime now available in AWS Lambda

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/python-3-9-runtime-now-available-in-aws-lambda/

You can now use the Python 3.9 runtime to develop your AWS Lambda functions. You can do this in the AWS Management Console, AWS CLI, or AWS SDK, AWS Serverless Application Model (AWS SAM), or AWS Cloud Development Kit (AWS CDK). This post outlines some of the improvements to the Python runtime in version 3.9 and how to use this version in your Lambda functions.

New features and improvements to the Python language

Python 3.9 introduces new features for strings and dictionaries. There are new methods to remove prefixes and suffixes in strings. To remove a prefix, use str.removeprefix(prefix). To remove a suffix, use str.removesuffix(suffix). To learn more, read about PEP 616.

Dictionaries now offer two new operators (| and |=). The first is a union operator for merging dictionaries and the second allows developers to update the contents of a dictionary with another dictionary. To learn more, read about PEP 584.

You can alter the behavior of Python functions by using decorators. Previously, these could only consist of the @ symbol, a name, a dotted name, and an optional single call. Decorators can now consist of any valid expression, as explained in PEP 614.

There are also improvements for time zone handling. The zoneinfo module now supports the IANA time zone database. This can help remove boilerplate and brings improvements for code handling multiple timezones.

While existing Python 3 versions support TLS1.2, Python 3.9 now provides support for TLS1.3. This helps improve the performance of encrypted connections with features such as False Start and Zero Round Trip Time (0-RTT).

For a complete list of updates in Python 3.9, read the launch documentation on the Python website.

Performance improvements in Python 3.9

There are two important performance improvements in Python 3.9 that you can benefit from without making any code changes.

The first impacts code that uses the built-in Python data structures tuple, list, dict, set, or frozenset. In Python 3.9, these internally use the vectorcall protocol, which can make function calls faster by reducing the number of temporary objects used. Second, Python 3.9 uses a new parser that is more performant than previous versions. To learn more, read about PEP 617.

Changes to how Lambda works with the Python runtime

In Python, the presence of an __init__.py file in a directory causes it to be treated as a package. Frequently, __init__.py is an empty file that’s used to ensure that Python identifies the directory as a package. However, it can also contain initialization code for the package. Before Python 3.9, where you provided your Lambda function in a package, Lambda did not run the __init__.py code in the handler’s directory and parent directories during function initialization. From Python 3.9, Lambda now runs this code during the initialization phase. This ensures that imported packages are properly initialized if they make use of __init__.py. Note that __init__.py code is only run when the execution environment is first initialized.

Finally, there is a change to the error response in this new version. When previous Python versions threw errors, the formatting appeared as:

{"errorMessage": "name 'x' is not defined", "errorType": "NameError", "stackTrace": [" File \"/var/task/error_function.py\", line 2, in lambda_handler\n return x + 10\n"]}

From Python 3.9, the error response includes a RequestId:

{"errorMessage": "name 'x' is not defined", "errorType": "NameError", **"requestId"**: "<request id of function invoke>" "stackTrace": [" File \"/var/task/error_function.py\", line 2, in lambda_handler\n return x + 10\n"]}

Using Python 3.9 in Lambda

You can now use the Python 3.9 runtime to develop your AWS Lambda functions. To use this version, specify a runtime parameter value python3.9 when creating or updating Lambda functions. You can see the new version in the Runtime dropdown in the Create function page.

Create function page

To update an existing Lambda function to Python 3.9, navigate to the function in the Lambda console, then choose Edit in the Runtime settings panel. You see the new version in the Runtime dropdown:

Edit runtime settings

In the AWS Serverless Application Model (AWS SAM), set the Runtime attribute to python3.9 to use this version in your application deployments:

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: Simple Lambda Function
  
Resources:
  MyFunction:
    Type: AWS::Serverless::Function
    Description: My Python Lambda Function
    Properties:
      CodeUri: my_function/
      Handler: lambda_function.lambda_handler
      Runtime: python3.9

Conclusion

You can now create new functions or upgrade existing Python functions to Python 3.9. Lambda’s support of the Python 3.9 runtime enables you to take advantage of improved performance and new features in this version. Additionally, the Lambda service now runs the __init_.py code before the handler, supports TLS 1.3, and provides enhanced logging for errors.

For more serverless learning resources, visit Serverless Land.

Choosing between AWS services for streaming data workloads

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/choosing-between-aws-services-for-streaming-data-workloads/

Traditionally, streaming data can be complex to manage due to the large amounts of data arriving from many separate sources. Managing fluctuations in traffic and durably persisting messages as they arrive is a non-trivial task. Using a serverless approach, AWS provides a number of services that help manage large numbers of messages, alleviating much of the infrastructure burden.

In this blog post, I compare several AWS services and how to choose between these options in streaming data workloads.

Comparing Amazon Kinesis Data Streams with Amazon SQS queues

While you can use both services to decouple data producers and consumers, each is suited to different types of workload. Amazon SQS is primarily used as a message queue to store messages durably between distributed services. Amazon Kinesis is primarily intended to manage streaming big data.

Kinesis supports ordering of records and the ability for multiple consumers to read messages from the same stream concurrently. It also allows consumers to replay messages from up to 7 days previously. Scaling in Kinesis is based upon shards and you must reshard to scale a data stream up or down.

With SQS, consumers pull data from a queue and it’s hidden from other consumers until processed successfully (known as a visibility timeout). Once a message is processed, it’s deleted from the queue. A queue may have multiple consumers but they all receive separate batches of messages. Standard queues do not provide an ordering guarantee but scaling in SQS is automatic.

Amazon Kinesis Data Streams

Amazon SQS

Ordering guarantee Yes, by shard No for standard queues; FIFO queues support ordering by group ID.
Scaling Resharding required to provision throughput Automatic for standard queues; up to 30,000 message per second for FIFO queues (more details).
Exactly-once delivery No No for standard queues; Yes for FIFO queues.
Consumer model Multiple concurrent Single consumer
Configurable message delay No Up to 15 minutes
Ability to replay messages Yes No
Encryption Yes Yes
Payload maximum 1 MB per record 256 KB per message
Message retention period 24 hours (default) to 365 days (additional charges apply) 1 minute to 14 days. 4 days is the default
Pricing model Per shard hour plus PUT payload unit per million. Additional charges for some features No minimum; $0.24-$0.595 per million messages, depending on Region and queue type
AWS Free Tier included No Yes, 1 million messages per month – see details
Typical use cases

Real-time metrics/reporting

Real-time data analytics

Log and data feed processing

Stream processing

Application integration

Asynchronous processing

Batching messages/smoothing throughput

Integration with Kinesis Data Analytics Yes No
Integration with Kinesis Data Firehose Yes No

While some functionality of both services is similar, Kinesis is often a better fit for many streaming workloads. Kinesis has a broader range of options for ingesting large amounts of data, such as the Kinesis Producer Library and Kinesis Aggregation Library. You can also use the PutRecords API to send up to 500 records (up to a maximum 5 MiB) per request.

Additionally, it has powerful integrations not available to SQS. Amazon Kinesis Data Analytics allows you to transform and analyze streaming data with Apache Flink. You can also use streaming SQL to run windowed queries on live data in near-real time. You can also use Amazon Kinesis Data Firehose as a consumer for Amazon Kinesis Data Streams, which is also not available to SQS queues.

Choosing between Kinesis Data Streams and Kinesis Data Firehose

Both of these services are part of Kinesis but they have different capabilities and target use-cases. Kinesis Data Firehose is a fully managed service that can ingest gigabytes of data from a variety of producers. When Kinesis Data Streams is the source, it automatically scales up or down to match the volume of data. It can optionally process records as they arrive with AWS Lambda and deliver batches of records to services like Amazon S3 or Amazon Redshift. Here’s how the service compares with Kinesis Data Streams:

Kinesis Data Streams

Kinesis Data Firehose

Scaling Resharding required Automatic
Supports compression No Yes (GZIP, ZIP, and SNAPPY)
Latency ~200 ms per consumer (~70 ms if using enhanced fan-out) Seconds (depends on buffer size configuration); minimum buffer window is 60 seconds
Retention 1–365 days None
Message replay Yes No
Quotas See quotas See quotas
Ingestion capacity Determined by number of shards (1,000 records or 1 MB/s per shard) No limit if source is Kinesis Data Streams; otherwise see quota page
Producer types

AWS SDK or AWS CLI

Kinesis Producer Library

Kinesis Agent

Amazon CloudWatch

Amazon EventBridge

AWS IoT Core

AWS SDK or AWS CLI

Kinesis Producer Library

Kinesis Agent

Amazon CloudWatch

Amazon EventBridge

AWS IoT Core

Kinesis Data Streams

Number of consumers Multiple, sharing 2 MB per second per shard throughput One per delivery stream
Consumers

AWS Lambda

Kinesis Data Analytics

Kinesis Data Firehose

Kinesis Client Library

Amazon S3

Amazon Redshift

Amazon Elasticsearch Service

Third-party providers

HTTP endpoints

Pricing Hourly charge plus data volume. Some features have additional charges – see pricing Based on data volume, format conversion and VPC delivery – see pricing

The needs of your workload determine the choice between the two services. To prepare and load data into a data lake or data store, Kinesis Data Firehose is usually the better choice. If you need low latency delivery of records and the ability to replay data, choose Kinesis Data Streams.

Using Kinesis Data Firehose to prepare and load data

Kinesis Data Firehose buffers data based on two buffer hints. You can configure a time-based buffer from 1-15 minutes and a volume-based buffer from 1-128 MB. Whichever limit is reached first causes the service to flush the buffer. These are called hints because the service can adjust the settings if data delivery falls behind writing to the stream. The service raises the buffer settings dynamically to allow the service to catch up.

This is the flow of data in Kinesis Data Firehose from a data source through to a destination, including optional settings for a delivery stream:

Kinesis Dat Firehose flow

  1. The service continuously loads from the data source as it arrives.
  2. The data transformation Lambda function processes individual records and returns these to the service.
  3. Transformed records are delivered to the destination once the buffer size or buffer window is reached.
  4. Any records that could not be delivered to the destination are written to an intermediate S3 bucket.
  5. Any records that cannot be transformed by the Lambda function are written to an intermediate S3 bucket.
  6. Optionally, the original, untransformed records are written to an S3 bucket.

Data transformation using a Lambda function

The data transformation process enables you to modify the contents of individual records. Kinesis Data Firehose synchronously invokes the Lambda function with a batch of records. Your custom code modifies the records and then returns an array of transformed records.

Transformed records

The incoming payload provides the data attribute in base64 encoded format. Once the transformation is complete, the returned array must include the following attributes per record:

  • recordId: This must match the incoming recordId to enable the service to map the new data to the record.
  • result: “Ok”, “Dropped”, or “ProcessingFailed”. Dropped means that your logic has intentionally removed the record whereas ProcessingFailed indicates that an error has occurred.
  • data: The transformed data must be base64 encoded.

The returned array must be the same length as the incoming array. The Alleycat example application uses the following code in the data transformation function to add a calculated field to the record:

exports.handler = async (event) => {
    const output = event.records.map((record) => {
        
      // Extract JSON record from base64 data
      const buffer = Buffer.from(record.data, 'base64').toString()
      const jsonRecord = JSON.parse(buffer)

	// Add the calculated field
	jsonRecord.output = ((jsonRecord.cadence + 35) * (jsonRecord.resistance + 65)) / 100

	// Convert back to base64 + add a newline
	const dataBuffer = Buffer.from(JSON.stringify(jsonRecord) + '\n', 'utf8').toString('base64')

       return {
            recordId: record.recordId,
            result: 'Ok',
            data: dataBuffer
        }
    })
    
    console.log(`Output records: ${output.length}`)
    return { records: output }
}

Comparing scaling and throughput with Kinesis Data Streams and Kinesis Data Firehose

Kinesis Data Firehose manages scaling automatically. If the data source is a Kinesis Data Stream, there is no limit to the amount of data the service can ingest. If the data source is a direct put using the PutRecordBatch API, there are soft limits of up to 500,000 records per second, depending upon the Region. See the Kinesis Data Firehose quota page for more information.

Kinesis Data Firehose invokes a Lambda transformation function synchronously and scales up the function as the number of records in the stream grows. When the destination is S3, Amazon Redshift, or the Amazon Elasticsearch Service, Kinesis Data Firehose allows up to five outstanding Lambda invocations per shard. When the destination is Splunk, the quota is 10 outstanding Lambda invocations per shard.

With Kinesis Data Firehose, the buffer hints are the main controls for influencing the rate of data delivery. You can decide between more frequent delivery of small batches of message or less frequent delivery of larger batches. This can impact the PUT cost when using a destination like S3. However, this service is not intended to provide real-time data delivery due to the transformation and batching processes.

With Kinesis Data Streams, the number of shards in a stream determines the ingestion capacity. Each shard supports ingesting up to 1,000 messages or 1 MB per second of data. Unlike Kinesis Data Firehose, this service does not allow you to transform records before delivery to a consumer.

Data Streams has additional capabilities for increasing throughput and reducing the latency of data delivery. The service invokes Lambda consumers every second with a configurable batch size of messages. If the consumers are falling behind data production in the stream, you can increase the parallelization factor. By default, this is set to 1, meaning that each shard has a single instance of a Lambda function it invokes. You can increase this up to 10 so that multiple instances of the consumer function process additional batches of messages.

Increase the parallelization factor

Data Streams consumers use a pull model over HTTP to fetch batches of records, operating in serial. A stream with five standard consumers averages 200 ms of latency each, taking up to 1 second in total. You can improve the overall latency by using enhanced fan-out (EFO). EFO consumers use a push model over HTTP/2 and are independent of each other.

With EFO, all five consumers in the previous example receive batches of messages in parallel using dedicated throughput. The overall latency averages 70 ms and typically data delivery speed is improved by up to 65%. Note that there is an additional charge for this feature.

Kinesis Data Streams EFO consumers

Conclusion

This blog post compares different AWS services for handling streaming data. I compare the features of SQS and Kinesis Data Streams, showing how ordering, ingestion throughput, and multiple consumers often make Kinesis the better choice for streaming workloads.

I compare Data Streams and Kinesis Data Firehose and show how Kinesis Data Firehose is the better option for many data loading operations. I show how the data transformation process works and the overall workflow of a Kinesis Data Firehose stream. Finally, I compare the scaling and throughput options for these two services.

For more serverless learning resources, visit Serverless Land.

Building a serverless multiplayer game that scales: Part 2

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-a-serverless-multiplayer-game-that-scales-part-2/

This post is written by Vito De Giosa, Sr. Solutions Architect and Tim Bruce, Sr. Solutions Architect, Developer Acceleration.

This series discusses solutions for scaling serverless games, using the Simple Trivia Service, a game that relies on user-generated content. Part 1 describes the overall architecture, how to deploy to your AWS account, and different communications methods.

This post discusses how to scale via automation and asynchronous processes. You can use automation to minimize the need to scale personnel to review player-generated content for acceptability. It also introduces asynchronous processing, which allows you to run non-critical processes in the background and batch data together. This helps to improve resource usage and game performance. Both scaling techniques can also reduce overall spend.

To set up the example, see the instructions in the GitHub repo and the README.md file. This example uses services beyond the AWS Free Tier and incurs charges. Instructions to remove the example application from your account are also in the README.md file.

Technical implementation

Games require a mechanism to support auto-moderated avatars. Specifically, this is an upload process to allow the player to send the content to the game. There is a content moderation process to remove unacceptable content and a messaging process to provide players with a status regarding their content.

Here is the architecture for this feature in Simple Trivia Service, which is combined within the avatar workflow:

Architecture diagram

This architecture processes images uploaded to Amazon S3 and notifies the user of the processing result via HTTP WebPush. This solution uses AWS Serverless services and the Amazon Rekognition moderation API.

Uploading avatars

Players start the process by uploading avatars via the game client. Using presigned URLs, the client allows players to upload images directly to S3 without sharing AWS credentials or exposing the bucket publicly.

The URL embeds all the parameters of the S3 request. It includes a SignatureV4 generated with AWS credentials from the backend allowing S3 to authorize the request.

S3 upload process

  1. The front end retrieves the presigned URL invoking an AWS Lambda function through an Amazon API Gateway HTTP API endpoint.
  2. The front end uses the URL to send a PUT request to S3 with the image.

Processing avatars

After the upload completes, the backend performs a set of activities. These include content moderation, generating the thumbnail variant, and saving the image URL to the player profile. AWS Step Functions orchestrates the workflow by coordinating tasks and integrating with AWS services, such as Lambda and Amazon DynamoDB. Step Functions enables creating workflows without writing code and handles errors, retries, and state management. This enables traffic control to avoid overloading single components when traffic surges.

The avatar processing workflow runs asynchronously. This allows players to play the game without being blocked and enables you to batch the requests. The Step Functions workflow is triggered from an Amazon EventBridge event. When the user uploads an image to S3, an event is published to EventBridge. The event is routed to the avatar processing Step Functions workflow.

The single avatar feature runs in seconds and uses Step Functions Express Workflows, which are ideal for high-volume event-processing use cases. Step Functions can also support longer running processes and manual steps, depending on your requirements.

To keep performance at scale, the solution adopts four strategies. First, it moderates content automatically, requiring no human intervention. This is done via Amazon Rekognition moderation API, which can discover inappropriate content in uploaded avatars. Developers do not need machine learning expertise to use this API. If it identifies unacceptable content, the Step Functions workflow deletes the uploaded picture.

Second, it uses avatar thumbnails on the top navigation bar and on leaderboards. This speeds up page loading and uses less network bandwidth. Image-editing software runs in a Lambda function to modify the uploaded file and store the result in S3 with the original.

Third, it uses Amazon CloudFront as a content delivery network (CDN) with the S3 bucket hosting images. This improves performance by implementing caching and serving static content from locations closer to the player. Additionally, using CloudFront allows you to keep the bucket private and provide greater security for the content stored within S3.

Finally, it stores profile picture URLs in DynamoDB and replicates the thumbnail URL in an Amazon Cognito user attribute named picture. This allows the game to retrieve the avatar URL as part of the login process, saving an HTTP GET request for the player profile.

The last step of the workflow publishes the result via an event to EventBridge for downstream systems to consume. The service routes the event to the notification component to inform the player about the moderation status.

Notifying users of the processing result

The result of the avatar workflow to the player is important but not urgent. Players want to know the result but not impact their gameplay experience. A solution for this challenge is to use HTTP web push. It uses the HTTP protocol and does not require a constant communication channel between backend and front end. This allows players to play games without being blocked or by introducing latency to the game communications channel.

Applications requiring low latency fully bidirectional communication, such as highly interactive multi-player games, typically use WebSockets. This creates a persistent two-way channel for front end and backend to exchange information. The web push mechanism can provide non-urgent data and messages to the player without interrupting the WebSockets channel.

The web push protocol describes how to use a consolidated push service as a broker between the web-client and the backend. It accepts subscriptions from the client and receives push message delivery requests from the backend. Each browser vendor provides a push service implementation that is compliant with the W3C Push API specification and is external to both client and backend.

The web client is typically a browser where a JavaScript application interacts with the push service to subscribe and listen for incoming notifications. The backend is the application that notifies the front end. Here is an overview of the protocol with all the parties involved.

Notification process

  1. A component on the client subscribes to the configured push service by sending an HTTP POST request. The client keeps a background connection waiting for messages.
  2. The push service returns a URL identifying a push resource that the client distributes to backend applications that are allowed to send notifications.
  3. Backend applications request a message delivery by sending an HTTP POST request to the previously distributed URL.
  4. The push service forwards the information to the client.

This approach has four advantages. First, it reduces the effort to manage the reliability of the delivery process by off-loading it to an external and standardized component. Second, it minimizes cost and resource consumption. This is because it doesn’t require the backend to keep a persistent communication channel or compute resources to be constantly available. Third, it keeps complexity to a minimum because it relies on HTTP only without requiring additional technologies. Finally, HTTP web push addresses concepts such as message urgency and time-to-live (TTL) by using a standard.

Serverless HTTP web push

The implementation of the web push protocol requires the following components, per the Push API specification. First, the front end is required to create a push subscription. This is implemented through a service worker, a script running in the origin of the application. The service worker exposes operations to access the push service either creating subscriptions or listening for push events.

Serverless HTTP web push

  1. The client uses the service worker to subscribe to the push service via the Push API.
  2. The push service responds with a payload including a URL, which is the client’s push endpoint. The URL is used to create notification delivery requests.
  3. The browser enriches the subscription with public cryptographic keys, which are used to encrypt messages ensuring confidentiality.
  4. The backend must receive and store the subscription for when a delivery request is made to the push service. This is provided by API Gateway, Lambda, and DynamoDB. API Gateway exposes an HTTP API endpoint that accepts POST requests with the push service subscription as payload. The payload is stored in DynamoDB alongside the player identifier.

This front end code implements the process:

//Once service worker is ready
navigator.serviceWorker.ready
  .then(function (registration) {
    //Retrieve existing subscription or subscribe
    return registration.pushManager.getSubscription()
      .then(async function (subscription) {
        if (subscription) {
          console.log('got subscription!', subscription)
          return subscription;
        }
        /*
         * Using Public key of our backend to make sure only our
         * application backend can send notifications to the returned
         * endpoint
         */
        const convertedVapidKey = self.vapidKey;
        return registration.pushManager.subscribe({
          userVisibleOnly: true,
          applicationServerKey: convertedVapidKey
        });
      });
  }).then(function (subscription) {
    //Distributing the subscription to the application backend
    console.log('register!', subscription);
    const body = JSON.stringify(subscription);
    const parms = {jwt: jwt, playerName: playerName, subscription: body};
    //Call to the API endpoint to save the subscription
    const res = DataService.postPlayerSubscription(parms);
    console.log(res);
  });

 

Next, the backend reacts to the avatar workflow completed custom event to create a delivery request. This is accomplished with EventBridge and Lambda.

Backend process after avater workflow completed

  1. EventBridge routes the event to a Lambda function.
  2. The function retrieves the player’s agent subscriptions, including push endpoint and encryption keys, from DynamoDB.
  3. The function sends an HTTP POST to the push endpoint with the encrypted message as payload.
  4. When the push service delivers the message, the browser activates the service worker updating local state and displaying the notification.

The push service allows creating delivery requests based on the knowledge of the endpoint and the front end allows the backend to deliver messages by distributing the endpoint. HTTPS provides encryption for data in transit while DynamoDB encrypts all your data at rest to provide confidentiality and security for the endpoint.

Security of WebPush can be further improved by using Voluntary Application Server Identification (VAPID). With WebPush, the clients authenticate messages at delivery time. VAPID allows the push service to perform message authentication on behalf of the web client avoiding denial-of-service risk. Without the additional security of VAPID, any application knowing the push service endpoint might successfully create delivery requests with an invalid payload. This can cause the player’s agent to accept messages from unauthorized services and, possibly, cause a denial-of-service to the client by overloading its capabilities.

VAPID requires backend applications to own a key pair. In Simple Trivia Service, a Lambda function, which is an AWS CloudFormation custom resource, generates the key pair when deploying the stack. It securely saves values in AWS System Manager (SSM) Parameter Store.

Here is a representation of VAPID in action:

VAPID process architecture

  1. The front end specifies which backend the push service can accept messages from. It does this by including the public key from VAPID in the subscription request.
  2. When requesting a message delivery, the backend self-identifies by including the public key and a token signed with the private key in the HTTP Authorization header. If the keys match and the client uses the public key at subscription, the message is sent. If not, the message is blocked by the push service.

The Lambda function that sends delivery requests to the push service reads the key values from SSM. It uses them to generate the Authorization header to include in the request, allowing for successful delivery to the client endpoint.

Conclusion

This post shows how you can add scaling support for a game via automation. The example uses Amazon Rekognition to check images for unacceptable content and uses asynchronous architecture patterns with Step Functions and HTTP WebPush. These scaling approaches can help you to maximize your technical and personnel investments.

For more serverless learning resources, visit Serverless Land.

Automating Amazon CloudWatch dashboards and alarms for Amazon Managed Workflows for Apache Airflow

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/automating-amazon-cloudwatch-dashboards-and-alarms-for-amazon-managed-workflows-for-apache-airflow/

This post is written by Mark Richman, Senior Solutions Architect.

Amazon Managed Workflows for Apache Airflow (MWAA) is a fully managed service that makes it easier to run open-source versions of Apache Airflow on AWS. It allows you to build workflows to run your extract-transform-load (ETL) jobs and data pipelines.

When working with MWAA, you often need to know more about the performance of your Airflow environment to achieve full insight and observability of your system. Airflow emits a number of useful metrics to Amazon CloudWatch, which are described in the product documentation. MWAA allows customers to define CloudWatch dashboards and alarms based upon the metrics and logs that Apache Airflow emits.

Airflow exposes metrics such as number of Directed Acyclic Graph (DAG) processes, DAG bag size, number of currently running tasks, task failures, and successes. Airflow is already set up to send metrics for an MWAA environment to CloudWatch.

This blog demonstrates a solution that automatically detects any deployed Airflow environments associated with the AWS account. It builds a CloudWatch dashboard and some useful alarms for each. All source code for this blog post is available on GitHub.

You automate the creation of a CloudWatch dashboard, which displays several of these key metrics together with CloudWatch alarms. These alarms receive notifications when the metrics fall outside of the thresholds that you configure, and allow you to perform actions in response.

Prerequisites

Deploying this solution requires:

Overview

Based on AWS serverless services, this solution includes:

  1. An Amazon EventBridge rule that runs on a schedule to invoke an AWS Step Functions workflow.
  2. Step Function orchestrates several AWS Lambda functions to query the existing MWAA environments.
  3. Lambda functions will update the CloudWatch dashboard definition to include metrics such as QueuedTasks, RunningTasks, SchedulerHeartbeat, TasksPending, and TotalParseTime.
  4. CloudWatch alarms are created for unhealthy workers and heartbeat failure across all MWAA environments. These alarms are removed for any nonexistent environments.
  5. Any MWAA environments that no longer exist have their respective CW dashboards removed.

Reference architecture

EventBridge, a serverless event service, is configured to invoke a Step Functions workflow every 10 minutes. You can configure this for your preferred interval. Step Functions invokes a number of Lambda functions in parallel. If a function throws an error, the failed step in the state machine transitions to its respective failed state and the entire workflow ends.

Each of the Lambda functions performs a single task, orchestrated by Step Functions. Each function has a descriptive name for the task it performs in the workflow.

Understanding the CreateDashboardFunction function

When you deploy the AWS SAM template, the SeedDynamoDBFunction Lambda function is invoked. The function populates a DynamoDB table called DashboardTemplateTable with a CloudWatch dashboard definition. This definition is a template for any new CloudWatch dashboard created by CreateDashboardFunction.

You can see this definition in the GitHub repo:

{
  "widgets": [{
      "type": "metric",
      "x": 0,
      "y": 0,
      "width": 12,
      "height": 6,
      "properties": {
        "view": "timeSeries",
        "stacked": true,
        "metrics": [
          [
            "AmazonMWAA",
            "QueuedTasks",
            "Function",
            "Executor",
            "Environment",
            "${EnvironmentName}"
          ]
        ],
        "region": "${AWS::Region}",
        "title": "QueuedTasks ${EnvironmentName}",
        "period": 300
      }
    },
    ...
}

When this Step Functions workflow runs, it runs CreateDashboardFunction. This iterates through all the MWAA environments in the account, creating or updating its corresponding CloudWatch dashboard. You can see the code in /functions/create_dashboard/app.py:

for env in mwaa_environments:
        dashboard_name = f"Airflow-{env}"

        dashboard_body = dashboard_template.replace(
            "${AWS::Region}", os.getenv("AWS_REGION", "us-east-1")
        ).replace("${EnvironmentName}", env)

        logger.info(f"Creating/updating dashboard: {dashboard_name}")
        logger.debug(dashboard_body)

        response = cloudwatch.put_dashboard(
            DashboardName=dashboard_name, DashboardBody=dashboard_body
        )

        logger.info(json.dumps(response, indent=2))

Step Functions state machine

Here is a visualization of the Step Functions state machine:

State machine visualization

Step Functions is based on state machines and tasks. A state machine is a workflow. A task is a state in a workflow that represents a single unit of work that another AWS service performs. Each step in a workflow is a state. The state machine in this solution is defined in JSON format in the repo.

Building and deploying the example application

To build and deploy this solution:

  1. Clone the repo from GitHub:
    git clone https://github.com/aws-samples/mwaa-dashboard
    cd mwaa-dashboard
    
  2. Build the application. Since Lambda functions may depend on packages that have natively compiled programs, use the --use-container flag. This flag compiles your functions locally in a Docker container that behaves like the Lambda environment:
    sam build --use-container
  3. Deploy the application to your AWS account:
    sam deploy --guided

This command packages and deploys the application to your AWS account. It provides a series of prompts:

  • Stack Name: The name of the stack to deploy to AWS CloudFormation. This should be unique to your account and Region. This walkthrough uses mwaa-dashboard throughout this project.
  • AWS Region: The AWS Region you want to deploy your app to.
  • Confirm changes before deploy: If set to yes, any change sets will be shown to you for manual review. If set to no, the AWS SAM CLI automatically deploys application changes.
  • Respond to the remaining prompts per the SAM CLI command reference.

Updating the CloudWatch dashboard template definition in DynamoDB

The CloudWatch dashboard template definition is stored in DynamoDB. This is a one-time setup step, performed by the functions/seed_dynamodb Lambda custom resource at stack deployment time.

To override the template, you can edit the data directly in DynamoDB using the AWS Management Console. Alternatively, modify the scripts/dashboard-template.json file and update DynamoDB using the scripts/seed.py script.

cd scripts
./seed.py -t <dynamodb-table-name>
cd ..

Here, <dynamodb-table-name> is the name of the DynamoDB table created during deployment. For example:

./seed.py -t mwaa-dashboard-DashboardTemplateTable-VA2M5945RCF1

Viewing the CloudWatch dashboards and alarms

If you have any existing MWAA environments, or create new ones, a dashboard for each appears with the Airflow- prefix. If you delete an MWAA environment, the corresponding dashboard is also deleted. No CloudWatch metrics are deleted.

Upon successful completion of the Step Functions workflow, you see a list of custom dashboards. There is one for each of your MWAA environments:

Custom dashboards view

Choosing the dashboard name displays the widgets defined in the JSON described previously. Each widget corresponds to an Airflow key performance indicator (KPI). The dashboards can be customized through the AWS Management Console without any code changes.

Example dashboard

These are the metrics:

  • QueuedTasks: The number of tasks with queued state. Corresponds to the executor.queued_tasks Airflow metric.
  • TasksPending: The number of tasks pending in executor. Corresponds to the scheduler.tasks.pending Airflow metric.
  • RunningTasks: The number of tasks running in executor. Corresponds to the executor.running_tasks Airflow metric.
  • SchedulerHeartbeat: The number of check-ins Airflow performs on the scheduler job. Corresponds to the scheduler_heartbeat Airflow metrics.
  • TotalParseTime: Number of seconds taken to scan and import all DAG files once. Corresponds to the dag_processing.total_parse_time Airflow metric.

More information on all MWAA metrics available in CloudWatch can be found in the documentation. CloudWatch alarms are also created for each MWAA environment:

CloudWatch alarms list

By default, you create two alarms:

  • {environment name}-UnhealthyWorker: This alarm triggers if the number of QueuedTasks is greater than the number of RunningTasks, and the number of RunningTasks is zero, for a period of 15 minutes.
  • {environment name}-HeartbeatFail: This alarm triggers if SchedulerHeartbeat is zero for a period of five minutes.

You can configure actions in response to these alarms, such as an Amazon SNS notification to email or a Slack message.

Cleaning up

After testing this application, delete the resources created to avoid ongoing charges. You can use the AWS CLI, AWS Management Console, or the AWS APIs to delete the CloudFormation stack deployed by AWS SAM.

To delete the stack via the AWS CLI, run the following command:

aws cloudformation delete-stack --stack-name mwaa-dashboard

The log groups all share the prefix /aws/lambda/mwaa-dashboard. Delete these with the command:

aws logs delete-log-group --log-group-name <log group>

Conclusion

With Amazon MWAA, you can spend more time building workflows and less time managing and scaling infrastructure. This article shows a serverless example that automatically creates CloudWatch dashboards and alarms for all existing and new MWAA environments. With this example, you can achieve better observability for your MWAA environments.

To get started with MWAA, visit the user guide. To deploy this solution in your own AWS account, visit the GitHub repo for this article.

For more serverless learning resources, visit Serverless Land.

Creating a single-table design with Amazon DynamoDB

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/creating-a-single-table-design-with-amazon-dynamodb/

Amazon DynamoDB is a highly performant NoSQL database that provides data storage for many serverless applications. Unlike traditional SQL databases, it does not use table joins and other relational database constructs. However, you can model many common relational designs in a single DynamoDB table but the process is different using a NoSQL approach.

This blog post uses the Alleycat racing application to explain the benefits of a single-table DynamoDB table. It also shows how to approach modeling data access requirements in a DynamoDB table. Alleycat is a home fitness system that allows users to compete in an intense series of 5-minute virtual bicycle races. Up to 1,000 racers at a time take the saddle and push the limits of cadence and resistance to set personal records and rank on virtual leaderboards.

Alleycat requirements

In the Alleycat example, the application offers a number of exercise classes. Each class has multiple races, and there are multiple racers in each race. The system logs the output for each racer per second of the race. An entity-relationship diagram in a traditional relational database shows how you could use normalized tables and relationships to store this data:

Relational model for Alleycat

In a relational database, often each table has a key that relates to a foreign key in another table. By joining multiple tables, you can query related tables and return the results in a single table view. While this is flexible and convenient, it’s also computationally expensive and difficult to scale horizontally.

Many serverless architectures are built for scale and the relational database paradigm often does not scale as efficiently as a workload demands. DynamoDB scales to almost any level of traffic but one of the tradeoffs is the lack of joins. Fortunately, it offers alternative ways to model the data to meet Alleycat’s requirements.

DynamoDB terminology and concepts

Unlike traditional databases, there is no limit to how much data can be stored in a DynamoDB table. The service is also designed to provide predictable performance at any scale, so you can expect similar query latency regardless of the level of traffic.

The most important operational aspect of running DynamoDB in production is setting and managing throughput. There is a provisioned mode, where you set the throughput, and on-demand, which is managed by the service. In the provisioned mode, you can also use automatic scaling to let the service set the throughput between lower and upper limits you define.

The choice here is determined by the traffic patterns in your workload. For applications with predictable traffic with gradual changes, provisioned mode is the better choice and is more cost effective. If traffic patterns are unknown or you prefer to have capacity managed automatically, choose on-demand. To learn more about the capacity modes, visit the documentation page.

Within each table, you must have a partition key, which is a string, numeric, or binary value. This key is a hash value used to locate items in constant time regardless of table size. It is conceptually different to an ID or primary key field in a SQL-based database and does not relate to data in other tables. When there is only a partition key, these values must be unique across items in a table.

Each table can optionally have a sort key. This allows you to search and sort within items that match a given primary key. While you must search on exact single values in the partition key, you can pattern search on sort keys. It’s common to use a numeric sort key with timestamps to find items within a date range, or use string search operators to find data in hierarchical relationships.

With only partition key and sort keys, this limits the possible types of query without duplicating data in a table. To solve this issue, DynamoDB also offers two types of indexes:

  • Local secondary indexes (LSIs): these must be created at the same time the table is created and effectively enable another sort key using the same partition key.
  • Global secondary indexes (GSIs): create and delete these at any time, and optionally use a different partition key from the existing table.

There are other important differences between the two index types:

LSI

GSI

Create At table creation Anytime
Delete At table deletion Anytime
Size Up to 10 GB per partition Unlimited
Throughput Shared with table Separate throughput
Key type Primary key only or composite key (partition key and sort key) Composite key only
Consistency model Both eventual and strong consistency Eventual consistency only

Determining data access requirements

Relational database design focuses on the normalization process without regard to data access patterns. However, designing NoSQL data schemas starts with the list of questions the application must answer. It’s important to develop a list of data access patterns before building the schema, since NoSQL databases offer less dynamic query flexibility than their SQL equivalents.

To determine data access patterns in new applications, user stories and use-cases can help identify the types of query. If you are migrating an existing application, use the query logs to identify the typical queries used. In the Alleycat example, the frontend application has the following queries:

  1. Get the results for each race by racer ID.
  2. Get a list of races by class ID.
  3. Get the best performance by racer for a class ID.
  4. Get the list of top scores by race ID.
  5. Get the second-by-second performance by racer for all races.

While it’s possible to implement the design with multiple DynamoDB tables, it’s unnecessary and inefficient. A key goal in querying DynamoDB data is to retrieve all the required data in a single query request. This is one of the more difficult conceptual ideas when working with NoSQL databases but the single-table design can help simplify data management and maximize query throughput.

Modeling many-to-many relationships with DynamoDB

In traditional SQL, a many-to-many relationship is classically represented with three tables. In the earlier diagram for the Alleycat application, these tables are racers, raceResults, and races. Populated with sample data, the tables look like this:

Relational tables with data

In DynamoDB, the adjacency list design pattern enables you to combine multiple SQL-type tables into a single NoSQL table. It has multiple uses but in this case can model many-to-many relationships. To do this, the partition key contains both types of item – races and racers. The key value contains the type of data expected in the item (for example, “race-1” or “racer-2”):

Equivalent data structure in DynamoDB

With this table design, you can query by racer ID or by race ID. For a single race, you can query by partition key to return all results for a single race, or use the sort key to limit by a single racer or for the overall results. For per racer results, the second-by-second data is stored in a nested JSON structure.

To allow sorting by output to create leaderboard results, the output value must be a sort key. However, the sort key cannot be updated once it is set. Using the main sort key, the application would only be able to write a final race result per racer to query and sort on this data.

To resolve this problem, use an index. The index can use a separate sort key where the value can be updated. This allows Alleycat to store the latest results in this field, and then for queries to sort by output to create a leaderboard.

The preceding table does not represent the races table in the normalized view, so you cannot query by class ID to retrieve a list of races. Depending on your design, you can solve this by adding a second index to the table to enable querying by class ID and returning a list of partition keys (race IDs). However, you can also overload GSIs to contain multiple types of value.

The AlleyCat application uses both an LSI and GSI to accommodate all the data access patterns. This table shows how this is modeled, although the results attribute names are shorter in the application:

Data modeled with LSI and GSI

  • Main composite key: PK and SK.
  • Local secondary index: Partition key is PK and sort key is Numeric.
  • Global secondary index: Partition key is SK and sort key is Numeric.

Reviewing the data access patterns for Alleycat

Before creating the DynamoDB table, test the proposed schema against the list of data access patterns. In this section, I review Alleycat’s list of queries to ensure that each is supported by the table schema. I use the Item explorer feature to run queries against a test table, after running the Alleycat simulator for multiple races.

1. Get the results for each race by racer ID

Use the table’s partition key, searching for PK = racer ID. This returns a list of all races (PK) for a given racer. See the updateRaceResults function for an example of how this is used:

Results by racer ID

2. Get a list of races by class ID

Use the local secondary index, searching for partition key = class ID. This results in a list of races (PK) for a given class ID. See the getRaces function code for an example of this query:

Results by class ID

3. Get the best performance by racer for a class ID.

Use the table’s partition key, searching for PK = class ID. This returns a list of racers and their best outputs for the given class ID. See the getLeaderboard function code for an example of this query:

Best performance by racer for a class ID

4. Get the list of top scores by race ID.

Use the global secondary index, searching for PK = race ID, sorting by the GSI sort key (descending) to rank the results. This returns a sorted list of results for a race. See the updateRaceResults function for an example of how this is used:

Top scored by race ID

5. Get the second-by-second performance by racer for all races.

Use the main table index, searching for PK = racer ID. Optionally use the sort key to restrict to a single race. This returns items with second-by-second performance stored in a nested JSON attribute. See the loadRealtimeHistory function for an example of how this is used:

Second-by-second performance for all racers

Optimizing items and capacity

In the Alleycat application, races are only 5 minutes long so the results attribute only contains 300 separate data points (once per second). By using a nested JSON structure in the items, the schema flattens data that otherwise would use 300 rows in the earlier SQL-based design.

The maximum item size in DynamoDB is 400 KB, which includes attribute names. If you have many more data points, you may reach this limit. To work around this, split the data across multiple items and provide the item order in the sort key. This way, when your application retrieves the items, it can reassemble the attributes to create the original dataset.

For example, if races in Alleycat were an hour long, there would be 3,600 data points. These may be stored in six rows containing 600 second-by-second results each:

Data set split across multiple items

Additionally, to maximize the storage per row, choose short attribute names. You can also compress data in attributes by storing as GZIP output instead of raw JSON, and using a binary data type for the attribute. This increases processing for the producing and consuming applications, which must compress and decompress the items. However, it can significantly increase the amount of data stored per row.

To learn more, read Best practices for storing large items and attributes.

Conclusion

This post looks at implementing common relational database patterns using DynamoDB. Instead of using multiple tables, the single-table design pattern can use adjacency lists to provide many-to-many relational functionality.

Using the Alleycat example, I show how to list the data access patterns required by an application, and then model the data using composite keys and indexes to return the relevant data using single queries. Finally, I show how to optimize items and capacity for workloads storing large amounts of data.

For more serverless learning resources, visit Serverless Land.

Understanding data streaming concepts for serverless applications

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/understanding-data-streaming-concepts-for-serverless-applications/

Amazon Kinesis is a suite of managed services that can help you collect, process, and analyze streaming data in near-real time. It consists of four separate services that are designed for common tasks with streaming data: This blog post focuses on Kinesis Data Streams.

One of the main benefits of processing streaming data is that an application can react as new data is generated, instead of waiting for batches. This real-time capability enables new functionality for applications. For example, payment processors can analyze payments in real time to detect fraudulent transactions. Ecommerce websites can use streams of clickstream activity to determine site engagement metrics in near-real time.

Kinesis can be used with Amazon EC2-based and container-based workloads. However, its integration with AWS Lambda can make it a useful data source for serverless applications. Using Lambda as a stream consumer can also help minimize the amount of operational overhead for managing streaming applications.

In this post, I explain important streaming concepts and how they affect the design of serverless applications. This post references the Alleycat racing application. Alleycat is a home fitness system that allows users to compete in an intense series of 5-minute virtual bicycle races. Up to 1,000 racers at a time take the saddle and push the limits of cadence and resistance to set personal records and rank on leaderboards. The Alleycat software connects the stationary exercise bike with a backend application that processes the data from thousands of remote devices.

The Alleycat frontend allows users to configure their races and view real-time leaderboard and historical rankings. The frontend could wait until the end of each race and collect the total output from each racer. Once the batch is ready, it could rank the results and provide a leaderboard once the race is completed. However, this is not very engaging for competitors. By using streaming data instead of a batch, the application show the racers a view of who is winning during the race. This makes the virtual environment more like a real-life cycling race.

Producers and consumers

In streaming data workloads, producers are the applications that produce data and consumers are those that process it. In a serverless streaming application, a consumer is usually a Lambda function, Amazon Kinesis Data Firehose, or Amazon Kinesis Data Analytics.

Kinesis producers and consumers

There are a number of ways to put data into a Kinesis stream in serverless applications, including direct service integrations, client libraries, and the AWS SDK.

Producer

Kinesis Data Streams

Kinesis Data Firehose

Amazon CloudWatch Logs Yes, using subscription filters Yes, using subscription filters
AWS IoT Core Yes, using IoT rule actions Yes, using IoT rule actions
AWS Database Migration Service Yes – set stream as target Not directly.
Amazon API Gateway Yes, via REST API direct service integration Yes, via REST API direct service integration
AWS Amplify Yes – via JavaScript library Not directly
AWS SDK Yes Yes

A single stream may have tens of thousands of producers, which could be web or mobile applications or IoT devices. The Alleycat application uses the AWS IoT SDK for JavaScript to publish messages to an IoT topic. An IoT rule action then uses a direct integration with Kinesis Data Streams to push the data to the stream. This configuration is ideal at the device level, especially since the device may already use AWS IoT Core to receive messages.

The Alleycat simulator uses the AWS SDK to send a large number of messages to the stream. The SDK provides two methods: PutRecord and PutRecords. The first allows you to send a single record, while the second supports up to 500 records per request (or up to 5 MB in total). The simulator uses the putRecords JavaScript API to batch messages to the stream.

A producer can put records directly on a stream, for example via the AWS SDK, or indirectly via other services such as Amazon API Gateway or AWS IoT Core. If direct, the producer must have appropriate permission to write data to the stream. If indirect, the producer must have permission to invoke the proxy service, and then the service must have permission to put data onto the stream.

While there may be many producers, there are comparatively fewer consumers. You can register up to 20 consumers per data stream, which share the outgoing throughout limit per shard. Consumers receive batches of records sequentially, which means processing latency increases as you add more consumers to a stream. For latency-sensitive applications, Kinesis offers enhanced fan-out which gives consumers 2 MB per second dedicated throughput and uses a push model to reduce latency.

Shards, streams, and partition keys

A shard is a sequence of data records in a stream with a fixed capacity. Part of Kinesis billing is based upon the number of shards. A single shard can process up to 1 MB per second or 1,000 records of incoming data. One shard can also send up to 2 MB per second of outgoing data to downstream consumers. These are hard limits on the throughputs of a shard and as your application approaches these limits, you must add more shards to avoid exceeding these limits.

A stream is a collection of these shards and is often a grouping at the workload or project level. Adding another shard to a stream effectively doubles the throughput, though it also doubles the cost. When there is only one shard in a stream, all records sent to that sent are routed to the same shard. With multiple shards, the routing of incoming messages to shards is determined by a partition key.

The data producer adds the partition key before sending the record to Kinesis. The service calculates an MD5 hash of the key, which maps to one of the shards in the stream. Each shard is assigned a range of non-overlapping hash values, so each partition key maps to only one shard.

MD5 hash function

The partition key exists as an alternative to specifying a shard ID directly, since it’s common in production applications to add and remove shards depending upon traffic. How you use the partition key determines the shard-mapping behavior. For example:

  • Same value: If you specify the same string as the partition key, every message is routed to a single shard, regardless of the number of shards in the stream. This is called overheating a shard.
  • Random value: Using a pseudo-random value, such as a UUID, evenly distributes messages between all the shards available.
  • Time-based: Using a timestamp as a partition key may result in a preference for a single shard if multiple messages arrive at the same time.
  • Applicationspecific: The Alleycat application uses the raceId as a partition key to ensure that all messages from a single race are processed by the same shard consumer.

A Lambda function is a consumer application for a data stream and processes one batch of records for each shard. Since Alleycat uses a tumbling window to calculate aggregates between batches, this use of the partition key ensures that all messages for each raceId are processed by the same function. The downside to this architecture is that it is limited to 1,000 incoming messages per second with the same raceId since it is bound to a single shard.

Deciding on a partition key strategy depends upon the specific needs of your workload. In most cases, a random value partition key is often the best approach.

Streaming payloads in serverless applications

When using the SDK to put messages to a stream, the Data attribute can be a buffer, typed array, blob, or string. Combined with the partition key value, the maximum record size is 1 MB. The Data value is base64 encoded when serialized by Kinesis and delivered as an encoded value to downstream consumers. When using a Lambda function consumer, Kinesis delivers batches in a Records array. Each record contains the encoded data attribute, partition key, and additional metadata in a JSON envelope:

JSON transformation from producer to consumer

Ordering and idempotency

Records in a Kinesis stream are delivered to consuming applications in the same order that they arrive at the Kinesis service. The service assigns a sequence number to the record when it is received and this is delivered as part of the payload to a Kinesis consumer:

Sequence number in payload

When using Lambda as a consuming application for Kinesis, by default each shard has a single instance of the function processing records. In this case, ordering is guaranteed as Kinesis invokes the function serially, one batch of records at a time.

Parallelization factor of 1

You can increase the number of concurrent function invocations by setting the ParallelizationFactor on the event source mapping. This allows you to set a concurrency of between 1 and 10, which provides a way to increase Lambda throughout if the IteratorAge metric is increasing. However, one side effect is that ordering per shard is no longer guaranteed, since the shard’s messages are split into multiple subgroups based upon an internal hash.

Parallelization factor is 2

Kinesis guarantees that every record is delivered “at least once”, but occasionally messages are delivered more than once. This is caused by producers that retry messages, network-related timeouts, and consumer retries, which can occur when worker processes restart. In both cases, these are normal activities and you should design your application to handle infrequent duplicate records.

To prevent the duplicate messages causing unintentional side effects, such as charging a payment twice, it’s important to design your application with idempotency in mind. By using transaction IDs appropriately, your code can determine if a given message has been processed previously, and ignore any duplicates. In the Alleycat application, the aggregation and processes of messages is idempotent. If two identical messages are received, processing completes with the same end result.

To learn more about implementing idempotency in serverless applications, read the “Serverless Application Lens: AWS Well-Architected Framework”.

Conclusion

In this post, I introduce some of the core streaming concepts for serverless applications. I explain some of the benefits of streaming architectures and how Kinesis works with producers and consumers. I compare different ways to ingest data, how streams are composed of shards, and how partition keys determine which shard is used. Finally, I explain the payload formats at the different stages of a streaming workload, how message ordering works with shards, and why idempotency is important to handle.

To learn more about building serverless web applications, visit Serverless Land.

Developing evolutionary architecture with AWS Lambda

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/developing-evolutionary-architecture-with-aws-lambda/

This post was written by Luca Mezzalira, Principal Solutions Architect, Media and Entertainment.

Agility enables you to evolve a workload quickly, adding new features, or introducing new infrastructure as required. The key characteristics for achieving agility in a code base are loosely coupled components and strong encapsulation.

Loose coupling can help improve test coverage and create atomic refactoring. With encapsulation, you expose only what is needed to interact with a service without revealing the implementation logic.

Evolutionary architectures can help achieve agility in your design. In the book “Building Evolutionary Architectures”, this architecture is defined as one that “supports guided, incremental change across multiple dimensions”.

This blog post focuses on how to structure code for AWS Lambda functions in a modular fashion. It shows how to embrace the evolutionary aspect provided by the hexagonal architecture pattern and apply it to different use cases.

Introducing ports and adapters

Hexagonal architecture is also known as the ports and adapters architecture. It is an architectural pattern used for encapsulating domain logic and decoupling it from other implementation details, such as infrastructure or client requests.

Ports and adapters

  1. Domain logic: Represents the task that the application should perform, abstracting any interaction with the external world.
  2. Ports: Provide a way for the primary actors (on the left) to interact with the application, via the domain logic. The domain logic also uses ports for interacting with secondary actors (on the right) when needed.
  3. Adapters: A design pattern for transforming one interface into another interface. They wrap the logic for interacting with a primary or secondary actor.
  4. Primary actors: Users of the system such as a webhook, a UI request, or a test script.
  5. Secondary actors: used by the application, these services are either a Repository (for example, a database) or a Recipient (such as a message queue).

Hexagonal architecture with Lambda functions

Lambda functions are units of compute logic that accomplish a specific task. For example, a function could manipulate data in a Amazon Kinesis stream, or process messages from an Amazon SQS queue.

In Lambda functions, hexagonal architecture can help you implement new business requirements and improve the agility of a workload. This approach can help create separation of concerns and separate the domain logic from the infrastructure. For development teams, it can also simplify the implementation of new features and parallelize the work across different developers.

The following example introduces a service for returning a stock value. The service supports different currencies for a frontend application that displays the information in a dashboard. The translation of a stock value between currencies happens in real time. The service must retrieve the exchange rates with every request made by the client.

The architecture for this service uses an Amazon API Gateway endpoint that exposes a REST API. When the client calls the API, it triggers a Lambda function. This gets the stock value from a DynamoDB table and the currency information from a third-party endpoint. The domain logic uses the exchange rate to convert the stock value to other currencies before responding to the client request.

The full example is available in the AWS GitHub samples repository. Here is the architecture for this service:

Hexagonal architecture example

  1. A client makes a request to the API Gateway endpoint, which invokes the Lambda function.
  2. The primary adapter receives the request. It captures the stock ID and pass it to the port:
    exports.lambdaHandler = async (event) => {
        try{
    	// retrieve the stockID from the request
            const stockID = event.pathParameters.StockID;
    	// pass the stockID to the port
            const response = await getStocksRequest(stockID);
            return response
        } 
    };
  3. The port is an interface for communicating with the domain logic. It enforces the separation between an adapter and the domain logic. With this approach, you can change and test the infrastructure and domain logic in isolation without impacting another part of the code base:
    const retrieveStock = async (stockID) => {
        try{
    	//use the port “stock” to access the domain logic
            const stockWithCurrencies = await stock.retrieveStockValues(stockID)
            return stockWithCurrencies;
        }
    }
    
  4. The port passing the stock ID invokes the domain logic entry point. The domain logic fetches the stock value from a DynamoDB table, then it requests the exchange rates. It returns the computed values to the primary adapter via the port. The domain logic always uses a port to interact with an adapter because the ports are the interfaces with the external world:
    const CURRENCIES = [“USD”, “CAD”, “AUD”]
    const retrieveStockValues = async (stockID) => {
    try {
    //retrieve the stock value from DynamoDB using a port
            const stockValue = await Repository.getStockData(stockID);
    //fetch the currencies value using a port
            const currencyList = await Currency.getCurrenciesData(CURRENCIES);
    //calculate the stock value in different currencies
            const stockWithCurrencies = {
                stock: stockValue.STOCK_ID,
                values: {
                    "EUR": stockValue.VALUE
                }
            };
            for(const currency in currencyList.rates){
                stockWithCurrencies.values[currency] =  (stockValue.VALUE * currencyList.rates[currency]).toFixed(2)
            }
    // return the final computation to the port
            return stockWithCurrencies;
        }
    }
    

 

This is how the domain logic interacts with the DynamoDB table:

DynamoDB interaction

  1. The domain logic uses the Repository port for interacting with the database. There is not a direct connection between the domain and the adapter:
    const getStockData = async (stockID) => {
        try{
    //the domain logic pass the request to fetch the stock ID value to this port
            const data = await getStockValue(stockID);
            return data.Item;
        } 
    }
    
  2. The secondary adapter encapsulates the logic for reading an item from a DynamoDB table. All the logic for interacting with DynamoDB is encapsulated in this module:
    const getStockValue = async (stockID) => {
        let params = {
            TableName : DB_TABLE,
            Key:{
                'STOCK_ID': stockID
            }
        }
        try {
            const stockData = await documentClient.get(params).promise()
            return stockData
        }
    }
    

 

The domain logic uses an adapter for fetching the exchange rates from the third-party service. It then processes the data and responds to the client request:

 

  1. Currencies API interactionThe second operation in the business logic is retrieving the currency exchange rates. The domain logic requests the operation via a port that proxies the request to the adapter:
    const getCurrenciesData = async (currencies) => {
        try{
            const data = await getCurrencies(currencies);
            return data
        } 
    }
    
  2. The currencies service adapter fetches the data from a third-party endpoint and returns the result to the domain logic.
    const getCurrencies = async (currencies) => {
        try{        
            const res = await axios.get(`http://api.mycurrency.io?symbols=${currencies.toString()}`)
            return res.data
        } 
    }
    

These eight steps show how to structure the Lambda function code using a hexagonal architecture.

Adding a cache layer

In this scenario, the production stock service experiences traffic spikes during the day. The external endpoint for the exchange rates cannot support the level of traffic. To address this, you can implement a caching strategy with Amazon ElastiCache using a Redis cluster. This approach uses a cache-aside pattern for offloading traffic to the external service.

Typically, it can be challenging to evolve code to implement this change without the separation of concerns in the code base. However, in this example, there is an adapter that interacts with the external service. Therefore, you can change the implementation to add the cache-aside pattern and maintain the same API contract with the rest of the application:

const getCurrencies = async (currencies) => {
    try{        
// Check the exchange rates are available in the Redis cluster
        let res = await asyncClient.get("CURRENCIES");
        if(res){
// If present, return the value retrieved from Redis
            return JSON.parse(res);
        }
// Otherwise, fetch the data from the external service
        const getCurr = await axios.get(`http://api.mycurrency.io?symbols=${currencies.toString()}`)
// Store the new values in the Redis cluster with an expired time of 20 seconds
        await asyncClient.set("CURRENCIES", JSON.stringify(getCurr.data), "ex", 20);
// Return the data to the port
        return getCurr.data
    } 
}

This is a low-effort change only affecting the adapter. The domain logic and port interacting with the adapter are untouched and maintain the same API contract. The encapsulation provided by this architecture helps to evolve the code base. It also preserves many of the tests in place, considering only an adapter is modified.

Moving domain logic from a container to a Lambda function

In this example, the team working on this workload originally wrap all the functionality inside a container using AWS Fargate with Amazon ECS. In this case, the developers define a route for the GET method for retrieving the stock value:

// This web application uses the Fastify framework 
  fastify.get('/stock/:StockID', async (request, reply) => {
    try{
        const stockID = request.params.StockID;
        const response = await getStocksRequest(stockID);
        return response
    } 
})

In this case, the route’s entry point is exactly the same for the Lambda function. The team does not need to change anything else in the code base, thanks to the characteristics provided by the hexagonal architecture.

This pattern can help you more easily refactor code from containers or virtual machines to multiple Lambda functions. It introduces a level of code portability that can be more challenging with other solutions.

Benefits and drawbacks

As with any pattern, there are benefits and drawbacks to using hexagonal architecture.

The main benefits are:

  • The domain logic is agnostic and independent from the outside world.
  • The separation of concerns increases code testability.
  • It may help reduce technical debt in workloads.

The drawbacks are:

  • The pattern requires an upfront investment of time.
  • The domain logic implementation is not opinionated.

Whether you should use this architecture for developing Lambda functions depends upon the needs of your application. With an evolving workload, the extra implementation effort may be worthwhile.

The pattern can help improve code testability because of the encapsulation and separation of concerns provided. This approach can also be used with compute solutions other than Lambda, which may be useful in code migration projects.

Conclusion

This post shows how you can evolve a workload using hexagonal architecture. It explains how to add new functionality, change underlying infrastructure, or port the code base between different compute solutions. The main characteristics enabling this are loose coupling and strong encapsulation.

To learn more about hexagonal architecture and similar patterns, read:

For more serverless learning resources, visit Serverless Land.

Translating content dynamically by using Amazon S3 Object Lambda

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/translating-content-dynamically-by-using-amazon-s3-object-lambda/

This post is written by Sandeep Mohanty, Senior Solutions Architect.

The recent launch of Amazon S3 Object Lambda creates many possibilities to transform data in S3 buckets dynamically. S3 Object Lambda can be used with other AWS serverless services to transform content stored in S3 in many creative ways. One example is using S3 Object Lambda with Amazon Translate to translate and serve content from S3 buckets on demand.

Amazon Translate is a serverless machine translation service that delivers fast and customizable language translation. With Amazon Translate, you can localize content such as websites and applications to serve a diverse set of users.

Using S3 Object Lambda with Amazon Translate, you do not need to translate content in advance for all possible permutations of source to target languages. Instead, you can transform content in near-real time using a data driven model. This can serve multiple language-specific applications simultaneously.

S3 Object Lambda enables you to process and transform data using Lambda functions as objects are being retrieved from S3 by a client application. S3 GET object requests invoke the Lambda function and you can customize it to transform the content to meet specific requirements.

For example, if you run a website or mobile application with global visitors, you must provide translations in multiple languages. Artifacts such as forms, disclaimers, or product descriptions can be translated to serve a diverse global audience using this approach.

Solution architecture

This is the high-level architecture diagram for the example application that translates dynamic content on demand:

Solution architecture

In this example, you create an S3 Object Lambda that intercepts S3 GET requests for an object. It then translates the file to a target language, passed as an argument appended to the S3 object key. At a high level, the steps can be summarized as follows:

  1. Create a Lambda function to translate data from a source language to a target language using Amazon Translate.
  2. Create an S3 Object Lambda Access Point from the S3 console.
  3. Select the Lambda function created in step 1.
  4. Provide a supporting S3 Access Point to give S3 Object Lambda access to the original object.
  5. Retrieve a file from S3 by invoking the S3 GetObject API, and pass the Object Lambda Access Point ARN as the bucket name instead of the actual S3 bucket name.

Creating the Lambda function

In the first step, you create the Lambda function, DynamicFileTranslation. This Lambda function is invoked by an S3 GET Object API call and it translates the requested object. The target language is passed as an argument appended to the S3 object key, corresponding to the object being retrieved.

For example, for the object key passed in the S3 GetObject API call is customized to look something like “ContactUs/contact-us.txt#fr”, the characters after the pound sign represent the code for the target language. In this case, ‘fr’ is French. The full list of supported languages and language codes can be found here.

This Lambda function dynamically translates the content of an object in S3 to a target language:

import json
import boto3
from urllib.parse import urlparse, unquote
from pathlib import Path
def lambda_handler(event, context):
    print(event)

   # Extract the outputRoute and outputToken from the object context
    object_context = event["getObjectContext"]
    request_route = object_context["outputRoute"]
    request_token = object_context["outputToken"]

   # Extract the user requested URL and the supporting access point arn
    user_request_url = event["userRequest"]["url"]
    supporting_access_point_arn = event["configuration"]["supportingAccessPointArn"]

    print("USER REQUEST URL: ", user_request_url)
   
   # The S3 object key is after the Host name in the user request URL.
   # The user request URL looks something like this, 
   # https://<User Request Host>/ContactUs/contact-us.txt#fr.
   # The target language code in the S3 GET request is after the "#"
    
    user_request_url = unquote(user_request_url)
    result = user_request_url.split("#")
    user_request_url = result[0]
    targetLang = result[1]
    
   # Extract the S3 Object Key from the user requested URL
    s3Key = str(Path(urlparse(user_request_url).path).relative_to('/'))
       
   # Get the original object from S3
    s3 = boto3.resource('s3')
    
   # To get the original object from S3,use the supporting_access_point_arn 
    s3Obj = s3.Object(supporting_access_point_arn, s3Key).get()
    srcText = s3Obj['Body'].read()
    srcText = srcText.decode('utf-8')

   # Translate original text
    translateClient = boto3.client('translate')
    response = translateClient.translate_text(
                                                Text = srcText,
                                                SourceLanguageCode='en',
                                                TargetLanguageCode=targetLang)
    
  # Write object back to S3 Object Lambda
    s3client = boto3.client('s3')
    s3client.write_get_object_response(
                                        Body=response['TranslatedText'],
                                        RequestRoute=request_route,
                                        RequestToken=request_token )
    
    return { 'statusCode': 200 }

The code in the Lambda function:

  • Extracts the outputRoute and outputToken from the object context. This defines where the WriteGetObjectResponse request is delivered.
  • Extracts the user-requested url from the event object.
  • Parses the S3 object key and the target language that is appended to the S3 object key.
  • Calls S3 GetObject to fetch the raw text of the source object.
  • Invokes Amazon Translate with the raw text extracted.
  • Puts the translated output back to S3 using the WriteGetObjectResponse API.

Configuring the Lambda IAM role

The Lambda function needs permissions to call back to the S3 Object Lambda access point with the WriteGetObjectResponse. It also needs permissions to call S3 GetObject and Amazon Translate. Add the following permissions to the Lambda execution role:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "S3ObjectLambdaAccess",
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3-object-lambda:WriteGetObjectResponse"
            ],
            "Resource": [
                "<arn of your S3 access point/*>”,
                "<arn of your Object Lambda accesspoint>"
            ]
        },
        {
            "Sid": "AmazonTranslateAccess",
            "Effect": "Allow",
            "Action": "translate:TranslateText",
            "Resource": "*"
        }
    ]
}

Deploying the Lambda using an AWS SAM template

Alternatively, deploy the Lambda function with the IAM role by using an AWS SAM template. The code for the Lambda function and the AWS SAM template is available for download from GitHub.

Creating the S3 Access Point

  1. Navigate to the S3 console and create a bucket with a unique name.
  2. In the S3 console, select “Access Points” and choose “Create access point”. Enter a name for the access point.
  3. For Bucket name, enter the S3 bucket name you entered in step 1.Create access point

This access point is the supporting access point for the Object Lambda Access Point you create in the next step. Keep all other settings on this page as default.

After creating the S3 access point, create the S3 Object Lambda Access Point using the supporting Access Point. The Lambda function you created earlier uses the supporting Access Point to download the original untransformed objects from S3.

Create Object Lambda Access Point

In the S3 console, go to the Object Lambda Access Point configuration and create an Object Lambda Access Point. Enter a name.

Create Object Lambda Access Point

For the Lambda function configurations, associate this with the Lambda function created earlier. Select the latest version of the Lambda function and keep all other settings as default.

Select Lambda function

To understand how to use the other settings of the S3 Object Lambda configuration, refer to the product documentation.

Testing dynamic content translation

In this section, you create a Python script and invoke the S3 GetObject API twice. First, against the S3 bucket and then against the Object Lambda Access Point. You can then compare the output to see how content is transformed using Object Lambda:

  1. Upload a text file to the S3 bucket using the Object Lambda Access Point you configured. For example, upload a sample “Contact Us” file in English to S3.
  2. To use the object Lambda Access Point, locate its ARN from the Properties tab of the Object Lambda Access Point.Object Lambda Access Point
  3. Create a local file called s3ol_client.py that contains the following Python script:
    import json
    import boto3
    import sys, getopt
      
    def main(argv):
    	
      try:	
        targetLang = sys.argv[1]
        print("TargetLang = ", targetLang)
        
        s3 = boto3.client('s3')
        s3Bucket = "my-s3ol-bucket"
        s3Key = "ContactUs/contact-us.txt"
    
        # Call get_object using the S3 bucket name
        response = s3.get_object(Bucket=s3Bucket,Key=s3Key)
        print("Original Content......\n")
        print(response['Body'].read().decode('utf-8'))
    
        print("\n")
    
        # Call get_object using the S3 Object Lambda access point ARN
        s3Bucket = "arn:aws:s3-object-lambda:us-west-2:123456789012:accesspoint/my-s3ol-access-point"
        s3Key = "ContactUs/contact-us.txt#" + targetLang
        response = s3.get_object(Bucket=s3Bucket,Key=s3Key)
        print("Transformed Content......\n")
        print(response['Body'].read().decode('utf-8'))
    
        return {'Success':200}             
      except:
        print("\n\nUsage: s3ol_client.py <Target Language Code>")
    
    #********** Program Entry Point ***********
    if __name__ == '__main__':
        main(sys.argv[1: ])
    
  4. Run the client program from the command line, passing the target language code as an argument. The full list of supported languages and codes in Amazon Translate can be found here.python s3ol_client.py "fr"

The output looks like this:

Example output

The first output is the original content that was retrieved when calling GetObject with the S3 bucket name. The second output is the transformed content when calling GetObject against the Object Lambda access point. The content is transformed by the Lambda function as it is being retrieved, and translated to French in near-real time.

Conclusion

This blog post shows how you can use S3 Object Lambda with Amazon Translate to simplify dynamic content translation by using a data driven approach. With user-provided data as arguments, you can dynamically transform content in S3 and generate a new object.

It is not necessary to create a copy of this new object in S3 before returning it to the client. We also saw that it is not necessary for the object with the same name to exist in the S3 bucket when using S3 Object Lambda. This pattern can be used to address several real world use cases that can benefit from the ability to transform and generate S3 objects on the fly.

For more serverless learning resources, visit Serverless Land.

ICYMI: Serverless Q2 2021

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/icymi-serverless-q2-2021/

Welcome to the 14th edition of the AWS Serverless ICYMI (in case you missed it) quarterly recap. Every quarter, we share all of the most recent product launches, feature enhancements, blog posts, webinars, Twitch live streams, and other interesting things that you might have missed!

Q2 calendar

In case you missed our last ICYMI, check out what happened last quarter here.

AWS Step Functions

Step Functions launched Workflow Studio, a new visual tool that provides a drag-and-drop user interface to build Step Functions workflows. This exposes all the capabilities of Step Functions that are available in Amazon States Language (ASL). This makes it easier to build and change workflows and build definitions in near-real time.

For more:

Workflow Studio

The new data flow simulator in the Step Functions console helps you evaluate the inputs and outputs passed through your state machine. It allows you to simulate each of the fields used to process data and updates in real time. It can help accelerate development with workflows and help visualize JSONPath processing.

For more:

Data flow simulator

Also, Amazon API Gateway can now invoke synchronous Express Workflows using REST APIs.

Amazon EventBridge

EventBridge now supports cross-Region event routing from any commercial AWS Region to a list of supported Regions. This feature allows you to centralize global events for auditing and monitoring or replicate events across Regions.

EventBridge cross-Region routing

The service now also supports bus-to-bus event routing in the same Region and in the same AWS account. This can be useful for centralizing events related to a single project, application, or team within your organization.

EventBridge bus-to-bus

You can now use EventBridge as a resource within Step Functions workflows. This provides a direct service integration for both standard and Express Workflows. You can publish events directly to a specified event bus using either a request-response or wait-for-callback pattern.

EventBridge added a new target for rules – Amazon SageMaker Pipelines. This allows you to use a rule to trigger a continuous integration and continuous deployment (CI/CD) service for your machine learning workloads.

AWS Lambda

Lambda Extensions

AWS Lambda extensions are now generally available including some performance and functionality improvements. Lambda extensions provide a new way to integrate your chosen monitoring, observability, security, and governance tools with AWS Lambda. These use the Lambda Runtime Extensions API to integrate with the execution environment and provide hooks into the Lambda lifecycle.

To help build your own extensions, there is an updated GitHub repository with example code.

To learn more:

  • Watch a Tech Talk with Julian Wood.
  • Watch the 8-episode Learning Path series covering all aspects of extensions.

Extensions available today

Amazon CloudWatch Lambda Insights support for Lambda container images is now generally available.

Amazon SNS

Amazon SNS has expanded the set of filter operators available to include IP address matching, existence of an attribute key, and “anything-but” matching.

The service has also introduced an SMS sandbox to help developers testing workloads that send text messages.

To learn more:

Amazon DynamoDB

DynamoDB announced CloudFormation support for several features. First, it now supports configuring Kinesis Data Streams using CloudFormation. This allows you to use infrastructure as code to set up Kinesis Data Streams instead of DynamoDB streams.

The service also announced that NoSQL Workbench now supports CloudFormation, so you can build data models and configure table capacity settings directly from the tool. Finally, you can now create and manage global tables with CloudFormation.

Learn how to use the recently launched Serverless Patterns Collection to configure DynamoDB as an event source for Lambda.

AWS Amplify

Amplify Hosting announced support for server-side rendered (SSR) apps built with the Next.js framework. This provides a zero configuration option for developers to deploy and host their Next.js-based applications.

The Amplify GLI now allows developers to make multiple DynamoDB GSI updates in a single deployment. This can help accelerate data model iterations. Additionally, the data management experience in the Amplify Admin UI launched at AWS re:Invent 2020 is now generally available.

AWS Serverless Application Model (AWS SAM)

AWS SAM has a public preview of support for local development and testing of AWS Cloud Development Kit (AWS CDK) projects.

To learn more:

Serverless blog posts

Operating Lambda

The “Operating Lambda” blog series includes the following posts in this quarter:

Streaming data

The “Building serverless applications with streaming data” blog series shows how to use Lambda with Kinesis.

Getting started with serverless for developers

Learn how to build serverless applications from your local integrated development environment (IDE).

April

May

June

Tech Talks & Events

We hold AWS Online Tech Talks covering serverless topics throughout the year. These are listed in the Serverless section of the AWS Online Tech Talks page. We also regularly deliver talks at conferences and events around the world, speak on podcasts, and record videos you can find to learn in bite-sized chunks.

Here are some from Q2:

Serverless Live was a day of talks held on May 19, featuring the serverless developer advocacy team, along with Adrian Cockroft and Jeff Barr. You can watch a replay of all the talks on the AWS Twitch channel.

Videos

YouTube ServerlessLand channel

Serverless Office Hours – Tues 10 AM PT / 1PM EST

Weekly live virtual office hours. In each session we talk about a specific topic or technology related to serverless and open it up to helping you with your real serverless challenges and issues. Ask us anything you want about serverless technologies and applications.

YouTube: youtube.com/serverlessland
Twitch: twitch.tv/aws

April

May

June

DynamoDB Office Hours

Are you an Amazon DynamoDB customer with a technical question you need answered? If so, join us for weekly Office Hours on the AWS Twitch channel led by Rick Houlihan, AWS principal technologist and Amazon DynamoDB expert. See upcoming and previous shows

Learning Path – AWS Lambda Extensions: The deep dive

Are you looking for a way to more easily integrate AWS Lambda with your favorite monitoring, observability, security, governance, and other tools? Welcome to AWS Lambda extensions: The deep dive, a learning path video series that shows you everything about augmenting Lambda functions using Lambda extensions.

There are also other helpful videos covering serverless available on the Serverless Land YouTube channel.

Still looking for more?

The Serverless landing page has more information. The Lambda resources page contains case studies, webinars, whitepapers, customer stories, reference architectures, and even more Getting Started tutorials.

You can also follow the Serverless Developer Advocacy team on Twitter to see the latest news, follow conversations, and interact with the team.

Monitoring and troubleshooting serverless data analytics applications

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/monitoring-and-troubleshooting-serverless-data-analytics-applications/

This series is about building serverless solutions in streaming data workloads. The application example used in this series is Alleycat, which allows bike racers to compete with each other virtually on home exercise bikes.

The first four posts have explored the architecture behind the application, which is enabled by Amazon Kinesis, Amazon DynamoDB, and AWS Lambda. This post explains how to monitor and troubleshoot issues that are common in streaming applications.

To set up the example, visit the GitHub repo and follow the instructions in the README.md file. Note that this walkthrough uses services that are not covered by the AWS Free Tier and incur cost.

Monitoring the Alleycat application

The business requirements for Alleycat state that it must handle up to 1,000 simultaneous racers. With each racer emitting a message every second, each 5-minute race results in 300,000 messages.

Reference architecture

While the architecture can support this throughput, the settings for each service determine how the workload scales up. The deployment templates in the GitHub repo do not use sufficiently high settings to handle this amount of data. In the section, I show how this results in errors and what steps you can take to resolve the issues. To start, I run the simulator for several races with the maximum racers configuration set to 1,000.

Monitoring the Kinesis stream

The monitoring tab of the Kinesis stream provides visualizations of stream metrics. This immediately shows that there is a problem in the application when running at full capacity:

Monitoring the Kinesis stream

  1. The iterator age is growing, indicating that the data consumers are falling behind the data producers. The Get records graph also shows the number of records in the stream growing.
  2. The Incoming data (count) metric shows the number of separate records ingested by the stream. The red line indicates the maximum capacity of this single-shard stream. With 1,000 active racers, this is almost at full capacity.
  3. However, the Incoming data – sum (bytes) graph shows that the total amount of data ingested by the stream is currently well under the maximum level shown by the red line.

There are two solutions for improving the capacity on the stream. First, the data producer application (the Alleycat frontend) could combine messages before sending. It’s currently reaching the total number of messages per second but the total byte capacity is significantly below the maximum. This action improves message packing but increases latency since the frontend waits to group messages.

Alternatively, you can add capacity by resharding. This enables you to increase (or decrease) the number of shards in a stream to adapt to the rate of data flowing through the application. You can do this with the UpdateShardCount API action. The existing stream goes into an Updating status and the stream scales by splitting shards. This creates two new child shards that split the partition keyspace of the parent. It also results in another, separate Lambda consumer for the new shard.

Monitoring the Lambda function

The monitoring tab of the consuming Lambda function provides visualization of metrics that can highlight problems in the workload. At full capacity, the monitoring highlights issues to resolve:

Monitoring the Lambda function

  1. The Duration chart shows that the function is exceeding its 15-second timeout, when the function normally finishes in under a second. This typically indicates that there are too many records to process in a single batch or throttling is occurring downstream.
  2. The Error count metric is growing, which highlights either logical errors in the code or errors from API calls to downstream resources.
  3. The IteratorAge metric appears for Lambda functions that are consuming from streams. In this case, the growing metric confirms that data consumption is falling behind data production in the stream.
  4. Concurrent executions remain at 1 throughout. This is set by the parallelization factor in the event source mapping and can be increased up to 10.

Monitoring the DynamoDB table

The metric tab on the application’s table in the DynamoDB console provides visualizations for the performance of the service:

Monitoring the DynamoDB table

  1. The consumed Read usage is well within the provisioned maximum and there is no read throttling on the table.
  2. Consumed Write usage, shown in blue, is frequently bursting through the provisioned capacity.
  3. The number of Write throttled requests confirms that the DynamoDB service is throttling requests since the table is over capacity.

You can resolve this issue by increasing the provisioned throughput on the table and related global secondary indexes. Write capacity units (WCUs) provide 1 KB of write throughput per second. You can set this value manually, use automatic scaling to match varying throughout, or enable on-demand mode. Read more about the pricing models for each to determine the best approach for your workload.

Monitoring Kinesis Data Streams

Kinesis Data Streams ingests data into shards, which are fixed capacity sequences of records, up to 1,000 records or 1 MB per second. There is no limit to the amount of data held within a stream but there is a configurable retention period. By default, Kinesis stores records for 24 hours but you can increase this up to 365 days as needed.

Kinesis is integrated with Amazon CloudWatch. Basic metrics are published every minute, and you can optionally enable enhanced metrics for an additional charge. In this section, I review the most commonly used metrics for monitoring the health of streams in your application.

Metrics for monitoring data producers

When data producers are throttled, they cannot put new records onto a Kinesis stream. Use the WriteProvisionedThroughputExceeded metric to detect if producers are throttled. If this is more than zero, you won’t be able to put records to the stream. Monitoring the Average for this statistic can help you determine if your producers are healthy.

When producers succeed in sending data to a stream, the PutRecord.Success and PutRecords.Success are incremented. Monitoring for spikes or drops in these metrics can help you monitor the health of producers and catch problems early. There are two separate metrics for each of the API calls, so watch the Average statistic for whichever of the two calls your application uses.

Metrics for monitoring data consumers

When data consumers are throttled or start to generate errors, Kinesis continues to accept new records from producers. However, there is growing latency between when records are written and when they are consumed for processing.

Using the GetRecords.IteratorAgeMilliseconds metric, you can measure the difference between the age of the last record consumed and the latest record put to the stream. It is important to monitor the iterator age. If the age is high in relation to the stream’s retention period, you can lose data as records expire from the stream. This value should generally not exceed 50% of the stream’s retention period – when the value reaches 100% of the stream retention period, data is lost.

If the iterator age is growing, one temporary solution is to increase the retention time of the stream. This gives you more time to resolve the issue before losing data. A more permanent solution is to add more consumers to keep up with data production, or resolve any errors that are slowing consumers.

When consumers exceed the ReadProvisionedThroughputExceeded metric, they are throttled and you cannot read from the stream. This results in a growth of records in the stream waiting for processing. Monitor the Average statistic for this metric and aim for values as close to 0 as possible.

The GetRecords.Success metric is the consumer-side equivalent of PutRecords.Success. Monitor this value for spikes or drops to ensure that your consumers are healthy. The Average is usually the most useful statistic for this purpose.

Increasing data processing throughput for Kinesis Data Streams

Adjusting the parallelization factor

Kinesis invokes Lambda consumers every second with a configurable batch size of messages. It’s important that the processing in the function keeps pace with the rate of traffic to avoid a growing iterator age. For compute intensive functions, you can increase the memory allocated in the function, which also increases the amount of virtual CPU available. This can help reduce the duration of a processing function.

If this is not possible or the function is falling behind data production in the stream, consider increasing the parallelization factor. By default, this is set to 1, meaning that each shard has a single instance of a Lambda function it invokes. You can increase this up to 10, which results in multiple instances of the consumer function processing additional batches of messages.

Adjusting the parallelization factor

Using enhanced fan-out to reduce iterator age

Standard consumers use a pull model over HTTP to fetch batches of records. Each consumer operates in serial. A stream with five consumers averages 200 ms of latency each, meaning it takes up to 1 second for all five to receive batches of records.

You can improve the overall latency by removing any unnecessary data consumers. If you use Kinesis Data Firehose and Kinesis Data Analytics on a stream, these count as consumers too. If you can remove subscribers, this helps with over data consumption throughput.

If the workload needs all of the existing subscribers, use enhanced fan-out (EFO). EFO consumers use a push model over HTTP/2 and are independent of each other. With EFO, the same five consumers in the previous example would receive batches of messages in parallel, using dedicated throughput. Overall latency averages 70 ms and typically data delivery speed is improved by up to 65%. There is an additional charge for this feature.

Enhanced fan-out

To learn more about processing streaming data with Lambda, see this AWS Online Tech Talk presentation.

Conclusion

In this post, I show how the existing settings in the Alleycat application are not sufficient for handling the expected amount of traffic. I walk through the metrics visualizations for Kinesis Data Streams, Lambda, and DynamoDB to find which quotas should be increased.

I explain which CloudWatch metrics can be used with Kinesis Data Stream to ensure that data producers and data consumers are healthy. Finally, I show how you can use the parallelization factor and enhanced fan-out features to increase the throughput of data consumers.

For more serverless learning resources, visit Serverless Land.

Building leaderboard functionality with serverless data analytics

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-serverless-applications-with-streaming-data-part-4/

This series is about building serverless solutions in streaming data workloads. The application example used in this series is Alleycat, which allows bike racers to compete with each other virtually on home exercise bikes.

Part 1 explains the application’s functionality, how to deploy to your AWS account, and provides an architectural review. Part 2 compares different ways to ingest streaming data into Amazon Kinesis Data Streams and shows how to optimize shard capacity. Part 3 uses Amazon Kinesis Data Firehose with AWS Lambda to implement the all-time rankings functionality.

This post walks through the leaderboard functionality in the application. This solution uses Kinesis Data Streams, Lambda, and Amazon DynamoDB to provide both real-time and historical rankings.

To set up the example, visit the GitHub repo and follow the instructions in the README.md file. Note that this walkthrough uses services that are not covered by the AWS Free Tier and incur cost.

Overview of leaderboards in Alleycat

Alleycat races are 5 minutes long and run continuously throughout the day for each class. Competitors select a class type and automatically join the current race by choosing Start Race. There are up to 1,000 competitors per race. To track performance, there are two types of leaderboard in Alleycat: Race results for completed races and the Realtime rankings for active races.

Alleycat frontend

  1. The user selects a completed race from the dropdown to a leaderboard ranking of results. This table is not real time and only reflects historical results.
  2. The user selects the Here now option to see live results for all competitors in the current virtual race for the chosen class.

Architecture overview

The backend microservice supporting these features is located in the 1-streaming-kds directory in the GitHub repo. It uses the following architecture:

Solution architecture

  1. The tumbling window function receives results every second from Kinesis Data Streams. This function aggregates metrics and saves the latest results to the application’s DynamoDB table.
  2. DynamoDB Streams invoke the publishing Lambda function every time an item is changed in the table. This reformats the message and publishes to the application’s IoT topic in AWS IoT Core to update the frontend.
  3. The final results function is an additional consumer on Kinesis Data Streams. It filters for only the last result in each competitor’s race and publishes the results to the DynamoDB table.
  4. The frontend calls an Amazon API Gateway endpoint to fetch the historical results for completed races via a Lambda function.

Configuring the tumbling window function

This Lambda function aggregates data from the Kinesis stream and stores the result in the DynamoDB table:

Tumbling window architecture

This function receives batches of 1,000 messages per invocation. The messages may contain results for multiple races and competitors. The function groups the results by race and racer ID and then flattens the data structure. It writes an item to the DynamoDB table in this format:

{
 "PK": "race-5403307",
 "SK": "results",
 "ts": 1620992324960,
 "results": "{\"0\":167.04,\"1\":136,\"2\":109.52,\"3\":167.14,\"4\":129.69,\"5\":164.97,\"6\":149.86,\"7\":123.6,\"8\":154.29,\"9\":89.1,\"10\":137.41,\"11\":124.8,\"12\":131.89,\"13\":117.18,\"14\":143.52,\"15\":95.04,\"16\":109.34,\"17\":157.38,\"18\":81.62,\"19\":165.76,\"20\":181.78,\"21\":140.65,\"22\":112.35,\"23\":112.1,\"24\":148.4,\"25\":141.75,\"26\":173.24,\"27\":131.72,\"28\":133.77,\"29\":118.44}",
 "GSI": 2
}

This transformation significantly reduces the number of writes to the DynamoDB table per function invocation. Each time a write occurs, this triggers the publishing process and notifies the frontend via the IoT topic.

DynamoDB can handle almost any number of writes to the table. You can set up to 40,000 write capacity units with default limits and can request even higher limits via an AWS Support ticket. However, you may want to limit the throughput to reduce the WCU cost.

The tumbling window feature of Lambda allows invocations from a streaming source to pass state between invocations. You can specify a window interval of up to 15 minutes. During the window, a state is passed from one invocation to the next, until a final invocation at the end of the window. Alleycat uses this feature to buffer aggregated results and only write the output at the end of the tumbling window:

Tumbling window process

For a tumbling window period of 5 seconds, this means that the Lambda function is invoked multiple times, passing an intermediate state from invocation to invocation. Once the window ends, it then writes the final aggregated result to DynamoDB. The tradeoff in this solution is that it reduces the number of real-time notifications to the frontend since these are published from the table’s stream. This increases the latency of live results in the frontend application.

Buffer by the Lambda function

Implementing tumbling windows in Lambda functions

The template.yaml file describes the tumbling Lambda function. The event definition specifies the tumbling window duration in the TumblingWindowInSeconds attribute:

  TumblingWindowFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: tumblingFunction/    
      Handler: app.handler
      Runtime: nodejs14.x
      Timeout: 15
      MemorySize: 256
      Environment:
        Variables:
          DDB_TABLE: !Ref DynamoDBtableName
      Policies:
        DynamoDBCrudPolicy:
          TableName: !Ref DynamoDBtableName
      Events:
        Stream:
          Type: Kinesis
          Properties:
            Stream: !Sub "arn:aws:kinesis:${AWS::Region}:${AWS::AccountId}:stream/${KinesisStreamName}"
            BatchSize: 1000
            StartingPosition: TRIM_HORIZON      
            TumblingWindowInSeconds: 15  

When you enable tumbling windows, the function’s event payload contains several new attributes:

{
    "Records": [
        {
 	    ...
        }
    ],
    "shardId": "shardId-000000000000",
    "eventSourceARN": "arn:aws:kinesis:us-east-2:123456789012:stream/alleycat",
    "window": {
        "start": "2021-05-05T18:51:00Z",
        "end": "2021-05-05T18:51:15Z"
    },
    "state": {},
    "isFinalInvokeForWindow": false,
    "isWindowTerminatedEarly": false
}

These include:

  • Window start and end: The beginning and ending timestamps for the current tumbling window.
  • State: An object containing the state returned from the previous invocation, which is initially empty in a new window. The state object can contain up to 1 MB of data.
  • isFinalInvokeForWindow: Indicates if this is the last invocation for the current window. This only occurs once per window period.
  • isWindowTerminatedEarly: A window ends early if the state exceeds the maximum allowed size of 1 MB.

The event handler in app.js uses tumbling windows if they are defined in the AWS SAM template:

// Main Lambda handler
exports.handler = async (event) => {

	// Retrieve existing state passed during tumbling window	
	let state = event.state || {}
	
	// Process the results from event
	let jsonRecords = getRecordsFromPayload(event)
	jsonRecords.map((record) => raceMap[record.raceId] = record.classId)
	state = getResultsByRaceId(state, jsonRecords)

	// If tumbling window is not configured, save and exit
	if (event.window === undefined) {
		return await saveCurrentRaces(state) 
	}

	// If tumbling window is configured, save to DynamoDB on the 
	// final invocation in the window
	if (event.isFinalInvokeForWindow) {
		await saveCurrentRaces(state) 
	} else {
		return { state }
	}
}

The final results function and API

The final results function filters for the last event in each competitor’s race, which contains the final score. The function writes each score to the DynamoDB table. Since there may be many write events per invocation, this function uses the Promise.all construct in Node.js to complete the database operations in parallel:

const saveFinalScores = async (raceResults) => {
	let paramsArr = []

	raceResults.map((result) => {
		paramsArr.push({
		  TableName : process.env.DDB_TABLE,
		  Item: {
		    PK: `race-${result.raceId}`,
		    SK: `racer-${result.racerId}`,
		    GSI: result.output,
		    ts: Date.now()
		  }
		})
	})
	// Save to DDB in parallel
	await Promise.all(paramsArr.map((params) => documentClient.put (params).promise()))
}

Using this approach, each call to documentClient.put is made without waiting for a response from the SDK. The call returns a Promise object, which is in a pending state until the database operation returns with a status. Promise.all waits for all promises to resolve or reject before code execution continues. Comparing the serial and concurrent approach, this reduces the overall time for multiple database writes. The tradeoff is that it increases the number of writes to DynamoDB and the number of WCUs consumed.

Serial vs concurrent writes

For a large number of put operations to DynamoDB, you can also use the DocumentClient’s batchWrite operation. This delegates to the underlying DynamoDB BatchWriteItem operation in the AWS SDK and can accept up to 25 separate put requests in a single call. To handle more than 25, you can make multiple batchWrite requests and still use the parallelized invocation method shown above.

The DynamoDB table maintains a list of race results with one item per racer per race:

DynamoDB table items

The frontend calls an API Gateway endpoint that invokes the getLeaderboard function. This function uses the DocumentClient’s query API to return results for a selected race, sorted by a global secondary index containing the final score:

const AWS = require('aws-sdk')
AWS.config.region = process.env.AWS_REGION 
const documentClient = new AWS.DynamoDB.DocumentClient()

// Main Lambda handler
exports.handler = async (event) => {

    const classId = parseInt(event.queryStringParameters.classId)

    const params = {
        TableName: process.env.DDB_TABLE,
        IndexName: 'GSI_PK_Index',
        KeyConditionExpression: 'PK = :ID',
        ExpressionAttributeValues: {
          ':ID': `class-${classId}`
        },
        ScanIndexForward: false,
        Limit: 1000
    }   

    const result = await documentClient.query(params).promise()   
    return result.Items
}

By default, this returns the top 1,000 places by using the Limit parameter. You can customize this or use pagination to implement fetching large result sets more efficiently.

Conclusion

In this post, I explain the all-time leaderboard logic in the Alleycat application. This is an asynchronous, eventually consistent process that checks batches of incoming records for new personal best records. This uses Kinesis Data Firehose to provide a zero-administration way to deliver and process large batches of records continuously.

This post shows the architecture in Alleycat and how this is defined in AWS SAM. Finally, I walk through how to build a data transformation Lambda function that decodes a payload and returns records back to Kinesis.

Part 5 discusses how to troubleshoot issues in Alleycat and how to monitor streaming applications generally.

For more serverless learning resources, visit Serverless Land.

Building serverless applications with streaming data: Part 3

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-serverless-applications-with-streaming-data-part-3/

This series is about building serverless solutions in streaming data workloads. These are traditionally challenging to build, since data can be streamed from thousands or even millions of devices continuously. The application example used in this series is Alleycat, which allows bike racers to compete with each other virtually on home exercise bikes.

Part 1 explains the application’s functionality, how to deploy to your AWS account, and provides an architectural review. Part 2 compares different ways to ingest streaming data into Amazon Kinesis Data Streams and shows how to optimize shard capacity.

In this post, I explain how the all-time rankings functionality is implemented. This uses Amazon Kinesis Data Firehose to deliver hundreds of thousands of data points for processing by AWS Lambda functions.

To set up the example, visit the GitHub repo and follow the instructions in the README.md file. Note that this walkthrough uses services that are not covered by the AWS Free Tier and incur cost.

Overview of real time rankings in Alleycat

In the example scenario, there are 40,000 users and up to 1,000 competitors may race at any given time. While a competitor is racing, there is a real time rankings display that shows their performance in the selected class:

Alleycat front end

The racer can select Here now to compare against racers in the current virtual race. Alternatively, selecting All time compares their performance against the best performance of all races who have ever competed in the race. The rankings board is dynamic and shows the rankings for the current second on the display for the local racer:

Leaderboard changes over time

In Alleycat, races occur every five minutes continuously, and the all-time data is gathered from races that are completed. For this to work, the application must do the following:

  • Continuously aggregate race data from each of the six exercise classes and deliver to an Amazon S3 bucket.
  • Compare the incoming race data with the personal best records for every competitor in the same class.
  • If the current performance is a record, update the all-time best records dataset.
  • Compress the dataset, since there are thousands of racers, each with 5 minutes of personal racing history.

The resulting dataset contains second-by-second personal records for every racer in a selected race. This is saved in the application’s history bucket in S3 and loaded by the Alleycat front end before the beginning of each race:

        // From Home.vue's loadRealtimeHistory method

        // Load gz from S3 history bucket, unzip and parse to JSON
        const URL = `https://${this.$appConfig.historyBucket}.s3.${this.$appConfig.region}.amazonaws.com/class-${this.selectedClassId}.gz`
        const response = await axios.get(URL, { responseType: 'arraybuffer' })
        const buffer = Buffer.from(response.data, 'base64')
        const dezipped = await gunzip(buffer)
        const history = JSON.parse(dezipped.toString())

Architecture overview

The backend microservice handling this feature is located in the 2-streaming-kdf directory in the GitHub repo. It uses the following architecture:

Solution architecture

  1. Amazon Kinesis Data Firehose is a consumer of Kinesis Data Streams. Records are delivered from the application’s main stream as they arrive.
  2. Kinesis Data Firehose invokes a data transformation Lambda function. This calculates the output for each racer’s data points and returns the modified records to Kinesis.
  3. Kinesis Data Firehose delivers batches of records to an S3 bucket.
  4. When objects are written to the S3 bucket, this event triggers the S3 processor Lambda function. This compares incoming racer performance with historical records.
  5. The Lambda function saves the new all-time records in a compressed format to the application’s history bucket in S3.

The process happens continuously providing that new records are delivered to the application’s main Kinesis stream.

Using Kinesis Data Firehose

Kinesis Data Firehose can consume records from Kinesis Data Streams, or directly from the AWS SDK, AWS IoT Core, and other data producers. In Alleycat, there are multiple consumers that process incoming data, so Kinesis Data Firehose is configured as a consumer on the main stream.

The process of updating historical records is not a real-time process in Alleycat. Processing individual racer messages and comparing against all-time records is possible but would be computationally more complex. In practice, only a few incoming data points are all-time records. As a result, Alleycat asynchronously processes batches of records to find and update all-time records. The process is eventually consistent and the tradeoff is latency. There may be up to 1 minute between recording a personal record and updating the historical dataset in S3.

Kinesis Data Firehose provides several key functions for this process. First, it batches groups of messages based upon the batching hints provided in the AWS Serverless Application Model (AWS SAM) template. This application buffers records for up to 60 seconds or until 1 MB of records are available, whichever is reached first. You can adjust these settings and batch for up to 900 seconds or 128 MB of data. Note that these settings are hints and not absolute – the service can dynamically adjust these if data delivery falls behind data writing in the stream. As a result, hints should be treated as guidance and the actual settings may change.

Kinesis Data Firehose also enables data compression before delivery in various common formats. Since the S3 delivery bucket is an intermediary storage location in this application, using data compression reduces the storage cost. Kinesis can also encrypt data before delivery but this feature is not used in Alleycat. Finally, Kinesis Data Firehose can transform the incoming records by invoking a Lambda function. This enables the application to calculate the racer’s output before delivering the records.

The configuration for Kinesis Data Firehose, the S3 buckets, and the Lambda function is located in the template.yaml file in the GitHub repo. This AWS SAM template shows how to define the complete integration:

  DeliveryStream:
    Type: AWS::KinesisFirehose::DeliveryStream
    DependsOn:
      - DeliveryStreamPolicy    
    Properties:
      DeliveryStreamName: "alleycat-data-firehose"
      DeliveryStreamType: "KinesisStreamAsSource"
      KinesisStreamSourceConfiguration: 
        KinesisStreamARN: !Sub "arn:aws:kinesis:${AWS::Region}:${AWS::AccountId}:stream/${KinesisStreamName}"
        RoleARN: !GetAtt DeliveryStreamRole.Arn
      ExtendedS3DestinationConfiguration: 
        BucketARN: !GetAtt DeliveryBucket.Arn
        BufferingHints: 
          SizeInMBs: 1
          IntervalInSeconds: 60
        CloudWatchLoggingOptions: 
          Enabled: true
          LogGroupName: "/aws/kinesisfirehose/alleycat-firehose"
          LogStreamName: "S3Delivery"
        CompressionFormat: "GZIP"
        EncryptionConfiguration: 
          NoEncryptionConfig: "NoEncryption"
        Prefix: ""
        RoleARN: !GetAtt DeliveryStreamRole.Arn
        ProcessingConfiguration: 
          Enabled: true
          Processors: 
            - Type: "Lambda"
              Parameters: 
                - ParameterName: "LambdaArn"
                  ParameterValue: !GetAtt FirehoseProcessFunction.Arn

Once Kinesis Data Firehose is set up, there is no ongoing administration. The service scales automatically to adjust to the amount of the data available in the application’s Kinesis data stream.

How the Lambda data transformer works

Kinesis Data Firehose buffers up to 3 MB before invoking the data transformation function (you can configure this setting with the ProcessingConfiguration API). The service delivers records as a JSON array:

{
    "invocationId": "d8ce3cc6-abcd-407a-abcd-d9bc2ce58e72",
    "sourceKinesisStreamArn": "arn:aws:kinesis:us-east-2:012345678912:stream/alleycat",
    "deliveryStreamArn": "arn:aws:firehose:us-east-2:012345678912:deliverystream/alleycat-firehose",
    "region": "us-east-2",
    "records": [
        {
            "recordId": "49617596128959546022271021234567...",
            "approximateArrivalTimestamp": 1619185789847,
            "data": "eyJ1dWlkIjoiYjM0MGQzMGYtjI1Yi00YWM4LThjY2QtM2ZhNThiMWZmZjNlIiwiZXZlbnQiOiJ1cGRhdGUiLCJkZXZpY2VUaW1lc3RhbXiOjE2MTkxODU3ODk3NUsInNlY29uZCI6MwicmFjZUlkIjoxNjE5MTg1NjIwMj1LCJuYW1lIjoiSHViZXJ0IiwicmFjZXJJZCI6MSwiY2xhc3NJZCI6MSwiY2FkZW5jZSI6ODAsInJlc2lzdGFuY2UiOjc4fQ==",
            "kinesisRecordMetadata": {
                "sequenceNumber": "49617596128959546022271020452",
                "subsequenceNumber": 0,
                "partitionKey": "1619185789798",
                "shardId": "shardId-000000000000",
                "approximateArrivalTimestamp": 1619185789847
            }
        }, 
   ...

The payload contains metadata about the invocation and data source and each record contains a recordId and a base64 encoded data attribute. The Lambda function can then decode and modify the data attribute for each record as needed. The Lambda function must finally return a JSON array of the same length as the incoming event.records array, with the following attributes:

  • recordId: this must match the incoming recordId, so Kinesis can map the modified data payload back to the original record.
  • result: this must be “Ok”, “Dropped”, or “ProcessingFailed”. “Dropped” means that the function has intentionally removed a payload from processing. Any records with “ProcessingFailed” are delivered to the S3 bucket in a folder called processing-failed. This includes metadata indicating the number of delivery attempts, the timestamp of the last attempt, and the Lambda function’s ARN.
  • data: the returned data payload must be base64 encoded and the modified record must be within the 1 MB limit per record.

The transformer function in Alleycat shows how to implement this process in Node.js:

exports.handler = (event) => {
  const output = event.records.map((record) => {
    // Extract JSON record from base64 data
    const buffer = Buffer.from(record.data, "base64").toString()
    const jsonRecord = JSON.parse(buffer)

    // Add calculated field
    jsonRecord.output = ((jsonRecord.cadence + 35) * (jsonRecord.resistance + 65)) / 100

    // Convert back to base64 + add a newline
    const dataBuffer = Buffer.from(JSON.stringify(jsonRecord) + "\n", "utf8").toString("base64")

    return {
      recordId: record.recordId,
      result: "Ok",
      data: dataBuffer,
    }
  })

  console.log(`{ recordsTotal: ${output.length} }`)
  return { records: output }
}

If you are processing the data in downstream services in JSON format, adding a newline at the end of the data buffer for each record can simplify the import process.

The data transformation Lambda function can run for up to 5 minutes and can use other AWS services for data processing. The Kinesis Data Firehose service scales up the Lambda function automatically if traffic increases, up to 5 outstanding invocations per shard (or 10 if the destination is Splunk).

Processing the Kinesis Data Firehose output

Kinesis Data Firehose puts a series of objects in the intermediary S3 bucket. Each put event invokes the S3 processor Lambda function. This function compares each data point to historical best performance, per racer ID, per class ID, at the same second of the race.

Data points that do not beat existing records are discarded. Otherwise, the function merges new historical records into the dataset and saves the result into the final S3 history bucket. This object is compressed in gzip format and contains the history for a single class ID.

When the frontend downloads this dataset from the S3 bucket, it decompresses the object. The resulting JSON structure shows personal output records per racer ID for each second of the race. The frontend uses this data to update the leaderboard on the local device during each second of the active race.

JSON output

Conclusion

In this post, I explain the all-time leaderboard logic in the Alleycat application. This is an asynchronous, eventually consistent process that checks batches of incoming records for new personal records. This uses Kinesis Data Firehose to provide a zero-administration way to deliver and process large batches of records continuously.

This post shows the architecture in Alleycat and how this is defined in AWS SAM. Finally, I walk through how to build a data transformation Lambda function that correctly decodes a payload and returns records back to Kinesis.

Part 4 show how to combine Kinesis with Amazon DynamoDB to support queries for streaming data. Alleycat uses this architecture to provide real-time rankings for competitors in the same virtual race.

For more serverless learning resources, visit Serverless Land.

Announcing migration of the Java 8 runtime in AWS Lambda to Amazon Corretto

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/announcing-migration-of-the-java-8-runtime-in-aws-lambda-to-amazon-corretto/

This post is written by Jonathan Tuliani, Principal Product Manager, AWS Lambda.

What is happening?

Beginning July 19, 2021, the Java 8 managed runtime in AWS Lambda will migrate from the current Open Java Development Kit (OpenJDK) implementation to the latest Amazon Corretto implementation.

To reflect this change, the AWS Management Console will change how Java 8 runtimes are displayed. The display name for runtimes using the ‘java8’ identifier will change from ‘Java 8’ to ‘Java 8 on Amazon Linux 1’. The display name for runtimes using the ‘java8.al2’ identifier will change from ‘Java 8 (Corretto)’ to ‘Java 8 on Amazon Linux 2’. The ‘java8’ and ‘java8.al2’ identifiers themselves, as used by tools such as the AWS CLI, CloudFormation, and AWS SAM, will not change.

Why are you making this change?

This change enables customers to benefit from the latest innovations and extended support of the Amazon Corretto JDK distribution. Amazon Corretto is a no-cost, multiplatform, production-ready distribution of the OpenJDK. Corretto is certified as compatible with the Java SE standard and used internally at Amazon for many production services.

Amazon is committed to Corretto, and provides regular updates that include security fixes and performance enhancements. With this change, these benefits are available to all Lambda customers. For more information on improvements provided by Amazon Corretto 8, see Amazon Corretto 8 change logs.

How does this affect existing Java 8 functions?

Amazon Corretto 8 is designed as a drop-in replacement for OpenJDK 8. Most functions benefit seamlessly from the enhancements in this update without any action from you.

In rare cases, switching to Amazon Corretto 8 introduces compatibility issues. See below for known issues and guidance on how to verify compatibility in advance of this change.

When will this happen?

This migration to Amazon Corretto takes place in several stages:

  • June 15, 2021: Availability of Lambda layers for testing the compatibility of functions with the Amazon Corretto runtime. Start of AWS Management Console changes to java8 and java8.al2 display names.
  • July 19, 2021: Any new functions using the java8 runtime will use Amazon Corretto. If you update an existing function, it will transition to Amazon Corretto automatically. The public.ecr.aws/lambda/java:8 container base image is updated to use Amazon Corretto.
  • August 16, 2021: For functions that have not been updated since June 28, AWS will begin an automatic transition to the new Corretto runtime.
  • September 10, 2021: Migration completed.

These changes are only applied to functions not using the arn:aws:lambda:::awslayer:Java8Corretto or arn:aws:lambda:::awslayer:Java8OpenJDK layers described below.

Which of my Lambda functions are affected?

Lambda supports two versions of the Java 8 managed runtime: the java8 runtime, which runs on Amazon Linux 1, and the java8.al2 runtime, which runs on Amazon Linux 2. This change only affects functions using the java8 runtime. Functions the java8.al2 runtime are already using the Amazon Corretto implementation of Java 8 and are not affected.

The following command shows how to use the AWS CLI to list all functions in a specific Region using the java8 runtime. To find all such functions in your account, repeat this command for each Region:

aws lambda list-functions --function-version ALL --region us-east-1 --output text --query "Functions[?Runtime=='java8'].FunctionArn"

What do I need to do?

If you are using the java8 runtime, your functions will be updated automatically. For production workloads, we recommend that you test functions in advance for compatibility with Amazon Corretto 8.

For Lambda functions using container images, the existing public.ecr.aws/lambda/java:8 container base image will be updated to use the Amazon Corretto Java implementation. You must manually update your functions to use the updated container base image.

How can I test for compatibility with Amazon Corretto 8?

If you are using the java8 managed runtime, you can test functions with the new version of the runtime by adding the layer reference arn:aws:lambda:::awslayer:Java8Corretto to the function configuration. This layer instructs the Lambda service to use the Amazon Corretto implementation of Java 8. It does not contain any data or code.

If you are using container images, update the JVM in your image to Amazon Corretto for testing. Here is an example Dockerfile:

FROM public.ecr.aws/lambda/java:8

# Update the JVM to the latest Corretto version
## Import the Corretto public key
rpm --import https://yum.corretto.aws/corretto.key

## Add the Corretto yum repository to the system list
curl -L -o /etc/yum.repos.d/corretto.repo https://yum.corretto.aws/corretto.repo

## Install the latest version of Corretto 8
yum install -y java-1.8.0-amazon-corretto-devel

# Copy function code and runtime dependencies from Gradle layout
COPY build/classes/java/main ${LAMBDA_TASK_ROOT}
COPY build/dependency/* ${LAMBDA_TASK_ROOT}/lib/

# Set the CMD to your handler
CMD [ "com.example.LambdaHandler::handleRequest" ]

Can I continue to use the OpenJDK version of Java 8?

You can continue to use the OpenJDK version of Java 8 by adding the layer reference arn:aws:lambda:::awslayer:Java8OpenJDK to the function configuration. This layer tells the Lambda service to use the OpenJDK implementation of Java 8. It does not contain any data or code.

This option gives you more time to address any code incompatibilities with Amazon Corretto 8. We do not recommend that you use this option to continue to use Lambda’s OpenJDK Java implementation in the long term. Following this migration, it will no longer receive bug fix and security updates. After addressing any compatibility issues, remove this layer reference so that the function uses the Lambda-Amazon Corretto managed implementation of Java 8.

What are the known differences between OpenJDK 8 and Amazon Corretto 8 in Lambda?

Amazon Corretto caches TCP sessions for longer than OpenJDK 8. Functions that create new connections (for example, new AWS SDK clients) on each invoke without closing them may experience an increase in memory usage. In the worst case, this could cause the function to consume all the available memory, which results in an invoke error and a subsequent cold start.

We recommend that you do not create AWS SDK clients in your function handler on every function invocation. Instead, create SDK clients outside the function handler as static objects that can be used by multiple invocations. For more information, see static initialization in the Lambda Operator Guide.

If you must use a new client on every invocation, make sure it is shut down at the end of every invocation. This avoids TCP session caches using unnecessary resources.

What if I need additional help?

Contact AWS Support, the AWS Lambda discussion forums, or your AWS account team if you have any questions or concerns.

For more serverless learning resources, visit Serverless Land.

Building serverless applications with streaming data: Part 2

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-serverless-applications-with-streaming-data-part-2/

Part 1 introduces the Alleycat application that allows bike racers to compete with each other virtually on home exercise bikes. I explain the application’s functionality, how to deploy to your AWS account, and provide an architectural review.

This series is about building serverless solutions in streaming data workloads. These are traditionally challenging to build, since data can be streamed from thousands or even millions of devices continuously.

In the example scenario, there are 40,000 users and up to 1,000 competitors may race at any given time. The workload must continuously ingest and buffer this data, then process and analyze the information to provide analytics and leaderboard content for the frontend application.

In this post, I focus on data ingestion. I compare the two different methods used in Alleycat, and discuss other approaches available. This post refers to Amazon Kinesis Data Streams, the AWS SDK, and AWS IoT Core in the solutions.

To set up the example, visit the GitHub repo and follow the instructions in the README.md file. Note that this walkthrough uses services that are not covered by the AWS Free Tier and incur cost.

Using AWS IoT Core to ingest streaming data

AWS IoT Core enables publish-subscribe capabilities for large numbers of client applications. Clients can send data to the backend using the AWS IoT Device SDK, which uses the MQTT standard for IoT messaging. After processing, the backend can publish aggregation and status messages back to the frontend via AWS IoT Core. This service fans out the messages to clients using topics.

When using this approach, note the Quality of Service (QoS) options available. By default, the SDK uses QoS level 0, which means the device does not confirm the message is received. This is intended for workloads that can lose messages occasionally without impacting performance. In Alleycat, if performance metrics are sometimes lost, this does not likely impact the overall end user experience.

For workloads requiring higher reliability, use QoS level 1, which causes the SDK to resend the message until an acknowledgement is received. While there is no additional charge for using QoS level 1, it generally increases the number of messages, which increases the overall cost. You are not charged for the PUBACK acknowledgement message – for more details, read more about AWS IoT Core pricing.

Frontend

In this scenario, the Alleycat frontend application is running on a physical exercise bike. The user selects a racer ID and exercise class and chooses Start Race to join the current virtual race for that class.

Start race UI

Every second, the frontend sends a message containing the cadence and resistance metrics and the current second in the race for the local racer. This message is created as a JSON object in the Home.vue component and sent to the ‘alleycat-publish’ topic:

      const message = {
        uuid: uuidv4(),
        event: this.event,
        deviceTimestamp: Date.now(),
        second: this.currentSecond,
        raceId: RACE_ID,
        name: this.racer.name,
        racerId: this.racer.id,
        classId: this.selectedClassId,
        cadence: this.racer.getCurrentCadence(),
        resistance: this.racer.getCurrentResistance
      }

The IoT.vue component contains the logic for this integration and uses the AWS IoT Device SDK to send and receive messages. On startup, the frontend connects to AWS IoT Core and publishes the messages using an MQTT client:

    bus.$on('publish', (data) => {
      console.log('Publish: ', data)
      mqttClient.publish(topics.publish, JSON.stringify(data))
    })

The SDK automatically attempts to retry in the event of a network disconnection and exposes an error handler to allow custom logic if other errors occur.

Backend

The resources used in the backend are defined using the AWS Serverless Application Model (AWS SAM) and configured in the core setup templates:

Reference architecture

Messages are published to topics in AWS IoT Core, which act as channels of interest. The message broker uses topic names and topic filters to route messages between publishers and subscribers. Incoming messages are routed using rules. Alleycat’s IoT rule routes all incoming messages to a Kinesis stream:

  IotTopicRule:
    Type: AWS::IoT::TopicRule
    Properties:
      RuleName: 'alleycatIngest'
      TopicRulePayload:
        RuleDisabled: 'false'
        Sql: "SELECT * FROM 'alleycat-publish'"
        Actions:
        - Kinesis:
            StreamName: 'alleycat'
            PartitionKey: "${timestamp()}"
            RoleArn: !GetAtt IoTKinesisRole.Arn

  IoTKinesisRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Principal:
              Service:
                - iot.amazonaws.com
            Action:
              - 'sts:AssumeRole'
      Path: /
      Policies:
        - PolicyName: IoTKinesisPutPolicy
          PolicyDocument:
            Version: "2012-10-17"
            Statement:
              - Effect: Allow
                Action: 'kinesis:PutRecord'
                Resource: !GetAtt KinesisStream.Arn

Using the AWS::IoT::TopicRule resource, you can optionally define an error action. This allows you to store messages in a durable location, such as an Amazon S3 bucket, if an error occurs. Errors can occur if a rule does not have permission to access a destination or throttling occurs in a target.

Rules can route matching messages to up to 10 targets. For debugging purposes, you can also enable Amazon CloudWatch Logs, which can help in troubleshoot failed message deliveries. The AWS IoT Core Message Broker allows up to 20,000 publish requests per second – if you need a higher limit for your workload, submit a request to AWS Support.

Using the AWS SDK to ingest streaming data

The Alleycat frontend creates traffic for a single user but there is also a simulator application that can generate messages for up to 1,000 riders. Instead of routing messages using an MQTT client, the simulator uses the AWS SDK to put messages directly into the Kinesis data stream.

The SDK provides a service interface object for Kinesis and two API methods for putting messages into streams: putRecord and putRecords. The first option accepts only a single message but the second enables batching of up to 500 messages per request. This is the preferred option for adding multiple messages, compared with calling putRecord multiple times.

The putRecords API takes parameters as a JSON array of messages:

const params = {
   StreamName: 'alley-cat',
   [{
      "Data":"{\"event\":\"update\",\"deviceTimestamp\":1620824038331,\"second\":3,\"raceId\":5402746,\"name\":\"Hayden\",\"racerId\":0,\"classId\":1,\"cadence\":79.8,\"resistance\":79}",
      "PartitionKey":"1620824038331"
   },
   {
      "Data":"{\"event\":\"update\",\"deviceTimestamp\":1620824038331,\"second\":3,\"raceId\":5402746,\"name\":\"Hubert\",\"racerId\":1,\"classId\":1,\"cadence\":60.4,\"resistance\":60.6}",
      "PartitionKey":"1620824038331"
   }
]}

The SDK automatically base64 encodes the Data attribute, which in this case is the JSON string output from JSON.stringify. In the JavaScript SDK, the putRecords API can return a promise, allowing the code to await the operation:

const result = await kinesis.putRecords(params).promise()

Shards and partition keys

Kinesis data streams consist of one or more shards, which are sequences of data records with a fixed capacity. Each shard can support up to 1,000 records per second for writes, up to maximum total data write rate of 1MB per second. The total capacity of a stream is the total of its shards.

When you send messages to a stream, the partitionKey attribute determines which shard it is routed to. The example application configures a Kinesis data stream with a single shard so the partitionKey attribute has no effect – all messages are routed to the same shard. However, many production applications have more than one shard and use the partitionKey to assign messages to shards.

The partitionKey is hashed by the Kinesis service to route to a shard. This diagram shows how partitionKey values from data producers are hashed by an MD5 function and mapped to individual shards:

MD5 hash process

While you cannot designate a specific shard ID in a message, you can influence the assignment depending on your choice of partitionKey:

  • Random: Using a randomized value results in random hash so messages are randomly sent to different shards. This effectively load balances messages across all available shards.
  • Time-based: A timestamp value may cause groups of messages sent to a single shard, if the messages arrive at the same time. The identical timestamp results in an identical hash.
  • Application-specific: if Alleycat used the classID as a partitionKey, racers in each class would always be routed to the same shard. This could be useful for downstream aggregation logic but would limit the capacity of messages per classID.

Optimizing capacity in a shard

Each shard can ingest data at a rate of 1 MB per second or 1,000 records per second, whichever limit is reached first. Since the payload maximum is 1MB, this could equate to one 1MB message per second. If the payload is larger, you must divide it into smaller pieces to avoid an error. For 1,000 messages, each payload must be under 1 KB on average to fit within the allowed capacity.

The combination of the two payload limits can result in different capacity profiles for a shard:

Capacity profiles in a shard

  1. The data payloads are evenly sized and use the 1 MB per second capacity.
  2. Data payload sizes vary, so the number of messages that can be packed into 1 MB varies per second.
  3. There are a large number of very messages, consuming all 1,000 messages per second. However, the total data capacity used is significantly less than 1 MB.

In the Alleycat application, the average payload size is around 170 bytes. When producing 1,000 messages a second, the workload is only using about 20% of the 1 MB per second limit. Since PUT payload size is a factor in Kinesis pricing, messages that are much smaller than 25 KB are less cost-efficient. Compare these two messaging patterns for the Alleycat application:

Producer message patterns

  1. In this default mode, a smaller message is published once per second. This reduces overall latency but results in higher overall messaging cost.
  2. The client application batches outgoing messages and sends to Kinesis every 5 seconds. This results in lower cost and better packing of messages, but introduces additional latency.

There is a tradeoff between cost and latency when optimizing a shard’s capacity and the decision depends upon the needs of your workload. If the client buffers messages, this adds latency on the client side. This is acceptable in many workloads that collect metrics for archival or asynchronous reporting purchases. However, for low-latency applications like Alleycat, it provides a better experience for the application user to send messages as soon as they are available.

Conclusion

This post focuses on ingesting data into Kinesis Data Streams. I explain the two approaches used by the Alleycat frontend and the simulator application and highlight other approaches that you can use. I show how messages are routed to shards using partition keys. Finally, I explore additional factors to consider when ingesting data, to improve efficiency and reduce cost.

Part 3 covers using Amazon Kinesis Data Firehose for transforming, aggregating, and loading streaming data into data stores. This is used to provide the historical, second-by-second leaderboard for the frontend application.

For more serverless learning resources, visit Serverless Land.

Using API destinations with Amazon EventBridge

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/using-api-destinations-with-amazon-eventbridge/

Amazon EventBridge enables developers to route events between AWS services, integrated software as a service (SaaS) applications, and your own applications. It can help decouple applications and produce more extensible, maintainable architectures. With the new API destinations feature, EventBridge can now integrate with services outside of AWS using REST API calls.

API destinations architecture

This feature enables developers to route events to existing SaaS providers that integrate with EventBridge, like Zendesk, PagerDuty, TriggerMesh, or MongoDB. Additionally, you can use other SaaS endpoints for applications like Slack or Contentful, or any other type of API or webhook. It can also provide an easier way to ingest data from serverless workloads into Splunk without needing to modify application code or install agents.

This blog post explains how to use API destinations and walks through integration examples you can use in your workloads.

How it works

API destinations are third-party targets outside of AWS that you can invoke with an HTTP request. EventBridge invokes the HTTP endpoint and delivers the event as a payload within the request. You can use any preferred HTTP method, such as GET or POST. You can use input transformers to change the payload format to match your target.

An API destination uses a Connection to manage the authentication credentials for a target. This defines the authorization type used, which can be an API key, OAuth client credentials grant, or a basic user name and password.

Connection details

The service manages the secret in AWS Secrets Manager and the cost of storing the secret is included in the pricing for API destinations.

You create a connection to each different external API endpoint and share the connection with multiple endpoints. The API destinations console shows all configured connections, together with their authorization status. Any connections that cannot be established are shown here:

Connections list

To create an API destination, you provide a name, the HTTP endpoint and method, and the connection:

Create API destination

When you configure the destination, you must also set an invocation rate limit between 1 and 300 events per second. This helps protect the downstream endpoint from surges in traffic. If the number of arriving events exceeds the limit, the EventBridge service queues up events. It delivers to the endpoint as quickly as possible within the rate limit.

It continues to do this for 24 hours. To make sure you retain any events that cannot be delivered, set up a dead-letter queue on the event bus. This ensures that if the event is not delivered within this timeframe, it is stored durably in an Amazon SQS queue for further processing. This can also be useful if the downstream API experiences an outage for extended periods of time.

Throttling and retries

Once you have configured the API destination, it becomes available in the list of targets for rules. Matching events are sent to the HTTP endpoint with the event serialized as part of the payload.

Select targets

As with API Gateway targets in EventBridge, the maximum timeout for API destination is 5 seconds. If an API call exceeds this timeout, it is retried.

Debugging the payload from API destinations

You can send an event via API Destinations to debugging tools like Webhook.site to view the headers and payload of the API call:

  1. Create a connection with the credential, such as an API key.Connection details
  2. Create an API destination with the webhook URL endpoint and then create a rule to match and route events.Target configuration
  3. The Webhook.site testing service shows the headers and payload once the webhook is triggered. This can help you test rules if you are adding headers or manipulating the payload using an input transformer.Webhook.site testing service

Customizing the payload

Third-party APIs often require custom headers or payload formats when accepting data. EventBridge rules allow you to customize header parameters, query strings, and payload formats without the need for custom code. Header parameters and query strings can be configured with static values or attributes from the event:

Header parameters

To customize the payload, configure an input transformer, which consists of an Input Path and Input template. You use an Input Path to define variables and use JSONPath query syntax to identify the variable source in the event. For example, to extract two attributes from an Amazon S3 PutObject event, the Input Path is:

{
  "key" : "$.[0].s3.object.key", 
  "bucket" : " $.[0].s3.bucket.name "
}

Next, the Input template defines the structure of the data passed to the target, which references the variables. With this release you can now use variables inside quotes in the input transformer. As a result, you can pass these values as a string or JSON, for example:

{
  "filename" : "<key>", 
  "container" : "mycontainer-<bucket>"
}

Sending AWS events to DataDog

Using API destinations, you can send any AWS-sourced event to third-party services like DataDog. This approach uses the DataDog API to put data into the service. To do this, you must sign up for an account and create an API key. In this example, I send S3 events via CloudTrail to DataDog for further analysis.

  1. Navigate to the EventBridge console, select API destinations from the menu.
  2. Select the Connections tab and choose Create connection:Connections UI
  3. Enter a connection name, then select API Key for Authorization Type. Enter the API key name DD-API-KEY and paste your secret API key as the value. Choose Create.Create new connection UI
  4. In the API destinations tab, choose Create API destination.Create API destination UI
  5. Enter a name, set the API destination endpoint to https://http-intake.logs.datadoghq.com/v1/input, and the HTTP method to POST. Enter 300 for Invocation rate limit and select the DataDog connection from the dropdown. Choose Create.API destination detail UI
  6. From the EventBridge console, select Rules and choose Create rule. Enter a name, select the default bus, and enter this event pattern:
    {
      "source": ["aws.s3"]
    }
  7. In Select targets, choose API destination and select DataDog for the API destination. Expand, Configure Input.
  8. In the Input transformer section, enter {"detail":"$.detail"} in the Input Path field and enter {"message": <detail>} in the Input Template.Select targets UI
  9. You can optionally add a dead letter queue. To do this, open the Retry policy and dead-letter queue section. Under Dead-letter queue, select an existing SQS queue.
    Retry policy and DLQ
  10. Choose Create.
  11. Open AWS CloudShell and upload an object to an S3 bucket in your account to trigger an event:
    echo "test" > testfile.txt
    aws s3 cp testfile.txt s3://YOUR_BUCKET_NAME

    CloudShell output

  12. The logs appear in the DataDog Logs console, where you can process the raw data for further analysis:Datadog Logs console

Sending AWS events to Zendesk

Zendesk is a SaaS provider that provides customer support solutions. It can already send events to EventBridge using a partner integration. This post shows how you can consume ticket events from Zendesk and run a sentiment analysis using Amazon Comprehend.

With API destinations, you can now use events to call the Zendesk API to create and modify tickets and interact with chats and customer profiles.

To create an API destination for Zendesk:

  1. Log in with an existing Zendesk account or register for a trial account.
  2. Navigate to the EventBridge console, select API destinations from the menu and choose Create API destination.
  3. On the Create API destination, page:
    1. Enter a name for the destination (e.g. “SendToZendesk”).
    2. For API destination endpoint, enter https://<<your-subdomain>>.zendesk.com/api/v2/tickets.json.
    3. For HTTP method, select POST.
    4. For Invocation rate, enter 10.
  4. In the connection section:
    1. Select the Create a new connection radio button.
    2. For Connection name, enter ZendeskConnection.
    3. For Authorization type, select Basic (Username/Password).
    4. Enter your Zendesk username and password.
  5. Choose Create.
    Connection configuration with basic auth

When you create a rule to route to this API destination, use the Input transformer to build the defined JSON payload, as shown in the previous DataDog example. When an event matches the rule, EventBridge calls the Zendesk Create Ticket API. The new ticket appears in the Zendesk dashboard:

Zendesk dashboard

For more information on the Zendesk API, visit the Zendesk Developer Portal.

Building an integration with AWS CloudFormation and AWS SAM

To support this new feature, there are two new AWS CloudFormation resources available. These can also be used in AWS Serverless Application Model (AWS SAM) templates:

The connection resource defines the connection credential and optional invocation HTTP parameters:

Resources:
  TestConnection:
    Type: AWS::Events::Connection
    Properties:
      AuthorizationType: API_KEY
      Description: 'My connection with an API key'
      AuthParameters:
        ApiKeyAuthParameters:
          ApiKeyName: VHS
          ApiKeyValue: Testing
        InvocationHttpParameters:
          BodyParameters:
          - Key: 'my-integration-key'
            Value: 'ABCDEFGH0123456'

Outputs:
  TestConnectionName:
    Value: !Ref TestConnection
  TestConnectionArn:
    Value: !GetAtt TestConnection.Arn

The API destination resource provides the connection, endpoint, HTTP method, and invocation limit:

Resources:
  TestApiDestination:
    Type: AWS::Events::ApiDestination
    Properties:
      Name: 'datadog-target'
      ConnectionArn: arn:aws:events:us-east-1:123456789012:connection/datadogConnection/2
      InvocationEndpoint: 'https://http-intake.logs.datadoghq.com/v1/input'
      HttpMethod: POST
      InvocationRateLimitPerSecond: 300

Outputs:
  TestApiDestinationName:
    Value: !Ref TestApiDestination
  TestApiDestinationArn:
    Value: !GetAtt TestApiDestination.Arn
  TestApiDestinationSecretArn:
    Value: !GetAtt TestApiDestination.SecretArn

You can use the existing AWS::Events::Rule resource to configure an input transformer for API destination targets:

ApiDestinationDeliveryRule:
  Type: AWS::Events::Rule
  Properties:
    EventPattern:
      source:
        - "EventsForMyAPIdestination"
    State: "ENABLED"
    Targets:
      -
        Arn: !Ref TestApiDestinationArn
        InputTransformer:
          InputPathsMap:
            detail: $.detail
          InputTemplate: >
            {
                    "message": <detail>
            }

Conclusion

The API destinations feature of EventBridge enables developers to integrate workloads with third-party applications using REST API calls. This provides an easier way to build decoupled, extensible applications that work with applications outside of the AWS Cloud.

To use this feature, you configure a connection and an API destination. You can use API destinations in the same way as existing targets for rules, and also customize headers, query strings, and payloads in the API call.

Learn more about using API destinations with the following SaaS providers: DatadogFreshworks, MongoDB, TriggerMesh, and Zendesk.

For more serverless learning resources, visit Serverless Land.

Using AWS X-Ray tracing with Amazon EventBridge

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/using-aws-x-ray-tracing-with-amazon-eventbridge/

AWS X-Ray allows developers to debug and analyze distributed applications. It can be useful for tracing transactions through microservices architectures, such as those typically used in serverless applications. Amazon EventBridge allows you to route events between AWS services, integrated software as a service (SaaS) applications, and your own applications. EventBridge can help decouple applications and produce more extensible, maintainable architectures.

EventBridge now supports trace context propagation for X-Ray, which makes it easier to trace transactions through event-based architectures. This means you can potentially trace a single request from an event producer through to final processing by an event consumer. These may be decoupled application stacks where the consumer has no knowledge of how the event is produced.

This blog post explores how to use X-Ray with EventBridge and shows how to implement tracing using the example application in this GitHub repo.

How it works

X-Ray works by adding a trace header to requests, which acts as a unique identifier. In the case of a serverless application using multiple AWS services, this allows X-Ray to group service interactions together as a single trace. X-Ray can then produce a service map of the transaction flow or provide the raw data for a trace:

X-Ray service map

When you send events to EventBridge, the service uses rules to determine how the events are routed from the event bus to targets. Any event that is put on an event bus with the PutEvents API can now support trace context propagation.

The trace header is provided as internal metadata to support X-Ray tracing. The header itself is not available in the event when it’s delivered to a target. For developers using the EventBridge archive feature, this means that a trace ID is not available for replay. Similarly, it’s not available on events sent to a dead-letter queue (DLQ).

Enabling tracing with EventBridge

To enable tracing, you don’t need to change the event structure to add the trace header. Instead, you wrap the AWS SDK client in a call to AWSXRay.captureAWSClient and grant IAM permissions to allow tracing. This enables X-Ray to instrument the call automatically with the X-Amzn-Trace-Id header.

For code using the AWS SDK for JavaScript, this requires changes to the way that the EventBridge client is instantiated. Without tracing, you declare the AWS SDK and EventBridge client with:

const AWS = require('aws-sdk')
const eventBridge = new AWS.EventBridge()

To use tracing, this becomes:

const AWSXRay = require('aws-xray-sdk')
const AWS = AWSXRay.captureAWS(require('aws-sdk'))
const eventBridge = new AWS.EventBridge()

The interaction with the EventBridge client remains the same but the calls are now instrumented by X-Ray. Events are put on the event bus programmatically using a PutEvents API call. In a Node.js Lambda function, the following code processes an event to send to an event bus, with tracing enabled:

const AWSXRay = require('aws-xray-sdk')
const AWS = AWSXRay.captureAWS(require('aws-sdk'))
const eventBridge = new AWS.EventBridge()

exports.handler = async (event) => {

  let myDetail = { "name": "Alice" }

  const myEvent = { 
    Entries: [{
      Detail: JSON.stringify({ myDetail }),
      DetailType: 'myDetailType',
      Source: 'myApplication',
      Time: new Date
    }]
  }

  // Send to EventBridge
  const result = await eventBridge.putEvents(myEvent).promise()

  // Log the result
  console.log('Result: ', JSON.stringify(result, null, 2))
}

You can also define a custom tracing header using the new TraceHeader attribute on the PutEventsRequestEntry API model. The unique value you provide overrides any trace header on the HTTP header. The value is also validated by X-Ray and discarded if it does not pass validation. See the X-Ray Developer Guide to learn about generating valid trace headers.

Deploying the example application

The example application consists of a webhook microservice that publishes events and target microservices that consume events. The generated event contains a target attribute to determine which target receives the event:

Example application architecture

To deploy these microservices, you must have the AWS SAM CLI and Node.js 12.x installed. to To complete the deployment, follow the instructions in the GitHub repo.

EventBridge can route events to a broad range of target services in AWS. Targets that support active tracing for X-Ray can create comprehensive traces from the event source. The services offering active tracing are AWS Lambda, AWS Step Functions, and Amazon API Gateway. In each case, you can trace a request from the producer to the consumer of the event.

The GitHub repo contains examples showing how to use active tracing with EventBridge targets. The webhook application uses a query string parameter called target to determine which events are routed to these targets.

For X-Ray to detect each service in the webhook, tracing must be enabled on both the API Gateway stage and the Lambda function. In the AWS SAM template, the Tracing: Active property turns on active tracing for the Lambda function. If an IAM role is not specified, the AWS SAM CLI automatically adds the arn:aws:iam::aws:policy/AWSXrayWriteOnlyAccess policy to the Lambda function’s execution role. For the API definition, adding TracingEnabled: True enables tracing for this API stage.

When you invoke the webhook’s API endpoint, X-Ray generates a trace map of the request, showing each of the services from the REST API call to putting the event on the bus:

X-Ray trace map with EventBridge

The CloudWatch Logs from the webhook’s Lambda function shows the event that has been put on the event bus:

CloudWatch Logs from a webhook

Tracing with a Lambda target

In the targets-lambda example application, the Lambda function uses the X-Ray SDK and has active tracing enabled in the AWS SAM template:

Resources:
  ConsumerFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: src/
      Handler: app.handler
      MemorySize: 128
      Timeout: 3
      Runtime: nodejs12.x
      Tracing: Active

With these two changes, the target Lambda function propagates the tracing header from the original webhook request. When the webhook API is invoked, the X-Ray trace map shows the entire request through to the Lambda target. X-Ray shows two nodes for Lambda – one is the Lambda service and the other is the Lambda function invocation:

Downstream service node in service map

Tracing with an API Gateway target

Currently, active tracing is only supported by REST APIs but not HTTP APIs. You can enable X-Ray tracing from the AWS CLI or from the Stages menu in the API Gateway console, in the Logs/Tracing tab:

Enable X-Ray tracing in API Gateway

You cannot currently create an API Gateway target for EventBridge using AWS SAM. To invoke an API endpoint from the EventBridge console, create a rule and select the API as a target. The console automatically creates the necessary IAM permissions for EventBridge to invoke the endpoint.

Setting API Gateway as an EventBridge target

If the API invokes downstream services with active tracing available, these services also appear as nodes in the X-Ray service graph. Using the webhook application to invoke the API Gateway target, the trace shows the entire request from the initial API call through to the second API target:

API Gateway node in X-Ray service map

Tracing with a Step Functions target

To enable tracing for a Step Functions target, the state machine must have tracing enabled and have permissions to write to X-Ray. The AWS SAM template can enable tracing, define the EventBridge rule and the AWSXRayDaemonWriteAccess policy in one resource:

  WorkFlowStepFunctions:
    Type: AWS::Serverless::StateMachine
    Properties:
      DefinitionUri: definition.asl.json
      DefinitionSubstitutions:
        LoggerFunctionArn: !GetAtt LoggerFunction.Arn
      Tracing:
        Enabled: True
      Events:
        UploadComplete:
          Type: EventBridgeRule
          Properties:
            Pattern:
              account: 
                - !Sub '${AWS::AccountId}'
              source:
                - !Ref EventSource
              detail:
                apiEvent:
                  target:
                    - 'sfn'

      Policies: 
        - AWSXRayDaemonWriteAccess
        - LambdaInvokePolicy:
            FunctionName: !Ref LoggerFunction

If the state machine uses services that support active tracing, these also appear in the trace map for individual requests. Using the webhook to invoke this target, X-Ray now shows the request trace to the state machine and the Lambda function it contains:

Step Functions in X-Ray service map

Adding X-Ray tracing to existing Lambda targets

To wrap the SDK client, you must enable active tracing and include the AWS X-Ray SDK in the Lambda function’s deployment package. Unlike the AWS SDK, the X-Ray SDK is not included in the Lambda execution environment.

Another option is to include the X-Ray SDK as a Lambda layer. You can build this layer by following the instructions in the GitHub repo. Once deployed, you can attach the X-Ray layer to any Lambda function either via the console or the CLI:

Adding X-Ray tracing a Lambda function

To learn more about using Lambda layers, read “Using Lambda layers to simplify your development process”.

Conclusion

X-Ray is a powerful tool for providing observability in serverless applications. With the launch of X-Ray trace context propagation in EventBridge, this allows you to trace requests across distributed applications more easily.

In this blog post, I walk through an example webhook application with three targets that support active tracing. In each case, I show how to enable tracing either via the console or using AWS SAM and show the resulting X-Ray trace map.

To learn more about how to use tracing with events, read the X-Ray Developer Guide or see the Amazon EventBridge documentation for this feature.

For more serverless learning resources, visit Serverless Land.

Operating Lambda: Building a solid security foundation – Part 2

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/operating-lambda-building-a-solid-security-foundation-part-2/

In the Operating Lambda series, I cover important topics for developers, architects, and systems administrators who are managing AWS Lambda-based applications. This two-part series discusses core security concepts for Lambda-based applications.

Part 1 explains the Lambda execution environment and how to apply the principles of least privilege to your workload. This post covers securing workloads with public endpoints, encrypting data, and using AWS CloudTrail for governance, compliance, and operational auditing.

Securing workloads with public endpoints

For workloads that are accessible publicly, AWS provides a number of features and services that can help mitigate certain risks. This section covers authentication and authorization of application users and protecting API endpoints.

Authentication and authorization

Authentication relates to identity and authorization refers to actions. Use authentication to control who can invoke a Lambda function, and then use authorization to control what they can do. For many applications, AWS Identity & Access Management (IAM) is sufficient for managing both control mechanisms.

For applications with external users, such as web or mobile applications, it is common to use JSON Web Tokens (JWTs) to manage authentication and authorization. Unlike traditional, server-based password management, JWTs are passed from the client on every request. They are a cryptographically secure way to verify identity and claims using data passed from the client. For Lambda-based applications, this allows you to secure APIs for each microservice independently, without relying on a central server for authentication.

You can implement JWTs with Amazon Cognito, which is a user directory service that can handle registration, authentication, account recovery, and other common account management operations. For frontend development, Amplify Framework provides libraries to simplify integrating Cognito into your frontend application. You can also use third-party partner services like Auth0.

Given the critical security role of an identity provider service, it’s important to use professional tooling to safeguard your application. It’s not recommended that you write your own services to handle authentication or authorization. Any vulnerabilities in custom libraries may have significant implications for the security of your workload and its data.

Protecting API endpoints

For serverless applications, the preferred way to serve a backend application publicly is to use Amazon API Gateway. This can help you protect an API from malicious users or spikes in traffic.

For authenticated API routes, API Gateway offers both REST APIs and HTTP APIs for serverless developers. Both types support authorization using AWS Lambda, IAM or Amazon Cognito. When using IAM or Amazon Cognito, incoming requests are evaluated and if they are missing a required token or contain invalid authentication, the request is rejected. You are not charged for these requests and they do not count towards any throttling quotas.

Unauthenticated API routes may be accessed by anyone on the public internet so it’s recommended that you limit their use. If you must use unauthenticated APIs, it’s important to protect these against common risks, such as denial-of-service (DoS) attacks. Applying AWS WAF to these APIs can help protect your application from SQL injection and cross-site scripting (XSS) attacks. API Gateway also implements throttling at the AWS account-level and per-client level when API keys are used.

In some cases, the functionality provided by an unauthenticated API can be achieved with an alternative approach. For example, a web application may provide a list of customer retail stores from an Amazon DynamoDB table to users who are not logged in. This request may originate from a frontend web application or from any other source that calls the URL endpoint. This diagram compares three solutions:

Solutions for an unauthenticated API

  1. The unauthenticated API can be called by anyone on the internet. In a denial of service attack, it’s possible to exhaust API throttling limits, Lambda concurrency, or DynamoDB provisioned read capacity on an underlying table.
  2. An Amazon CloudFront distribution in front of the API endpoint with an appropriate time-to-live (TTL) configuration may help absorb traffic in a DoS attack, without changing the underlying solution for fetching the data.
  3. Alternatively, for static data that rarely changes, the CloudFront distribution could serve the data from an S3 bucket.

The AWS Well-Architected Tool provides a Serverless Lens that analyzes the security posture of serverless workloads.

Encrypting data in Lambda-based applications

Managing secrets

For applications handling sensitive data, AWS services provide a range of encryption options for data in transit and at rest. It’s important to identity and classify sensitive data in your workload, and minimize the storage of sensitive data to only what is necessary.

When protecting data at rest, use AWS services for key management and encryption of stored data, secrets and environment variables. Both the AWS Key Management Service and AWS Secrets Manager provide a robust approach to storing and managing secrets used in Lambda functions.

Do not store plaintext secrets or API keys in Lambda environment variables. Instead, use KMS to encrypt environment variables. Also ensure you do not embed secrets directly in function code, or commit these secrets to code repositories.

Using HTTPS securely

HTTPS is encrypted HTTP, using TLS (SSL) to encrypt the request and response, including headers and query parameters. While query parameters are encrypted, URLs may be logged by different services in plaintext, so you should not use these to store sensitive data such as credit card numbers.

AWS services make it easier to use HTTPS throughout your application and it is provided by default in services like API Gateway. Where you need an SSL/TLS certificate in your application, to support features like custom domain names, it’s recommended that you use AWS Certificate Manager (ACM). This provides free public certificates for ACM-integrated services and managed certificate renewal.

Governance controls with AWS CloudTrail

For compliance and operational auditing of application usage, AWS CloudTrail logs activity related to your AWS account usage. It tracks resource changes and usage, and provides analysis and troubleshooting tools. Enabling CloudTrail does not have any negative performance implications for your Lambda-based application, since the logging occurs asynchronously.

Separate from application logging (see chapter 4), CloudTrail captures two types of events:

  • Control plane: These events apply to management operations performed on any AWS resources. Individual trails can be configured to capture read or write events, or both.
  • Data plane: Events performed on the resources, such as when a Lambda function is invoked or an S3 object is downloaded.

For Lambda, you can log who creates and invokes functions, together with any changes to IAM roles. You can configure CloudTrail to log every single activity by user, role, service, and API within an AWS account. The service is critical for understanding the history of changes made to your account and also detecting any unintended changes or suspicious activity.

To research which AWS user interacted with a Lambda function, CloudTrail provides an audit log to find this information. For example, when a new permission is added to a Lambda function, it creates an AddPermission record. You can interpret the meaning of individual attributes in the JSON message by referring to the CloudTrail Record Contents documentation.

CloudTrail Record Contents documentation

CloudTrail data is considered sensitive so it’s recommended that you protect it with KMS encryption. For any service processing encrypted CloudTrail data, it must use an IAM policy with kms:Decrypt permission.

By integrating CloudTrail with Amazon EventBridge, you can create alerts in response to certain activities and respond accordingly. With these two services, you can quickly implement an automated detection and response pattern, enabling you to develop mechanisms to mitigate security risks. With EventBridge, you can analyze data in real-time, using event rules to filter events and forward to targets like Lambda functions or Amazon Kinesis streams.

CloudTrail can deliver data to Amazon CloudWatch Logs, which allows you to process multi-Region data in real time from one location. You can also deliver CloudTrail to Amazon S3 buckets, where you can create event source mappings to start data processing pipelines, run queries with Amazon Athena, or analyze activity with Amazon Macie.

If you use multiple AWS accounts, you can use AWS Organizations to manage and govern individual member accounts centrally. You can set an existing trail as an organization-level trail in a primary account that can collect events from all other member accounts. This can simplify applying consistent auditing rules across a large set of existing accounts, or automatically apply rules to new accounts. To learn more about this feature, see Creating a Trail for an Organization.

Conclusion

In this blog post, I explain how to secure workloads with public endpoints and the different authentication and authorization options available. I also show different approaches to exposing APIs publicly.

CloudTrail can provide compliance and operational auditing for Lambda usage. It provides logs for both the control plane and data plane. You can integrate CloudTrail with EventBridge to create alerts in response to certain activities. Customers with multiple AWS accounts can use AWS Organizations to manage trails centrally.

For more serverless learning resources, visit Serverless Land.

Building a serverless multi-player game that scales

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-a-serverless-multiplayer-game-that-scales/

This post is written by Tim Bruce, Sr. Solutions Architect, Developer Acceleration.

Game development is a highly iterative process with rapidly changing requirements. Many game developers want to maximize the time spent building features and less time configuring servers, managing infrastructure, and mastering scale.

AWS Serverless provides four key benefits for customers. First, it can help move from idea to market faster, by reducing operational overhead. Second, customers may realize lower costs with serverless by not over-provisioning hardware and software to operate. Third, serverless scales with user activity. Finally, serverless services provide built-in integration, allowing you to focus on your game instead of connecting pieces together.

For AWS Gaming customers, these benefits allow your teams to spend more time focusing on gameplay and content, instead of undifferentiated tasks such as setting up and maintaining servers and software. This can result in better gameplay and content, and a faster time-to-market.

This blog post introduces a game with a serverless-first architecture. Simple Trivia Service is a web-based game showing architectural patterns that you can apply in your own games.

Introducing the Simple Trivia Service

The Simple Trivia Service offers single- and multi-player trivia games with content created by players. There are many features in Simple Trivia Service found in games, such as user registration, chat, content creation, leaderboards, game play, and a marketplace.

Simple Trivia Service UI

Authenticated players can chat with other players, create and manage quizzes, and update their profile. They can play single- and multi-player quizzes, host quizzes, and buy and sell quizzes on the marketplace. The single- and multi-player game modes show how games with different connectivity and technical requirements can be delivered with serverless first architectures. The game modes and architecture solutions are covered in the Simple Trivia Service backend architecture section.

Simple Trivia Service front end

The Simple Trivia Service front end is a Vue.js single page application (SPA) that accesses backend services. The SPA app, accessed via a web browser, allows users to make requests to the game endpoints using HTTPS, secure WebSockets, and WebSockets over MQTT. These requests use integrations to access the serverless backend services.

Vue.js helps make this reference architecture more accessible. The front end uses AWS Amplify to build, deploy, and host the SPA without the need to provision and manage any resources.

Simple Trivia Service backend architecture

The backend architecture for Simple Trivia Service is defined in a set of AWS Serverless Application Model (AWS SAM) templates for portions of the game. A deployment guide is included in the README.md file in the GitHub repository. Here is a visual depiction of the backend architecture.

Reference architecture

Services used

Simple Trivia Service is built using AWS Lambda, Amazon API Gateway, AWS IoT, Amazon DynamoDB, Amazon Simple Notification Service (SNS), AWS Step Functions, Amazon Kinesis, Amazon S3, Amazon Athena, and Amazon Cognito:

  • Lambda enables serverless microservice features in Simple Trivia Service.
  • API Gateway provides serverless endpoints for HTTP/RESTful and WebSocket communication while IoT delivers a serverless endpoint for WebSockets over MQTT communication.
  • DynamoDB enables data storage and retrieval for internet-scale applications.
  • SNS provides microservice communications via publish/subscribe functionality.
  • Step Functions coordinates complex tasks to ensure appropriate outcomes.
  • Analytics for Simple Trivia Service are delivered via Kinesis and S3 with Athena providing a query/visualization capability.
  • Amazon Cognito provides secure, standards-based login and a user directory.

Two managed services that are not serverless, Amazon VPC NAT Gateway and Amazon ElastiCache for Redis, are also used. VPC NAT Gateway is required by VPC-enabled Lambda functions to reach services outside of the VPC, such as DynamoDB. ElastiCache provides an in-memory database suited for applications with submillisecond latency requirements.

User security and enabling communications to backend services

Players are required to register and log in before playing. Registration and login credentials are sent to Amazon Cognito using the Secure Remote Password protocol. Upon successfully logging in, Amazon Cognito returns a JSON Web Token (JWT) and an Amazon Cognito user session.

The JWT is included within requests to API Gateway, which validates the token before allowing the request to be forwarded to Lambda.

IoT requires additional security for users by using an AWS Identity and Access Management (IAM) policy. A policy attached to the Amazon Cognito user allows the player to connect, subscribe, and send messages to the IoT endpoint.

Game types and supporting architectures

Simple Trivia Service’s three game modes define how players interact with the backend services. These modes align to different architectures used within the game.

“Single Player” quiz architecture

“Single Player” quiz architecture

Single player quizzes have simple rules, short play sessions, and appeal to wide audiences. Single player game communication is player-to-endpoint only. This is accomplished with API Gateway via an HTTP API.

Four Lambda functions (ActiveGamesList, GamePlay, GameAnswer, and LeaderboardGet) enable single player games. These functions are integrated with API Gateway and respond to specific requests from the client. API Gateway forwards the request, including URI, body, and query string, to the appropriate Lambda function.

When a player chooses “Play”, a request is sent to API Gateway, which invokes the ActiveGamesList function. This function queries the ActiveGames DynamoDB table and returns the list of active games to the user.

The player selects a game, resulting in another request triggering the GamePlay function. GamePlay retrieves the game’s questions from the GamesDetail DynamoDB table. The front end maintains the state for the user during the game.

When all questions are answered, the SPA sends the player’s responses to API Gateway, invoking the GameAnswer function. This function scores the player’s responses against the GameDetails table. The score and answers are sent to the user.

Additionally, this function sends the player score for the leaderboard and player experience to two SNS topics (LeaderboardTopic and PlayerProgressTopic). The ScorePut and PlayerProgressPut functions subscribe to these topics. These two functions write the details to the Leaderboard and Player Progress DynamoDB tables.

This architecture processes these two actions asynchronously, resulting in the player receiving their score and answers without having to wait. This also allows for increased security for player progress, as only the PlayerProgressPut function is allowed to write to this table.

Finally, the player can view the game’s leaderboard, which is returned to the player as the response to the GetLeaderboard function. The function retrieves the top 10 scores and the current player’s score from the Leaderboard table.

“Multi-player – Casual and Competitive” architecture

“Multiplayer – Casual and Competitive” architecture

These game types require player-to-player and service-to-player communication. This is typically performed using TCP/UDP sockets or the WebSocket protocol. API Gateway WebSockets provides WebSocket communication and enables Lambda functions to send messages to and receive messages from game hosts and players.

Game hosts start games via the “Host” button, messaging the LiveAdmin function via API Gateway. The function adds the game to the LiveGames table, which allows players to find and join the game. A list of questions for the game is sent to the game host from the LiveAdmin function at this time. Additionally, the game host is added to the GameConnections table, which keeps track of which connections are related to a game. Players, via the LivePlayer function, are also added to this table when they join a game.

The game host client manages the state of the game for all players and controls the flow of the game, sending questions, correct answers, and leaderboards to players via API Gateway and the LiveAdmin function. The function only sends game messages to the players in the GameConnections table. Player answers are sent to the host via the LivePlayer function.

To end the game, the game host sends a message with the final leaderboard to all players via the LiveAdmin function. This function also stores the leaderboard in the Leaderboard table, removes the game from the ActiveGames table, and sends player progression messages to the Player Progress topic.

“Multi-player – Live Scoreboard” architecture

“Multiplayer – Live Scoreboard” architecture

This is an extension of other multi-player game types requiring similar communications. This uses IoT with WebSockets over MQTT as the transport. It enables the client to subscribe to a topic and act on messages it receives. IoT manages routing messages to clients based on their subscriptions.

This architecture moves the state management from the game host client to a data store on the backend. This change requires a database that can respond quickly to user actions. Simple Trivia Service uses ElastiCache for Redis for this database. Game questions, player responses, and the leaderboard are all stored and updated in Redis during the quiz. The ElastiCache instance is blocked from internet traffic by placing it in a VPC. A security group configures access for the Lambda functions in the same VPC.

Game hosts for this type of game start the game by hosting it, which sends a message to IoT, triggering the CacheGame function. This function adds the game to the ActiveGames table and caches the quiz details from DynamoDB into Redis. Players join the game by sending a message, which is delivered to the JoinGame function. This adds the user record to Redis and alerts the game host that a player has joined.

Game hosts can send questions to the players via a message that invokes the AskQuestion function. This function updates the current question number in Redis and sends the question to subscribed players via the AskQuestion function. The ReceiveAnswer function processes player responses. It validates the response, stores it in Redis, updates the scoreboard, and replies to all players with the updated scoreboard after the first correct answer. The game scoreboard is updated for players in real time.

When the game is over, the game host sends a message to the EndGame function via IoT. This function writes the game leaderboard to the Leaderboard table, sends player progress to the Player Progress SNS topic, deletes the game from cache, and removes the game from the ActiveGames table.

Conclusion

This post introduces the Simple Trivia Service, a single- and multi-player game built using a serverless-first architecture on AWS. I cover different solutions that you can use to enable connectivity from your game client to a serverless-first backend for both single- and multi-player games. I also include a walkthrough of the architecture for each of these solutions.

You can deploy the code for this solution to your own AWS account via instructions in the Simple Trivia Service GitHub repository.

For more serverless learning resources, visit Serverless Land.

Operating Lambda: Building a solid security foundation – Part 1

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/operating-lambda-building-a-solid-security-foundation-part-1/

In the Operating Lambda series, I cover important topics for developers, architects, and systems administrators who are managing AWS Lambda-based applications. This two-part series discusses core security concepts for Lambda-based applications.

In the AWS Cloud, the most important foundational security principle is the shared responsibility model. This broadly shares security responsibilities between AWS and our customers. AWS is responsible for “security of the cloud”, such as the underlying physical infrastructure and facilities providing the services. Customers are responsible for “security in the cloud”, which includes applying security best practices, controlling access, and taking measures to protect data.

One of the main reasons for the popularity of Lambda-based applications is that AWS manages even more of the security operations compared with traditional cloud-based compute. For example, Lambda customers using zip file deployments do not need to patch underlying operating systems or apply security patches – these tasks are managed automatically by the Lambda service.

This post explains the Lambda execution environment and mechanisms used by the service to protect customer data. It also covers applying the principles of least privilege to your application and what this means in terms of permissions and Lambda function scope.

Understanding the Lambda execution environment

When your functions are invoked, the Lambda service runs your code inside an execution environment. Lambda scrubs the memory before it is assigned to an execution environment. Execution environments are run on hardware virtualized virtual machines (MicroVMs) which are dedicated to a single AWS account. Execution environments are never shared across functions and MicroVMs are never shared across AWS accounts. This is the isolation model for the Lambda service:

Isolation model for the Lambda service

A single execution environment may be reused by subsequent function invocations. This helps improve performance since it reduces the time taken to prepare and environment. Within your code, you can take advantage of this behavior to improve performance further, by caching locally within the function or reusing long-lived connections. All of these invocations are handled by a single process, so any process-wide state (such as static state in Java) is available across all invocations within the same execution environment.

There is also a local file system available at /tmp for all Lambda functions. This is local to each function but shared across invocations within the same execution environment. If your function must access large libraries or files, these can be downloaded here first and then used by all subsequent invocations. This mechanism provides a way to amortize the cost and time of downloading this data across multiple invocations.

While data is never shared across AWS customers, it is possible for data from one Lambda function to be shared with another invocation of the same function instance. This can be useful for caching common values or sharing libraries. However, if you have information only intended for a single invocation, you should:

  • Ensure that data is only used in a local variable scope.
  • Delete any /tmp files before exiting, and use a UUID name to prevent different instances from accessing the same temporary files.
  • Ensure that any callbacks are complete before exiting.

For applications requiring the highest levels of security, you may also implement your own memory encryption and wiping process before a function exits. At the function level, the Lambda service does not inspect or scan your code. Many of the best practices in security for software development continue to apply in serverless software development.

The security posture of an application is determined by the use-case but developers should always take precautions against common risks such as misconfiguration, injection flaws, and handling user input. Developers should be familiar with common security concepts and security risks, such as those listed in the OWASP Top 10 Web Application Security Risks and the OWASP Serverless Top 10. The use of static code analysis tools, unit tests, and regression tests are still valid in a serverless compute environment.

To learn more, read “Amazon Web Services: Overview of Security Processes”, “Compliance validation for AWS Lambda”, and “Security Overview of AWS Lambda”.

Applying the principles of least privilege

AWS Identity and Access Management (IAM) is the service used to manage access to AWS services. Before using IAM, it’s important to review security best practices that apply across AWS, to ensure that your user accounts are secured appropriately.

Lambda is fully integrated with IAM, allowing you to control precisely what each Lambda function can do within the AWS Cloud. There are two important policies that define the scope of permissions in Lambda functions. The event source uses a resource policy that grants permission to invoke the Lambda function, whereas the Lambda service uses an execution role to constrain what the function is allowed to do. In many cases, the console configures both of these policies with default settings.

As you start to build Lambda-based applications with frameworks such as AWS SAM, you describe both policies in the application’s template.

Resource and execution role policy

By default, when you create a new Lambda function, a specific IAM role is created for only that function.

IAM role for a Lambda function

This role has permissions to create an Amazon CloudWatch log group in the current Region and AWS account, and create log streams and put events to those streams. The policy follows the principle of least privilege by scoping precise permissions to specific resources, AWS services, and accounts.

Developing least privilege IAM roles

As you develop a Lambda function, you expand the scope of this policy to enable access to other resources. For example, for a function that processes objects put into an Amazon S3 bucket, it requires read access to objects stored in that bucket. Do not grant the function broader permissions to write or delete data, or operate in other buckets.

Determining the exact permissions can be challenging, since IAM permissions are granular and they control access to both the data plane and control plane. The following references are useful for developing IAM policies:

One of the fastest ways to scope permissions appropriately is to use AWS SAM policy templates. You can reference these templates directly in the AWS SAM template for your application, providing custom parameters as required:

SAM policy templates

In this example, the S3CrudPolicy template provides full create, read, update, and delete permissions to one bucket, and the S3ReadPolicy template provides only read access to another bucket. AWS SAM named templates expand into more verbose AWS CloudFormation policy definitions that show how the principle of least privilege is applied. The S3ReadPolicy is defined as:

        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "s3:GetObject",
              "s3:ListBucket",
              "s3:GetBucketLocation",
              "s3:GetObjectVersion",
              "s3:GetLifecycleConfiguration"
            ],
            "Resource": [
              {
                "Fn::Sub": [
                  "arn:${AWS::Partition}:s3:::${bucketName}",
                  {
                    "bucketName": {
                      "Ref": "BucketName"
                    }
                  }
                ]
              },
              {
                "Fn::Sub": [
                  "arn:${AWS::Partition}:s3:::${bucketName}/*",
                  {
                    "bucketName": {
                      "Ref": "BucketName"
                    }
                  }
                ]
              }
            ]
          }
        ]

It includes the necessary, minimal permissions to retrieve the S3 object, including getting the bucket location, object version, and lifecycle configuration.

Access to CloudWatch Logs

To log output, Lambda roles must provide access to CloudWatch Logs. If you are building a policy manually, ensure that it includes:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "logs:CreateLogGroup",
            "Resource": "arn:aws:logs:region:accountID:*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": [
                "arn:aws:logs:region:accountID:log-group:/aws/lambda/functionname:*"
            ]
        }
    ]
}

If the role is missing these permissions, the function still runs but it is unable to log any output to the CloudWatch service.

Avoiding wildcard permissions in IAM policies

The granularity of IAM permissions means that developers may choose to use overly broad permissions when they are testing or developing code.

IAM supports the “*” wildcard in both the resources and actions attributes, making it easier to select multiple matching items automatically. These may be useful when developing and testing functions in specific development AWS accounts with no access to production data. However, you should ensure that “star permissions” are never used in production environments.

Wildcard permissions grant broad permissions, often for many permissions or resources. Many AWS managed policies, such as AdministratorAccess, provide broad access intended only for user roles. Do not apply these policies to Lambda functions, since they do not specify individual resources.

In Application design and Service Quotas – Part 1, the section Using multiple AWS accounts for managing quotas shows a multiple account example. This approach provisions a separate AWS account for each developer in a team, and separates accounts for beta and production. This can help prevent developers from unintentionally transferring overly broad permissions to beta or production accounts.

For developers using the Serverless Framework, the Safeguards plugin is a policy-as-code framework to check deployed templates for compliance with security.

Specialized Lambda functions compared with all-purpose functions

In the post on Lambda design principles, I discuss architectural decisions in choosing between specialized functions and all-purpose functions. From a security perspective, it can be more difficult to apply the principles of least privilege to all-purpose functions. This is partly because of the broad capabilities of these functions and also because developers may grant overly broad permissions to these functions.

When building smaller, specialized functions with single tasks, it’s often easier to identify the specific resources and access requirements, and grant only those permissions. Additionally, since new features are usually implemented by new functions in this architectural design, you can specifically grant permissions in new IAM roles for these functions.

Avoid sharing IAM roles with multiple Lambda functions. As permissions are added to the role, these are shared across all functions using this role. By using one dedicated IAM role per function, you can control permissions more intentionally. Every Lambda function should have a 1:1 relationship with an IAM role. Even if some functions have the same policy initially, always separate the IAM roles to ensure least privilege policies.

To learn more, the series of posts for “Building well-architected serverless applications: Controlling serverless API access” – part 1, part 2, and part 3.

Conclusion

This post explains the Lambda execution environment and how the service protects customer data. It covers important steps you should take to prevent data leakage between invocations and provides additional security resources to review.

The principles of least privilege also apply to Lambda-based applications. I show how you can develop IAM policies and practices to ensure that IAM roles are scoped appropriately, and why you should avoid wildcard permissions. Finally, I explain why using smaller, specialized Lambda functions can help maintain least privilege.

Part 2 will discuss security workloads with public endpoints and how to use AWS CloudTrail for governance, compliance, and operational auditing of Lambda usage.

For more serverless learning resources, visit Serverless Land.