Filtering event sources for AWS Lambda functions

Post Syndicated from Benjamin Smith original https://aws.amazon.com/blogs/compute/filtering-event-sources-for-aws-lambda-functions/

This post is written by Heeki Park, Principal Specialist Solutions Architect – Serverless.

When an AWS Lambda function is configured with an event source, the Lambda service triggers a Lambda function for each message or record. The exact behavior depends on the choice of event source and the configuration of the event source mapping. The event source mapping defines how the Lambda service handles incoming messages or records from the event source.

Today, AWS announces the ability to filter messages before the invocation of a Lambda function. Filtering is supported for the following event sources: Amazon Kinesis Data Streams, Amazon DynamoDB Streams, and Amazon SQS. This helps reduce requests made to your Lambda functions, may simplify code, and can reduce overall cost.

Overview

Consider a logistics company with a fleet of vehicles in the field. Each vehicle is enabled with sensors and 4G/5G connectivity to emit telemetry data into Kinesis Data Streams:

  • In one scenario, they use machine learning models to infer the health of vehicles based on each payload of telemetry data, which is outlined in example 2 on the Lambda pricing page.
  • In another scenario, they want to invoke a function, but only when tire pressure is low on any of the tires.

If tire pressure is low, the company notifies the maintenance team to check the tires when the vehicle returns. The process checks if the warehouse has enough spare replacements. Optionally, it notifies the purchasing team to buy additional tires.

The application responds to the stream of incoming messages and runs business logic if tire pressure is below 32 psi. Each vehicle in the field emits telemetry as follows:

{
    "time": "2021-11-09 13:32:04",
    "fleet_id": "fleet-452",
    "vehicle_id": "a42bb15c-43eb-11ec-81d3-0242ac130003",
    "lat": 47.616226213162406,
    "lon": -122.33989110734133,
    "speed": 43,
    "odometer": 43519,
    "tire_pressure": [41, 40, 31, 41],
    "weather_temp": 76,
    "weather_pressure": 1013,
    "weather_humidity": 66,
    "weather_wind_speed": 8,
    "weather_wind_dir": "ne"
}

To process all messages from a fleet of vehicles, you configure a filter matching the fleet id in the following example. The Lambda service applies the filter pattern against the full payload that it receives.

The schema of the payload for Kinesis and DynamoDB Streams is shown under the “kinesis” attribute in the example Kinesis record event. When building filters for Kinesis or DynamoDB Streams, you filter the payload under the “data” attribute. The schema of the payload for SQS is shown in the array of records in the example SQS message event. When working with SQS, you filter the payload under the “body” attribute:

{
    "data": {
        "fleet_id": ["fleet-452"]
    }
}

To process all messages associated with a specific vehicle, configure a filter on only that vehicle id. The fleet id is kept in the example to show that it matches on both of those filter criteria:

{
    "data": {
        "fleet_id": ["fleet-452"],
        "vehicle_id": ["a42bb15c-43eb-11ec-81d3-0242ac130003"]
    }
}

To process all messages associated with that fleet but only if tire pressure is below 32 psi, you configure the following rule pattern. This pattern searches the array under tire_pressure to match values less than 32:

{
    "data": {
        "fleet_id": ["fleet-452"],
        "tire_pressure": [{"numeric": ["<", 32]}]
    }
}

To create the event source mapping with this filter criteria with an AWS CLI command, run the following command.

aws lambda create-event-source-mapping \
--function-name fleet-tire-pressure-evaluator \
--batch-size 100 \
--starting-position LATEST \
--event-source-arn arn:aws:kinesis:us-east-1:0123456789012:stream/fleet-telemetry \
--filter-criteria '{"Filters": [{"Pattern": "{\"tire_pressure\": [{\"numeric\": [\"<\", 32]}]}"}]}'

For the CLI, the value for Pattern in the filter criteria requires the double quotes to be escaped in order to be properly captured.

Alternatively, to create the event source mapping with this filter criteria with an AWS Serverless Application Model (AWS SAM) template, use the following snippet.

Events: 
  TirePressureEvent: 
    Type: Kinesis    
    Properties: 
      BatchSize: 100
      StartingPosition: LATEST
      Stream: "arn:aws:kinesis:us-east-1:0123456789012:stream/fleet-telemetry"
      Filters: 
        - Pattern: "{\"data\": {\"tire_pressure\": [{\"numeric\": [\"<\", 32]}]}}"

For the AWS SAM template, the value for Pattern in the filter criteria does not require escaped double quotes.

For more information on how to create filters, refer to examples of event pattern rules in EventBridge, as Lambda filters messages in the same way.

Reducing costs with event filtering

By configuring the event source with this filter criteria, you can reduce the number of messages that are used to invoke your Lambda function.

Using the example from the Lambda pricing page, with a fleet of 10,000 vehicles in the field, each is emitting telemetry once an hour. Each month, the vehicles emit 10,000 * 24 * 31 = 7,440,000 messages, which trigger the same number of Lambda invocations. You configure the function with 256 MB of memory and the average duration of the function is 100 ms. In this example, vehicles emit low-pressure telemetry once every 31 days.

Without filtering, the cost of the application is:

  • Monthly request charges → 7.44M * $0.20/million = $1.49
  • Monthly compute duration (seconds) → 7.44M * 0.1 seconds = 0.744M seconds
  • Monthly compute (GB-s) → 256MB/1024MB * 0.744M seconds = 0.186M GB-s
  • Monthly compute charges → 0.186M GB-s * $0.0000166667 = $3.10
  • Monthly total charges = $1.49 + $3.10 = $4.59

With filtering, the cost of the application is:

  • Monthly request charges → (7.44M / 31)* $0.20/million = $0.05
  • Monthly compute duration (seconds) → (7.44M / 31) * 0.1 seconds = 0.024M seconds
  • Monthly compute (GB-s) → 256MB/1024MB * 0.024M seconds = 0.006M GB-s
  • Monthly compute charges → 0.006M GB-s * $0.0000166667 = $0.10
  • Monthly total charges = $0.05 + $0.10 = $0.15

By using filtering, the cost is reduced from $4.59 to $0.15, a 96.7% cost reduction.

Designing and implementing event filtering

In addition to reducing cost, the functions now operate more efficiently. This is because they no longer iterate through arrays of messages to filter out messages. The Lambda service filters the messages that it receives from the source before batching and sending them as the payload for the function invocation. This is the order of operations:

Event flow with filtering

Event flow with filtering

As you design filter criteria, keep in mind a few additional properties. The event source mapping allows up to five patterns. Each pattern can be up to 2048 characters. As the Lambda service receives messages and filters them with the pattern, it fills the batch per the normal event source behavior.

For example, if the maximum batch size is set to 100 records and the maximum batching window is set to 10 seconds, the Lambda service filters and accumulates records in a batch until one of those two conditions is satisfied. In the case where 100 records that meet the filter criteria come during the batching window, the Lambda service triggers a function with those filtered 100 records in the payload.

If fewer than 100 records meeting the filter criteria arrive during the batch window, Lambda triggers a function with the filtered records that came during the batch window at the end of the 10-second batch window. Be sure to configure the batch window to match your latency requirements.

The Lambda service ignores filtered messages and treats them as successfully processed. For Kinesis Data Streams and DynamoDB Streams, the iterator advances past the records that were sent via the event source mapping.

For SQS, the messages are deleted from the queue without any additional processing. With SQS, be sure that the messages that are filtered out are not required. For example, you have an Amazon SNS topic with multiple SQS queues subscribed. The Lambda functions consuming each of those SQS queues process different subsets of messages. You could use filters on SNS but that would require the message publisher to add attributes to the messages that it sends. You could instead use filters on the event source mapping for SQS. Now the publisher does not need to make any changes, as the filter is applied on the messages payload directly.

Conclusion

Lambda now supports the ability to filter messages based on a criteria that you define. This can reduce the number of messages that your functions process, may reduce cost, and can simplify code.

You can now build applications for specific use cases that use only a subset of the messages that flow through your event-driven architectures. This can help optimize the compute efficiency of your functions.

Learn more about this capability in our AWS Lambda Developer Guide.