Optimizing your AWS Lambda costs – Part 2

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/optimizing-your-aws-lambda-costs-part-2/

This post is written by Chris Williams, Solutions Architect and Thomas Moore, Solutions Architect, Serverless.

Part 1 of this blog series looks at optimizing AWS Lambda costs through right-sizing a function’s memory, and code tuning. We also explore how using Graviton2, Provisioned Concurrency and Compute Savings Plans can offer a reduction in per-millisecond billing.

Part 2 continues to explore cost optimization techniques for Lambda with a focus on architectural improvements and cost-effective logging.

Event filtering

A common serverless architecture pattern is Lambda reading events from a queue or a stream, such as Amazon SQS or Amazon Kinesis Data Streams. This uses an event source mapping, which defines how the Lambda service handles incoming messages or records from the event source.

Event filtering

Sometimes you don’t want to process every message in the queue or stream because the data is not relevant. For example, if IoT vehicle data is sent to a Kinesis Stream and you only want to process events where tire_pressure is < 32, then the Lambda code may look like this:

def lambda_handler(event, context):
   if(event[“tire_pressure”] >=32):
      return

# business logic goes here

This is inefficient as you are paying for Lambda invocations and execution time when there is no business value beyond filtering.

Lambda now supports the ability to filter messages before invocation, simplifying your code and reducing costs. You only pay for Lambda when the event matches the filter criteria and triggers an invocation.

Filtering is supported for Kinesis Streams, Amazon DynamoDB Streams and SQS by specifying filter criteria when setting up the event source mapping. For example, using the following AWS CLI command:

aws lambda create-event-source-mapping \
--function-name fleet-tire-pressure-evaluator \
--batch-size 100 \
--starting-position LATEST \
--event-source-arn arn:aws:kinesis:us-east-1:123456789012:stream/fleet-telemetry \
--filter-criteria '{"Filters": [{"Pattern": "{\"tire_pressure\": [{\"numeric\": [\"<\", 32]}]}"}]}'

After applying the filter, Lambda is only invoked when tire_pressure is less than 32 in messages received from the Kinesis Stream. In this example, it may indicate a problem with the vehicle and require attention.

For more information on how to create filters, refer to examples of event pattern rules in EventBridge, as Lambda filters messages in the same way. Event filtering is explored in greater detail in the Lambda event filtering launch blog.

Avoid idle wait time

Lambda function duration is one dimension used for calculating billing. When function code makes a blocking call, you are billed for the time that it waits to receive a response.

This idle wait time can grow when Lambda functions are chained together, or a function is acting as an orchestrator for other functions. For customers who have workflows such as batch operations or order delivery systems, this adds management overhead. Additionally, it may not be possible to complete all workflow logic and error handling within the maximum Lambda timeout of 15 minutes.

Instead of handling this logic in function code, re-architect your solution to use AWS Step Functions as an orchestrator of the workflow. When using a standard workflow, you are billed for each state transition within the workflow rather than the total duration of the workflow. In addition, you can move support for retries, wait conditions, error workflows and callbacks into the state condition allowing your Lambda functions to focus on business logic.

The following example shows an example Step Functions state machine, where a single Lambda function is split into multiple states. During the wait period, there is no charge. You are only billed on state transition.

State machine

Direct integrations

If a Lambda function is not performing custom logic when it integrates with other AWS services, it may be unnecessary and could be replaced by a lower-cost direct integration.

For example, you may be using API Gateway together with a Lambda function to read from a DynamoDB table:

With Lambda

This could be replaced using a direct integration, removing the Lambda function:

Without Lambda

API Gateway supports transformations to return the output response in a format the client expects. This avoids having to use a Lambda function to do the transformation. You can find more detailed instructions on creating an API Gateway with an AWS service integration in the documentation.

You can also benefit from direct integration when using Step Functions. Today, Step Functions supports over 200 AWS services and 9,000 API actions. This gives greater flexibility for direct service integration and in many cases removes the need for a proxy Lambda function. This can simplify Step Function workflows and may reduce compute costs.

Reduce logging output

Lambda automatically stores logs that the function code generates through Amazon CloudWatch Logs. This may be useful for understanding what is happening within your application in near real-time. CloudWatch Logs includes a charge for the total data ingested throughout the month. Therefore, reducing output to include only necessary information can help reduce costs.

When you deploy workloads into production, review the logging level of your application. For example, in a pre-production environment, debug logs can be beneficial in providing additional information to tune the function. Within your production workloads, you may disable debug level logs and use a logging library (such as the Lambda Powertools Python Logger). You can define a minimum logging level to output by using an environment variable, which allows configuration outside of the function code.

Structuring your log format enforces a standard set of information through a defined schema, instead of allowing variable formats or large volumes of text. Defining structures such as error codes and adding accompanying metrics leads to a reduction in the volume of text that repeats throughout your logs. This also improves the ability to filter logs for specific error types and reduces the risk of a mistyped character in a log message.

Use Cost-Effective Storage for Logs

Once CloudWatch Logs ingests data, by default it is persisted forever with a per-GB monthly storage fee. As log data ages, it typically becomes less valuable in the immediate time frame, and is instead reviewed historically on an ad-hoc basis. However, the storage pricing within CloudWatch Logs remains the same.

To avoid this, set retention policies on your CloudWatch Logs log groups to delete old log data automatically. This retention policy applies to both your existing and future log data.

Some applications logs may need to persist for months or years for compliance or regulatory requirements. Instead of keeping the logs in CloudWatch Logs, export them to Amazon S3. By doing this, you can take advantage of lower-cost storage object classes while factoring in any expected usage patterns for how or when data is accessed.

Conclusion

Cost optimization is an important part of creating well-architected solutions and this is no different when using serverless. This blog series explores some best practice techniques to help reduce your Lambda bill.

If you are already running AWS Lambda applications in production today, some techniques are easier to implement than others. For example, you can purchase Savings Plans with zero code or architecture changes, whereas avoiding idle wait time will require new services and code changes. Evaluate which technique is right for your workload in a development environment before applying changes to production.

If you are still in the design and development stages, use this blog series as a reference to incorporate these cost optimization techniques at an early stage. This ensures that your solutions are optimized from day one.

To get hands-on implementing some of the techniques discussed, take the Serverless Optimization Workshop.

For more serverless learning resources, visit Serverless Land.