All posts by Brian Maguire

Serverless architecture for optimizing Amazon Connect call-recording archival costs

2022-06-24 Brian Maguire

Post Syndicated from Brian Maguire original https://aws.amazon.com/blogs/architecture/serverless-architecture-for-optimizing-amazon-connect-call-recording-archival-costs/

In this post, we provide a serverless solution to cost-optimize the storage of contact-center call recordings. The solution automates the scheduling, storage-tiering, and resampling of call-recording files, resulting in immediate cost savings. The solution is an asynchronous architecture built using AWS Step Functions, Amazon Simple Queue Service (Amazon SQS), and AWS Lambda.

Amazon Connect provides an omnichannel cloud contact center with the ability to maintain call recordings for compliance and gaining actionable insights using Contact Lens for Amazon Connect and AWS Contact Center Intelligence Partners. The storage required for call recordings can quickly increase as customers meet compliance retention requirements, often spanning six or more years. This can lead to hundreds of terabytes in long-term storage.

Solution overview

When an agent completes a customer call, Amazon Connect sends the call recording to an Amazon Simple Storage Solution (Amazon S3) bucket with: a date and contact ID prefix, the file stored in the .WAV format and encoded using bitrate 256 kb/s, pcm_s16le, 8000 Hz, two channels, and 256 kb/s. The call-recording files are approximately 2 Mb/minute optimized for high-quality processing, such as machine learning analysis (see Figure 1).

Figure 1. Asynchronous architecture for batch resampling for call-recording files on Amazon S3

When a call recording is sent to Amazon S3, downstream post-processing is often performed to generate analytics reports for agents and quality auditors. The downstream processing can include services that provide transcriptions, quality-of-service metrics, and sentiment analysis to create reports and trigger actionable events.

While this processing is often completed within minutes, the downstream applications could require processing retries. As audio resampling reduces the quality of the audio files, it is essential to delay resampling until after processing is completed. As processed call recordings are infrequently accessed days after a call is completed, with only a small percentage accessed by agents and call quality auditors, call recordings can benefit from resampling and transitioning to long-term Amazon S3 storage tiers.

In Figure 2, multiple AWS services work together to provide an end-to-end cost-optimization solution for your contact center call recordings.

Figure 2. AWS Step Function orchestrates the batch resampling of call recordings

An Amazon EventBridge schedule rule triggers the step function to perform the batch resampling process for all call recordings from the previous 7 days.

In the first step function task, the Lambda function task iterates the S3 bucket using the ListObjectsV2 API, obtaining the call recordings (1000 objects per iteration) with the date prefix from 7 days ago.

The next task invokes a Lambda function inserting the call recording objects into the Amazon SQS queue. The audio-conversion Lambda function receives the Amazon SQS queue events via the event source mapping Lambda integration. Each concurrent Lambda invocation downloads a stored call recording from Amazon S3, resampling the .WAV with ffmpeg and tagging the S3 object with a “converted=True” tag.

Finally, the conversion function uploads the resampled file to Amazon S3, overwriting the original call recording with the resampled recording using a cost-optimized storage class, such as S3 Glacier Instant Retrieval. S3 Glacier Instant Retrieval provides the lowest cost for long-lived data that is rarely accessed and requires milliseconds retrieval, such as for contact-center call-recording playback. By default, Amazon Connect stores call recordings with S3 Versioning enabled, maintaining the original file as a version. You can use lifecycle policies to delete object versions from a version-enabled bucket to permanently remove the original version, as this will minimize the storage of the original call recording.

This solution captures failures within the step function workflow with logging and a dead-letter queue, such as when an error occurs with resampling a recording file. A Step Function task monitors the Amazon SQS queue using the AWS Step Function integration with AWS SDK with SQS and ending the workflow when the queue is emptied. Table 1 demonstrates the default and resampled formats.

Figure 3. Detailed AWS Step Functions state machine diagram

Resampling

Table 1. Default and resampled call recording audio formats

Audio sampling formats	File size/minute	Notes
Bitrate 256 kb/s, pcm_s16le, 8000 Hz, 2 channels, 256 kb/s	2 MB	The default for Amazon Connect call recordings. Sampled for audio quality and call analytics processing.
Bitrate 64 kb/s, pcm_alaw, 8000 Hz, 1 channel, 64 kb/s	0.5 MB	Resampled to mono channel 8 bit. This resampling is not reversible and should only be performed after all call analytics processing has been completed.

Cost assessment

For pricing information for the primary services used in the solution, visit:

The costs incurred by the solution are based on usage and are AWS Free Tier eligible. After the AWS Free Tier allowance is consumed, usage costs are approximately $0.11 per 1000 minutes of call recordings. S3 Standard starts at $0.023 per GB/month; and S3 Glacier Instant Retrieval is $0.004 per GB/month, with $0.003 per GB of data retrieval. During a 6-year compliance retention term, the schedule-based resampling and storage tiering results in significant cost savings.

In the 6-year example detailed in Table 2, the S3 Standard storage costs would be approximately $356,664 for 3 million call-recording minutes/month. The audio resampling and S3 Glacier Instant Retrieval tiering reduces the 6-year cost to approximately $41,838.

Table 2. Multi-year costs savings scenario (3 million minutes/month) in USD

Year	Total minutes (3 million/month)	Total storage (TB)	Cost of storage, S3 Standard (USD)	Cost of running the resampling (USD)	Cost of resampling solution with S3 Glacier Instant Retrieval (USD)
1	36,000,000	72	10,764	3,960	4,813
2	72,000,000	108	30,636	3,960	5,677
3	108,000,000	144	50,508	3,960	6,541
4	144,000,000	180	70,380	3,960	7,405
5	180,000,000	216	90,252	3,960	8,269
6	216,000,000	252	110,124	3,960	9,133
Total	1,008,000,000	972	356,664	23,760	41,838

To explore PCA costs for yourself, use AWS Cost Explorer or choose Bill Details on the AWS Billing Dashboard to see your month-to-date spend by service.

Deploying the solution

The code and documentation for this solution are available by cloning the git repository and can be deployed with AWS Cloud Development Kit (AWS CDK).

Bash
# clone repository
git clone https://github.com/aws-samples/amazon-connect-call-recording-cost-optimizer.git
# navigate the project directory
cd amazon-connect-call-recording-cost-optimizer

Modify the cdk.context.json with your environment’s configuration setting, such as the bucket_name. Next, install the AWS CDK dependencies and deploy the solution:

:# ensure you are in the root directory of the repository

./cdk-deploy.sh

Once deployed, you can test the resampling solution by waiting for the EventBridge schedule rule to execute based on the num_days_age setting that is applied. You can also manually run the AWS Step Function with a specified date, for example {"specific_date":"01/01/2022"}.

The AWS CDK deployment creates the following resources:

AWS Step Function
AWS Lambda function
Amazon SQS queues
Amazon EventBridge rule

The solution handles the automation of transitioning a storage tier, such as S3 Glacier Instant Retrieval. In addition, Amazon S3 Lifecycles can be set manually to transition the call recordings after resampling to alternative Amazon S3 Storage Classes.

Cleanup

When you are finished experimenting with this solution, cleanup your resources by running the command:

cdk destroy

This command deletes the AWS CDK-deployed resources. However, the S3 bucket containing your call recordings and CloudWatch log groups are retained.

Conclusion

This call recording resampling solution offers an automated, cost-optimized, and scalable architecture to reduce long-term compliance call recording archival costs.

Scaling Data Analytics Containers with Event-based Lambda Functions

2021-08-30 Brian Maguire

Post Syndicated from Brian Maguire original https://aws.amazon.com/blogs/architecture/scaling-data-analytics-containers-with-event-based-lambda-functions/

The marketing industry collects and uses data from various stages of the customer journey. When they analyze this data, they establish metrics and develop actionable insights that are then used to invest in customers and generate revenue.

If you’re a data scientist or developer in the marketing industry, you likely often use containers for services like collecting and preparing data, developing machine learning models, and performing statistical analysis. Because the types and amount of marketing data collected are quickly increasing, you’ll need a solution to manage the scale, costs, and number of required data analytics integrations.

In this post, we provide a solution that can perform and scale with dynamic traffic and is cost optimized for on-demand consumption. It uses synchronous container-based data science applications that are deployed with asynchronous container-based architectures on AWS Lambda. This serverless architecture automates data analytics workflows utilizing event-based prompts.

Synchronous container applications

Data science applications are often deployed to dedicated container instances, and their requests are routed by an Amazon API Gateway or load balancer. Typically, an Amazon API Gateway routes HTTP requests as synchronous invocations to instance-based container hosts.

The target of the requests is a container-based application running a machine learning service (SciLearn). The service container is configured with the required dependency packages such as scikit-learn, pandas, NumPy, and SciPy.

Containers are commonly deployed on different targets such as on-premises, Amazon Elastic Container Service (Amazon ECS), Amazon Elastic Compute Cloud (Amazon EC2), and AWS Elastic Beanstalk. These services run synchronously and scale through Amazon Auto Scaling groups and a time-based consumption pricing model.

Figure 1. Synchronous container applications diagram

Challenges with synchronous architectures

When using a synchronous architecture, you will likely encounter some challenges related to scale, performance, and cost:

Operation blocking. The sender does not proceed until Lambda returns, and failure processing, such as retries, must be handled by the sender.
No native AWS service integrations. You cannot use several native integrations with other AWS services such as Amazon Simple Storage Service (Amazon S3), Amazon EventBridge, and Amazon Simple Queue Service (Amazon SQS).
Increased expense. A 24/7 environment can result in increased expenses if resources are idle, not sized appropriately, and cannot automatically scale.

The New for AWS Lambda – Container Image Support blog post offers a serverless, event-based architecture to address these challenges. This approach is explained in detail in the following section.

Benefits of using Lambda Container Image Support

In our solution, we refactored the synchronous SciLearn application that was deployed on instance-based hosts as an asynchronous event-based application running on Lambda. This solution includes the following benefits:

Easier dependency management with Dockerfile. Dockerfile allows you to install native operating system packages and language-compatible dependencies or use enterprise-ready container images.
Similar tooling. The asynchronous and synchronous solutions use Amazon Elastic Container Registry (Amazon ECR) to store application artifacts. Therefore, they have the same build and deployment pipeline tools to inspect Dockerfiles. This means your team will likely spend less time and effort learning how to use a new tool.
Performance and cost. Lambda provides sub-second autoscaling that’s aligned with demand. This results in higher availability, lower operational overhead, and cost efficiency.
Integrations. AWS provides more than 200 service integrations to deploy functions as container images on Lambda, without having to develop it yourself so it can be deployed faster.
Larger application artifact up to 10 GB. This includes larger application dependency support, giving you more room to host your files and packages in oppose to hard limit of 250 MB of unzipped files for deployment packages.

Scaling with asynchronous events

AWS offers two ways to asynchronously scale processing independently and automatically: Elastic Beanstalk worker environments and asynchronous invocation with Lambda. Both options offer the following:

They put events in an SQS queue.
They can be designed to take items from the queue only when they have the capacity available to process a task. This prevents them from becoming overwhelmed.
They offload tasks from one component of your application by sending them to a queue and process them asynchronously.

These asynchronous invocations add default, tunable failure processing and retry mechanisms through “on failure” and “on success” event destinations, as described in the following section.

Integrations with multiple destinations

“On failure” and “on success” events can be logged in an SQS queue, Amazon Simple Notification Service (Amazon SNS) topic, EventBridge event bus, or another Lambda function. All four are integrated with most AWS services.

“On failure” events are sent to an SQS dead-letter queue because they cannot be delivered to their destination queues. They will be reprocessed them as needed, and any problems with message processing will be isolated.

Figure 2 shows an asynchronous Amazon API Gateway that has placed an HTTP request as a message in an SQS queue, thus decoupling the components.

The messages within the SQS queue then prompt a Lambda function. This runs the machine learning service SciLearn container in Lambda for data analysis workflows, which are integrated with another SQS dead letter queue for failures processing.

Figure 2. Example asynchronous-based container applications diagram

When you deploy Lambda functions as container images, they benefit from the same operational simplicity, automatic scaling, high availability, and native integrations. This makes it an appealing architecture for our data analytics use cases.

Design considerations

The following can be considered when implementing Docker Container Images with Lambda:

Lambda supports container images that have manifest files that follow these formats:
- Docker image manifest V2, schema 2 (Docker version 1.10 and newer)
- Open Container Initiative (OCI) specifications (v1.0.0 and up)
Storage in Lambda:
- Memory (128 MB to 10,240 MB)
- /tmp space (512 MB)
- Container images up to10 GB
- Amazon S3 or Amazon Elastic File System (Amazon EFS) as external storage
On create/update, Lambda will cache the image to speed up the cold start of functions during execution. Cold starts occur on an initial request to Lambda, which can lead to longer startup times. The first request will maintain an instance of the function for only a short time period. If Lambda has not been called during that period, the next invocation will create a new instance
Fine grain role policies are highly recommended for security purposes
Container images can use the Lambda Extensions API to integrate monitoring, security and other tools with the Lambda execution environment.

Conclusion

We were able to architect this synchronous service based on a previously deployed on instance-based hosts and design it to become asynchronous on Amazon Lambda.

By using the new support for container-based images in Lambda and converting our workload into an asynchronous event-based architecture, we were able to overcome these challenges:

Performance and security. With batch requests, you can scale asynchronous workloads and handle failure records using SQS Dead Letter Queues and Lambda destinations. Using Lambda to integrate with other services (such as EventBridge and SQS) and using Lambda roles simplifies maintaining a granular permission structure. When Lambda uses an Amazon SQS queue as an event source, it can scale up to 60 more instances per minute, with a maximum of 1,000 concurrent invocations.
Cost optimization. Compute resources are a critical component of any application architecture. Overprovisioning computing resources and operating idle resources can lead to higher costs. Because Lambda is serverless, it only incurs costs on when you invoke a function and the resources allocated for each request.

Noise

All posts by Brian Maguire

Serverless architecture for optimizing Amazon Connect call-recording archival costs

Solution overview

Resampling

Cost assessment

Deploying the solution

Cleanup

Conclusion

Scaling Data Analytics Containers with Event-based Lambda Functions

Synchronous container applications

Challenges with synchronous architectures

Benefits of using Lambda Container Image Support

Scaling with asynchronous events

Integrations with multiple destinations

Design considerations

Conclusion

The collective thoughts of the interwebz