Pay as you go machine learning inference with AWS Lambda

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/pay-as-you-go-machine-learning-inference-with-aws-lambda/

This post is courtesy of Eitan Sela, Senior Startup Solutions Architect.

Many customers want to deploy machine learning models for real-time inference, and pay only for what they use. Using Amazon EC2 instances for real-time inference may not be cost effective to support sporadic inference requests throughout the day.

AWS Lambda is a serverless compute service with pay-per-use billing. However, ML frameworks like XGBoost are too large to fit into the 250 MB application artifact size limit, or the 512 MB /tmp space limit. While you can store the packages in Amazon S3 and download to Lambda (up to 3 GB), this can increase the cost.

To address this, Lambda functions can now mount an Amazon Elastic File System (EFS). This is a scalable and elastic NFS file system storing data within and across multiple Availability Zones (AZ) for high availability and durability.

With this new capability, it’s now easier to use Python packages in Lambda that require storage space to load models and other dependencies.

In this blog post, I walk through how to:

Create an EFS file system and an Access Point as an application-specific entry point.
Provision an EC2 instance, mount EFS using the Access Point, and train a breast cancer XGBoost ML model. XGBoost, Python packages, and the model are saved on the EFS file system.
Create a Lambda function that loads the Python packages and model from EFS, and performs the prediction based on a test event.

Create an Amazon EFS file system with an Access Point

Configuring EFS for Lambda is straight-forward. I show how to do this in the AWS CloudFormation but you can also use the AWS CLI, AWS SDK, and AWS Serverless Application Model (AWS SAM).

EFS file systems are created within a customer VPC, so Lambda functions using the EFS file system must have access to the same VPC.

You can deploy the AWS CloudFormation stack located on this GitHub repository.

The stack includes the following:

Create a VPC with public subnet.
Create an EFS file system
Create an EFS Access Point
Create an EC2 in the VPC

It can take up to 10 minutes for the CloudFormation stack to create the resources. After the resource creation is complete, navigate to the EFS console to see the new file system.

Navigate to the Access Points panel to see a new Access Point with the File system ID from the previous page.

Note the Access Point ID and File System ID for the following sections.

Launch an Amazon EC2 instance to train a breast cancer model

In this section, you install Python packages on the EFS file system, after mounting it to EC2. You then train the breast cancer model, and save the model in the EFS file system used by the Lambda function.

The machine learning framework you use for this function is XGBoost. This is an optimized distributed gradient boosting library designed to be highly efficient, flexible, and portable. XGBoost is one of the most popular machine learning algorithms.

Navigate to the EC2 console to see the new EC2 instance created from the CloudFormation stack. This is an Amazon Linux 2 c5.large EC2 instance named ‘xgboost-for-serverless-inference-cfn-ec2’. In the instance details, you see that the security group is configured to allow inbound SSH access (for connecting to the instance).

Mount the EFS file system on the EC2

Connect to the instance using SSH and mount the EFS file system previously created by using the Access Point:

Install amazon-efs-utils tools:
sudo yum -y install amazon-efs-utils
Create a directory to mount EFS into:
mkdir efs
Mount the EFS file system using the Access Point:
sudo mount -t efs -o tls,accesspoint=<Access point ID> <File system ID>:/ efs

Install Python, pip and required packages

Install Python and pip:
sudo yum -y install python37
curl -O https://bootstrap.pypa.io/get-pip.py
python3 get-pip.py --user
Verify the installation:
python3 --version
pip3 --version
Create a requirements.txt file containing the dependencies:
xgboost==1.1.1
pandas
sklearn
joblib
Install the Python packages using the requirements file:
pip3 install -t efs/lib/ -r requirements.txt
Note: using bursting throughput mode with EFS File system, this action can take up to 10 minutes.
Set the Python path to refer to the installed packages directory of EFS file system:
export PYTHONPATH=/home/ec2-user/efs/lib/

Train the breast cancer model

The breast cancer model predicts whether the breast mass is a malignant tumor or benign by looking at features computed from a digitized image of a fine needle aspirate of a breast mass.

The data used to train the model consists of the diagnosis in addition to the 10 real-valued features that are computed for each cell nucleus. Such features include radius, texture, perimeter, area, smoothness, compactness, concavity, concave points, symmetry, and fractal dimension. The prediction returned by the model is either “B” for benign or “M” for malignant. This sample project uses the public Breast Cancer Wisconsin (Diagnostic) dataset.

After installing the required Python packages, train a XGBoost model on the breast cancer dataset:

Create a bc_xgboost_train.py file containing the Python code needed to train a breast cancer XGBoost model. Download the code here.
Start the training of the model:python3 bc_xgboost_train.pyYou see the following message:The model file bc-xgboost-model is created in the root directory.
Create a new directory on the EFS file system and copy the XGBoost breast cancer model:
mkdir efs/model
cp bc-xgboost-model efs/model/
Check you have the required Python packages and the model on the EFS file system:
ls efs/model/ efs/lib/
You see all the Python packages installed previously in the lib directory, and the model file in the model directory.
Review the total size of lib Python packages directory:
du -sh efs/lib/

You can see that the total size of lib directory is 534 MB. This is a larger package size than was allowed before EFS for Lambda.

Building a serverless machine learning inference using Lambda

In this section, you use the EFS file system previously configured for the Lambda function to import the required libraries and load the model.

Using EFS with Lambda

The AWS SAM template creates the Lambda function, mount the EFS Access Point created earlier, and both IAM roles required.

It takes several minutes for the AWS SAM CLI to create the Lambda function. After, navigate to the Lambda console to see the created Lambda function.

In the Lambda function configuration, you see the environment variables, and basic settings, such as runtime, memory, and timeout.

Further down, you see that the Lambda function has the VPC access configured, and the file system is mounted.

Test your Lambda function

In the Lambda console, select Configure test events from the Test events dropdown.
For Event Name, enter InferenceTestEvent.
Copy the event JSON from here and paste in the dialog box.
Choose Create. After saving, you see InferenceTestEvent in the Test list. Now choose Test.

You see the Lambda function inference result, log output, and duration:

Conclusion

In this blog post, you train an XGBoost breast cancer model using Python packages installed on an Amazon EFS file system. You create an AWS Lambda function that loads the Python packages and the model from EFS file system, and perform the predictions.

Now you know how to call a machine learning model inference using a Lambda function. To learn more about other real-world examples, see:

Noise