Creating a serverless face blurring service for photos in Amazon S3

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/creating-a-serverless-face-blurring-service-for-photos-in-amazon-s3/

Many workloads process photos or imagery from web applications or mobile applications. For privacy reasons, it can be useful to identify and blur faces in these photos. This blog post shows how to build a serverless face blurring service for photos uploaded to an Amazon S3 bucket.

The example application uses the AWS Serverless Application Model (AWS SAM), enabling you to deploy the application more easily in your own AWS account. This walkthrough creates resources covered in the AWS Free Tier but usage beyond the Free Tier allowance may incur cost. To set up the example, visit the GitHub repo and follow the instructions in the README.md file.

Overview

Using a serverless approach, this face blurring microservice runs on demand in response to new photos being uploaded to S3. The solution uses the following architecture:

When an image is uploaded to the source S3 bucket, S3 sends a notification event to an Amazon SQS queue.
The Lambda service polls the SQS queue and invokes an AWS Lambda function when messages are available.
The Lambda function uses Amazon Rekognition to detect faces in the source image. The service returns the coordinates of faces to the function.
After blurring the faces in the source image, the function stores the resulting image in the output S3 bucket.

Deploying the solution

Before deploying the solution, you need:

An AWS account (sign up for an account if you don’t have one).
The AWS SAM CLI installed.
Node.js installed (version 14 minimum).

To deploy:

From a terminal window, clone the GitHub repo:
git clone https://github.com/aws-samples/serverless-face-blur-service
Change directory:
cd ./serverless-face-blur-service
Download and install dependencies:
sam build
Deploy the application to your AWS account:
sam deploy --guided
During the guided deployment process, enter unique names for the two S3 buckets. These names must be globally unique.

To test the application, upload a JPG file containing at least one face into the source S3 bucket. After a few seconds, the destination bucket contains the output file, with the same name. The output file shows blur content when the faces are detected:

How the face blurring Lambda function works

The Lambda function receives messages from the SQS queue when available. These messages contain metadata about the JPG object uploaded to S3:

{
    "Records": [
        {
            "messageId": "e9a12dd2-1234-1234-1234-123456789012",
            "receiptHandle": "AQEBnjT2rUH+kmEXAMPLE",
            "body": "{\"Records\":[{\"eventVersion\":\"2.1\",\"eventSource\":\"aws:s3\",\"awsRegion\":\"us-east-1\",\"eventTime\":\"2021-06-21T19:48:14.418Z\",\"eventName\":\"ObjectCreated:Put\",\"userIdentity\":{\"principalId\":\"AWS:AROA3DTKMEXAMPLE:username\"},\"requestParameters\":{\"sourceIPAddress\":\"73.123.123.123\"},\"responseElements\":{\"x-amz-request-id\":\"AZ39QWJFVEQJW9RBEXAMPLE\",\"x-amz-id-2\":\"MLpNwwQGQtrNai/EXAMPLE\"},\"s3\":{\"s3SchemaVersion\":\"1.0\",\"configurationId\":\"5f37ac0f-1234-1234-82f12343-cbc8faf7a996\",\"bucket\":{\"name\":\"s3-face-blur-source\",\"ownerIdentity\":{\"principalId\":\"EXAMPLE\"},\"arn\":\"arn:aws:s3:::s3-face-blur-source\"},\"object\":{\"key\":\"face.jpg\",\"size\":3541,\"eTag\":\"EXAMPLE\",\"sequencer\":\"123456789\"}}}]}",
            "attributes": {
                "ApproximateReceiveCount": "6",
                "SentTimestamp": "1624304902103",
                "SenderId": "AIDAJHIPREXAMPLE",
                "ApproximateFirstReceiveTimestamp": "1624304902103"
            },
            "messageAttributes": {},
            "md5OfBody": "12345",
            "eventSource": "aws:sqs",
            "eventSourceARN": "arn:aws:sqs:us-east-1:123456789012:s3-lambda-face-blur-S3EventQueue-ABCDEFG01234",
            "awsRegion": "us-east-1"
        }
    ]
}

The body attribute contained a serialized JSON object with an array of records, containing the S3 bucket name and object keys. The Lambda handler in app.js uses the JSON.parse method to create a JSON object from the string:

  const s3Event = JSON.parse(event.Records[0].body)

The handler extracts the bucket and key information. Since the S3 key attribute is URL encoded, it must be decoded before further processing:

const Bucket = s3Event.Records[0].s3.bucket.name
const Key = decodeURIComponent(s3Event.Records[0].s3.object.key.replace(/\+/g, " "))

There are three steps in processing each image: detecting faces in the source image, blurring faces, then storing the output in the destination bucket.

Detecting faces in the source image

The detectFaces.js file contains the detectFaces function. This accepts the bucket name and key as parameters, then uses the AWS SDK for JavaScript to call the Amazon Rekognition service:

const AWS = require('aws-sdk')
AWS.config.region = process.env.AWS_REGION 
const rekognition = new AWS.Rekognition()

const detectFaces = async (Bucket, Name) => {

  const params = {
    Image: {
      S3Object: {
       Bucket,
       Name
      }
     }    
  }

  console.log('detectFaces: ', params)

  try {
    const result = await rekognition.detectFaces(params).promise()
    return result.FaceDetails
  } catch (err) {
    console.error('detectFaces error: ', err)
    return []
  }  
}

The detectFaces method of the Amazon Rekognition API accepts a parameter object defining a reference to the source S3 bucket and key. The service returns a data object with an array called FaceDetails:

{
    "BoundingBox": {
        "Width": 0.20408163964748383,
        "Height": 0.4340078830718994,
        "Left": 0.727995753288269,
        "Top": 0.3109045922756195
    },
    "Landmarks": [
        {
            "Type": "eyeLeft",
            "X": 0.784351646900177,
            "Y": 0.46120116114616394
        },
        {
            "Type": "eyeRight",
            "X": 0.8680923581123352,
            "Y": 0.5227685570716858
        },
        {
            "Type": "mouthLeft",
            "X": 0.7576283812522888,
            "Y": 0.617080807685852
        },
        {
            "Type": "mouthRight",
            "X": 0.8273565769195557,
            "Y": 0.6681531071662903
        },
        {
            "Type": "nose",
            "X": 0.8087539672851562,
            "Y": 0.5677543878555298
        }
    ],
    "Pose": {
        "Roll": 23.821317672729492,
        "Yaw": 1.4818285703659058,
        "Pitch": 2.749311685562134
    },
    "Quality": {
        "Brightness": 83.74250793457031,
        "Sharpness": 89.85481262207031
    },
    "Confidence": 99.9793472290039
}

The Confidence score is the percentage confidence that the image contains a face. This example uses the BoundingBox coordinates to find the location of the face in the image. The response also includes positional data for facial features like the mouth, nose, and eyes.

Blurring faces in the source image

In the blurFaces.js file, the blurFaces function uses the open source GraphicsMagick library to process the source image. The function takes the bucket and key as parameters with the metadata returned by the Amazon Rekognition service:

const AWS = require('aws-sdk')
AWS.config.region = process.env.AWS_REGION 
const s3 = new AWS.S3()
const gm = require('gm').subClass({imageMagick: process.env.localTest})

const blurFaces = async (Bucket, Key, faceDetails) => {

  const object = await s3.getObject({ Bucket, Key }).promise()
  let img = gm(object.Body)

  return new Promise ((resolve, reject) => {
    img.size(function(err, dimensions) {
        if (err) reject(err)
        console.log('Image size', dimensions)

        faceDetails.map((faceDetail) => {
            const box = faceDetail.BoundingBox
            const width  = box.Width * dimensions.width
            const height = box.Height * dimensions.height
            const left = box.Left * dimensions.width
            const top = box.Top * dimensions.height

            img.region(width, height, left, top).blur(0, 70)
        })

        img.toBuffer((err, buffer) => resolve(buffer))
    })
  })
}

The function loads the source object from S3 using the getObject method of the S3 API. In the response, the Body attribute contains a buffer with the image data – this is used to instantiate a ‘gm’ object for processing.

Amazon Rekognition’s bounding box coordinates are percentage-based relative to the size of the image. This code converts these percentages to X- and Y-based coordinates and uses the region method to identify a portion of the image. The blur method uses a Gaussian operator based on the inputs provided. Once the transformation is complete, the function returns a buffer with the new image.

Using GraphicsMagick with Lambda functions

The GraphicsMagick package contains operating system-specific binaries. Depending on the operating system of your development machine, you may install binaries locally that are not compatible with Lambda. The Lambda service uses Amazon Linux 2 (AL2).

To simplify local testing and deployment, the sample application uses Lambda layers to package this library. This open source Lambda layer repo shows how to build, deploy, and test GraphicsMagick as a Lambda layer. It also publishes public layers to help you use the library in your Lambda functions.

When testing this function locally with the test.js script, the GM npm package uses the binaries on the local development machine. When the function is deployed to the Lambda service, the package uses the Lambda layer with the AL2-compatible binaries.

Limiting throughput with Amazon Rekognition

Both S3 and Lambda are highly scalable services and in this example can handle thousands of image uploads a second. In this configuration, S3 sends Event Notifications to an SQS queue each time an object is uploaded. The Lambda function processes events from this queue.

When using downstream services in Lambda functions, it’s important to note the quotas and throughputs in place for those services. This can help avoid throttling errors or overwhelming non-serverless services that may not be able to handle the same level of traffic.

The Amazon Rekognition service sets default transaction per second (TPS) rates for AWS accounts. For the DetectFaces API, the default is between 5-50 TPS depending upon the AWS Region. If you need a higher throughput, you can request an increase in the Service Quotas console.

In the AWS SAM template of the example application, the definition of the Lambda function uses two attributes to control the throughput. The ReservedConcurrentExecutions attribute is set to 1, which prevents the Lambda service from scaling beyond one instance of the function. The BatchSize in event source mapping is also set to 1, so each invocation contains only a single S3 event from the SQS queue:

  BlurFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: src/
      Handler: app.handler
      Runtime: nodejs14.x
      Timeout: 10
      MemorySize: 2048
      ReservedConcurrentExecutions: 1
      Policies:
        - S3ReadPolicy:
            BucketName: !Ref SourceBucketName
        - S3CrudPolicy:
            BucketName: !Ref DestinationBucketName
        - RekognitionDetectOnlyPolicy: {}
      Environment:
        Variables:
          DestinationBucketName: !Ref DestinationBucketName
      Events:
        MySQSEvent:
          Type: SQS
          Properties:
            Queue: !GetAtt S3EventQueue.Arn
            BatchSize: 1

The combination of these two values means that this function processes images one at a time, regardless of how many images are uploaded to S3. By increasing these values, you can change the scaling behavior and number of messages processed per invocation. This allows you to control the throughput of the number of the messages sent to Amazon Rekognition for processing.

Conclusion

A serverless face blurring service can provide a simpler way to process photos in workloads with large amounts of traffic. This post introduces an example application that blurs faces when images are saved in an S3 bucket. The S3 PutObject event invokes a Lambda function that uses Amazon Rekognition to detect faces and GraphicsMagick to process the images.

This blog post shows how to deploy the example application and walks through the functions that process the images. It explains how to use GraphicsMagick and how to control throughput in the SQS event source mapping.

For more serverless learning resources, visit Serverless Land.

Noise