Tag Archives: serverless

Manage your data warehouse cost allocations with Amazon Redshift Serverless tagging

Post Syndicated from Sandeep Bajwa original https://aws.amazon.com/blogs/big-data/manage-your-data-warehouse-cost-allocations-with-amazon-redshift-serverless-tagging/

Amazon Redshift Serverless makes it simple to run and scale analytics without having to manage your data warehouse infrastructure. Developers, data scientists, and analysts can work across databases, data warehouses, and data lakes to build reporting and dashboarding applications, perform real-time analytics, share and collaborate on data, and even build and train machine learning (ML) models with Redshift Serverless.

Tags allows you to assign metadata to your AWS resources. You can define your own key and value for your resource tag, so that you can easily manage and filter your resources. Tags can also improve transparency and map costs to specific teams, products, or applications. This way, you can raise cost awareness and also make teams and users accountable for their own cost and usage.

You can now use tagging in Redshift Serverless to categorize the following resources based on your grouping needs:

  • Namespace – A collection of database objects and users
  • Workgroup – A collection of compute resources
  • Snapshot – Point-in-time backups of a cluster
  • Recovery point – Recovery points in Redshift Serverless are created every 30 minutes and saved for 24 hours

When using Redshift Serverless, you may have to manage data across many business departments, environments, and billing groups. In doing so, you’re usually faced with one of the following tasks:

  • Cost allocation and financial management – You want to know what you’re spending on AWS for a given project, line of business, or environment
  • Operations support and incident management – You want to send issues to the right teams and users
  • Access control – You want to constrain user access to certain resources
  • Security risk management – You want to group resources based on their level of security or data sensitivity and make sure proper controls are in place

In this post, we focus on tagging Redshift Serverless resources for cost allocation and reporting purposes. Knowing where you have incurred costs at the resource, workload, team, and organization level enhances your ability to budget and manage cost.

Solution overview

Let’s say that your company has two departments: marketing and finance. Each department has multiple cost centers and environments, as illustrated in the following figure. In AWS Cost Explorer, you want to create cost reports for Redshift Serverless by department, environment, and cost center.

We start with creating and applying user-defined tags to Amazon Serverless workgroups for respective departments, environments, and cost centers. You can use both the AWS Command Line Interface (AWS CLI) and Redshift Serverless console to tag serverless resources.

The high-level steps are as follows:

  1. Create tags.
  2. View and edit tags.
  3. Set up cost allocation tags.
  4. Create cost reports.

Create tags

To create tags, complete the following steps:

  1. On the Amazon Redshift console, choose Manage tags in the navigation pane.
    Amazon Redshift console
  2. For Filter by resource type, you can filter by Workgroup, Namespace, Snapshot, and Recovery Point.
  3. Optionally, you can search for resources by an existing tag by entering values for Tag key or Tag value. For this post, we don’t include any tag filters, so we can view all the resources across our account.
    search for resources by an existing tag by entering values
  4. Select your resource from the search results and choose Manage tags to customize the tag key and value parameters.

Here, you can add new tags, remove tags, save changes, and cancel your changes if needed.

  1. Because we want to allocate cost across the various departments, we add a new key called department and a new value called marketing.
  1. Choose Save changes.
    Save Changes
  2. Confirm the changes by choosing Apply changes.
    Apply changes

For more details on tagging, refer to Tagging resources overview.

View and edit tags

If you already have resources such as workgroups (listed on the Workgroup configuration page) or snapshots (listed on the Data backup page), you can create new tags or edit existing tags on the given resource. In the following example, we manage tags on an existing workgroup.

  1. On the Amazon Redshift console, choose Workgroup configuration in the navigation pane.
  2. Select your workgroup and on the Actions menu, choose Manage tags.
    Select your workgroup and on the Actions menu, choose Manage tags.

Now we can remove existing tags or add new tags. For our use case, let’s assume that the marketing department is no longer using the default workgroup, so we want to remove the current tag.

  1. Choose Remove next to the marketing tag.
    Choose Remove next to the marketing tag.

We are given the option to choose Undo if needed.

  1. Choose Save changes and then Apply the changes to confirm.
    Choose Save changes and then Apply the changes to confirm.

After we apply the tags, we can view the full list of resources. The number of tags applied to each resource is found in the Tags column.
view the full list of resources

Set up cost allocation tags

After you create and apply the user-defined tags to your Redshift Serverless workgroups, it can take up to 24 hours for the tags to appear on your cost allocation tags page for activation. You can activate tags by using the AWS Billing console for cost allocation tracking with the following steps:

  1. On the AWS Billing console, choose Cost allocation tags in the navigation pane.
  2. Under User-defined cost allocation tags¸ select the tags you created and applied (for this example, cost-center).
  3. Choose Activate.
    Choose Activate

After we activate all the tags we created, we can view the full list by choosing Active on the drop-down menu.

Create cost reports

After you activate the cost allocation tags, they appear on your cost allocation reports in Cost Explorer.

Cost Explorer helps you manage your AWS costs by giving you detailed insights into the line items in your bill. In Cost Explorer, you can visualize daily, monthly, and forecasted spend by combining an array of available filters. Filters allow you to narrow down costs according to AWS service type, linked accounts, and tags.

The following screenshot shows the preconfigured reports in Cost Explorer.
preconfigured reports in Cost Explorer.

To create custom reports for your cost and usage data, complete the following steps:

  1. On the AWS Cost Management console, choose Reports in the navigation pane.
  2. Choose Create new report.
  3. Select the report type you want to create (for this example, we select Cost and usage).
  4. Choose Create Report.
    Create Report
  5. To view weekly Redshift Serverless cost by cost center, choose the applicable settings in the Report parameters pane. For this post, we group data by the cost-center tag and filter data by the department tag.
  6. Save the report for later use by choosing Save to report library.
    Save the report for later use by choosing Save to report library.
  7. Enter a name for your report, then choose Save report.
    Save Report

The following screenshot shows a sample report for daily Redshift Serverless cost by department.

sample report for daily Redshift Serverless cost by department.

The following screenshot shows an example report of weekly Redshift Serverless cost by environment.

example report of weekly Redshift Serverless cost by environment.

Conclusion

Tagging resources in Amazon Redshift helps you maintain a central place to organize and view resources across the service for billing management. This feature saves you hours of manual work you would spend in grouping your Amazon Redshift resources via a spreadsheet or other manual alternatives.

For more tagging best practices, refer to Tagging AWS resources.


About the Authors

Sandeep Bajwa is a Sr. Analytics Specialist based out of Northern Virginia, specialized in the design and implementation of analytics and data lake solutions.

Michael Yitayew is a Product Manager for Amazon Redshift based out of New York. He works with customers and engineering teams to build new features that enable data engineers and data analysts to more easily load data, manage data warehouse resources, and query their data. He has supported AWS customers for over 3 years in both product marketing and product management roles.

AWS Week in Review – March 27, 2023

Post Syndicated from Marcia Villalba original https://aws.amazon.com/blogs/aws/aws-week-in-review-march-27-2023/

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

In Finland, where I live, spring has arrived. The snow has melted, and the trees have grown their first buds. But I don’t get my hopes high, as usually around Easter we have what is called takatalvi. Takatalvi is a Finnish world that means that the winter returns unexpectedly in the spring.

Last Week’s Launches
Here are some launches that got my attention during the previous week.

AWS SAM CLI – Now the sam sync command will compare your local Serverless Application Model (AWS SAM) template with your deployed AWS CloudFormation template and skip the deployment if there are no changes. For more information, check the latest version of the AWS SAM CLI.

IAM – AWS Identity and Access Management (IAM) has launched two new global condition context keys. With these new condition keys, you can write service control policies (SCPs) or IAM policies that restrict the VPCs and private IP addresses from which your Amazon Elastic Compute Cloud (Amazon EC2) instance credentials can be used, without hard-coding VPC IDs or IP addresses in the policy. To learn more about this launch and how to get started, see How to use policies to restrict where EC2 instance credentials can be used from.

Amazon SNS – Amazon Simple Notification Service (Amazon SNS) now supports setting context-type request headers for HTTP/S notifications, such as application/json, application/xml, or text/plain. With this new feature, applications can receive their notifications in a more predictable format.

AWS Batch – AWS Batch now allows you to configure ephemeral storage up to 200GiB on AWS Fargate type jobs. With this launch, you no longer need to limit the size of your data sets or the size of the Docker images to run machine learning inference.

Application Load Balancer – Application Load Balancer (ALB) now supports Transport Layer Security (TLS) protocol version 1.3, enabling you to optimize the performance of your application while keeping it secure. TLS 1.3 on ALB works by offloading encryption and decryption of TLS traffic from your application server to the load balancer.

Amazon IVS – Amazon Interactive Video Service (IVS) now supports combining videos from multiple hosts into the source of a live stream. For a demo, refer to Add multiple hosts to live streams with Amazon IVS.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
Some other updates and news that you may have missed:

I read the post Implementing an event-driven serverless story generation application with ChatGPT and DALL-E a few days ago, and since then I have been reading my child a lot of  AI-generated stories. In this post, David Boyne, explains step by step how you can create an event-driven serverless story generation application. This application produces a brand-new story every day at bedtime with images, which can be played in audio format.

Podcast Charlas Técnicas de AWS – If you understand Spanish, this podcast is for you. Podcast Charlas Técnicas is one of the official AWS podcasts in Spanish, and every other week there is a new episode. The podcast is meant for builders, and it shares stories about how customers have implemented and learned AWS services, how to architect applications, and how to use new services. You can listen to all the episodes directly from your favorite podcast app or at AWS Podcasts en español.

AWS open-source news and updates – The open source newsletter is curated by my colleague Ricardo Sueiras to bring you the latest open-source projects, posts, events, and more.

Upcoming AWS Events
Check your calendars and sign up for the AWS Summit closest to your city. AWS Summits are free events that bring the local community together, where you can learn about different AWS services.

Here are the ones coming up in the next months:

That’s all for this week. Check back next Monday for another Week in Review!

— Marcia

AWS Week in Review – March 20, 2023

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/aws-week-in-review-march-20-2023/

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

A new week starts, and Spring is almost here! If you’re curious about AWS news from the previous seven days, I got you covered.

Last Week’s Launches
Here are the launches that got my attention last week:

Picture of an S3 bucket and AWS CEO Adam Selipsky.Amazon S3 – Last week there was AWS Pi Day 2023 celebrating 17 years of innovation since Amazon S3 was introduced on March 14, 2006. For the occasion, the team released many new capabilities:

Amazon Linux 2023 – Our new Linux-based operating system is now generally available. Sébastien’s post is full of tips and info.

Application Auto Scaling – Now can use arithmetic operations and mathematical functions to customize the metrics used with Target Tracking policies. You can use it to scale based on your own application-specific metrics. Read how it works with Amazon ECS services.

AWS Data Exchange for Amazon S3 is now generally available – You can now share and find data files directly from S3 buckets, without the need to create or manage copies of the data.

Amazon Neptune – Now offers a graph summary API to help understand important metadata about property graphs (PG) and resource description framework (RDF) graphs. Neptune added support for Slow Query Logs to help identify queries that need performance tuning.

Amazon OpenSearch Service – The team introduced security analytics that provides new threat monitoring, detection, and alerting features. The service now supports OpenSearch version 2.5 that adds several new features such as support for Point in Time Search and improvements to observability and geospatial functionality.

AWS Lake Formation and Apache Hive on Amazon EMR – Introduced fine-grained access controls that allow data administrators to define and enforce fine-grained table and column level security for customers accessing data via Apache Hive running on Amazon EMR.

Amazon EC2 M1 Mac Instances – You can now update guest environments to a specific or the latest macOS version without having to tear down and recreate the existing macOS environments.

AWS Chatbot – Now Integrates With Microsoft Teams to simplify the way you troubleshoot and operate your AWS resources.

Amazon GuardDuty RDS Protection for Amazon Aurora – Now generally available to help profile and monitor access activity to Aurora databases in your AWS account without impacting database performance

AWS Database Migration Service – Now supports validation to ensure that data is migrated accurately to S3 and can now generate an AWS Glue Data Catalog when migrating to S3.

AWS Backup – You can now back up and restore virtual machines running on VMware vSphere 8 and with multiple vNICs.

Amazon Kendra – There are new connectors to index documents and search for information across these new content: Confluence Server, Confluence Cloud, Microsoft SharePoint OnPrem, Microsoft SharePoint Cloud. This post shows how to use the Amazon Kendra connector for Microsoft Teams.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
A few more blog posts you might have missed:

Example of a geospatial query.Women founders Q&A – We’re talking to six women founders and leaders about how they’re making impacts in their communities, industries, and beyond.

What you missed at that 2023 IMAGINE: Nonprofit conference – Where hundreds of nonprofit leaders, technologists, and innovators gathered to learn and share how AWS can drive a positive impact for people and the planet.

Monitoring load balancers using Amazon CloudWatch anomaly detection alarms – The metrics emitted by load balancers provide crucial and unique insight into service health, service performance, and end-to-end network performance.

Extend geospatial queries in Amazon Athena with user-defined functions (UDFs) and AWS Lambda – Using a solution based on Uber’s Hexagonal Hierarchical Spatial Index (H3) to divide the globe into equally-sized hexagons.

How cities can use transport data to reduce pollution and increase safety – A guest post by Rikesh Shah, outgoing head of open innovation at Transport for London.

For AWS open-source news and updates, here’s the latest newsletter curated by Ricardo to bring you the most recent updates on open-source projects, posts, events, and more.

Upcoming AWS Events
Here are some opportunities to meet:

AWS Public Sector Day 2023 (March 21, London, UK) – An event dedicated to helping public sector organizations use technology to achieve more with less through the current challenging conditions.

Women in Tech at Skills Center Arlington (March 23, VA, USA) – Let’s celebrate the history and legacy of women in tech.

The AWS Summits season is warming up! You can sign up here to know when registration opens in your area.

That’s all from me for this week. Come back next Monday for another Week in Review!

Danilo

New – Use Amazon S3 Object Lambda with Amazon CloudFront to Tailor Content for End Users

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-use-amazon-s3-object-lambda-with-amazon-cloudfront-to-tailor-content-for-end-users/

With S3 Object Lambda, you can use your own code to process data retrieved from Amazon S3 as it is returned to an application. Over time, we added new capabilities to S3 Object Lambda, like the ability to add your own code to S3 HEAD and LIST API requests, in addition to the support for S3 GET requests that was available at launch.

Today, we are launching aliases for S3 Object Lambda Access Points. Aliases are now automatically generated when S3 Object Lambda Access Points are created and are interchangeable with bucket names anywhere you use a bucket name to access data stored in Amazon S3. Therefore, your applications don’t need to know about S3 Object Lambda and can consider the alias to be a bucket name.

Architecture diagram.

You can now use an S3 Object Lambda Access Point alias as an origin for your Amazon CloudFront distribution to tailor or customize data for end users. You can use this to implement automatic image resizing or to tag or annotate content as it is downloaded. Many images still use older formats like JPEG or PNG, and you can use a transcoding function to deliver images in more efficient formats like WebP, BPG, or HEIC. Digital images contain metadata, and you can implement a function that strips metadata to help satisfy data privacy requirements.

Architecture diagram.

Let’s see how this works in practice. First, I’ll show a simple example using text that you can follow along by just using the AWS Management Console. After that, I’ll implement a more advanced use case processing images.

Using an S3 Object Lambda Access Point as the Origin of a CloudFront Distribution
For simplicity, I am using the same application in the launch post that changes all text in the original file to uppercase. This time, I use the S3 Object Lambda Access Point alias to set up a public distribution with CloudFront.

I follow the same steps as in the launch post to create the S3 Object Lambda Access Point and the Lambda function. Because the Lambda runtimes for Python 3.8 and later do not include the requests module, I update the function code to use urlopen from the Python Standard Library:

import boto3
from urllib.request import urlopen

s3 = boto3.client('s3')

def lambda_handler(event, context):
  print(event)

  object_get_context = event['getObjectContext']
  request_route = object_get_context['outputRoute']
  request_token = object_get_context['outputToken']
  s3_url = object_get_context['inputS3Url']

  # Get object from S3
  response = urlopen(s3_url)
  original_object = response.read().decode('utf-8')

  # Transform object
  transformed_object = original_object.upper()

  # Write object back to S3 Object Lambda
  s3.write_get_object_response(
    Body=transformed_object,
    RequestRoute=request_route,
    RequestToken=request_token)

  return

To test that this is working, I open the same file from the bucket and through the S3 Object Lambda Access Point. In the S3 console, I select the bucket and a sample file (called s3.txt) that I uploaded earlier and choose Open.

Console screenshot.

A new browser tab is opened (you might need to disable the pop-up blocker in your browser), and its content is the original file with mixed-case text:

Amazon Simple Storage Service (Amazon S3) is an object storage service that offers...

I choose Object Lambda Access Points from the navigation pane and select the AWS Region I used before from the dropdown. Then, I search for the S3 Object Lambda Access Point that I just created. I select the same file as before and choose Open.

Console screenshot.

In the new tab, the text has been processed by the Lambda function and is now all in uppercase:

AMAZON SIMPLE STORAGE SERVICE (AMAZON S3) IS AN OBJECT STORAGE SERVICE THAT OFFERS...

Now that the S3 Object Lambda Access Point is correctly configured, I can create the CloudFront distribution. Before I do that, in the list of S3 Object Lambda Access Points in the S3 console, I copy the Object Lambda Access Point alias that has been automatically created:

Console screenshot.

In the CloudFront console, I choose Distributions in the navigation pane and then Create distribution. In the Origin domain, I use the S3 Object Lambda Access Point alias and the Region. The full syntax of the domain is:

ALIAS.s3.REGION.amazonaws.com

Console screenshot.

S3 Object Lambda Access Points cannot be public, and I use CloudFront origin access control (OAC) to authenticate requests to the origin. For Origin access, I select Origin access control settings and choose Create control setting. I write a name for the control setting and select Sign requests and S3 in the Origin type dropdown.

Console screenshot.

Now, my Origin access control settings use the configuration I just created.

Console screenshot.

To reduce the number of requests going through S3 Object Lambda, I enable Origin Shield and choose the closest Origin Shield Region to the Region I am using. Then, I select the CachingOptimized cache policy and create the distribution. As the distribution is being deployed, I update permissions for the resources used by the distribution.

Setting Up Permissions to Use an S3 Object Lambda Access Point as the Origin of a CloudFront Distribution
First, the S3 Object Lambda Access Point needs to give access to the CloudFront distribution. In the S3 console, I select the S3 Object Lambda Access Point and, in the Permissions tab, I update the policy with the following:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "cloudfront.amazonaws.com"
            },
            "Action": "s3-object-lambda:Get*",
            "Resource": "arn:aws:s3-object-lambda:REGION:ACCOUNT:accesspoint/NAME",
            "Condition": {
                "StringEquals": {
                    "aws:SourceArn": "arn:aws:cloudfront::ACCOUNT:distribution/DISTRIBUTION-ID"
                }
            }
        }
    ]
}

The supporting access point also needs to allow access to CloudFront when called via S3 Object Lambda. I select the access point and update the policy in the Permissions tab:

{
    "Version": "2012-10-17",
    "Id": "default",
    "Statement": [
        {
            "Sid": "s3objlambda",
            "Effect": "Allow",
            "Principal": {
                "Service": "cloudfront.amazonaws.com"
            },
            "Action": "s3:*",
            "Resource": [
                "arn:aws:s3:REGION:ACCOUNT:accesspoint/NAME",
                "arn:aws:s3:REGION:ACCOUNT:accesspoint/NAME/object/*"
            ],
            "Condition": {
                "ForAnyValue:StringEquals": {
                    "aws:CalledVia": "s3-object-lambda.amazonaws.com"
                }
            }
        }
    ]
}

The S3 bucket needs to allow access to the supporting access point. I select the bucket and update the policy in the Permissions tab:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "*"
            },
            "Action": "*",
            "Resource": [
                "arn:aws:s3:::BUCKET",
                "arn:aws:s3:::BUCKET/*"
            ],
            "Condition": {
                "StringEquals": {
                    "s3:DataAccessPointAccount": "ACCOUNT"
                }
            }
        }
    ]
}

Finally, CloudFront needs to be able to invoke the Lambda function. In the Lambda console, I choose the Lambda function used by S3 Object Lambda, and then, in the Configuration tab, I choose Permissions. In the Resource-based policy statements section, I choose Add permissions and select AWS Account. I enter a unique Statement ID. Then, I enter cloudfront.amazonaws.com as Principal and select lambda:InvokeFunction from the Action dropdown and Save. We are working to simplify this step in the future. I’ll update this post when that’s available.

Testing the CloudFront Distribution
When the distribution has been deployed, I test that the setup is working with the same sample file I used before. In the CloudFront console, I select the distribution and copy the Distribution domain name. I can use the browser and enter https://DISTRIBUTION_DOMAIN_NAME/s3.txt in the navigation bar to send a request to CloudFront and get the file processed by S3 Object Lambda. To quickly get all the info, I use curl with the -i option to see the HTTP status and the headers in the response:

curl -i https://DISTRIBUTION_DOMAIN_NAME/s3.txt

HTTP/2 200 
content-type: text/plain
content-length: 427
x-amzn-requestid: a85fe537-3502-4592-b2a9-a09261c8c00c
date: Mon, 06 Mar 2023 10:23:02 GMT
x-cache: Miss from cloudfront
via: 1.1 a2df4ad642d78d6dac65038e06ad10d2.cloudfront.net (CloudFront)
x-amz-cf-pop: DUB56-P1
x-amz-cf-id: KIiljCzYJBUVVxmNkl3EP2PMh96OBVoTyFSMYDupMd4muLGNm2AmgA==

AMAZON SIMPLE STORAGE SERVICE (AMAZON S3) IS AN OBJECT STORAGE SERVICE THAT OFFERS...

It works! As expected, the content processed by the Lambda function is all uppercase. Because this is the first invocation for the distribution, it has not been returned from the cache (x-cache: Miss from cloudfront). The request went through S3 Object Lambda to process the file using the Lambda function I provided.

Let’s try the same request again:

curl -i https://DISTRIBUTION_DOMAIN_NAME/s3.txt

HTTP/2 200 
content-type: text/plain
content-length: 427
x-amzn-requestid: a85fe537-3502-4592-b2a9-a09261c8c00c
date: Mon, 06 Mar 2023 10:23:02 GMT
x-cache: Hit from cloudfront
via: 1.1 145b7e87a6273078e52d178985ceaa5e.cloudfront.net (CloudFront)
x-amz-cf-pop: DUB56-P1
x-amz-cf-id: HEx9Fodp184mnxLQZuW62U11Fr1bA-W1aIkWjeqpC9yHbd0Rg4eM3A==
age: 3

AMAZON SIMPLE STORAGE SERVICE (AMAZON S3) IS AN OBJECT STORAGE SERVICE THAT OFFERS...

This time the content is returned from the CloudFront cache (x-cache: Hit from cloudfront), and there was no further processing by S3 Object Lambda. By using S3 Object Lambda as the origin, the CloudFront distribution serves content that has been processed by a Lambda function and can be cached to reduce latency and optimize costs.

Resizing Images Using S3 Object Lambda and CloudFront
As I mentioned at the beginning of this post, one of the use cases that can be implemented using S3 Object Lambda and CloudFront is image transformation. Let’s create a CloudFront distribution that can dynamically resize an image by passing the desired width and height as query parameters (w and h respectively). For example:

https://DISTRIBUTION_DOMAIN_NAME/image.jpg?w=200&h=150

For this setup to work, I need to make two changes to the CloudFront distribution. First, I create a new cache policy to include query parameters in the cache key. In the CloudFront console, I choose Policies in the navigation pane. In the Cache tab, I choose Create cache policy. Then, I enter a name for the cache policy.

Console screenshot.

In the Query settings of the Cache key settings, I select the option to Include the following query parameters and add w (for the width) and h (for the height).

Console screenshot.

Then, in the Behaviors tab of the distribution, I select the default behavior and choose Edit.

There, I update the Cache key and origin requests section:

  • In the Cache policy, I use the new cache policy to include the w and h query parameters in the cache key.
  • In the Origin request policy, use the AllViewerExceptHostHeader managed policy to forward query parameters to the origin.

Console screenshot.

Now I can update the Lambda function code. To resize images, this function uses the Pillow module that needs to be packaged with the function when it is uploaded to Lambda. You can deploy the function using a tool like the AWS SAM CLI or the AWS CDK. Compared to the previous example, this function also handles and returns HTTP errors, such as when content is not found in the bucket.

import io
import boto3
from urllib.request import urlopen, HTTPError
from PIL import Image

from urllib.parse import urlparse, parse_qs

s3 = boto3.client('s3')

def lambda_handler(event, context):
    print(event)

    object_get_context = event['getObjectContext']
    request_route = object_get_context['outputRoute']
    request_token = object_get_context['outputToken']
    s3_url = object_get_context['inputS3Url']

    # Get object from S3
    try:
        original_image = Image.open(urlopen(s3_url))
    except HTTPError as err:
        s3.write_get_object_response(
            StatusCode=err.code,
            ErrorCode='HTTPError',
            ErrorMessage=err.reason,
            RequestRoute=request_route,
            RequestToken=request_token)
        return

    # Get width and height from query parameters
    user_request = event['userRequest']
    url = user_request['url']
    parsed_url = urlparse(url)
    query_parameters = parse_qs(parsed_url.query)

    try:
        width, height = int(query_parameters['w'][0]), int(query_parameters['h'][0])
    except (KeyError, ValueError):
        width, height = 0, 0

    # Transform object
    if width > 0 and height > 0:
        transformed_image = original_image.resize((width, height), Image.ANTIALIAS)
    else:
        transformed_image = original_image

    transformed_bytes = io.BytesIO()
    transformed_image.save(transformed_bytes, format='JPEG')

    # Write object back to S3 Object Lambda
    s3.write_get_object_response(
        Body=transformed_bytes.getvalue(),
        RequestRoute=request_route,
        RequestToken=request_token)

    return

I upload a picture I took of the Trevi Fountain in the source bucket. To start, I generate a small thumbnail (200 by 150 pixels).

https://DISTRIBUTION_DOMAIN_NAME/trevi-fountain.jpeg?w=200&h=150

Picture of the Trevi Fountain with size 200x150 pixels.

Now, I ask for a slightly larger version (400 by 300 pixels):

https://DISTRIBUTION_DOMAIN_NAME/trevi-fountain.jpeg?w=400&h=300

Picture of the Trevi Fountain with size 400x300 pixels.

It works as expected. The first invocation with a specific size is processed by the Lambda function. Further requests with the same width and height are served from the CloudFront cache.

Availability and Pricing
Aliases for S3 Object Lambda Access Points are available today in all commercial AWS Regions. There is no additional cost for aliases. With S3 Object Lambda, you pay for the Lambda compute and request charges required to process the data, and for the data S3 Object Lambda returns to your application. You also pay for the S3 requests that are invoked by your Lambda function. For more information, see Amazon S3 Pricing.

Aliases are now automatically generated when an S3 Object Lambda Access Point is created. For existing S3 Object Lambda Access Points, aliases are automatically assigned and ready for use.

It’s now easier to use S3 Object Lambda with existing applications, and aliases open many new possibilities. For example, you can use aliases with CloudFront to create a website that converts content in Markdown to HTML, resizes and watermarks images, or masks personally identifiable information (PII) from text, images, and documents.

Customize content for your end users using S3 Object Lambda with CloudFront.

Danilo

Implementing an event-driven serverless story generation application with ChatGPT and DALL-E

Post Syndicated from David Boyne original https://aws.amazon.com/blogs/compute/implementing-an-event-driven-serverless-story-generation-application-with-chatgpt-and-dall-e/

This post demonstrates how to integrate AWS serverless services with artificial intelligence (AI) technologies, ChatGPT, and DALL-E. This full stack event-driven application showcases a method of generating unique bedtime stories for children by using predetermined characters and scenes as a prompt for ChatGPT.

Every night at bedtime, the serverless scheduler triggers the application, initiating an event-driven workflow to create and store new unique AI-generated stories with AI-generated images and supporting audio.

These datasets are used to showcase the story on a custom website built with Next.js hosted with AWS App Runner. After the story is created, a notification is sent to the user containing a URL to view and read the story to the children.

Example notification of a story hosted with Next.js and App Runner

Example notification of a story hosted with Next.js and App Runner

By integrating AWS services with AI technologies, you can now create new and innovative ideas that were previously unimaginable.

The application mentioned in this blog post demonstrates examples of point-to-point messaging with Amazon EventBridge pipes, publish/subscribe patterns with Amazon EventBridge and reacting to change data capture events with DynamoDB Streams.

Understanding the architecture

The following image shows the serverless architecture used to generate stories:

Architecture diagram for Serverless bed time story generation with ChatGPT and DALL-E

Architecture diagram for Serverless bed time story generation with ChatGPT and DALL-E

A new children’s story is generated every day at configured time using Amazon EventBridge Scheduler (Step 1). EventBridge Scheduler is a service capable of scaling millions of schedules with over 200 targets and over 6000 API calls. This example application uses EventBridge scheduler to trigger an AWS Lambda function every night at the same time (7:15pm). The Lambda function is triggered to start the generation of the story.

EventBridge scheduler triggers Lambda function every day at 7:15pm (bed time)

EventBridge scheduler triggers Lambda function every day at 7:15pm (bed time)

The “Scenes” and “Characters” Amazon DynamoDB tables contain the characters involved in the story and a scene that is randomly selected during its creation. As a result, ChatGPT receives a unique prompt each time. An example of the prompt may look like this:

“`
Write a title and a rhyming story on 2 main characters called Parker and Jackson. The story needs to be set within the scene haunted woods and be at least 200 words long

“`

After the story is created, it is then saved in the “Stories” DynamoDB table (Step 2).

Scheduler triggering Lambda function to generate the story and store story into DynamoDB

Scheduler triggering Lambda function to generate the story and store story into DynamoDB

Once the story is created this initiates a change data capture event using DynamoDB Streams (Step 3). This event flows through point-to-point messaging with EventBridge pipes and directly into EventBridge. Input transforms are then used to convert the DynamoDB Stream event into a custom EventBridge event, which downstream consumers can comprehend. Adopting this pattern is beneficial as it allows us to separate contracts from the DynamoDB event schema and not having downstream consumers conform to this schema structure, this mapping allows us to remain decoupled from implementation details.

EventBridge Pipes connecting DynamoDB streams directly into EventBridge.

EventBridge Pipes connecting DynamoDB streams directly into EventBridge.

Upon triggering the StoryCreated event in EventBridge, three targets are triggered to carry out several processes (Step 4). Firstly, AI Images are processed, followed by the creation of audio for the story. Finally, the end user is notified of the completed story through Amazon SNS and email subscriptions. This fan-out pattern enables these tasks to be run asynchronously and in parallel, allowing for faster processing times.

EventBridge pub/sub pattern used to start async processing of notifications, audio, and images.

EventBridge pub/sub pattern used to start async processing of notifications, audio, and images.

An SNS topic is triggered by the `StoryCreated` event to send an email to the end user using email subscriptions (Step 6). The email consists of a URL with the id of the story that has been created. Clicking on the URL takes the user to the frontend application that is hosted with App Runner.

Using SNS to notify the user of a new story

Using SNS to notify the user of a new story

Example email sent to the user

Example email sent to the user

Amazon Polly is used to generate the audio files for the story (Step 6). Upon triggering the `StoryCreated` event, a Lambda function is triggered, and the story description is used and given to Amazon Polly. Amazon Polly then creates an audio file of the story, which is stored in Amazon S3. A presigned URL is generated and saved in DynamoDB against the created story. This allows the frontend application and browser to retrieve the audio file when the user views the page. The presigned URL has a validity of two days, after which it can no longer be accessed or listened to.

Lambda function to generate audio using Amazon Polly, store in S3 and update story with presigned URL

Lambda function to generate audio using Amazon Polly, store in S3 and update story with presigned URL

The `StoryCreated` event also triggers another Lambda function, which uses the OpenAI API to generate an AI image using DALL-E based on the generated story (Step 7). Once the image is generated, the image is downloaded and stored in Amazon S3. Similar to the audio file, the system generates a presigned URL for the image and saves it in DynamoDB against the story. The presigned URL is only valid for two days, after which it becomes inaccessible for download or viewing.

Lambda function to generate images, store in S3 and update story with presigned URL.

Lambda function to generate images, store in S3 and update story with presigned URL.

In the event of a failure in audio or image generation, the frontend application still loads the story, but does not display the missing image or audio at that moment. This ensures that the frontend can continue working and provide value. If you wanted more control and only trigger the user’s notification event once all parallel tasks are complete the aggregator messaging pattern can be considered.

Hosting the frontend Next.js application with AWS App Runner

Next.js is used by the frontend application to render server-side rendered (SSR) pages that can access the stories from the DynamoDB table, which are then hosted with AWS App Runner after being containerized.

Next.js application hosted with App Runner, with permissions into DynamoDB table.

Next.js application hosted with App Runner, with permissions into DynamoDB table.

AWS App Runner enables you to deploy containerized web applications and APIs securely, without needing any prior knowledge of containers or infrastructure. With App Runner, developers can concentrate on their application, while the service handles container startup, running, scaling, and load balancing. After deployment, App Runner provides a secure URL for clients to begin making HTTP requests against.

With App Runner, you have two primary options for deploying your container: source code connections or source images. Using source code connections grants App Runner permission to pull the image file directly from your source code, and with Automatic deployment configured, it can redeploy the application when changes are made. Alternatively, source images provide App Runner with the image’s location in an image registry, and this image is deployed by App Runner.

In this example application, CDK deploys the application using the DockerImageAsset construct with the App Runner construct. Once deployed, App Runner builds and uploads the frontend image to Amazon Elastic Container Registry (ECR) and deploys it. Downstream consumers can access the application using the secure URL provided by App Runner. In this example, the URL is used when the SNS notification is sent to the user when the story is ready to be viewed.

Giving the frontend container permission to DynamoDB table

To grant the Next.js application permission to obtain stories from the Stories DynamoDB table, App Runner instance roles are configured. These roles are optional and can provide the necessary permissions for the container to access AWS services required by the compute service.

If you want to learn more about AWS App Runner, you can explore the free workshop.

Design choices and assumptions

The DynamoDB Time to Live (TTL) feature is ideal for the short-lived nature of daily generated stories. DynamoDB handle the deletion of stories after two days by setting the TTL attribute on each story. Once a story is deleted, it becomes inaccessible through the generated story URLs.

Using Amazon S3 presigned URLs is a method to grant temporary access to a file in S3. This application creates presigned URLs for the audio file and generated images that last for 2 days, after which the URLs for the S3 items become invalid.

Input transforms are used between DynamoDB streams and EventBridge events to decouple the schemas and events consumed by downstream targets. Consuming the events as they are is known as the “conformist” pattern, and couples us to implementation details of DynamoDB streams with downstream EventBridge consumers. This allows the application to remain decoupled from implementation details and remain flexible.

Conclusion

The adoption of artificial intelligence (AI) technology has significantly increased in various industries. ChatGPT, a large language model that can understand and generate human-like responses in natural language, and DALL-E, an image generation system that can create realistic images based on textual descriptions, are examples of such technology. These systems have demonstrated the potential for AI to provide innovative solutions and transform the way we interact with technology.

This blog post explores ways in which you can utilize AWS serverless services with ChatGTP and DALL-E to create a story generation application fronted by a Next.js application hosted with App Runner. EventBridge Scheduler is used to trigger the story creation process then react to change data capture events with DynamoDB streams and EventBridge Pipes, and use Amazon EventBridge to fan out compute tasks to process notifications, images, and audio files.

You can find the documentation and the source code for this application in GitHub.

For more serverless learning resources, visit Serverless Land.

Managing sessions of anonymous users in WebSocket API-based applications

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/managing-sessions-of-anonymous-users-in-websocket-api-based-applications/

This post is written by Alexey Paramonov, Solutions Architect, ISV.

This blog post demonstrates how to reconnect anonymous users to WebSocket API without losing their session context. You learn how to link WebSocket API connection IDs to the logical user, what to do when the connection fails, and how to store and recover session information when the user reconnects.

WebSocket APIs are common in modern interactive applications. For example, for watching stock prices, following live chat feeds, collaborating with others in productivity tools, or playing online games. Another example described in “Implementing reactive progress tracking for AWS Step Functions” blog post uses WebSocket APIs to send progress back to the client.

The backend is aware of the client’s connection ID to find the right client to send data to. But if the connection temporarily fails, the client reconnects with a new connection ID. If the backend does not have a mechanism to associate a new connection ID with the same client, the interaction context is lost and the user must start over again. In a live chat feed that could mean skipping to the most recent message with a loss of some previous messages. And in case of AWS Step Functions progress tracking the user would lose progress updates.

Overview

The sample uses Amazon API Gateway WebSocket APIs. It uses AWS Lambda for WebSocket connection management and for mocking a teleprinter to generate a stateful stream of characters for testing purposes. There are two Amazon DynamoDB tables for storing connection IDs and user sessions, as well as an optional MediaWiki API for retrieving an article.

The following diagram outlines the architecture:

Reference architecture

  1. The browser generates a random user ID and stores it locally in the session storage. The client sends the user ID inside the Sec-WebSocket-Protocol header to WebSocket API.
  2. The default WebSocket API route OnConnect invokes OnConnect Lambda function and passes the connection ID and the user ID to it. OnConnect Lambda function determines if the user ID exists in the DynamoDB table and either creates a new item or updates the existing one.
  3. When the connection is open, the client sends the Read request to the Teleprinter Lambda function.
  4. The Teleprinter Lambda function downloads an article about WebSocket APIs from Wikipedia and stores it as a string in memory.
  5. The Teleprinter Lambda function checks if there is a previous state stored in the Sessions table in DynamoDB. If it is a new user, the Teleprinter Lambda function starts sending the article from the beginning character by character back to the client via WebSocket API. If it is a returning user, the Teleprinter Lambda function retrieves the last cursor position (the last successfully sent character) from the DynamoDB table and continues from there.
  6. The Teleprinter Lambda function sends 1 character every 100ms back to the client.
  7. The client receives the article and prints it out character by character as each character arrives.
  8. If the connection breaks, the WebSocket API calls the OnDisconnect Lambda function automatically. The function marks the connection ID for the given user ID as inactive.
  9. If the user does not return within 5 minutes, Amazon EventBridge scheduler invokes the OnDelete Lambda function, which deletes items with more than 5 minutes of inactivity from Connections and Sessions tables.

A teleprinter returns data one character at a time instead of a single response. When the Lambda function fetches an article, it feeds it character by character inside the for-loop with a delay of 100ms on every iteration. This demonstrates how the user can continue reading the article after reconnecting and not starting the feed all over again. The pattern could be useful for traffic limiting by slowing down user interactions with the backend.

Understanding sample code functionality

When the user connects to the WebSocket API, the client generates a user ID and saves it in the browser’s session storage. The user ID is a random string. When the client opens a WebSocket connection, the frontend sends the user ID inside the Sec-WebSocket-Protocol header, which is standard for WebSocket APIs. When using the JavaScript WebSocket library, there is no need to specify the header manually. To initialize a new connection, use:

const newWebsocket = new WebSocket(wsUri, userId);

wsUri is the deployed WebSocket API URL and userId is a random string generated in the browser. The userId goes to the Sec-WebSocket-Protocol header because WebSocket APIs generally offer less flexibility in choosing headers compared to RESTful APIs. Another way to pass the user ID to the backend could be a query string parameter.

The backend receives the connection request with the user ID and connection ID. The WebSocket API generates the connection ID automatically. It’s important to include the Sec-WebSocket-Protocol in the backend response, otherwise the connection closes immediately:

    return {
        'statusCode': 200,
        'headers': {
            'Sec-WebSocket-Protocol': userId
        }
    }

The Lambda function stores this information in the DynamoDB Connections table:

    ddb.put_item(
        TableName=table_name,
        Item={
            'userId': {'S': userId},
            'connectionId': {'S': connection_id},
            'domainName': {'S': domain_name},
            'stage': {'S': stage},
            'active': {'S': True}
        }
    )

The active attribute indicates if the connection is up. In the case of inactivity over a specified time limit, the OnDelete Lambda function deletes the item automatically. The Put operation comes handy here because you don’t need to query the DB if the user already exists. If it is a new user, Put creates a new item. If it is a reconnection, Put updates the connectionId and sets active to True again.

The primary key for the Connections table is userId, which helps locate users faster as they reconnect. The application relies on a global secondary index (GSI) for locating connectionId. This is necessary for marking the connection inactive when WebSocket API calls OnDisconnect function automatically.

Now the application has connection management, you can retrieve session data. The Teleprinter Lambda function receives a connection ID from WebSocket API. You can query the GSI of Connections table and find if the user exists:

    def get_user_id(connection_id):
        response = ddb.query(
            TableName=connections_table_name,
            IndexName='connectionId-index',
            KeyConditionExpression='connectionId = :c',
            ExpressionAttributeValues={
                ':c': {'S': connection_id}
            }
        )

        items = response['Items']

        if len(items) == 1:
            return items[0]['userId']['S']

        return None

Another DynamoDB table called Sessions is used to check if the user has an existing session context. If yes, the Lambda function retrieves the cursor position and resumes teletyping. If it is a brand-new user, the Lambda function starts sending characters from the beginning. The function stores a new cursor position if the connection breaks. This makes it easier to detect stale connections and store the current cursor position in the Sessions table.

    Try:
        api_client.post_to_connection(
            ConnectionId=connection_id,
            Data=bytes(ch, 'utf-8')
        )
    except api_client.exceptions.GoneException as e:
        print(f"Found stale connection, persisting state")
        store_cursor_position(user_id, pos)
        return {
            'statusCode': 410
        }

    def store_cursor_position(user_id, pos):
        ddb.put_item(
            TableName=sessions_table_name,
            Item={
                'userId': {'S': user_id},
                'cursorPosition': {'N': str(pos)}
            }
        )

After this, the Teleprinter Lambda function ends.

Serving authenticated users

The same mechanism also works for authenticated users. The main difference is it takes a given ID from a JWT token or other form of authentication and uses it instead of a randomly generated user ID. The backend relies on unambiguous user identification and may require only minor changes for handling authenticated users.

Deleting inactive users

The OnDisconnect function marks user connections as inactive and adds a timestamp to ‘lastSeen’ attribute. If the user never reconnects, the application should purge inactive items from Connections and Sessions tables. Depending on your operational requirements, there are two options.

Option 1: Using EventBridge Scheduler

The sample application uses EventBridge to schedule OnDelete function execution every 5 minutes. The following code shows how AWS SAM adds the scheduler to the OnDelete function:

  OnDeleteSchedulerFunction:
    Type: AWS::Serverless::Function
    Properties:
      Handler: app.handler
      Runtime: python3.9
      CodeUri: handlers/onDelete/
      MemorySize: 128
      Environment:
        Variables:
          CONNECTIONS_TABLE_NAME: !Ref ConnectionsTable
          SESSIONS_TABLE_NAME: !Ref SessionsTable
      Policies:
      - DynamoDBCrudPolicy:
          TableName: !Ref ConnectionsTable
      - DynamoDBCrudPolicy:
          TableName: !Ref SessionsTable
      Events:
        Schedule:
          Type: ScheduleV2
          Properties:
            ScheduleExpression: rate(5 minute)

The Events section of the function definition is where AWS SAM sets up the scheduled execution. Change values in ScheduleExpression to meet your scheduling requirements.

The OnDelete function relies on the GSI to find inactive user IDs. The following code snippet shows how to query connections with more than 5 minutes of inactivity:

    five_minutes_ago = int((datetime.now() - timedelta(minutes=5)).timestamp())

    stale_connection_items = table_connections.query(
        IndexName='lastSeen-index',
        KeyConditionExpression='active = :hk and lastSeen < :rk',
        ExpressionAttributeValues={
            ':hk': 'False',
            ':rk': five_minutes_ago
        }
    )

After that, the function iterates through the list of user IDs and deletes them from the Connections and Sessions tables:

    # remove inactive connections
    with table_connections.batch_writer() as batch:
        for item in inactive_users:
            batch.delete_item(Key={'userId': item['userId']})

    # remove inactive sessions
    with table_sessions.batch_writer() as batch:
        for item in inactive_users:
            batch.delete_item(Key={'userId': item['userId']})

The sample uses batch_writer() to avoid requests for each user ID.

Option 2: Using DynamoDB TTL

DynamoDB provides a built-in mechanism for expiring items called Time to Live (TTL). This option is easier to implement. With TTL, there’s no need for EventBridge scheduler, OnDelete Lambda function and additional GSI to track time span. To set up TTL, use the ‘lastSeen’ attribute as an object creation time. TTL deletes or archives the item after a specified period of time. When using AWS SAM or AWS CloudFormation templates, add TimeToLiveSpecification to your code.

The TTL typically deletes expired items within 48 hours. If your operational requirements demand faster and more predictable timing, use option 1. For example, if your application aggregates data while the user is offline, use option 1. But if your application stores a static session data, like cursor position used in this sample, option 2 can be an easier alternative.

Deploying the sample

Prerequisites

Ensure you can manage AWS resources from your terminal.

  • AWS CLI and AWS SAM CLI installed.
  • You have an AWS account. If not, visit this page.
  • Your user has enough permissions to manage AWS resources.
  • Git is installed.
  • NPM is installed (only for local frontend deployment).

You can find the source code and README here.

The repository contains both the frontend and the backend code. To deploy the sample, follow this procedure:

  1. Open a terminal and clone the repository:
    git clone https://github.com/aws-samples/websocket-sessions-management.git
  2. Navigate to the root of the repository.
  3. Build and deploy the AWS SAM template:
    sam build && sam deploy --guided
  4. Note the value of WebSocketURL in the output. You need it later.

With the backend deployed, test it using the hosted frontend.

Testing the application

Testing the application

You can watch an animated demo here.

Notice that the app has generated a random user ID on startup. The app shows the user ID above in the header.

  1. Paste the WebSocket URL into the text field. You can find the URL in the console output after you have successfully deployed your AWS SAM template. Alternatively, you can navigate to AWS Management Console (make sure you are in the right Region), select the API you have recently deployed, go to “Stages”, select the deployed stage and copy the “WebSocket URL” value.
  2. Choose Connect. The app opens a WebSocket connection.
  3. Choose Tele-read to start receiving the Wikipedia article character by character. New characters appear in the second half of the screen as they arrive.
  4. Choose Disconnect to close WebSocket connection. Reconnect again and choose Tele-read. Your session resumes from where you stopped.
  5. Choose Reset Identity. The app closes the WebSocket connection and changes the user ID. Choose Connect and Tele-read again and your character feed starts from the beginning.

Cleaning up

To clean up the resources, in the root directory of the repository run:

sam delete

This removes all resources provisioned by the template.yml file.

Conclusion

In this post, you learn how to keep track of user sessions when using WebSockets API and not lose the session context when the user reconnects again. Apply learnings from this example to improve your user experience when using WebSocket APIs for web-frontend and mobile applications, where internet connections may be unstable.

For more serverless learning resources, visit  Serverless Land.

Building serverless Java applications with the AWS SAM CLI

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-serverless-java-applications-with-the-aws-sam-cli/

This post was written by Mehmet Nuri Deveci, Sr. Software Development Engineer, Steven Cook, Sr.Solutions Architect, and Maximilian Schellhorn, Solutions Architect.

When using Java in the serverless environment, the AWS Serverless Application Model Command Line Interface (AWS SAM CLI) offers an easier way to build and deploy AWS Lambda functions. You can either use the default AWS SAM build mechanism or tailor the build behavior to your application needs.

Since Java offers a variety of plugins and tools for building your application, builders usually have custom requirements for their build setup. In addition, when targeting GraalVM or non-LTS versions of the JVM, the build behavior requires additional configuration to build a Lambda custom runtime.

This blog post provides an overview of the common ways to build Java applications for Lambda with the AWS SAM CLI. This allows you to make well-informed decisions based on your projects’ requirements. This post focuses on Apache Maven, however the same concepts apply for Gradle.

You can find the source code for these examples in the GitHub repo.

Overview

The following diagram provides an overview of the build and deployment process with AWS SAM CLI. The default behavior includes the following steps:

Architecture overview

  1. Define your infrastructure resources such as Lambda functions, Amazon DynamoDB Tables, Amazon S3 buckets, or an Amazon API Gateway endpoint in a template.yaml file.
  2. The CLI command “sam build” builds the application based on the chosen runtime and configuration within the template.
  3. The sam build command populates the .aws-sam/build folder with the built artifacts (for example, class files or jars).
  4. The sam deploy command uploads your template and function code from the .aws-sam/build folder and starts an AWS CloudFormation deployment.
  5. After a successful deployment, you can find the provisioned resources in your AWS account.

Using the default Java build mechanism in AWS SAM CLI

AWS SAM CLI supports building Serverless Java functions with Maven or Gradle. You must use one of the supported Java runtimes (java8, java8.al2, java11) and your function’s resource CodeUri property must point to a source folder.

Default Java build mechanism

AWS SAM CLI ships with default build mechanisms provided by the aws-lambda-builders project. It is therefore not required to package or build your application in advance and any customized build steps in pom.xml will not be used. For example, by default the AWS SAM Maven Lambda Builder triggers the following steps:

  1. mvn clean install to build the function.
  2. mvn dependency:copy-dependencies -DincludeScope=runtime -Dmdep.prependGroupId=true to prepare dependency jar files.
  3. Class files in target/classes and dependency jar archives in target/dependency are copied to the final build artifact location in .aws-sam/build/{ResourceLogicalId}.

Start the build process by running the following command from the directory where the template.yaml resides:

sam build

This results in the following outputs:

Outputs

The .aws-sam build folder contains the necessary classes and libraries to run your application. The transformed template.yaml file points to the build artifacts directory (instead of pointing to the original source directory).

Run the following command to deploy the resources to AWS:

sam deploy --guided

This zips the HelloWorldFunction directory in .aws-sam/build and uploads it to the Lambda service.

Building Uber-Jars with AWS SAM CLI

A popular way for building and packaging Java projects, especially when using frameworks such as Micronaut, Quarkus and Spring Boot is to create an Uber-jar or Fat-jar. This is a jar file that contains the application class files with all the dependency class files within a single jar file. This simplifies the deployment and management of the application artifact.

Frameworks typically provide a Maven or Gradle setup that produces an Uber-jar by default. For example, by using the Apache Maven Shade plugin for Maven, you can configure Uber-jar packaging:

<build>
  <plugins>
    <plugin>
      <groupId>org.apache.maven.plugins</groupId>
      <artifactId>maven-shade-plugin</artifactId>
      <version>3.4.1</version>
      <executions>
 	 <execution>
 	   <phase>package</phase>
 	     <goals>
 	       <goal>shade</goal>
 	     </goals>
 	 </execution>
      </executions>
     </plugin>
  </plugins>
</build>

The maven-shade-plugin modifies the package phase of the Maven build to produce the Uber-jar file in the target directory.

To build and package an Uber-jar with AWS SAM, you must customize the AWS SAM CLI build process. You can do this by using a Makefile to replace the default steps discussed earlier. Within the template.yaml, declare a Metadata resource attribute with a BuildMethod entry. The sam build process then looks for a Makefile within the CodeUri directory.

Makefile metadata

The Makefile is responsible for building the artifacts required by AWS SAM to deploy the Lambda function. AWS SAM runs the target in the Makefile. This is an example Makefile:

build-HelloWorldFunctionUberJar:
 	mvn clean package
 	mkdir -p $(ARTIFACTS_DIR)/lib
 	cp ./target/HelloWorld*.jar $(ARTIFACTS_DIR)/lib/

The Makefile runs the Maven clean and package goals that build Uber-jar in the target directory via the Apache Maven Plugin. As the Lambda Java runtime loads jar files from the lib directory, you can copy the uber-jar file to the $ARTIFACTS_DIR/lib directory as part of the build steps.

To build the application, run:

sam build

This triggers the customized build step and creates the following output resources:

Output response

The deployment step is identical to the previous example.

Running the build process inside a container

AWS SAM provides a mechanism to run the application build process inside a Docker container. This provides the benefit of not requiring your build dependencies (such as Maven and Java) to be installed locally or in your CI/CD environment.

To run the build process inside a container, use the command line options –use-container and optionally –build-image. The following diagram outlines the modified build process with this option:

Modified build process with this option

  1. Similar to the previous examples, the directory with the application sources or the Makefile is referenced.
  2. To run the build process inside a Docker container, provide the command line option:
    sam build –-use-container
  3. By default, AWS SAM uses images for container builds that are maintained in the GitHub repo aws/aws-sam-build-images. AWS SAM pulls the container image depending on the defined runtime. For this example, it pulls the public.ecr.aws/sam/build-java11 image, which has prerequisites such as Maven and Java11 pre-installed to build the application. In addition, the source code folder is mounted in the container and the Makefile target or default build mechanism is run within the container.
  4. The final artifacts are delivered to the.aws-sam/build directory on the local file system. If you are using a Makefile, you can copy the final artifact to the $ARTIFACTS_DIR.
  5. The sam deploy command uploads the template and function code from the .aws-sam/build directory and starts a CloudFormation deployment.
  6. After a successful deployment, you can find the provisioned resources in your AWS account.

To test the behavior, run the previous examples with the –use-container option. No additional changes are needed.

Using your own base build images for creating custom runtimes

When you are targeting a non-supported Lambda runtime, such as a non-LTS Java version or natively compiled GraalVM native images, you can create your own build image with the necessary dependencies installed.

For example, to build a native image with GraalVM, you must have GraalVM and the native image tool installed when building your application.

To create a custom image:

  1. Create a Dockerfile with the needed dependencies:
    #Use the official AWS SAM base image or Amazon Linux 2 as a starting point
    FROM public.ecr.aws/sam/build-java11:latest-x86_64
    
    #Install GraalVM dependencies
    ENV GRAAL_VERSION 22.2.0
    ENV GRAAL_FOLDERNAME graalvm-ce-java11-${GRAAL_VERSION}
    ENV GRAAL_FILENAME graalvm-ce-java11-linux-amd64-${GRAAL_VERSION}.tar.gz
    RUN curl -4 -L https://github.com/graalvm/graalvm-ce-builds/releases/download/vm-22.2.0/graalvm-ce-java11-linux-amd64-22.2.0.tar.gz | tar -xvz
    RUN mv $GRAAL_FOLDERNAME /usr/lib/graalvm
    RUN rm -rf $GRAAL_FOLDERNAME
    
    #Install Native Image dependencies
    RUN /usr/lib/graalvm/bin/gu install native-image
    RUN ln -s /usr/lib/graalvm/bin/native-image /usr/bin/native-image
    RUN ln -s /usr/lib/maven/bin/mvn /usr/bin/mvn
    
    #Set GraalVM as default
    ENV JAVA_HOME /usr/lib/graalvm
    
  2. Build your image locally or upload it to a container registry:
    docker build . -t sam/custom-graal-image
  3. Use AWS SAM build with the build image argument to provide your custom image:
    sam build --use-container --build-image sam/custom-graal-image

You can find the source code and an example Dockerfile, Makefile, and pom.xml for GraalVM native images in the GitHub repo.

When you use the official AWS SAM build images as a base image, you have all the necessary tooling such as Maven, Java11 and the Lambda builders installed. If you want to use a more customized approach and use a different base image, you must install these dependencies.

For an example, check the GraalVM with Java 17 cookiecutter template. In addition, there are multiple additional components involved when building custom runtimes, which are outlined in “Build a custom Java runtime for AWS Lambda”.

To avoid providing the command line options on every build, include them in the samconfig.toml file:

[default.build.parameters]
use_container = true
build_image = ["public.ecr.aws/sam/build-java11:latest-x86_64"]

For additional information, refer to the official AWS SAM CLI documentation.

Deploying the application without building with AWS SAM

There might be scenarios where you do not want to rely on the build process offered by AWS SAM. For example, when you have highly customized or established build processes, or advanced dependency caching or visibility requirements.

In this case, you can still use the AWS SAM CLI commands such as sam local and sam deploy. But you must point your CodeUri property directly to the pre-built artifact (instead of the source code directory):

HelloWorldFunctionSkipBuild:
  Type: AWS::Serverless::Function
  Properties:
    CodeUri: HelloWorldFunction/target/HelloWorld-1.0.jar
    Handler: helloworld.App::handleRequest

In this case, there is no need to use the sam build command, since the build logic is outside of AWS SAM CLI:

Build process

Here, the sam build command fails because it looks for a source folder to build the application. However, there might be cases where you have a mixed setup that includes some functions that point to pre-built artifacts and others that are built by AWS SAM CLI. In this scenario, you can mark those functions to explicitly skip the build process by adding the following SkipBuild flag in the Metadata section of your resource definition:

HelloWorldFunctionSkipBuild:
  Type: AWS::Serverless::Function
  Properties:
    CodeUri: HelloWorldFunction/target/HelloWorld-1.0.jar
    Handler: helloworld.App::handleRequest
    Runtime: java11
  Metadata:
    SkipBuild: True

Conclusion

This blog post shows how to build Java applications with the AWS SAM CLI. You learnt about the default build mechanisms, and how to customize the build behavior and abstract the build process inside a container environment. Visit the GitHub repository for the example code templates referenced in the examples.

To learn more, dive deep into the AWS SAM documentation. For more serverless learning resources, visit Serverless Land.

Server-side rendering micro-frontends – UI composer and service discovery

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/server-side-rendering-micro-frontends-ui-composer-and-service-discovery/

This post is written by Luca Mezzalira, Principal Specialist Solutions Architect, Serverless.

The previous blog post describes the architecture for creating a server-side rendering micro-frontend in AWS. This and subsequent posts explain the different parts that compose this architecture in detail. The code for the example is available on a AWS Samples GitHub repository.

For context, this post covers the infrastructure related to the UI composer, and why you need an Amazon S3 bucket for storing static assets:

Architecture overview

The rest of the series explores the micro-frontends composition, how to design micro-frontends using serverless services, different caching and performance optimization strategies, and the organization structure implications associated with frontend distributed systems.

A user’s request journey

The best way to navigate through this distributed system is by simulating a user request that touches all the parts implemented in the architecture.

The application example shows a product details page of a hypothetical ecommerce platform:

Building micro-frontends

When a user selects an article from the catalog page, the DNS resolves the URL to an Amazon CloudFront distribution that is the reference CDN for this project.

The request is immediately fulfilled if the page is cached. Therefore, no additional logic is requested by the cloud infrastructure and the response is fast (less than the 500 ms shown in this example).

When the page is not available in the CloudFront points of presence (PoPs), the request is forwarded to the Application Load Balancer (ALB). It arrives at the AWS Fargate cluster where the UI Composer generates the page for fulfilling the request.

Using CloudFront in the architecture

CDNs are known for accelerating application delivery thanks to caching static files from nearby PoPs. CloudFront can also accelerate uncacheable content such as dynamic APIs or personalized content.

With a network of over 450 points of presence, CloudFront terminates user TCP/TLS connections within 20-30 milliseconds on average. Traffic to origin servers is carried over the AWS global network instead of the public internet. This infrastructure is a purpose-built, highly available, and low-latency private infrastructure built on a global, fully redundant, metro fiber network that is linked via terrestrial and trans-oceanic cables across the world. In addition to terminating connections close to users, CloudFront accelerates dynamic content thanks to modern internet protocols such as QUIC and TLS1.3, and persisting TCP connections to the origin servers.

CloudFront also has security benefits, offering protection in AWS against infrastructure DDoS attacks. It integrates with AWS Web Application Firewall and AWS Shield Advanced, giving you controls to block application-level DDoS attacks. CloudFront also offers native security controls such as HTTP to HTTPS redirections, CORS management, geo-blocking, tokenization, and managing security response headers.

UI Composer application logic

When the request is not fulfilled by the CloudFront cache, it is routed to the Fargate cluster. Here, multiple tasks compute and serve the page requested.

This example uses Fastify, a fast Node.js framework that is gaining popularity among the Node.js community. When the web server initializes, it loads external parameters and the template for composing a page.

const start = async () => {
  try {
    //load parameters
    MFElist = await init();
    //load catalog template
    catalogTemplate = await loadFromS3(MFElist.template, MFElist.templatesBucket)
    await fastify.listen({ port: PORT, host: '0.0.0.0' })
  } catch (err) {
    fastify.log.error(err)
    process.exit(1)
  }
}

To maintain team independence and avoid redeploying the UI composer for every application change, the HTML templates are loaded from an S3 bucket. All teams responsible for micro-frontends in the same page can position their micro-frontends into the right place of the HTML template and delegate the composition task to the UI composer.

In this demo, the initial parameters and the catalog template are retrieved once. However, in a real scenario, it’s more likely you retrieve the parameters at initialization and at a regular cadence. The template might be loaded at runtime for every request or have another background routine fetching the initialization parameters in a similar way.

When the request reaches the product details route, the web application logic calls a transformTemplate function. It passes the catalog template, retrieved from the S3 bucket at the server initialization. It returns a 200 response if the page is composed without any issues.

fastify.get('/productdetails', async(request, reply) => {
  try{
    const catalogDetailspage = await transformTemplate(catalogTemplate)
    responseStream(catalogDetailspage, 200, reply)
  } catch(err){
    console.log(err)
    throw new Error(err)
  }
})

The page composition is the key responsibility of the UI composer. There are several viable approaches for composing micro-frontends in a server-side rendering system, covered in the next post.

Micro-frontends discovery

To decouple workloads for multiple teams, you must use architectural patterns that support it. In a microservices architecture, a pattern that allows independent evolution of a service without coupling the DNS or IP to any microservice is the service discovery pattern.

In this example, AWS System Managers Parameters Store acts as a services registry. Every micro-frontend available in the workload registers itself once the infrastructure is provisioned.

In this way, the UI composer can request the micro-frontend ID found inside the HTML template. It can retrieve the correct way to consume the micro-frontend API using an ARN or a remote HTTP URL, for instance.

AWS System Managers Parameters Store

Using ARN over HTTP requests inside the workload network can help you to reduce the latency thanks to fewer network hops. Moreover, the security is delegated to IAM policies providing a robust security implementation.

The UI composer takes care to retrieve the micro-frontends endpoints at runtime before loading them into the HTML template. This is a simpler yet powerful approach for maintaining the boundaries within your organization and allowing independent teams to evolve their architecture autonomously.

Micro-frontends discovery evolution

Using Parameter Store as a service discovery system, you can deploy a new micro-frontend by adding a new key-value into the service discovery.

A more sophisticated option could be creating a service that acts as a registry and also shapes the traffic towards different micro-frontends versions using deployment strategies like canary releases or blue/green deployments.

You can start iteratively with a simple key-value store system and evolve the architecture with a more complex approach when the workload requires, providing a robust way to roll out micro-frontends services in your system.

When this is in place, it’s likely to increase the release cadence of your micro-frontends. This is because developers often feel safer releasing in production without affecting the entire user base and they can run tests alongside real traffic.

Performance considerations

This architecture uses Fargate for composing the micro-frontends instead of Lambda functions. This allows incremental rendering offered by browsers, displaying the HTML page partially before it’s completely returned.

Consider a scenario where a micro-frontend takes longer to render due to a downstream dependency or a faulty version deployed into production. Without the streaming capability, you must wait until all the micro-frontends responses arrive, buffer them in memory, compose the page and then send the final output to the browser.

Instead, by using the streaming API offered by Node.js frameworks, you can send a partial HTML page (for example, the head tag and subsequently the rest of the page), to be rendered by a browser.

Streaming also improves server overhead, because the servers don’t have to buffer entire pages. By incrementally flushing data to browsers, servers keep memory pressure low, which lets them process more requests and save overhead costs.

However, in case your workload doesn’t require these capabilities, one or multiple Lambda functions might be suitable for your project as well, reducing the infrastructure management complexity to handle.

Conclusion

This post looks at how to use the UI Composer and micro-frontends discoverability. Once this part is developed, it won’t need to change regularly. This represents the foundation for building server-side rendering micro-frontends using HTML-over-the-wire. There might be other approaches to follow for other frameworks such as Next.js due to the architectural implementation of the framework itself.

The next post will cover how the UI composer includes micro-frontends output inside an HTML template.

For more serverless learning resources, visit Serverless Land.

How gaming companies can use Amazon Redshift Serverless to build scalable analytical applications faster and easier

Post Syndicated from Satesh Sonti original https://aws.amazon.com/blogs/big-data/how-gaming-companies-can-use-amazon-redshift-serverless-to-build-scalable-analytical-applications-faster-and-easier/

This post provides guidance on how to build scalable analytical solutions for gaming industry use cases using Amazon Redshift Serverless. It covers how to use a conceptual, logical architecture for some of the most popular gaming industry use cases like event analysis, in-game purchase recommendations, measuring player satisfaction, telemetry data analysis, and more. This post also discusses the art of the possible with newer innovations in AWS services around streaming, machine learning (ML), data sharing, and serverless capabilities.

Our gaming customers tell us that their key business objectives include the following:

  • Increased revenue from in-app purchases
  • High average revenue per user and lifetime value
  • Improved stickiness with better gaming experience
  • Improved event productivity and high ROI

Our gaming customers also tell us that while building analytics solutions, they want the following:

  • Low-code or no-code model – Out-of-the-box solutions are preferred to building customized solutions.
  • Decoupled and scalable – Serverless, auto scaled, and fully managed services are preferred over manually managed services. Each service should be easily replaceable, enhanced with little or no dependency. Solutions should be flexible to scale up and down.
  • Portability to multiple channels – Solutions should be compatible with most of endpoint channels like PC, mobile, and gaming platforms.
  • Flexible and easy to use – The solutions should provide less restrictive, easy-to-access, and ready-to-use data. They should also provide optimal performance with low or no tuning.

Analytics reference architecture for gaming organizations

In this section, we discuss how gaming organizations can use a data hub architecture to address the analytical needs of an enterprise, which requires the same data at multiple levels of granularity and different formats, and is standardized for faster consumption. A data hub is a center of data exchange that constitutes a hub of data repositories and is supported by data engineering, data governance, security, and monitoring services.

A data hub contains data at multiple levels of granularity and is often not integrated. It differs from a data lake by offering data that is pre-validated and standardized, allowing for simpler consumption by users. Data hubs and data lakes can coexist in an organization, complementing each other. Data hubs are more focused around enabling businesses to consume standardized data quickly and easily. Data lakes are more focused around storing and maintaining all the data in an organization in one place. And unlike data warehouses, which are primarily analytical stores, a data hub is a combination of all types of repositories—analytical, transactional, operational, reference, and data I/O services, along with governance processes. A data warehouse is one of the components in a data hub.

The following diagram is a conceptual analytics data hub reference architecture. This architecture resembles a hub-and-spoke approach. Data repositories represent the hub. External processes are the spokes feeding data to and from the hub. This reference architecture partly combines a data hub and data lake to enable comprehensive analytics services.

Let’s look at the components of the architecture in more detail.

Sources

Data can be loaded from multiple sources, such as systems of record, data generated from applications, operational data stores, enterprise-wide reference data and metadata, data from vendors and partners, machine-generated data, social sources, and web sources. The source data is usually in either structured or semi-structured formats, which are highly and loosely formatted, respectively.

Data inbound

This section consists of components to process and load the data from multiple sources into data repositories. It can be in batch mode, continuous, pub/sub, or any other
custom integration. ETL (extract, transform, and load) technologies, streaming services, APIs, and data exchange interfaces are the core components of this pillar. Unlike ingestion processes, data can be transformed as per business rules before loading. You can apply technical or business data quality rules and load raw data as well. Essentially, it provides the flexibility to get the data into repositories in its most usable form.

Data repositories

This section consists of a group of data stores, which includes data warehouses, transactional or operational data stores, reference data stores, domain data stores housing purpose-built business views, and enterprise datasets (file storage). The file storage component is usually a common component between a data hub and a data lake to avoid data duplication and provide comprehensiveness. Data can also be shared among all these repositories without physically moving with features, such as data sharing and federated queries. However, data copy and duplication are allowed considering various consumption needs in terms of formats and latency.

Data outbound

Data is often consumed using structured queries for analytical needs. Also, datasets are accessed for ML, data exporting, and publishing needs. This section consists of components to query the data, export, exchange, and APIs. In terms of implementation, the same technologies may be used for both inbound and outbound, but the functions are different. However, it’s not mandatory to use the same technologies. These processes aren’t transformation heavy because the data is already standardized and almost ready to consume. The focus is on the ease of consumption and integration with consuming services.

Consumption

This pillar consists of various consumption channels for enterprise analytical needs. It includes business intelligence (BI) users, canned and interactive reports, dashboards, data science workloads, Internet of Things (IoT), web apps, and third-party data consumers. Popular consumption entities in many organizations are queries, reports, and data science workloads. Because there are multiple data stores maintaining data at different granularity and formats to service consumer needs, these consumption components depend on data catalogs for finding the right source.

Data governance

Data governance is key to the success of a data hub reference architecture. It constitutes components like metadata management, data quality, lineage, masking, and stewardship, which are required for organized maintenance of the data hub. Metadata management helps organize the technical and business metadata catalog, and consumers can reference this catalog to know what data is available in which repository and at what granularity, format, owners, refresh frequency, and so on. Along with metadata management, data quality is important to increase confidence for consumers. This includes data cleansing, validation, conformance, and data controls.

Security and monitoring

Users and application access should be controlled at multiple levels. It starts with authentication, then authorizing who and what should be accessed, policy management, encryption, and applying data compliance rules. It also includes monitoring components to log the activity for auditing and analysis.

Analytics data hub solution architecture on AWS

The following reference architecture provides an AWS stack for the solution components.

Let’s look at each component again and the relevant AWS services.

Data inbound services

AWS Glue and Amazon EMR services are ideal for batch processing. They scale automatically and are able to process most of the industry standard data formats. Amazon Kinesis Data Streams, Amazon Kinesis Data Firehose, and Amazon Managed Streaming for Apache Kafka (Amazon MSK) enables you to build streaming process applications. These streaming services integrate well with the Amazon Redshift streaming feature. This helps you process real-time sources, IoT data, and data from online channels. You can also ingest data with third-party tools like Informatica, dbt, and Matallion.

You can build RESTful APIs and WebSocket APIs using Amazon API Gateway and AWS Lambda, which will enable real-time two-way communication with web sources, social, and IoT sources. AWS Data Exchange helps with subscribing to third-party data in AWS Marketplace. Data subscription and access is fully managed with this service. Refer to the respective service documentation for further details.

Data repository services

Amazon Redshift is the recommended data storage service for OLAP (Online Analytical Processing) workloads such as cloud data warehouses, data marts, and other analytical data stores. This service is the core of this reference architecture on AWS and can address most analytical needs out of the box. You can use simple SQL to analyze structured and semi-structured data across data warehouses, data marts, operational databases, and data lakes to deliver the best price performance at any scale. The Amazon Redshift data sharing feature provides instant, granular, and high-performance access without data copies and data movement across multiple Amazon Redshift data warehouses in the same or different AWS accounts, and across Regions.

For ease of use, Amazon Redshift offers a serverless option. Amazon Redshift Serverless automatically provisions and intelligently scales data warehouse capacity to deliver fast performance for even the most demanding and unpredictable workloads, and you pay only for what you use. Just load your data and start querying right away in Amazon Redshift Query Editor or in your favorite BI tool and continue to enjoy the best price performance and familiar SQL features in an easy-to-use, zero administration environment.

Amazon Relational Database Service (Amazon RDS) is a fully managed service for building transactional and operational data stores. You can choose from many popular engines such as MySQL, PostgreSQL, MariaDB, Oracle, and SQL Server. With the Amazon Redshift federated query feature, you can query transactional and operational data in place without moving the data. The federated query feature currently supports Amazon RDS for PostgreSQL, Amazon Aurora PostgreSQL-Compatible Edition, Amazon RDS for MySQL, and Amazon Aurora MySQL-Compatible Edition.

Amazon Simple Storage Service (Amazon S3) is the recommended service for multi-format storage layers in the architecture. It offers industry-leading scalability, data availability, security, and performance. Organizations typically store data in Amazon S3 using open file formats. Open file formats enable analysis of the same Amazon S3 data using multiple processing and consumption layer components. Data in Amazon S3 can be easily queried in place using SQL with Amazon Redshift Spectrum. It helps you query and retrieve structured and semi-structured data from files in Amazon S3 without having to load the data. Multiple Amazon Redshift data warehouses can concurrently query the same datasets in Amazon S3 without the need to make copies of the data for each data warehouse.

Data outbound services

Amazon Redshift comes with the web-based analytics workbench Query Editor V2.0, which helps you run queries, explore data, create SQL notebooks, and collaborate on data with your teams in SQL through a common interface. AWS Transfer Family helps securely transfer files using SFTP, FTPS, FTP, and AS2 protocols. It supports thousands of concurrent users and is a fully managed, low-code service. Similar to inbound processes, you can utilize Amazon API Gateway and AWS Lambda for data pull using the Amazon Redshift Data API. And AWS Data Exchange helps publish your data to third parties for consumption through AWS Marketplace.

Consumption services

Amazon QuickSight is the recommended service for creating reports and dashboards. It enables you to create interactive dashboards, visualizations, and advanced analytics with ML insights. Amazon SageMaker is the ML platform for all your data science workload needs. It helps you build, train, and deploy models consuming the data from repositories in the data hub. You can use Amazon front-end web and mobile services and AWS IoT services to build web, mobile, and IoT endpoint applications to consume data out of the data hub.

Data governance services

The AWS Glue Data Catalog and AWS Lake Formation are the core data governance services AWS currently offers. These services help manage metadata centrally for all the data repositories and manage access controls. They also help with data classification and can automatically handle schema changes. You can use Amazon DataZone to discover and share data at scale across organizational boundaries with built-in governance and access controls. AWS is investing in this space to provide more a unified experience for AWS services. There are many partner products such as Collibra, Alation, Amorphic, Informatica, and more, which you can use as well for data governance functions with AWS services.

Security and monitoring services

AWS Identity and Access Management (AWS IAM) manages identities for AWS services and resources. You can define users, groups, roles, and policies for fine-grained access management of your workforce and workloads. AWS Key Management Service (AWS KMS) manages AWS keys or customer managed keys for your applications. Amazon CloudWatch and AWS CloudTrail help provide monitoring and auditing capabilities. You can collect metrics and events and analyze them for operational efficiency.

In this post, we’ve discussed the most common AWS services for the respective solution components. However, you aren’t limited to only these services. There are many other AWS services for specific use cases that may be more appropriate for your needs than what we discussed here. You can reach to AWS Analytics Solutions Architects for appropriate guidance.

Example architectures for gaming use cases

In this section, we discuss example architectures for two gaming use cases.

Game event analysis

In-game events (also called timed or live events) encourage player engagement through excitement and anticipation. Events entice players to interact with the game, increasing player satisfaction and revenue with in-game purchases. Events have become more and more important, especially as games shift from being static pieces of entertainment to be played as is to offering dynamic and changing content through the use of services that use information to make decisions about game play as the game is being played. This enables games to change as the players play and influence what works and what doesn’t, and gives any game a potentially infinite lifespan.

This capability of in-game events to offer fresh content and activities within a familiar framework is how you keep players engaged and playing for months to years. Players can enjoy new experiences and challenges within the familiar framework or world that they have grown to love.

The following example shows how such an architecture might appear, including changes to support various sections of the process like breaking the data into separate containers to accommodate scalability, charge-back, and ownership.

To fully understand how events are viewed by the players and to make decisions about future events requires information on how the latest event was actually performed. This means gathering a lot of data as the players play to build key performance indicators (KPIs) that measure the effectiveness and player satisfaction with each event. This requires analytics that specifically measure each event and capture, analyze, report on, and measure player experience for each event. These KPIs include the following:

  • Initial user flow interactions – What actions users are taking after they first receive or download an event update in a game. Are there any clear drop-off points or bottlenecks that are turning people off the event?
  • Monetization – When, what, and where users are spending money on in the event, whether it’s buying in-game currencies, answering ads, specials, and so on.
  • Game economy – How can users earn and spend virtual currencies or goods during an event, using in-game money, trades, or barter.
  • In-game activity – Player wins, losses, leveling up, competition wins, or player achievements within the event.
  • User to user interactions – Invitations, gifting, chats (private and group), challenges, and so on during an event.

These are just some of the KPIs and metrics that are key for predictive modeling of events as the game acquires new players while keeping existing users involved, engaged, and playing.

In-game activity analysis

In-game activity analysis essentially looks at any meaningful, purposeful activity the player might show, with the goal of trying to understand what actions are taken, their timing, and outcomes. This includes situational information about the players, including where they are playing (both geographical and cultural), how often, how long, what they undertake on each login, and other activities.

The following example shows how such an architecture might appear, including changes to support various sections of the process like breaking the data into separate warehouses. The multi-cluster warehouse approach helps scale the workload independently, provides flexibility to the implemented charge-back model, and supports decentralized data ownership.

The solution essentially logs information to help understand the behavior of your players, which can lead to insights that increase retention of existing players, and acquisition of new ones. This can provide the ability to do the following:

  • Provide in-game purchase recommendations
  • Measure player trends in the short term and over time
  • Plan events the players will engage in
  • Understand what parts of your game are most successful and which are less so

You can use this understanding to make decisions about future game updates, make in-game purchase recommendations, determine when and how your game economy may need to be balanced, and even allow players to change their character or play as the game progresses by injecting this information and accompanying decisions back into the game.

Conclusion

This reference architecture, while showing examples of only a few analysis types, provides a faster technology path for enabling game analytics applications. The decoupled, hub/spoke approach brings the agility and flexibility to implement different approaches to analytics and understanding the performance of game applications. The purpose-built AWS services described in this architecture provide comprehensive capabilities to easily collect, store, measure, analyze, and report game and event metrics. This helps you efficiently perform in-game analytics, event analysis, measure player satisfaction, and provide tailor-made recommendations to game players, efficiently organize events, and increase retention rates.

Thanks for reading the post. If you have any feedback or questions, please leave them in the comments.


About the authors

Satesh Sonti is a Sr. Analytics Specialist Solutions Architect based out of Atlanta, specialized in building enterprise data platforms, data warehousing, and analytics solutions. He has over 16 years of experience in building data assets and leading complex data platform programs for banking and insurance clients across the globe.

Tanya Rhodes is a Senior Solutions Architect based out of San Francisco, focused on games customers with emphasis on analytics, scaling, and performance enhancement of games and supporting systems. She has over 25 years of experience in enterprise and solutions architecture specializing in very large business organizations across multiple lines of business including games, banking, healthcare, higher education, and state governments.

AWS Application Composer Now Generally Available – Visually Build Serverless Applications Quickly

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/aws-application-composer-now-generally-available-visually-build-serverless-applications-quickly/

At AWS re:Invent 2022, we previewed AWS Application Composer, a visual builder for you to compose and configure serverless applications from AWS services backed by deployment-ready infrastructure as code (IaC).

In the keynote, Dr. Werner Vogels, CTO of Amazon.com said:

Developers that never used serverless before. How do they know where to start? Which services do they need? How do they work together? We really wanted to make this easier. AWS Application Composer simplifies and accelerates the architecting, configuring, and building of serverless applications.

During the preview, we had lots of interest and great feedback from customers. Today, I am happy to announce the general availability of AWS Application Composer with new improvements based on customer feedback. I want to quickly review its features and introduce some improvements.

Introduction to AWS Application Composer
To get started with AWS Application Composer, choose Open demo in the AWS Management Console. This demo shows a simple cart application with Amazon API Gateway, AWS Lambda, and Amazon DynamoDB resources.

You can easily browse and search for AWS services in the left Resources panel and drag and drop them onto the canvas to expand your architecture.

In the middle Canvas panel, you can connect resources together by clicking and dragging from one resource port to another. Permissions are automatically composed for these resources to interact with each other using policy template, environment variables, and event subscriptions. Grouping resources is very useful to select one visual organization. For above example, API Compute group is compsite of Lambda functions. When you double-click on a specific resource, you can name and configure your properties in the right Resource properties panel.

As well as featured resources available in the visual resource palette, you can use hidden and read-only resources will populate on the canvas when you load an existing template that includes them.

In this example, the MyHttpApi resource is a hidden resource. It is not available from the resource palette but does appear on the canvas in color. The resource named MyHttpApiRole (in this case, an AWS::IAM::Role resource) is read-only. It grayed out on the canvas greyed out. To learn more about all supported resources, see AWS Application Composer featured resources in the AWS documentation.

When you select the Template menu, you can view, edit or manually download your IaC, such as AWS Serverless Application Model (AWS SAM). Your changes are automatically synced with your canvas.

When you start Connected mode, you can use Application Composer with local tools such as an integrated development environment (IDE). Any changes activate the automatic synchronization of your project template and files between Application Composer and your local project directory.

It is useful to incorporate into your existing team processes, such as local testing with AWS SAM Command Line Interface (CLI), peer review through version control, or deployment through AWS CloudFormation and continuous integration and delivery (CI/CD) pipelines.

This mode is supported on Chrome and Edge browsers and requires you to grant temporary local file system access to your browser.

AWS Application Composer can be used in real-world scenarios such as:

  • Building a prototype of serverless applications
  • Reviewing and collaboratively evolving existing serverless projects
  • Generating diagrams for documentation or Wikis
  • Onboarding new team members to a project
  • Reducing the first steps to deploy something in an AWS account

To learn more real-world examples, see Visualize and create your serverless workloads with AWS Application Composer in the AWS Compute Blog, How I Used AWS Application Composer to Make Analyzing My Meetup Data Easy in BuildOn.AWS, or watch a breakout session video (SVS211) from AWS re:Invent 2022.

Improvements Since Preview Launch
Here is a new feature to improve how you work with Amazon Simple Queue Service (Amazon SQS) queues.

You can now directly connect Amazon API Gateway resources to Amazon SQS without routing requests through AWS Lambda function. You can remove the complexity of the Lambda function’s execution and increase the reliability while reducing lines of code.

For example, you can drag API Gateway and Amazon SQS onto the canvas and connect the two resources. When the user drags the connector from API route to SQS, Send message appears. You can connect the API route to the SQS queue via their choice of integration target.

The new Change Inspector provides a visual diff of template changes made when you connect two resources on the canvas. This information is available as a notification when you make the connection, which helps you understand how Composer manages integration configuration in your IaC template as you build.

Here are some more improvements to your experience in the user interface!

First, we reduced the size of resource cards. The larger cards made it difficult for the users to read and view their template on the canvas. Now, you can arrange more resource cards easily and save space on the canvas.

Also, we added zoom in and out and zoom to fit buttons so that users can quickly view the entire screen or zoom to the desired level. When you load a large template onto the canvas, you can easily see all the resource cards in any size.

Now Available
AWS Application Composer is now generally available in the US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland), and Europe (Stockholm) Regions, adding three more Regions to the six Regions available during preview. There is no additional cost, and you can start using it today.

To learn more, see the AWS Application Composer Developer Guide and send feedback to AWS re:Post for AWS Application Composer or through your usual AWS support contacts.

Channy

Access Amazon Athena in your applications using the WebSocket API

Post Syndicated from Abhi Sodhani original https://aws.amazon.com/blogs/big-data/access-amazon-athena-in-your-applications-using-the-websocket-api/

Modern applications are built with modular independent components or microservices that rely on an API framework to communicate with services. Many organizations are building data lakes to store and analyze large volumes of structured, semi-structured, and unstructured data. In addition, many teams are moving towards a data mesh architecture, which requires them to expose their data sets as easily consumable data products. To accomplish this on AWS, organizations use Amazon Simple Storage Service (Amazon S3) to provide cheap and reliable object storage to house their datasets. To enable interactive querying and analyzing their data in place using familiar SQL syntax, many teams are turning to Amazon Athena. Athena is an interactive query service that is used by modern applications to query large volumes of data on an S3 data lake using standard SQL.

When working with SQL databases, application developers and business analysts are most familiar with simple permissions management and synchronous query-response protocols—if a user has permissions to submit a query, they do so and receive the results from the server when the query is complete. Directly accessing Athena APIs, for example when integrating with a custom web application, requires an AWS Identity and Access Management (IAM) role for the applications, and requires you to build a custom process to poll for query completion asynchronously. The IAM role needs access to run Athena API calls, as well as S3 permissions to retrieve the Athena output stored on Amazon S3. Polling for Athena query completion when performed at several intervals could result in increased latency from the client perspective.

In this post, we present a solution that can integrate with your front-end application to query data from Amazon S3 using an Athena synchronous API invocation. With this solution, you can add a layer of abstraction to your application on direct Athena API calls and promote the access using the WebSocket API developed with Amazon API Gateway. The query results are returned back to the application as Amazon S3 presigned URLs.

Overview of solution

For illustration purposes, this post builds a COVID-19 data lake with a WebSocket API to handle Athena queries. Your application can invoke the WebSocket API to pull the data from Amazon S3 using an Athena SQL query, and the WebSocket API returns the JSON response with the presigned Amazon S3 URL. The application needs to parse the JSON message to read the presigned URL, download the data to local, and report the data back to the front end.

We use AWS Step Functions to poll the Athena query run. When the query is complete, Step Functions invokes an AWS Lambda function to generate the presigned URL and send the request back to the application.

The application doesn’t require direct access to Athena, just access to invoke the API. When using this solution, you should secure the API following AWS guidelines. For more information, refer to Controlling and managing access to a WebSocket API in API Gateway.

The following diagram summarizes the architecture, key components, and interactions in the solution.

Architecture diagram for the Athena WebSocket API. The user connects to the API through API Gateway. API Gateway uses Lambda and DynamoDB to store session data. SQL queries are routed to Amazon Athena and a Step Function polls for query status and returns the results back to the user.

The application is composed of the WebSocket API in API Gateway, which handles the connectivity between the client and Athena. A client application using the framework can submit the Athena SQL query and get back the presigned URL containing the query results data. The workflow includes the following steps:

  1. The application invokes the WebSocket API connection.
  2. A Lambda function is invoked to initiate the connection. The connection ID is stored in an Amazon DynamoDB
  3. When the client application is connected, it can invoke the runquery action, which invokes the RunQuery Lambda function.
  4. The function first runs the Athena query.
  5. When the query is started, the function checks the status and uses Step Functions to track the query progress.
  6. Step Functions invokes the third Lambda function to read the processed Athena results and get the presigned S3 URL. Failed messages are routed to an Amazon Simple Notification Service (Amazon SNS) topic, which you can subscribe to.
  7. The presigned URL is returned to the client application.
  8. The connection is closed using the OnDisconnect function.

The RunQuery Lambda function runs the Athena query using the start_query_execution request:

def run_query(client, query):
    """This function executes and sends the query request to Athena."""
    response = client.start_query_execution(
        QueryString=query,
        QueryExecutionContext={
            'Database': params['Database']
        },
        ResultConfiguration={
            'OutputLocation': f's3://{params["BucketName"]}/{params["OutputDir"]}/'
        },
        WorkGroup=params["WorkGroup"]
    )
    return response

The Amazon S3 presigned URL is generated by invoking the generate_presigned_url request with the bucket and key information that hosts the Athena results. The code hard codes the presigner expiration to 120 seconds, which is configurable in the function input parameter PreSignerExpireSeconds. See the following code:

def signed_get_url(event):
    s3 = boto3.client('s3', region_name=params['Region'], config=Config(signature_version='s3v4'))
    # User provided body with object info
    bodyData = json.loads(event['body'])
    try:
        url = s3.generate_presigned_url(
            ClientMethod='get_object',
            Params={
                'Bucket': params['BucketName'],
                'Key': bodyData["ObjectName"]
            },
            ExpiresIn=int(params['PreSignerExpireSeconds'])
        )
        body = {'PreSignedUrl': url, 'ExpiresIn': params['PreSignerExpireSeconds']}
        response = {
            'statusCode': 200,
			'body': json.dumps(body),
            'headers': cors.global_returns["Allow Origin Header"]
        }
        logger.info(f"[MESSAGE] Response for PreSignedURL: {response}")
    except Exception as e:
        logger.exception(f"[MESSAGE] Unable to generate URL: {str(e)}")
        response = {
            'statusCode': 502,
            'body': 'Unable to generate PreSignedUrl',
            'headers': cors.global_returns["Allow Origin Header"]
        }
    return response

Prerequisites

This post assumes you have the following:

  • Access to an AWS account
  • Permissions to create an AWS CloudFormation stack
  • Permissions to create the following resources:
    • AWS Glue catalog databases and tables
    • API Gateway
    • Lambda function
    • IAM roles
    • Step Functions state machine
    • SNS topic
    • DynamoDB table

Enable the WebSocket API

To enable the WebSocket API of API Gateway, complete the following steps:

  1. Configure the Athena dataset.

To make the data from the AWS COVID-19 data lake available in the Data Catalog in your AWS account, create a CloudFormation stack using the following template. If you’re signed in to your AWS account, the following page fills out most of the stack creation form for you. All you need to do is choose Create stack. For instructions on creating a CloudFormation stack, see Getting started with AWS CloudFormation.

You can also use an existing Athena database to query, in which case you need to update the stack parameters.

  1. Sign in to the Athena console.

If this is the first time you’re using Athena, you must specify a query result location on Amazon S3. For more information about querying and accessing the data from Athena, see A public data lake for analysis of COVID-19 data.

  1. Configure the WebSocket framework using the following page, which deploys the API infrastructure using AWS Serverless Application Model (AWS SAM).
  2. Update the parameters pBucketName with the S3 bucket (in the us-east-2 region) that stores the Athena results and also update the database if you want to query an existing database.
  3. Select the check box to acknowledge creation of IAM roles and choose Deploy.

At a high level, these are the primary resources deployed by the application template:

  • An API Gateway with routes to the connect, disconnect, and query Lambda functions. Note that the API Gateway deployed with this sample doesn’t implement authentication and authorization. We recommend that you implement authentication and authorization before deploying into a production environment. Refer to Controlling and managing access to a WebSocket API in API Gateway to understand how to implement these security controls.
  • A DynamoDB table for tracking client connections.
  • Lambda functions to manage connection states using DynamoDB.
  • A Lambda function to run the query and start the step function. The function includes an associated IAM role and policies with permissions to Step Functions, the AWS Glue Data Catalog, Athena, AWS Key Management Service (AWS KMS), and Amazon S3. Note that the Lambda execution role gives read access to the Data Catalog and S3 bucket that you specify in the deployment parameters. We recommend that you don’t include a catalog that contains sensitive data without first understanding the impacts and implementing additional security controls.
  • A Lambda function with associated permissions to poll for the query results and return the presigned URL to the client.
  • A Step Functions state machine with associated permissions to run the polling Lambda function and send API notifications using Amazon SNS.

Test the setup

To test the WebSocket API, you can use wscat, an open-source command line tool.

  1. Install NPM.
  2. Install wscat:
$ npm install -g wscat
  1. On the console, connect to your published API endpoint by running the following command. The full URI to use can be found on the AWS CloudFormation console by finding the WebSocketURI output in the serverlessrepo-aws-app-athena-websocket-integration stack that was deployed by the AWS SAM application you deployed previously.
$ wscat -c wss://{YOUR-API-ID}.execute-api.{YOUR-REGION}.amazonaws.com/{STAGE}
  1. To test the runquery function, send a JSON message like the following example. This triggers the state machine to run your SQL query using Athena and, using Lambda, return an S3 presigned URL to your client, which you can access to download the query results. Note that the API accepts any valid Athena query. Additional query validation could be added to the internal Lambda function if desired.
$ wscat -c wss://{YOUR-API-ID}.execute-api.{YOUR-REGION}.amazonaws.com/{STAGE}
Connected (press CTRL+C to quit)
> {"action":"runquery", "data":"SELECT * FROM \"covid-19\".country_codes limit 5"}
< {"pre-signed-url": "https://xxx-s3.amazonaws.com/athena_api_access_results/xxxxx.csv?"}
  1. Copy the value for pre-signed-url and enter it into your browser window to access the results.

The presigned URL provides you temporary credentials to download the query results. For more information, refer to Using presigned URLs. This process can be integrated into a front-end web application to automatically download and display the results of the query.

Clean up

To avoid incurring ongoing charges, delete the resources you provisioned by deleting the CloudFormation stacks CovidLakeStacks and serverlessrepo-AthenaWebSocketIntegration via the AWS CloudFormation console. For detailed instructions, refer to the cleanup sections in the starter kit README files in the GitHub repo.

Conclusion

In this post, we showed how to integrate your application with Athena using the WebSocket API. We have included a GitHub repo for you to understand the code and modify it per your application requirements, to get the full benefits of the solution. We encourage you to further explore the features of the API Gateway WebSocket API to add in security using authorizers, view live invocations using dashboards, and expand the framework for more routes on action request.

Let’s stay in touch via the GitHub repo.


About the Authors

Abhi Sodhani is a Sr. AI/ML Solutions Architect at AWS. He helps customers with a wide range of solutions, including machine leaning, artificial intelligence, data lakes, data warehousing, and data visualization. Outside of work, he is passionate about books, yoga, and travel.

Robin Zimmerman's HeadshotRobin Zimmerman is a Data and ML Engineer with AWS Professional Services. He works with AWS enterprise customers to develop systems to extract value from large volumes of data using AWS data, analytics, and machine learning services. When he’s not working, you’ll probably find him in the mountains—rock climbing, skiing, mountain biking, or out on whatever other adventure he can dream up.

How we built an open-source SEO tool using Workers, D1, and Queues

Post Syndicated from Kristian Freeman original https://blog.cloudflare.com/how-we-built-an-open-source-seo-tool-using-workers-d1-and-queues/

How we built an open-source SEO tool using Workers, D1, and Queues

How we built an open-source SEO tool using Workers, D1, and Queues

Building applications on Cloudflare Workers has always been fun. Workers applications have low latency response times by default, and easy developer ergonomics thanks to Wrangler. It’s no surprise that for years now, developers have been going from idea to production with Workers in just a few minutes.

Internally, we’re no different. When a member of our team has a project idea, we often reach for Workers first, and not just for the MVP stage, but in production, too. Workers have been a secret ingredient to Cloudflare’s innovation for some time now, allowing us to build products like Access, Stream and Workers KV. Even better, when we have new ideas and we can use new Cloudflare products to build them, it’s a great way to give feedback on those products.

We’ve discussed this in the past on the Cloudflare blog – in May last year, I wrote how we rebuilt Cloudflare’s developer documentation using many of the tools that had recently been released in the Workers ecosystem: Cloudflare Pages for hosting, and Bulk Redirects for the redirect rules. In November, we released a new version of our API documentation, which again used Pages for hosting, and Pages functions for intelligent caching and transformation of our API schema.

In this blog post, I’m excited to show off some of the new tools in Cloudflare’s developer arsenal, D1 and Queues, to prototype and ship an internal tool for our SEO experts at Cloudflare. We’ve made this project, which we’re calling Prospector, open-source too – check it out in our cloudflare/templates repo on GitHub. Whether you’re a developer looking to understand how to use multiple parts of Cloudflare’s developer stack together, or an SEO specialist who may want to deploy the tool in production, we’ve made it incredibly easy to get up and running.

How we built an open-source SEO tool using Workers, D1, and Queues

What we’re building

Prospector is a tool that allows Cloudflare’s SEO experts to monitor our blog and marketing site for specific keywords. When a keyword is matched on a page, Prospector will notify an email address. This allows our SEO experts to stay informed of any changes to our website, and take action accordingly.

Using MailChannels’ integration with Workers, we can quickly and easily send emails from our application using a single API call. This allows us to focus on the core functionality of the application, and not worry about the details of sending emails.

Prospector uses Cloudflare Workers as the user-facing API for the application. It uses D1 to store and retrieve data in real-time, and Queues to handle the fetching of all URLs and the notification process. We’ve also included an intuitive user interface for the application, which is built with HTML, CSS, and JavaScript.

How we built an open-source SEO tool using Workers, D1, and Queues

Why we built it

It is widely known in SEO that both internal and external links help Google and other search engines understand what a website is about, which impacts keyword rankings. Not only do these links guide readers to additional helpful information, they also allow web crawlers for search engines to discover and index content on the site.

Acquiring external links is often a time-consuming process and at the discretion of third parties, whereas website owners typically have much more control over internal links. As a result, internal linking is one of the most useful levers available in SEO.

In an ideal world, every piece of content would be fully formed upon publication, replete with helpful internal links throughout the piece. However, this is often not the case. Many times, content is edited after the fact or additional pieces of relevant content come along after initial publication. These situations result in missed opportunities for internal linking.

Like other large organizations, Cloudflare has published thousands of blogs and web pages over the years. We share new content every time a product/technology is introduced and improved. Ultimately, that also means it’s become more challenging to identify opportunities for internal linking in a timely, automated fashion. We needed a tool that would allow us to identify internal linking opportunities as they appear, and speed up the time it takes to identify new internal linking opportunities.

Although we tested several tools that might solve this problem, we found that they were limited in several ways. First, some tools only scanned the first 2,000 characters of a web page. Any opportunities found beyond that limit would not be detected. Next, some tools did not allow us to limit searches to certain areas of the site and resulted in many false positives. Finally, other potential solutions required manual operation, leaving the process at the mercy of human memory.

To solve our problem (and ultimately, improve our SEO), we needed an automated tool that could discover and notify us of new instances of targeted phrases on a specified range of pages.

How it works

Data model

First, let’s explore the data model for Prospector. We have two main tables: notifiers and urls. The notifiers table stores the email address and keyword that we want to monitor. The urls table stores the URL and sitemap that we want to scrape. The notifiers table has a one-to-many relationship with the urls table, meaning that each notifier can have many URLs associated with it.

In addition, we have a sitemaps table that stores the sitemap URLs that we’ve scraped. Many larger websites don’t just have a single sitemap: the Cloudflare blog, for instance, has a primary sitemap that contains four sub-sitemaps. When the application is deployed, a primary sitemap is provided as configuration, and Prospector will parse it to find all of the sub-sitemaps.

Finally, notifier_matches is a table that stores the matches between a notifier and a URL. This allows us to keep track of which URLs have already been matched, and which ones still need to be processed. When a match has been found, the notifier_matches table is updated to reflect that, and “matches” for a keyword are no longer processed. This saves our SEO experts from a crowded inbox, and allows them to focus and act on new matches.

Connecting the pieces with Cloudflare Queues
Cloudflare Queues acts as the work queue for Prospector. When a new notifier is added, a new job is created for it and added to the queue. Behind the scenes, Queues will distribute the work across multiple Workers, allowing us to scale the application as needed. When a job is processed, Prospector will scrape the URL and check for matches. If a match is found, Prospector will send an email to the notifier’s email address.

Using the Cron Triggers functionality in Workers, we can schedule the scraping process to run at a regular interval – by default, once a day. This allows us to keep our data up-to-date, and ensures that we’re always notified of any changes to our website. It also allows the end-user to configure when they receive emails in case they want to receive them more or less frequently, or at the beginning of their workday.

The Module Workers syntax for Workers makes accessing the application bindings – the constants available in the application for querying D1, Queues, and other services – incredibly easy. src/index.ts, the entrypoint for the application, looks like this:

import { DBUrl, Env } from './types'

import {
  handleQueuedUrl,
  scheduled,
} from './functions';

import h from './api'

export default {
  async fetch(
	request: Request,
	env: Env,
	ctx: ExecutionContext
  ): Promise<Response> {
	return h.fetch(request, env, ctx)
  },

  async queue(
	batch: MessageBatch<Error>,
	env: Env
  ): Promise<void> {
	for (const message of batch.messages) {
  	const url: DBUrl = JSON.parse(message.body)
  	await handleQueuedUrl(url, env.DB)
	}
  },

  async scheduled(
	env: Env,
  ): Promise<void> {
	await scheduled({
  	authToken: env.AUTH_TOKEN,
  	db: env.DB,
  	queue: env.QUEUE,
  	sitemapUrl: env.SITEMAP_URL,
	})
  }
};

With this syntax, we can see where the various events incoming to the application – the fetch event, the queue event, and the scheduled event – are handled. The fetch event is the main entrypoint for the application, and is where we handle all of the API routes. The queue event is where we handle the work that’s been added to the queue, and the scheduled event is where we handle the scheduled scraping process.

Central to the application, of course, is Workers – acting as the API gateway and coordinator. We’ve elected to use the popular open-source framework Hono, an Express-style API for Workers, in Prospector. With Hono, we can quickly map out a REST API in just a few lines of code. Here’s an example of a few API routes and how they’re defined with Hono:

const app = new Hono()

app.get("/", (context) => {
  return context.html(index)
})

app.post("/notifiers", async context => {
  try {
	const { keyword, email } = await context.req.parseBody()
	await context.env.DB.prepare(
  	"insert into notifiers (keyword, email) values (?, ?)"
	).bind(keyword, email).run()
	return context.redirect('/')
  } catch (err) {
	context.status(500)
	return context.text("Something went wrong")
  }
})

app.get('/sitemaps', async (context) => {
  const query = await context.env.DB.prepare(
	"select * from sitemaps"
  ).all();
  const sitemaps: Array<DBSitemap> = query.results
  return context.json(sitemaps)
})

Crucial to the development of Prospector are the improved TypeScript bindings for Workers. As announced in November of last year, TypeScript bindings for Workers are now automatically generated based on our open source runtime, workerd. This means that whenever we use the types provided from the @cloudflare/workers-types package in our application, we can be sure that the types are always up-to-date.

With these bindings, we can define the types for our environment variables, and use them in our application. Here’s an example of the Env type, which defines the environment variables that we use in the application:

export interface Env {
  AUTH_TOKEN: string
  DB: D1Database
  QUEUE: Queue
  SITEMAP_URL: string
}

Notice the types of the DB and QUEUE bindings – D1Database and Queue, respectively. These types are automatically generated, complete with type signatures for each method inside of the D1 and Queue APIs. This means that we can be sure that we’re using the correct methods, and that we’re passing the correct arguments to them, directly from our text editor – without having to refer to the documentation.

How we built an open-source SEO tool using Workers, D1, and Queues

How to use it

One of my favorite things about Workers is that deploying applications is quick and easy. Using `wrangler.toml` and some simple build scripts, we can deploy a fully-functional application in just a few minutes. Prospector is no different. With just a few commands, we can create the necessary D1 database and Queues instance, and deploy the application to our account.

First, you’ll need to clone the repository from our cloudflare/templates repository:

git clone $URL

If you haven’t installed wrangler yet, you can do so by running:

npm install @cloudflare/wrangler -g

With Wrangler installed, you can login to your account by running:

wrangler login

After you’ve done that, you’ll need to create a new D1 database, as well as a Queues instance. You can do this by running the following commands:

wrangler d1 create $DATABASE_NAME
wrangler queues create $QUEUE_NAME

Configure your wrangler.toml with the appropriate bindings (see [the README](URL) for an example):

[[ d1_databases ]]
binding = "DB"
database_name = "keyword-tracker-db"
database_id = "ab4828aa-723b-4a77-a3f2-a2e6a21c4f87"
preview_database_id = "8a77a074-8631-48ca-ba41-a00d0206de32"
	
[[queues.producers]]
  queue = "queue"
  binding = "QUEUE"

[[queues.consumers]]
  queue = "queue"
  max_batch_size = 10
  max_batch_timeout = 30
  max_retries = 10
  dead_letter_queue = "queue-dlq"

Next, you can run the bin/migrate script to create the tables in your database:

bin/migrate

This will create all the needed tables in your database, both in development (locally) and in production. Note that you’ll even see the creation of a honest-to-goodness .sqlite3 file in your project directory – this is the local development database, which you can connect to directly using the same SQLite CLI that you’re used to:

$ sqlite3 .wrangler/state/d1/DB.sqlite3
sqlite> .tables notifier_matches  notifiers     sitemaps       urls

Finally, you can deploy the application to your account:

npm run deploy

With a deployed application, you can visit your Workers URL to see the user interface. From there, you can add new notifiers and URLs, and see the results of your scraping process. When a new keyword match is found, you’ll receive an email with the details of the match instantly:

How we built an open-source SEO tool using Workers, D1, and Queues

Conclusion

For some time, there have been a great deal of applications that were hard to build on Workers without relational data or background task tooling. Now, with D1 and Queues, we can build applications that seamlessly integrate between real-time user interfaces, geographically distributed data, background processing, and more, all using the same developer ergonomics and low latency that Workers is known for.

D1 has been crucial for building this application. On larger sites, the number of URLs that need to be scraped can be quite large. If we were to use Workers KV, our key-value store, for storing this data, we would quickly struggle with how to model, retrieve, and update the data needed for this use-case. With D1, we can build relational data models and quickly query just the data we need for each queued processing task.

Using these tools, developers can build internal tools and applications for their companies that are more powerful and more scalable than ever before. With the integration of Cloudflare’s Zero Trust suite, developers can make these applications secure by default, and deploy them to Cloudflare’s global network. This allows developers to build applications that are fast, secure, and reliable, all without having to worry about the underlying infrastructure.

Prospector is a great example of how easy it is to build applications on Cloudflare Workers. With the recent addition of D1 and Queues, we’ve been able to build fully-functional applications that require real-time data and background processing in just a few hours. We’re excited to share the open-source code for Prospector, and we’d love to hear your feedback on the project.

If you have any questions, feel free to reach out to us on Twitter at @cloudflaredev, or join us in the Cloudflare Workers Discord community, which recently hit 20k members and is a great place to ask questions and get help from other developers.

Introducing AWS Lambda Powertools for .NET

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/introducing-aws-lambda-powertools-for-net/

This blog post is written by Amir Khairalomoum, Senior Solutions Architect.

Modern applications are built with modular architectural patterns, serverless operational models, and agile developer processes. They allow you to innovate faster, reduce risk, accelerate time to market, and decrease your total cost of ownership (TCO). A microservices architecture comprises many distributed parts that can introduce complexity to application observability. Modern observability must respond to this complexity, the increased frequency of software deployments, and the short-lived nature of AWS Lambda execution environments.

The Serverless Applications Lens for the AWS Well-Architected Framework focuses on how to design, deploy, and architect your serverless application workloads in the AWS Cloud. AWS Lambda Powertools for .NET translates some of the best practices defined in the serverless lens into a suite of utilities. You can use these in your application to apply structured logging, distributed tracing, and monitoring of metrics.

Following the community’s continued adoption of AWS Lambda Powertools for Python, Java, and TypeScript, AWS Lambda Powertools for .NET is now generally available.

This post shows how to use the new open source Powertools library to implement observability best practices with minimal coding. It walks through getting started, with the provided examples available in the Powertools GitHub repository.

About Powertools

Powertools for .NET is a suite of utilities that helps with implementing observability best practices without needing to write additional custom code. It currently supports Lambda functions written in C#, with support for runtime versions .NET 6 and newer. Powertools provides three core utilities:

  • Tracing provides a simpler way to send traces from functions to AWS X-Ray. It provides visibility into function calls, interactions with other AWS services, or external HTTP requests. You can add attributes to traces to allow filtering based on key information. For example, when using the Tracing attribute, it creates a ColdStart annotation. You can easily group and analyze traces to understand the initialization process.
  • Logging provides a custom logger that outputs structured JSON. It allows you to pass in strings or more complex objects, and takes care of serializing the log output. The logger handles common use cases, such as logging the Lambda event payload, and capturing cold start information. This includes appending custom keys to the logger.
  • Metrics simplifies collecting custom metrics from your application, without the need to make synchronous requests to external systems. This functionality allows capturing metrics asynchronously using Amazon CloudWatch Embedded Metric Format (EMF) which reduces latency and cost. This provides convenient functionality for common cases, such as validating metrics against CloudWatch EMF specification and tracking cold starts.

Getting started

The following steps explain how to use Powertools to implement structured logging, add custom metrics, and enable tracing with AWS X-Ray. The example application consists of an Amazon API Gateway endpoint, a Lambda function, and an Amazon DynamoDB table. It uses the AWS Serverless Application Model (AWS SAM) to manage the deployment.

When you send a GET request to the API Gateway endpoint, the Lambda function is invoked. This function calls a location API to find the IP address, stores it in the DynamoDB table, and returns it with a greeting message to the client.

Example application

Example application

The AWS Lambda Powertools for .NET utilities are available as NuGet packages. Each core utility has a separate NuGet package. It allows you to add only the packages you need. This helps to make the Lambda package size smaller, which can improve the performance.

To implement each of these core utilities in a separate example, use the Globals sections of the AWS SAM template to configure Powertools environment variables and enable active tracing for all Lambda functions and Amazon API Gateway stages.

Sometimes resources that you declare in an AWS SAM template have common configurations. Instead of duplicating this information in every resource, you can declare them once in the Globals section and let your resources inherit them.

Logging

The following steps explain how to implement structured logging in an application. The logging example shows you how to use the logging feature.

To add the Powertools logging library to your project, install the packages from NuGet gallery, from Visual Studio editor, or by using following .NET CLI command:

dotnet add package AWS.Lambda.Powertools.Logging

Use environment variables in the Globals sections of the AWS SAM template to configure the logging library:

  Globals:
    Function:
      Environment:
        Variables:
          POWERTOOLS_SERVICE_NAME: powertools-dotnet-logging-sample
          POWERTOOLS_LOG_LEVEL: Debug
          POWERTOOLS_LOGGER_CASE: SnakeCase

Decorate the Lambda function handler method with the Logging attribute in the code. This enables the utility and allows you to use the Logger functionality to output structured logs by passing messages as a string. For example:

[Logging]
public async Task<APIGatewayProxyResponse> FunctionHandler
         (APIGatewayProxyRequest apigProxyEvent, ILambdaContext context)
{
  ...
  Logger.LogInformation("Getting ip address from external service");
  var location = await GetCallingIp();
  ...
}

Lambda sends the output to Amazon CloudWatch Logs as a JSON-formatted line.

{
  "cold_start": true,
  "xray_trace_id": "1-621b9125-0a3b544c0244dae940ab3405",
  "function_name": "powertools-dotnet-tracing-sampl-HelloWorldFunction-v0F2GJwy5r1V",
  "function_version": "$LATEST",
  "function_memory_size": 256,
  "function_arn": "arn:aws:lambda:eu-west-2:286043031651:function:powertools-dotnet-tracing-sample-HelloWorldFunction-v0F2GJwy5r1V",
  "function_request_id": "3ad9140b-b156-406e-b314-5ac414fecde1",
  "timestamp": "2022-02-27T14:56:39.2737371Z",
  "level": "Information",
  "service": "powertools-dotnet-sample",
  "name": "AWS.Lambda.Powertools.Logging.Logger",
  "message": "Getting ip address from external service"
}

Another common use case, especially when developing new Lambda functions, is to print a log of the event received by the handler. You can achieve this by enabling LogEvent on the Logging attribute. This is disabled by default to prevent potentially leaking sensitive event data into logs.

[Logging(LogEvent = true)]
public async Task<APIGatewayProxyResponse> FunctionHandler
         (APIGatewayProxyRequest apigProxyEvent, ILambdaContext context)
{
  ...
}

With logs available as structured JSON, you can perform searches on this structured data using CloudWatch Logs Insights. To search for all logs that were output during a Lambda cold start, and display the key fields in the output, run following query:

fields coldStart='true'
| fields @timestamp, function_name, function_version, xray_trace_id
| sort @timestamp desc
| limit 20
CloudWatch Logs Insights query for cold starts

CloudWatch Logs Insights query for cold starts

Tracing

Using the Tracing attribute, you can instruct the library to send traces and metadata from the Lambda function invocation to AWS X-Ray using the AWS X-Ray SDK for .NET. The tracing example shows you how to use the tracing feature.

When your application makes calls to AWS services, the SDK tracks downstream calls in subsegments. AWS services that support tracing, and resources that you access within those services, appear as downstream nodes on the service map in the X-Ray console.

You can instrument all of your AWS SDK for .NET clients by calling RegisterXRayForAllServices before you create them.

public class Function
{
  private static IDynamoDBContext _dynamoDbContext;
  public Function()
  {
    AWSSDKHandler.RegisterXRayForAllServices();
    ...
  }
  ...
}

To add the Powertools tracing library to your project, install the packages from NuGet gallery, from Visual Studio editor, or by using following .NET CLI command:

dotnet add package AWS.Lambda.Powertools.Tracing

Use environment variables in the Globals sections of the AWS SAM template to configure the tracing library.

  Globals:
    Function:
      Tracing: Active
      Environment:
        Variables:
          POWERTOOLS_SERVICE_NAME: powertools-dotnet-tracing-sample
          POWERTOOLS_TRACER_CAPTURE_RESPONSE: true
          POWERTOOLS_TRACER_CAPTURE_ERROR: true

Decorate the Lambda function handler method with the Tracing attribute to enable the utility. To provide more granular details for your traces, you can use the same attribute to capture the invocation of other functions outside of the handler. For example:

[Tracing]
public async Task<APIGatewayProxyResponse> FunctionHandler
         (APIGatewayProxyRequest apigProxyEvent, ILambdaContext context)
{
  ...
  var location = await GetCallingIp().ConfigureAwait(false);
  ...
}

[Tracing(SegmentName = "Location service")]
private static async Task<string?> GetCallingIp()
{
  ...
}

Once traffic is flowing, you see a generated service map in the AWS X-Ray console. Decorating the Lambda function handler method, or any other method in the chain with the Tracing attribute, provides an overview of all the traffic flowing through the application.

AWS X-Ray trace service view

AWS X-Ray trace service view

You can also view the individual traces that are generated, along with a waterfall view of the segments and subsegments that comprise your trace. This data can help you pinpoint the root cause of slow operations or errors within your application.

AWS X-Ray waterfall trace view

AWS X-Ray waterfall trace view

You can also filter traces by annotation and create custom service maps with AWS X-Ray Trace groups. In this example, use the filter expression annotation.ColdStart = true to filter traces based on the ColdStart annotation. The Tracing attribute adds these automatically when used within the handler method.

View trace attributes

View trace attributes

Metrics

CloudWatch offers a number of included metrics to help answer general questions about the application’s throughput, error rate, and resource utilization. However, to understand the behavior of the application better, you should also add custom metrics relevant to your workload.

The metrics utility creates custom metrics asynchronously by logging metrics to standard output using the Amazon CloudWatch Embedded Metric Format (EMF).

In the sample application, you want to understand how often your service is calling the location API to identify the IP addresses. The metrics example shows you how to use the metrics feature.

To add the Powertools metrics library to your project, install the packages from the NuGet gallery, from the Visual Studio editor, or by using the following .NET CLI command:

dotnet add package AWS.Lambda.Powertools.Metrics

Use environment variables in the Globals sections of the AWS SAM template to configure the metrics library:

  Globals:
    Function:
      Environment:
        Variables:
          POWERTOOLS_SERVICE_NAME: powertools-dotnet-metrics-sample
          POWERTOOLS_METRICS_NAMESPACE: AWSLambdaPowertools

To create custom metrics, decorate the Lambda function with the Metrics attribute. This ensures that all metrics are properly serialized and flushed to logs when the function finishes its invocation.

You can then emit custom metrics by calling AddMetric or push a single metric with a custom namespace, service and dimensions by calling PushSingleMetric. You can also enable the CaptureColdStart on the attribute to automatically create a cold start metric.

[Metrics(CaptureColdStart = true)]
public async Task<APIGatewayProxyResponse> FunctionHandler
         (APIGatewayProxyRequest apigProxyEvent, ILambdaContext context)
{
  ...
  // Add Metric to capture the amount of time
  Metrics.PushSingleMetric(
        metricName: "CallingIP",
        value: 1,
        unit: MetricUnit.Count,
        service: "lambda-powertools-metrics-example",
        defaultDimensions: new Dictionary<string, string>
        {
            { "Metric Type", "Single" }
        });
  ...
}

Conclusion

CloudWatch and AWS X-Ray offer functionality that provides comprehensive observability for your applications. Lambda Powertools .NET is now available in preview. The library helps implement observability when running Lambda functions based on .NET 6 while reducing the amount of custom code.

It simplifies implementing the observability best practices defined in the Serverless Applications Lens for the AWS Well-Architected Framework for a serverless application and allows you to focus more time on the business logic.

You can find the full documentation and the source code for Powertools in GitHub. We welcome contributions via pull request, and encourage you to create an issue if you have any feedback for the project. Happy building with AWS Lambda Powertools for .NET.

For more serverless learning resources, visit Serverless Land.

Uploading large objects to Amazon S3 using multipart upload and transfer acceleration

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/uploading-large-objects-to-amazon-s3-using-multipart-upload-and-transfer-acceleration/

This post is written by Tam Baghdassarian, Cloud Application Architect, Rama Krishna Ramaseshu, Sr Cloud App Architect, and Anand Komandooru, Sr Cloud App Architect.

Web and mobile applications must often upload objects to the AWS Cloud. For example, services such as Amazon Photos allow users to upload photos and large video files from their browser or mobile application.

Amazon S3 can be ideal to store large objects due to its 5-TB object size maximum along with its support for reducing upload times via multipart uploads and transfer acceleration.

Overview

Developers who must upload large files from their web and mobile applications to the cloud can face a number of challenges. Due to underlying TCP throughput limits, a single HTTP connection cannot use the full bandwidth available, resulting in slower upload times. Furthermore, network latency and quality can result in poor or inconsistent user experience.

Using S3 features such as presigned URLs and multipart upload, developers can securely increase throughput and minimize upload retries due to network errors. Additionally, developers can use transfer acceleration to reduce network latency and provide a consistent user experience to their web and mobile app users across the globe.

This post references a sample application consisting of a web frontend and a serverless backend application. It demonstrates the benefits of using S3’s multipart upload and transfer acceleration features.

Architecture overview

Solution overview:

  1. Web or mobile application (frontend) communicates with AWS Cloud (backend) through Amazon API Gateway to initiate and complete a multipart upload.
  2. AWS Lambda functions invoke S3 API calls on behalf of the web or mobile application.
  3. Web or mobile application uploads large objects to S3 using S3 transfer acceleration and presigned URLs.
  4. File uploads are received and acknowledged by the closest edge location to reduce latency.

Using S3 multipart upload to upload large objects

A multipart upload allows an application to upload a large object as a set of smaller parts uploaded in parallel. Upon completion, S3 combines the smaller pieces into the original larger object.

Breaking a large object upload into smaller pieces has a number of advantages. It can improve throughput by uploading a number of parts in parallel. It can also recover from a network error more quickly by only restarting the upload for the failed parts.

Multipart upload consists of:

  1. Initiate the multipart upload and obtain an upload id via the CreateMultipartUpload API call.
  2. Divide the large object into multiple parts, get a presigned URL for each part, and upload the parts of a large object in parallel via the UploadPart API call.
  3. Complete the upload by calling the CompleteMultipartUpload API call.

When used with presigned URLs, multipart upload allows an application to upload the large object using a secure, time-limited method without sharing private bucket credentials.

This Lambda function can initiate a multipart upload on behalf of a web or mobile application:

const multipartUpload = await s3. createMultipartUpload(multipartParams).promise()
return {
    statusCode: 200,
    body: JSON.stringify({
        fileId: multipartUpload.UploadId,
        fileKey: multipartUpload.Key,
      }),
    headers: {
      'Access-Control-Allow-Origin': '*'
    }
};

The UploadId is required for subsequent calls to upload each part and complete the upload.

Uploading objects securely using S3 presigned URLs

A web or mobile application requires write permission to upload objects to a S3 bucket. This is usually accomplished by granting access to the bucket and storing credentials within the application.

You can use presigned URLs to access S3 buckets securely without the need to share or store credentials in the calling application. In addition, presigned URLs are time-limited (the default is 15 minutes) to apply security best practices.

A web application calls an API resource that uses the S3 API calls to generate a time-limited presigned URL. The web application then uses the URL to upload an object to S3 within the allotted time, without having explicit write access to the S3 bucket. Once the presigned URL expires, it can no longer be used.

When combined with multipart upload, a presigned URL can be generated for each of the upload parts, allowing the web or mobile application to upload large objects.

This example demonstrates generating a set of presigned URLs for index number of parts:

    const multipartParams = {
        Bucket: bucket_name,
        Key: fileKey,
        UploadId: fileId,
    }
    const promises = []

    for (let index = 0; index < parts; index++) {
        promises.push(
            s3.getSignedUrlPromise("uploadPart", {
            ...multipartParams,
            PartNumber: index + 1,
            Expires: parseInt(url_expiration)
            }),
        )
    }
    const signedUrls = await Promise.all(promises)

Prior to calling getSignedUrlPromise, the client must obtain an UploadId via CreateMultipartUpload. Read Generating a presigned URL to share an object for more information.

Reducing latency by using transfer acceleration

By using S3 transfer acceleration, the application can take advantage of the globally distributed edge locations in Amazon CloudFront. When combined with multipart uploads, each part can be uploaded automatically to the edge location closest to the user, reducing the upload time.

Transfer acceleration must be enabled on the S3 bucket. It can be accessed using the endpoint bucketname.s3-acceleration.amazonaws.com or bucketname.s3-accelerate.dualstack.amazonaws.com to connect to the enabled bucket over IPv6.

Use the speed comparison tool to test the benefits of the transfer acceleration from your location.

You can use transfer acceleration with multipart uploads and presigned URLs to allow a web or mobile application to upload large objects securely and efficiently.

Transfer acceleration needs must be enabled on the S3 bucket. This example creates an S3 bucket with transfer acceleration using CDK and TypeScript:

const s3Bucket = new s3.Bucket(this, "document-upload-bucket", {
      bucketName: “BUCKET-NAME”,
      encryption: BucketEncryption.S3_MANAGED,
      enforceSSL: true,
      transferAcceleration: true,      
      removalPolicy: cdk.RemovalPolicy.DESTROY
    });

After activating transfer acceleration on the S3 bucket, the backend application can generate transfer acceleration-enabled presigned URLs. by initializing the S3 SDK:

s3 = new AWS.S3({useAccelerateEndpoint: true});

The web or mobile application then use the presigned URLs to upload file parts.

See S3 transfer acceleration for more information.

Deploying the test solution

To set up and run the tests outlined in this blog, you need:

  • An AWS account.
  • Install and configure AWS CLI.
  • Install and bootstrap AWS CDK.
  • Deploy the backend and frontend solution at the following git repository.
  • A sufficiently large test upload file of at least 100 MB.

To deploy the backend:

  1. Clone the repository to your local machine.
  2. From the backendv2 folder, install all dependencies by running:
    npm install
  3. Use CDK to deploy the backend to AWS:
    cdk deploy --context env="randnumber" --context whitelistip="xx.xx.xxx.xxx"

You can use an additional context variable called “urlExpiry” to set a specific expiration time on the S3 presigned URL. The default value is set at 300 seconds. A new S3 bucket with the name “document-upload-bucket-randnumber” is created for storing the uploaded objects, and the whitelistip value allows API Gateway access from this IP address only.

Note the API Gateway endpoint URL for later.

To deploy the frontend:

  1. From the frontend folder, install the dependencies:
    npm install
  2. To launch the frontend application from the browser, run:
    npm run start

Testing the application

Testing the application

To test the application:

  1. Launch the user interface from the frontend folder:
    npm run
  2. Enter the API Gateway address in the API URL textbox.Select the maximum size of each part of the upload (the minimum is 5 MB) and the number of parallel uploads. Use your available bandwidth, TCP window size, and retry time requirements to determine the optimal part size. Web browsers have a limit on the number of concurrent connections to the same server. Specifying a larger number of concurrent connections results in blocking on the web browser side.
  3. Decide if transfer acceleration should be used to further reduce latency.
  4. Choose a test upload file.
  5. Use the Monitor section to observe the total time to upload the test file.

Experiment with different values for part size, number of parallel uploads, use of transfer acceleration and the size of the test file to see the effects on total upload time. You can also use the developer tools for your browser to gain more insights.

Test results

The following tests have the following characteristics:

  • The S3 bucket is located in the US East Region.
  • The client’s average upload speed is 79 megabits per second.
  • The Firefox browser uploaded a file of 485 MB.

Test 1 – Single part upload without transfer acceleration

To create a baseline, the test file is uploaded without transfer acceleration and using only a single part. This simulates a large file upload without the benefits of multipart upload. The baseline result is 72 seconds.

Single part upload without transfer acceleration

Test 2 – Single upload with transfer acceleration

The next test measured upload time using transfer acceleration while still maintaining a single upload part with no multipart upload benefits. The result is 43 seconds (40% faster).

Single upload with transfer acceleration

Test 3 – Multipart upload without transfer acceleration

This test uses multipart upload by splitting the test file into 5-MB parts with a maximum of six parallel uploads. Transfer acceleration is disabled. The result is 45 seconds (38% faster).

Multipart upload without transfer acceleration

Test 4 – Multipart upload with transfer acceleration

For this test, the test file is uploaded by splitting the file into 5-MB parts with a maximum of six parallel uploads. Transfer acceleration is enabled for each upload. The result is 28 seconds (61% faster).

Multipart upload with transfer acceleration

The following chart summarizes the test results.

Multipart upload Transfer acceleration Upload time
No No 72s
Yes No 43s
No Yes 45s
Yes Yes 28s

Conclusion

This blog shows how web and mobile applications can upload large objects to Amazon S3 in a secured and efficient manner when using presigned URLs and multipart upload.

Developers can also use transfer acceleration to reduce latency and speed up object uploads. When combined with multipart upload, you can see upload time reduced by up to 61%.

Use the reference implementation to start incorporating multipart upload and S3 transfer acceleration in your web and mobile applications.

For more serverless learning resources, visit Serverless Land.

Developing portable AWS Lambda functions

Post Syndicated from Pascal Vogel original https://aws.amazon.com/blogs/compute/developing-portable-aws-lambda-functions/

This blog post is written by Uri Segev, Principal Serverless Specialist Solutions Architect

When developing new applications or modernizing existing ones, you might face a dilemma: which compute technology to use? A serverless compute service such as AWS Lambda or maybe containers? Often, serverless can be the better approach thanks to automatic scaling, built-in high availability, and a pay-for-use billing model. However, you may hesitate to choose serverless for reasons such as:

  • Perceived higher cost or difficulty in estimating cost
  • It is a paradigm shift, which requires learning to bridge the knowledge gap
  • Misconceptions about Lambda capabilities and use cases
  • Concern that using Lambda will result in lock-in
  • Existing investments in non-serverless platforms and tooling

This blog post suggests best practices for developing portable Lambda functions that allow you to easily port your code to containers if you later choose to. By doing so, you can avoid lock-in and try out the serverless approach in a risk-free way.

Each section of this blog post describes what you need to consider when writing portable code and the steps needed to migrate this code from Lambda to containers, if you later choose to do so.

Best practices for portable Lambda functions

Separate business logic and Lambda handler

Lambda functions are event-driven in nature. When a specific event happens, it invokes the Lambda function by calling its handler method. The handler method receives an event object which contains information regarding the reason for the function invocation. Once the function execution completes, it returns from the handler method. Whatever is returned from the handler is the function’s return value.

To write portable code, we recommend using the handler method only as an interface between the Lambda runtime (event object) and the business logic. Using Hexagonal architecture terminology, the handler should be a driving adapter making calls into the port, which is the interface exposed by the business logic The handler should extract all required information from the event object and then call a separate method that implements the business logic.

When that method returns, the handler constructs the result in the format expected by the function invoker and returns it. We also recommend splitting the handler code and the business logic code into separate files. Should you choose to migrate to containers later, you simply migrate your business logic code files with no additional changes.

The following pseudocode shows a Lambda handler that extracts information from the event object and calls the business logic. Once the business logic is done, the handler places the response in the function’s return value:

import business_logic

# The Lambda handler extracts needed information from the event
# object and invokes the business logic
handler(event, context) {
  # Extract needed information from event object payload = event[‘payload’]

  # Invoke business logic
  result = do_some_logic(payload)
  
  # Construct result for API Gateway
  return {
    statusCode: 200,
	body: result
  }
}

The following pseudocode shows the business logic. It’s located in a separate file and is unaware that it is being invoked from a Lambda function. It is pure logic.

# This is the business logic. It knows nothing about who invokes it.
do_some_logic(data) {
result = "This is my result."
  return result
}

This approach also makes it easier to run unit tests on the business logic without the need to construct event objects and to invoke the Lambda handler.

If you migrate to containers later, you include the business logic files in your container with new interface code as described in the following section.

Event source integration

One benefit of Lambda functions is the event source integration. For instance, if you integrate Lambda with Amazon Simple Queue Service (Amazon SQS), the Lambda service will take care of polling the queue, invoking the Lambda function and deleting the messages from the queue when done. By using this integration, you need to write less boilerplate code. You can focus only on implementing business logic and not the integration with the event source.

The following pseudocode shows how the Lambda handler looks like for an SQS event source:

import business_logic

handler(event, context) {
  entries = []
  # Iterate over all the messages in the event object
  for message in event[‘Records’] {
    # Call the business logic to process a single message
    success = handle_message(message)

    # Start building the response
    if Not success {
      entries.append({
      'itemIdentifier': message['messageId']
      })
    }
  }

  # Notify Lambda about failed items.
  if (let(entries) > 0) {
    return {
      'batchItemFailures': entries
    }
  }
}

As you can see in the previous code, the Lambda function has almost no knowledge that it is being invoked from SQS. There are no SQS API calls. It only knows the structure of the event object, which is specific to SQS.

When moving to a container, the integration responsibility moves from the Lambda service to you, the developer. There are different event sources in AWS, and each of them will require a different approach for consuming events and invoking business logic. For example, if the event source is Amazon API Gateway, your application will need to create an HTTP server that listens on an HTTP port and waits for incoming requests in order to invoke the business logic.

If the event source is Amazon Kinesis Data Streams, your application will need to run a poller that reads records from the shards, keep track of processed records, handle the case of a change in the number of shards in the stream, retry on errors, and more. Regardless of the event source, if you follow the previous recommendations, you will not need to change anything in the business logic code.

The following pseudocode shows how the integration with SQS will look like in a container. Note that you will lose some features such as batching, filtering, and, of course, automatic scaling.

import aws_sdk
import business_logic

QUEUE_URL = os.environ['QUEUE_URL']
BATCH_SIZE = os.environ.get('BATCH_SIZE', 1)
sqs_client = aws_sdk.client('sqs')

main() {
  # Infinite loop to poll for messages from SQS
  while True {

    # Receive a batch of messages from the queue
    response = sqs_client.receive_message(
      QueueUrl = QUEUE_URL,
      MaxNumberOfMessages = BATCH_SIZE,
      WaitTimeSeconds = 20 )

    # Loop over the messages in the batch
    entries = []
    i = 1
    for message in response.get('Messages',[]) {
      # Process a single message
      success = handle_message(message)

      # Append the message handle to an array that is later
      # used to delete processed messages
      if success {
        entries.append(
          {
            'Id': f'index{i}',
            'ReceiptHandle': message['receiptHandle']
          }
        )
        i += 1
      }
    }

    # Delete all the processed messages
    if (len(entries) > 0) {
      sqs_client.delete_message_batch(
        QueueUrl = QUEUE_URL,
        Entries = entries
      )
    }
  }
}

Another point to consider here is Lambda destinations. If your function is invoked asynchronously and you configured a destination for your function, you will need to include that in the interface code. It will need to catch any business logic error and, based on that, invoke the right destination.

Package functions as containers

Lambda supports packaging functions as .zip files and container images. To develop portable code, we recommend using container images as your default packaging method. Even though you package the function as a container image, you can’t run it on other container platforms such as Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service (EKS). However, by packaging it this way, the migration to containers later will be easier as you are already using the same tools and you already created a Dockerfile that will require minimal changes.

An example Dockerfile for Lambda looks like this:

FROM public.ecr.aws/lambda/python:3.9
COPY *.py requirements.txt ./
RUN python3.9 -m pip install -r requirements.txt -t .
CMD ["app.lambda_handler"]

If you move to containers later, you will need to change the Dockerfile to use a different base image and adapt the CMD line that defines how to start the application. This is in addition to the code changes described in the previous section.

The corresponding Dockerfile for the container will look like this:

FROM python:3.9
COPY *.py requirements.txt ./
RUN python3.9 -m pip install -r requirements.txt -t .
CMD ["python", "./app.py"]

The deployment pipeline also needs to change as we deploy to a different target. However, building the artifacts remains the same.

Single invocation per instance

Lambda functions run in their own isolated runtime environment. Each environment handles a single request at a time which works great for Lambda. However, if you migrate your application to containers, you will likely invoke the business logic from multiple threads in a single process at the same time.

This section discusses aspects of moving from a single invocation to multiple concurrent invocations within the same process.

Static variables

Static variables are those that are instantiated once and then reused across multiple invocations. Examples of such variables are database connections or configuration information.

For function optimization, and specifically for reducing cold starts and the duration of warm function invocations, we recommend initializing all static variables outside the function handler and storing them in global variables so that further invocations will reuse them.

We recommend using an initialization function that you write as part of the business logic module and that you invoke from outside the handler. This function saves information in global variables that the business logic code reuses across invocations.

The following pseudocode shows the Lambda function:

import business_logic

# Call the initialization code
initialize()

handler(event, context) {
  ...
  # Call the business logic
  ...
}

And the business logic code will look like this:

# Global variables used to store static data
var config

initialize() {
  config = read_Config()
}

do_some_logic(data) {
  # Do something with config object
  ...
}

The same also applies to containers. You will usually initialize static variables when the process starts and not for every single request. When moving to containers, all you need to do is call the initialization function before starting the main application loop.

import business_logic

# Call the initialization code
initialize()

main() {
  while True {
    ...
    # Call the business logic
    ...
  }
}

As you can see, there are no changes in the business logic code.

Database connections

As Lambda functions share nothing between the runtime environments, unlike containers they can’t rely on connection pools when connecting to a relational database. For this reason, we created Amazon RDS Proxy, which acts as a centralized connection pool used by many functions.

To write portable Lambda functions, we recommend using a connection pool object with a single connection. Your business logic code will always ask for a connection from the pool when making a database request. You will still need to use RDS Proxy.

If you later move to containers, you can increase the number of connections in the pool to a larger number with no further changes and the application will scale without overwhelming the database.

File system

Lambda functions come with a writable /tmp folder in the size of 512 MB to 10 GB. As each function instance runs in an isolated runtime environment, developers usually use fixed file names for files stored in that folder. If you run the same business logic code in a container in multiple threads, the different threads will overwrite the files created by others.

We recommended using unique file names in each invocation. Append a UUID or another random number to the file name. Delete the files once you are done with them to avoid running out of space.

If you move your code to containers later, there is nothing to do.

Portable web applications

If you develop a web application, there is another way to achieve portability. You can use the AWS Lambda Web Adapter project to host a web app inside a Lambda function. This way you can develop a web application with familiar frameworks (e.g., Express.js, Next.js, Flask, Spring Boot, Laravel, or anything that uses HTTP 1.1/1.0), and run it on Lambda. If you package your web application as a container, the same Docker image can run on Lambda (using the web adapter) and containers.

Porting from containers to Lambda

This blog post demonstrates how to develop portable Lambda functions you can easily port to containers. Taking these recommendations into consideration can also help develop portable code in general, which allows you to port containers to Lambda functions.

Some things to consider:

  • Separate the business logic from the interface code in the container. The interface code should interact with the event sources and invoke the business logic.
  • As Lambda functions only have a /tmp writable folder, replicate this in your containers (even though you could write to different locations).

Conclusion

This blog post suggests best practices for developing Lambda functions that allow you to gain the benefits of a serverless approach without risking lock-in.

By following these best practices for separating business logic from Lambda handlers, packaging functions as containers, handling Lambda’s single invocation per instance, and more, you can develop portable Lambda functions. As a consequence, you will be able to port your code from Lambda to containers with minimal effort if you choose to move to containers later.

Refer to these best practices and code samples to ease the adoption of a serverless approach when developing your next application.

For more serverless learning resources, visit Serverless Land.

Let’s Architect! Architecture tools

Post Syndicated from Luca Mezzalira original https://aws.amazon.com/blogs/architecture/lets-architect-architecture-tools/

Tools, such as diagramming software, low-code applications, and frameworks, make it possible to experiment quickly. They are essential in today’s fast-paced and technology-driven world. From improving efficiency and accuracy, to enhancing collaboration and creativity, a well-defined set of tools can make a significant impact on the quality and success of a project in the area of software architecture.

As an architect, you can take advantage of a wide range of resources to help you build solutions that meet the needs of your organization. For example, with tools in the likes of the Amazon Web Services (AWS) Solutions Library and Serverless Land, you can boost your knowledge and productivity while working on event-driven architectures, microservices, and stateless computing.

In this Let’s Architect! edition, we explore how to incorporate these patterns into your architecture, and which tools to leverage to build solutions that are scalable, secure, and cost-effective.

How AWS Application Composer helps your team build great apps

In this re:Invent 2022 session, Chase Douglas, Principal Engineer at AWS, speaks about AWS Application Composer, a newly launched service.

This service has the potential to change the way architects design solutions—without writing a single line of code! The service is user-friendly, intuitive, and requires no prior coding experience. It allows users to scaffold a serverless architecture, defining a CloudFormation template visually with drag-and-drop. A detailed AWS Compute Blog post takes readers through the process of using AWS Application Composer.

Take me to this re:Invent 2022 video!

How an architecture can be designed with AWS Application Composer

How an architecture can be designed with AWS Application Composer

AWS design + build tools

When migrating to the cloud, we suggest referencing these four tried-and-true AWS resources that can be used to design and build projects.

  1. AWS Workshops are created by AWS teams to provide opportunities for hands-on learning to develop practical skills. Workshops are available in multiple categories and for skill levels 100-400.
  2. AWS Architecture Center contains a collection of best practices and architectural patterns for designing and deploying cloud-based solutions using AWS services. Furthermore, it includes detailed architecture diagrams, whitepapers, case studies, and other resources that provide a wealth of information on how to design and implement cloud solutions.
  3. Serverless Land (an Amazon property) brings together various patterns, workflows, code snippets, and blog posts pertaining to AWS serverless architectures.
  4. AWS Solutions Library provides customers with templates, tools, and automated workflows to easily deploy, operate, and manage common use cases on the AWS Cloud.
Inside event-driven architectures designed by David Boyne on Serverless Land

Inside event-driven architectures designed by David Boyne on Serverless Land

The Well-Architected way

In this session, the AWS Well-Architected provides guidance on how to implement the architectural models reported in the AWS Well-Architected Framework within your organization at scale.

Discover a customer story and understand how to use the features of the AWS Well-Architected Tool and APIs to receive recommendations based on your workload and measure your architectural metrics. In the Framework whitepaper, you can explore the six pillars of Well-Architected (operational excellence, security, reliability, performance efficiency, cost optimization, and sustainability) and best practices to achieve them.

Understanding the key design pillars can help architects make informed design decisions, leading to more robust and efficient solutions. This knowledge also enables architects to identify potential problems early on in the design process and find appropriate patterns to address those issues.

Take me to the Well-Architected video!

Discover how the AWS Well-Architected Framework can help you design scalable, maintainable, and reusable solutions

Discover how the AWS Well-Architected Framework can help you design scalable, maintainable, and reusable solutions

See you next time!

Thanks for exploring architecture tools and resources with us!

Join us next time when we’ll talk about data mesh architecture!

To find all the posts from this series, check out the Let’s Architect! page of the AWS Architecture Blog.

Detecting solar panel damage with Amazon Rekognition Custom Labels

Post Syndicated from Ramakant Joshi original https://aws.amazon.com/blogs/architecture/detecting-solar-panel-damage-with-amazon-rekognition-custom-labels/

Enterprises perform quality control to ensure products meet production standards and avoid potential brand reputation damage. As the cost of sensors decreases and connectivity increases, industries adopt real-time imagery analysis to detect quality issues.

At the same time, artificial intelligence (AI) advancements enable advanced automation, reduce overall cost and project time, and produce accurate defect detection results in manufacturing plants. As these technologies mature, AI-driven inspections are more common outside of the plant environment.

Overview of solution

This post describes our SOLVED (Solar Roving Eye Detector) project leveraging machine learning (ML) to identify damaged solar panels using Amazon Rekognition Custom Labels and alert operators to take corrective action.

As solar adoption increases, so does the need to detect panel damage. Applying AWS-managed AI services is a simpler, more cost-effective approach than human solar panel inspection or custom-built production applications.

Customers can capture and process videos from the field and build effective computer vision models without creating a dedicated data science team. This approach can be generalized for use cases across industries to detect defects in wind turbines, cell phone towers, automotive parts, and other field components.

Amazon Rekognition Custom Labels builds off of existing service capabilities already trained to identify the objects and scenes in millions of cross-category images. You upload a small set of training images—typically a few hundred or less—into our console. The solution automatically loads and inspects the training data, selects the right ML algorithms, trains a model, and provides model performance metrics. You can then integrate your custom model into your applications through the Amazon Rekognition Custom Labels API.

Walkthrough

This post introduces the SOLVED project featured at the re:Invent 2021 Builders Fair. It will:

  • Review the need for solar panel damage detection
  • Discuss a cloud-based approach to ingest, store, process, analyze, and detect damaged solar panels
  • Present a diagram streaming videos from a Raspberry Pi, storing them on Amazon Simple Storage Service (Amazon S3), processing them using an AWS video-on-demand solution, and inferring damage using Amazon Rekognition
  • Introduce a console to mimic an operation center for appropriate action
  • Demonstrate the integration of AWS IoT Core with a Philips Hue bulb for operator alerts

Prerequisites

Before getting started, review the following prerequisites for this solution:

The SOLVED project

The SOLVED project leverages ML to identify damaged solar panels using Amazon Rekognition Custom Labels. It involves four steps:

  1. Data ingestion: Live solar panel video ingested from moving rover into an Amazon S3 bucket
  2. Pre-processing: Captured video split into thumbnail images
  3. Processing and visualization: ML models making real-time inferences to identify defective panels with a dashboard to review images and prediction scores
  4. Alerting: Defective panels result in notification sent through MQTT messages to light a smart bulb

Figure 1 shows the SOLVED project system architecture.

The SOLVED project system architecture

Figure 1. The SOLVED project system architecture

Installation steps

Let’s review each of the steps in this use case.

Data ingestion

The data ingestion layer of the SOLVED project consists of a continuous video stream captured as a rover moves through a field of solar panels.

We used a Freenove 4WD Smart Car rover with Raspberry Pi. The mounted camera captures video as it moves through the field. We installed an Amazon Kinesis Video Streams Producer on the Pi and streamed the live video to a Kinesis Video Stream named reinventbuilder2021.

Figure 2 shows the Kinesis Video Stream setup window for reinventbuilder2021.

Kinesis Video Stream setup for reinventbuilder2021

Figure 2. Kinesis Video Stream setup for reinventbuilder2021

To start streaming, use the following steps.

  1. Create a new Kinesis Video Stream using this Amazon Kinesis Video Streams Developer Guide
  2. Make a note of the Amazon Resource Name (ARN)
  3. On the Pi, access the command prompt and use aws sts get-session-token for temporary credentials. The IAM user should have the permissions for Kinesis Video Streams PutMedia.
  4. Set the following environment variables:
    export AWS_DEFAULT_REGION="us-east-1"
    export AWS_ACCESS_KEY_ID="xxxxx"
    export AWS_SECRET_ACCESS_KEY="yyyyy"
    export AWS_SESSION_TOKEN=“zzzzz”
  5. Start the streamer using the following command:
    cd ~/amazon-kinesis-video-streams-producer-sdk-cpp/build
    ./kvs_gstreamer_sample reinventbuilder2021
  6. Validate the captured stream by viewing the Media playback on the console.

Figure 3 shows the video stream console, including the Media playback option.

Video stream console with Media playback option

Figure 3. Video stream console with Media playback option

There are two ways to clip video snippets, which we’ll do next.

You can use the Download clip button on the video stream console as shown in Figure 4.

Choose your video streaming clip duration

Figure 4. Choose your video streaming clip duration

Alternately, you can use a script from the following command line:

ONE_MIN_AGO=$(date -v -30S -u "+%FT%T+0000")
NOW=$(date -u "+%FT%T+0000")

FILE_NAME=reinventbuilder-solved-$RANDOM.mp4
echo $FILE_NAME
S3_PATH=s3://videoondemandsplitter-source-e6lyof9qjv1j/

aws kinesis-video-archived-media get-clip --endpoint-url $KVS_DATA_ENDPOINT \
--stream-name reinventbuilder2021 \
--clip-fragment-selector "FragmentSelectorType=SERVER_TIMESTAMP,TimestampRange={StartTimestamp=$ONE_MIN_AGO,EndTimestamp=$NOW}" \
$FILE_NAME

echo "Running get-clip for stream"

sleep 45

aws s3 cp $FILE_NAME $S3_PATH
echo "copying file $FILE_NAME TO $S3_PATH"

The clip is available in the Amazon S3 source folder created using AWS CloudFormation, as shown in Figure 5.

Access your clip in the Amazon S3 source folder

Figure 5. Access your clip in the Amazon S3 source folder

Pre-processing

To process the video, we leverage Video on Demand at AWS. This solution encodes video files with AWS Elemental MediaConvert. Out of the box, it:

1. Automatically transcodes videos uploaded to Amazon S3 into formats suitable for playback on a range of devices using MediaConvert
2. Customizes MediaConvert job settings by uploading a custom file and using different settings per input
3. Stores transcoded files in a destination Amazon S3 bucket and uses CloudFront to deliver them to end viewers
4. Provides outputs including input file metadata, job settings, and output details in addition to transcoded video. These outputs are stored in a separate JSON file, available for further processing

For our use case, we used the frame capture feature to create a set of thumbnails from the source videos. The thumbnails are stored in the Amazon S3 bucket with the video output.

To deploy this solution, use the CloudFormation stack.

Processing and visualization

Every trained ML model requires quality training data. We began with publicly available solar panel images that were categorized as “good” or “defective” and uploaded the images to an Amazon S3 bucket into corresponding folders.

Next, we configured Amazon Rekognition Custom Labels with the folders to indicate the labels to use in training and deploying the model. Using the rover images, we tested the model.

We used the rover to record videos of good and damaged solar panels over an extended period and label the outcome favorably. The video was then split into individual frames using MediaConvert, giving us a well-labeled dataset that we trained our model with using Amazon Rekognition Custom Labels.

We used the model endpoint to infer outcomes on solar panels with varying damage footprints across multiple locations. AWS Elemental Mediaconvert expedited the process of curating the training set, and creating the model and endpoint using Amazon Rekognition was straightforward.

As shown in Figure 6, we used a training set of 7,000 images with an even mix of good and damaged panels.

A training set of images

Figure 6. A training set of images

Examples of good panel images are depicted in Figure 7.

Good panel images

Figure 7. Good panel images

Examples of damaged panel images are depicted in Figure 8.

Damaged panel images

Figure 8. Damaged panel images

In this use case, 90 percent model accuracy was achieved.

To visualize the results, we leveraged AWS Amplify to provide an operator interface to identify the damaged panels.

Figure 9 shows screenshots from the operator dashboard with output from the Amazon Custom Labels Rekognition model for good and defective panels.

Operator dashboard in AWS Amplify

Figure 9. Operator dashboard in AWS Amplify

Alerting

Maintenance teams must be notified of defective panels to take corrective action. To create alerts, we configured AWS IoT Core to send MQTT messages to a Philips Hue smart bulb, with red bulbs indicating defective panels. To set up the Philips Hue API, use the How to develop for Hue guide.

For example, here’s the API to change color:

PUT https://192.xx.xx.xx/api/xxxxxxx/lights/1/state

{"on":true, "sat":254, "bri":254,"hue":20000} 

turns color to green

{"on":true, "sat":254, "bri":254,"hue":1000}

turns to red.

We set up a client on the Pi that listens on an AWS IoT Core MQTT topic and makes an API request to Philips Hue.

To connect a device to AWS IoT, complete these steps:

  1. Create an IoT thing, a device certificate, and an AWS IoT policy. An AWS IoT thing represents a physical device (in this case, Raspberry Pi) and contains static device metadata, as shown in Figure 10.
    AWS IoT Thing

    Figure 10. AWS IoT Thing

    2. Create a device certificate, required to connect to and authenticate with AWS IoT. An example is shown in Figure 11.

Device certificate

Figure 11. Device certificate

3. Associate an AWS IoT policy with each device certificate. They determine which AWS IoT resources the device can access. In this case, we allowed iot.*, giving the device access to all IoT resources, as shown in Figure 12.

IoT policy

Figure 12. IoT policy

Devices and other clients use an AWS IoT root CA certificate to authenticate the server they’re communicating with. For more on how devices authenticate with AWS IoT Core, see Server authentication in the AWS IoT Core Developer Guide. Copy the certificate chain to the Raspberry Pi.

For communication with the Philips Hue, we used the Qhue wrapper as shown in Figure 13.

Qhue wrapper

Figure 13. Qhue wrapper

The authors presented a demo of this solution at re:Invent 2021 Builder’s Fair.

Author demo at re:Invent 2021 Builder's Fair

Figure 14. Author demo at re:Invent 2021 Builder’s Fair

Clean up

If you used the CloudFormation stack, delete it to avoid unexpected future charges. Delete Amazon S3 buckets and terminate Amazon Rekognition jobs to stop accruing charges.

Conclusion

Amazon Rekognition helps customers collect images in the field and apply AI-based analysis to interpret the condition of assets within the images.

In this post, you learned how to configure the Kinesis Video Stream producer on a Raspberry Pi to upload captured videos to Amazon Kinesis Video streams. You also learned how to save video streams to Amazon S3 and leverage the Video on Demand at AWS solution.

Using AWS MediaConvert, we transcoded the videos and create a set of thumbnails from the source videos. We then used Amazon Rekognition Custom Labels to train and deploy models for solar panel damage detection. Finally, we configured AWS IoT core to send MQTT messages to a Philips Hue smart bulb for notifications.

In this post, we presented a serverless architecture on AWS to detect defective solar panels. The reference architecture diagram is adaptable to solve inspection and damage detection problems across other industries.

Implementing reactive progress tracking for AWS Step Functions

Post Syndicated from Benjamin Smith original https://aws.amazon.com/blogs/compute/implementing-reactive-progress-tracking-for-aws-step-functions/

This blog post is written by Alexey Paramonov, Solutions Architect, ISV and Maximilian Schellhorn, Solutions Architect ISV

This blog post demonstrates a solution based on AWS Step Functions and Amazon API Gateway WebSockets to track execution progress of a long running workflow. The solution updates the frontend regularly and users are able to track the progress and receive detailed status messages.

Websites with long-running processes often don’t provide feedback to users, leading to a poor customer experience. You might have experienced this when booking tickets, searching for hotels, or buying goods online. These sites often call multiple backend and third-party endpoints and aggregate the results to complete your request, causing the delay. In these long running scenarios, a transparent progress tracking solution can create a better user experience.

Overview

The example provided uses:

  • AWS Serverless Application Model (AWS SAM) for deployment: an open-source framework for building serverless applications.
  • AWS Step Functions for orchestrating the workflow.
  • AWS Lambda for mocking long running processes.
  • API Gateway to provide a WebSocket API for bidirectional communications between clients and the backend.
  • Amazon DynamoDB for storing connection IDs from the clients.

The example provides different options to report the progress back to the WebSocket connection by using Step Functions SDK integration, Lambda integrations, or Amazon EventBridge.

The following diagram outlines the example:

  1. The user opens a connection to WebSocket API. The OnConnect and OnDisconnect Lambda functions in the “WebSocket Connection Management” section persist this connection in DynamoDB (see documentation). The connection is bidirectional, meaning that the user can send requests through the open connection and the backend can respond with a new progress status whenever it is available.
  2. The user sends a new order through the WebSocket API. API Gateway routes the request to the “OnOrder” AWS Lambda function, which starts the state machine execution.
  3. As the request propagates through the state machine, we send progress updates back to the user via the WebSocket API using AWS SDK service integrations.
  4. For more customized status responses, we can use a centralized AWS Lambda function “ReportProgress” that updates the WebSocket API.

How to respond to the client?

To send the status updates back to the client via the WebSocket API, three options are explored:

Option 1: AWS SDK integration with API Gateway invocation

As the diagram shows, the API Gateway workflow tasks starting with the prefix “Report:” send responses directly to the client via the WebSocket API. This is an example of the state machine definition for this step:

          'Report: Workflow started':
            Type: Task
            Resource: arn:aws:states:::apigateway:invoke
            ResultPath: $.Params
            Parameters:
              ApiEndpoint: !Join [ '.',[ !Ref ProgressTrackingWebsocket, execute-api, !Ref 'AWS::Region', amazonaws.com ] ]
              Method: POST
              Stage: !Ref ApiStageName
              Path.$: States.Format('/@connections/{}', $.ConnectionId)
              RequestBody:
                Message: 🥁 Workflow started
                Progress: 10
              AuthType: IAM_ROLE
            Next: 'Mock: Inventory check'

This option reports the progress directly without using any additional Lambda functions. This limits the system complexity, reduces latency between the progress update and the response delivered to the client, and potentially reduces costs by reducing Lambda execution duration. A potential drawback is the limited customization of the response and getting familiar with the definition language.

Option 2: Using a Lambda function for reporting the progress status

To further customize response logic, create a Lambda function for reporting. As shown in point 4 of the diagram, you can also invoke a “ReportProgress” function directly from the state machine. This Python code snippet reports the progress status back to the WebSocket API:

apigw_management_api_client = boto3.client('apigatewaymanagementapi', endpoint_url=api_url)
apigw_management_api_client.post_to_connection(
            ConnectionId=connection_id,
            Data=bytes(json.dumps(event), 'utf-8')
        )

This option allows for more customizations and integration into the business logic of other Lambda functions to track progress in more detail. For example, execution of loops and reporting back on every iteration. The tradeoff is that you must handle exceptions and retries in your code. It also increases overall system complexity and additional costs associated with Lambda execution.

Option 3: Using EventBridge

You can combine option 2 with EventBridge to provide a centralized solution for reporting the progress status. The solution also handles retries with back-off if the “ReportProgress” function can’t communicate with the WebSocket API.

You can also use AWS SDK integrations from the state machine to EventBridge instead of using API Gateway. This has the additional benefit of a loosely coupled and resilient system but you could experience increased latency due to the additional services used. The combination of EventBridge and the Lambda function adds a minimal latency, but it might not be acceptable for short-lived workflows. However, if the workflow takes tens of seconds to complete and involves numerous steps, option 3 may be more suitable.

This is the architecture:

  1. As before.
  2. As before.
  3. AWS SDK integration sends the status message to EventBridge.
  4. The message propagates to the “ReportProgress” Lambda function.
  5. The Lambda function sends the processed message through the WebSocket API back to the client.

Deploying the example

Prerequisites

Make sure you can manage AWS resources from your terminal.

  • AWS CLI and AWS SAM CLI installed.
  • You have an AWS account. If not, visit this page.
  • Your user has sufficient permissions to manage AWS resources.
  • Git is installed.
  • NPM is installed (only for local frontend deployment).

To view the source code and documentation, visit the GitHub repo. This contains both the frontend and backend code.

To deploy:

  1. Clone the repository:
    git clone "https://github.com/aws-samples/aws-step-functions-progress-tracking.git"
  2. Navigate to the root of the repository.
  3. Build and deploy the AWS SAM template:
    sam build && sam deploy --guided
  4. Copy the value of WebSocketURL in the output for later.
  5. The backend is now running. To test it, use a hosted frontend.

Alternatively, you can deploy the React-based frontend on your local machine:

  1. Navigate to “progress-tracker-frontend/”:
    cd progress-tracker-frontend
  2. Launch the react app:
    npm start
  3. The command opens the React app in your default browser. If it does not happen automatically, navigate to http://localhost:3000/ manually.

Now the application is ready to test.

Testing the example application

  1. The webpage requests a WebSocket URL – this is the value from the AWS SAM template deployment. Paste it into Enter WebSocket URL in the browser and choose Connect.
  2. On the next page, choose Send Order and watch how the progress changes.

    This sends the new order request to the state machine. As it progresses, you receive status messages back through the WebSocket API.
  3. Optionally, you can inspect the raw messages arriving to the client. Open the Developer tools in your browser and navigate to the Network tab. Filter for WS (stands for WebSocket) and refresh the page. Specify the WebSocket URL, choose Connect and then choose Send Order.

Cleaning up

The services used in this solution are eligible for AWS Free Tier. To clean up the resources, in the root directory of the repository run:

sam delete

This removes all resources provisioned by the template.yml file.

Conclusion

In this post, you learn how to augment your Step Functions workflows with low latency progress tracking via API Gateway WebSockets. Consider adding the progress tracking to your long running workflows to improve the customer experience and provide a reactive look and feel for your application.

Navigate to the GitHub repository and review the implementation to see how your solution could become more user friendly and responsive. Start with examining the template.yml and the state machine’s definition and see how the frontend handles WebSocket communication and message visualization.

For more serverless learning resources, visit  Serverless Land.

Achieve up to 27% better price-performance for Spark workloads with AWS Graviton2 on Amazon EMR Serverless

Post Syndicated from Karthik Prabhakar original https://aws.amazon.com/blogs/big-data/achieve-up-to-27-better-price-performance-for-spark-workloads-with-aws-graviton2-on-amazon-emr-serverless/

Amazon EMR Serverless is a serverless option in Amazon EMR that makes it simple to run applications using open-source analytics frameworks such as Apache Spark and Hive without configuring, managing, or scaling clusters.

At AWS re:Invent 2022, we announced support for running serverless Spark and Hive workloads with AWS Graviton2 (Arm64) on Amazon EMR Serverless. AWS Graviton2 processors are custom-built by AWS using 64-bit Arm Neoverse cores, delivering a significant leap in price-performance for your cloud workloads.

This post discusses the performance improvements observed while running Apache Spark jobs using AWS Graviton2 on EMR Serverless. We found that Graviton2 on EMR Serverless achieved 10% performance improvement for Spark workloads based on runtime. AWS Graviton2 is offered at a 20% lower cost than the x86 architecture option (see the Amazon EMR pricing page for details), resulting in a 27% overall better price-performance for workloads.

Spark performance test results

The following charts compare the benchmark runtime with and without Graviton2 for a EMR Serverless Spark application (note that the charts are not drawn to scale). We observed up to 10% improvement in total runtime and 8% improvement in geometric mean for the queries compared to x86.

The following table summarizes our results.

Metric Graviton2 x86 %Gain
Total Execution Time (in seconds) 2,670 2,959 10%
Geometric Mean (in seconds) 22.06 24.07 8%

Testing configuration

To evaluate the performance improvements, we use benchmark tests derived from TPC-DS 3 TB scale performance benchmarks. The benchmark consists of 104 queries, and each query is submitted sequentially to an EMR Serverless application. EMR Serverless has automatic and fine-grained scaling enabled by default. Spark provides Dynamic Resource Allocation (DRA) to dynamically adjust the application resources based on the workload, and EMR Serverless uses the signals from DRA to elastically scale workers as needed. For our tests, we chose a predefined pre-initialized capacity that allows the application to scale to default limits. Each application has 1 driver and 100 workers configured as pre-initialized capacity, allowing it to scale to a maximum of 8000 vCPU/60000 GB capacity. When launching the applications, as default we use x86_64 to get baseline numbers and Arm64 for AWS Graviton2, and the application had VPC networking enabled.

The following table summarizes the Spark application configuration.

Number of Drivers Driver Size Number of Executors Executor Size Ephemeral Storage Amazon EMR release label
1 4 vCPUs, 16 GB Memory 100 4 vCPUs, 16 GB Memory 200 G 6.9

Performance test results and cost comparison

Let’s do a cost comparison of the benchmark tests. Because we used 1 driver [4 vCPUs, 16 GB memory] and 100 executors [4 vCPUs, 16 GB memory] for each run, the total capacity used is 4*101=192 vCPUs, 16*101=1616 GB memory, 200*100=20000 GB storage. The following table summarizes the cost.

Test Total time (Seconds) vCPUs Memory (GB) Ephemeral (Storage GB) Cost
x86_64 2,958.82 404 1616 18000 $26.73
Graviton2 2,670.38 404 1616 18000 $19.59

The calculations are as follows:

  • Total vCPU cost = (number of vCPU * per vCPU rate * job runtime in hour)
  • Total GB = (Total GB of memory configured * per GB-hours rate * job runtime in hour)
  • Storage = 20 GB of ephemeral storage is available for all workers by default—you pay only for any additional storage that you configure per worker

Cost breakdown

Let’s look at the cost breakdown for x86:

  • Job runtime – 49.3 minutes = 0.82 hours
  • Total vCPU cost – 404 vCPUs x 0.82 hours job runtime x 0.052624 USD per vCPU = 17.4333 USD
  • Total GB cost – 1,616 memory-GBs x 0.82 hours job runtime x 0.0057785 USD per memory GB = 7.6572 USD
  • Storage cost – 18,000 storage-GBs x 0.82 hours job runtime x 0.000111 USD per storage GB = 1.6386 USD
  • Additional storage – 20,000 GB – 20 GB free tier * 100 workers = 18,000 additional storage GB
  • EMR Serverless total cost (x86): 17.4333 USD + 7.6572 USD + 1.6386 USD = 26.7291 USD

Let’s compare to the cost breakdown for Graviton 2:

  • Job runtime – 44.5 minutes = 0.74 hours
  • Total vCPU cost – 404 vCPUs x 0.74 hours job runtime x 0.042094 USD per vCPU = 12.5844 USD
  • Total GB cost – 1,616 memory-GBs x 0.74 hours job runtime x 0.004628 USD per memory GB = 5.5343 USD
  • Storage cost – 18,000 storage-GBs x 0.74 hours job runtime x 0.000111 USD per storage GB = 1.4785 USD
  • Additional storage – 20,000 GB – 20 GB free tier * 100 workers = 18,000 additional storage GB
  • EMR Serverless total cost (Graviton2): 12.5844 USD + 5.5343 USD + 1.4785 USD = 19.5972 USD

The tests indicate that for the benchmark run, AWS Graviton2 lead to an overall cost savings of 27%.

Individual query improvements and observations

The following chart shows the relative speedup of individual queries with Graviton2 compared to x86.

We see some regression in a few shorter queries, which had little impact on the overall benchmark runtime. We observed better performance gains for long running queries, for example:

  • q67 average 86 seconds for x86, 74 seconds for Graviton2 with 24% runtime performance gain
  • q23a and q23b gained 14% and 16%, respectively
  • q32 regressed by 7%; the difference between average runtime is <500 milliseconds (11.09 seconds for Graviton2 vs. 10.39 seconds for x86)

To quantify performance, we use benchmark SQL derived from TPC-DS 3 TB scale performance benchmarks.

If you’re evaluating migrating your workloads to Graviton2 architecture on EMR Serverless, we recommend testing the Spark workloads based on your real-world use cases. The outcome might vary based on the pre-initialized capacity and number of workers chosen. If you want to run workloads across multiple processor architectures, (for example, test the performance on x86 and Arm vCPUs) follow the walkthrough in the GitHub repo to get started with some concrete ideas.

Conclusion

As demonstrated in this post, Graviton2 on EMR Serverless applications consistently yielded better performance for Spark workloads. Graviton2 is available in all Regions where EMR Serverless is available. To see a list of Regions where EMR Serverless is available, see the EMR Serverless FAQs. To learn more, visit the Amazon EMR Serverless User Guide and sample codes with Apache Spark and Apache Hive.

If you’re wondering how much performance gain you can achieve with your use case, try out the steps outlined in this post and replace with your queries.

To launch your first Spark or Hive application using a Graviton2-based architecture on EMR Serverless, see Getting started with Amazon EMR Serverless.


About the authors

Karthik Prabhakar is a Senior Big Data Solutions Architect for Amazon EMR at AWS. He is an experienced analytics engineer working with AWS customers to provide best practices and technical advice in order to assist their success in their data journey.

Nithish Kumar Murcherla is a Senior Systems Development Engineer on the Amazon EMR Serverless team. He is passionate about distributed computing, containers, and everything and anything about the data.

Amazon EMR Serverless supports larger worker sizes to run more compute and memory-intensive workloads

Post Syndicated from Veena Vasudevan original https://aws.amazon.com/blogs/big-data/amazon-emr-serverless-supports-larger-worker-sizes-to-run-more-compute-and-memory-intensive-workloads/

Amazon EMR Serverless allows you to run open-source big data frameworks such as Apache Spark and Apache Hive without managing clusters and servers. With EMR Serverless, you can run analytics workloads at any scale with automatic scaling that resizes resources in seconds to meet changing data volumes and processing requirements. EMR Serverless automatically scales resources up and down to provide just the right amount of capacity for your application.

We are excited to announce that EMR Serverless now offers worker configurations of 8 vCPUs with up to 60 GB memory and 16 vCPUs with up to 120 GB memory, allowing you to run more compute and memory-intensive workloads on EMR Serverless. An EMR Serverless application internally uses workers to execute workloads. and you can configure different worker configurations based on your workload requirements. Previously, the largest worker configuration available on EMR Serverless was 4 vCPUs with up to 30 GB memory. This capability is especially beneficial for the following common scenarios:

  • Shuffle-heavy workloads
  • Memory-intensive workloads

Let’s look at each of these use cases and the benefits of having larger worker sizes.

Benefits of using large workers for shuffle-intensive workloads

In Spark and Hive, shuffle occurs when data needs to be redistributed across the cluster during a computation. When your application performs wide transformations or reduce operations such as join, groupBy, sortBy, or repartition, Spark and Hive triggers a shuffle. Also, every Spark stage and Tez vertex is bounded by a shuffle operation. Taking Spark as an example, by default, there are 200 partitions for every Spark job defined by spark.sql.shuffle.partitions. However, Spark will compute the number of tasks on the fly based on the data size and the operation being performed. When a wide transformation is performed on top of a large dataset, there could be GBs or even TBs of data that need to be fetched by all the tasks.

Shuffles are typically expensive in terms of both time and resources, and can lead to performance bottlenecks. Therefore, optimizing shuffles can have a significant impact on the performance and cost of a Spark job. With large workers, more data can be allocated to each executor’s memory, which minimizes the data shuffled across executors. This in turn leads to increased shuffle read performance because more data will be fetched locally from the same worker and less data will be fetched remotely from other workers.

Experiments

To demonstrate the benefits of using large workers for shuffle-intensive queries, let’s use q78 from TPC-DS, which is a shuffle-heavy Spark query that shuffles 167 GB of data over 12 Spark stages. Let’s perform two iterations of the same query with different configurations.

The configurations for Test 1 are as follows:

  • Size of executor requested while creating EMR Serverless application = 4 vCPUs, 8 GB memory, 200 GB disk
  • Spark job config:
    • spark.executor.cores = 4
    • spark.executor.memory = 8
    • spark.executor.instances = 48
    • Parallelism = 192 (spark.executor.instances * spark.executor.cores)

The configurations for Test 2 are as follows:

  • Size of executor requested while creating EMR Serverless application = 8 vCPUs, 16 GB memory, 200 GB disk
  • Spark job config:
    • spark.executor.cores = 8
    • spark.executor.memory = 16
    • spark.executor.instances = 24
    • Parallelism = 192 (spark.executor.instances * spark.executor.cores)

Let’s also disable dynamic allocation by setting spark.dynamicAllocation.enabled to false for both tests to avoid any potential noise due to variable executor launch times and keep the resource utilization consistent for both tests. We use Spark Measure, which is an open-source tool that simplifies the collection and analysis of Spark performance metrics. Because we’re using a fixed number of executors, the total number of vCPUs and memory requested are the same for both the tests. The following table summarizes the observations from the metrics collected with Spark Measure.

. Total Time Taken for Query in milliseconds shuffleLocalBlocksFetched shuffleRemoteBlocksFetched shuffleLocalBytesRead shuffleRemoteBytesRead shuffleFetchWaitTime shuffleWriteTime
Test 1 153244 114175 5291825 3.5 GB 163.1 GB 1.9 hr 4.7 min
Test 2 108136 225448 5185552 6.9 GB 159.7 GB 3.2 min 5.2 min

As seen from the table, there is a significant difference in performance due to shuffle improvements. Test 2, with half the number of executors that are twice as large as Test 1, ran 29.44% faster, with 1.97 times more shuffle data fetched locally compared to Test 1 for the same query, same parallelism, and same aggregate vCPU and memory resources. Therefore, you can benefit from improved performance without compromising on cost or job parallelism with the help of large executors. We have observed similar performance benefits for other shuffle-intensive TPC-DS queries such as q23a and q23b.

Recommendations

To determine if the large workers will benefit your shuffle-intensive Spark applications, consider the following:

  • Check the Stages tab from the Spark History Server UI of your EMR Serverless application. For example, from the following screenshot of Spark History Server, we can determine that this Spark job wrote and read 167 GB of shuffle data aggregated across 12 stages, looking at the Shuffle Read and Shuffle Write columns. If your jobs shuffle over 50 GB of data, you may potentially benefit from using larger workers with 8 or 16 vCPUs or spark.executor.cores.

  • Check the SQL / DataFrame tab from the Spark History Server UI of your EMR Serverless application (only for Dataframe and Dataset APIs). When you choose the Spark action performed, such as collect, take, showString, or save, you will see an aggregated DAG for all stages separated by the exchanges. Every exchange in the DAG corresponds to a shuffle operation, and it will contain the local and remote bytes and blocks shuffled, as seen in the following screenshot. If the local shuffle blocks or bytes fetched is much less compared to the remote blocks or bytes fetched, you can rerun your application with larger workers (with 8 or 16 vCPUs or spark.executor.cores) and review these exchange metrics in a DAG to see if there is any improvement.

  • Use the Spark Measure tool with your Spark query to obtain the shuffle metrics in the Spark driver’s stdout logs, as shown in the following log for a Spark job. Review the time taken for shuffle reads (shuffleFetchWaitTime) and shuffle writes (shuffleWriteTime), and the ratio of the local bytes fetched to the remote bytes fetched. If the shuffle operation takes more than 2 minutes, rerun your application with larger workers (with 8 or 16 vCPUs or spark.executor.cores) with Spark Measure to track the improvement in shuffle performance and the overall job runtime.
Time taken: 177647 ms

Scheduling mode = FIFO
Spark Context default degree of parallelism = 192

Aggregated Spark stage metrics:
numStages => 22
numTasks => 10156
elapsedTime => 159894 (2.7 min)
stageDuration => 456893 (7.6 min)
executorRunTime => 28418517 (7.9 h)
executorCpuTime => 20276736 (5.6 h)
executorDeserializeTime => 326486 (5.4 min)
executorDeserializeCpuTime => 124323 (2.1 min)
resultSerializationTime => 534 (0.5 s)
jvmGCTime => 648809 (11 min)
shuffleFetchWaitTime => 340880 (5.7 min)
shuffleWriteTime => 245918 (4.1 min)
resultSize => 23199434 (22.1 MB)
diskBytesSpilled => 0 (0 Bytes)
memoryBytesSpilled => 0 (0 Bytes)
peakExecutionMemory => 1794288453176
recordsRead => 18696929278
bytesRead => 77354154397 (72.0 GB)
recordsWritten => 0
bytesWritten => 0 (0 Bytes)
shuffleRecordsRead => 14124240761
shuffleTotalBlocksFetched => 5571316
shuffleLocalBlocksFetched => 117321
shuffleRemoteBlocksFetched => 5453995
shuffleTotalBytesRead => 158582120627 (147.7 GB)
shuffleLocalBytesRead => 3337930126 (3.1 GB)
shuffleRemoteBytesRead => 155244190501 (144.6 GB)
shuffleRemoteBytesReadToDisk => 0 (0 Bytes)
shuffleBytesWritten => 156913371886 (146.1 GB)
shuffleRecordsWritten => 13867102620

Benefits of using large workers for memory-intensive workloads

Certain types of workloads are memory-intensive and may benefit from more memory configured per worker. In this section, we discuss common scenarios where large workers could be beneficial for running memory-intensive workloads.

Data skew

Data skews commonly occur in several types of datasets. Some common examples are fraud detection, population analysis, and income distribution. For example, when you want to detect anomalies in your data, it’s expected that only less than 1% of the data is abnormal. If you want to perform some aggregation on top of normal vs. abnormal records, 99% of the data will be processed by a single worker, which may lead to that worker running out of memory. Data skews may be observed for memory-intensive transformations like groupBy, orderBy, join, window functions, collect_list, collect_set, and so on. Join types such as BroadcastNestedLoopJoin and Cartesan product are also inherently memory-intensive and susceptible to data skews. Similarly, if your input data is Gzip compressed, a single Gzip file can’t be read by more than one task because the Gzip compression type is unsplittable. When there are a few very large Gzip files in the input, your job may run out of memory because a single task may have to read a huge Gzip file that doesn’t fit in the executor memory.

Failures due to data skew can be mitigated by applying strategies such as salting. However, this often requires extensive changes to the code, which may not be feasible for a production workload that failed due to an unprecedented data skew caused by a sudden surge in incoming data volume. For a simpler workaround, you may just want to increase the worker memory. Using larger workers with more spark.executor.memory allows you to handle data skew without making any changes to your application code.

Caching

In order to improve performance, Spark allows you to cache the data frames, datasets, and RDDs in memory. This enables you to reuse a data frame multiple times in your application without having to recompute it. By default, up to 50% of your executor’s JVM is used to cache the data frames based on the property spark.memory.storageFraction. For example, if your spark.executor.memory is set to 30 GB, then 15 GB is used for cache storage that is immune to eviction.

The default storage level of cache operation is DISK_AND_MEMORY. If the size of the data frame you are trying to cache doesn’t fit in the executor’s memory, a portion of the cache spills to disk. If there isn’t enough space to write the cached data in disk, the blocks are evicted and you don’t get the benefits of caching. Using larger workers allows you to cache more data in memory, boosting job performance by retrieving cached blocks from memory rather than the underlying storage.

Experiments

For example, the following PySpark job leads to a skew, with one executor processing 99.95% of the data with memory-intensive aggregates like collect_list. The job also caches a very large data frame (2.2 TB). Let’s run two iterations of the same job on EMR Serverless with the following vCPU and memory configurations.

Let’s run Test 3 with the previously largest possible worker configurations:

  • Size of executor set while creating EMR Serverless application = 4 vCPUs, 30 GB memory, 200 GB disk
  • Spark job config:
    • spark.executor.cores = 4
    • spark.executor.memory = 27 G

Let’s run Test 4 with the newly released large worker configurations:

  • Size of executor set in while creating EMR Serverless application = 8 vCPUs, 60 GB memory, 200 GB disk
  • Spark job config:
    • spark.executor.cores = 8
    • spark.executor.memory = 54 G

Test 3 failed with FetchFailedException, which resulted due to the executor memory not being sufficient for the job.

Also, from the Spark UI of Test 3, we see that the reserved storage memory of the executors was fully utilized for caching the data frames.

The remaining blocks to cache were spilled to disk, as seen in the executor’s stderr logs:

23/02/06 16:06:58 INFO MemoryStore: Will not store rdd_4_1810
23/02/06 16:06:58 WARN MemoryStore: Not enough space to cache rdd_4_1810 in memory! (computed 134.1 MiB so far)
23/02/06 16:06:58 INFO MemoryStore: Memory use = 14.8 GiB (blocks) + 507.5 MiB (scratch space shared across 4 tasks(s)) = 15.3 GiB. Storage limit = 15.3 GiB.
23/02/06 16:06:58 WARN BlockManager: Persisting block rdd_4_1810 to disk instead.

Around 33% of the persisted data frame was cached on disk, as seen on the Storage tab of the Spark UI.

Test 4 with larger executors and vCores ran successfully without throwing any memory-related errors. Also, only about 2.2% of the data frame was cached to disk. Therefore, cached blocks of a data frame will be retrieved from memory rather than from disk, offering better performance.

Recommendations

To determine if the large workers will benefit your memory-intensive Spark applications, consider the following:

  • Determine if your Spark application has any data skews by looking at the Spark UI. The following screenshot of the Spark UI shows an example data skew scenario where one task processes most of the data (145.2 GB), looking at the Shuffle Read size. If one or fewer tasks process significantly more data than other tasks, rerun your application with larger workers with 60–120 G of memory (spark.executor.memory set anywhere from 54–109 GB factoring in 10% of spark.executor.memoryOverhead).

  • Check the Storage tab of the Spark History Server to review the ratio of data cached in memory to disk from the Size in memory and Size in disk columns. If more than 10% of your data is cached to disk, rerun your application with larger workers to increase the amount of data cached in memory.
  • Another way to preemptively determine if your job needs more memory is by monitoring Peak JVM Memory on the Spark UI Executors tab. If the peak JVM memory used is close to the executor or driver memory, you can create an application with a larger worker and configure a higher value for spark.executor.memory or spark.driver.memory. For example, in the following screenshot, the maximum value of peak JVM memory usage is 26 GB and spark.executor.memory is set to 27 G. In this case, it may be beneficial to use larger workers with 60 GB memory and spark.executor.memory set to 54 G.

Considerations

Although large vCPUs help increase the locality of the shuffle blocks, there are other factors involved such as disk throughput, disk IOPS (input/output operations per second), and network bandwidth. In some cases, more small workers with more disks could offer higher disk IOPS, throughput, and network bandwidth overall compared to fewer large workers. We encourage you to benchmark your workloads against suitable vCPU configurations to choose the best configuration for your workload.

For shuffle-heavy jobs, it’s recommended to use large disks. You can attach up to 200 GB disk to each worker when you create your application. Using large vCPUs (spark.executor.cores) per executor may increase the disk utilization on each worker. If your application fails with “No space left on device” due to the inability to fit shuffle data in the disk, use more smaller workers with 200 GB disk.

Conclusion

In this post, you learned about the benefits of using large executors for your EMR Serverless jobs. For more information about different worker configurations, refer to Worker configurations. Large worker configurations are available in all Regions where EMR Serverless is available.


About the Author

Veena Vasudevan is a Senior Partner Solutions Architect and an Amazon EMR specialist at AWS focusing on big data and analytics. She helps customers and partners build highly optimized, scalable, and secure solutions; modernize their architectures; and migrate their big data workloads to AWS.