Tag Archives: serverless

Announcing Green Compute on Cloudflare Workers

Post Syndicated from Aly Cabral original https://blog.cloudflare.com/announcing-green-compute/

Announcing Green Compute on Cloudflare Workers

Announcing Green Compute on Cloudflare Workers

All too often we are confronted with the choice to move quickly or act responsibly. Whether the topic is safety, security, or in this case sustainability, we’re asked to make the trade off of halting innovation to protect ourselves, our users, or the planet. But what if that didn’t always need to be the case? At Cloudflare, our goal is to bring sustainable computing to you without the need for any additional time, work, or complexity.

Enter Green Compute on Cloudflare Workers.

Green Compute can be enabled for any Cron triggered Workers. The concept is simple: when turned on, we’ll take your compute workload and run it exclusively on parts of our edge network located in facilities powered by renewable energy. Even though all of Cloudflare’s edge network is powered by renewable energy already, some of our data centers are located in third-party facilities that are not 100% powered by renewable energy. Green Compute takes our commitment to sustainability one step further by ensuring that not only our network equipment but also the building facility as a whole are powered by renewable energy. There are absolutely no code changes needed. Now, whether you need to update a leaderboard every five minutes or do DNA sequencing directly on our edge (yes, that’s a real use case!), you can minimize the impact of any scheduled work, regardless of how complex or energy intensive.

How it works

Cron triggers allow developers to set time-based invocations for their Workers. These Workers happen on a recurring schedule, as opposed to being triggered by application users via HTTP requests. Developers specify a job schedule in familiar cron syntax either through wrangler or within the Workers Dashboard. To set up a scheduled job, first create a Worker that performs a periodic task, then navigate to the ‘Triggers’ tab to define a Cron Trigger.

Announcing Green Compute on Cloudflare Workers

The great thing about cron triggered Workers is that there is no human on the other side waiting for a response in real time. There is no end user we need to run the job close to. Instead, these Workers are scheduled to run as (often computationally expensive) background jobs making them a no-brainer candidate to run exclusively on sustainable hardware, even when that hardware isn’t the closest to your user base.

Cloudflare’s massive global network is logically one distributed system with all the parts connected, secured, and trusted. Because our network works as a single system, as opposed to a system with logically isolated regions, we have the flexibility to seamlessly move workloads around the world keeping your impact goals in mind without any additional management complexity for you.

Announcing Green Compute on Cloudflare Workers

When you set up a Cron Trigger with Green Compute enabled, the Cloudflare network will route all scheduled jobs to green energy hardware automatically, without any application changes needed. To turn on Green Compute today, signup for our beta.

Real world use

If you haven’t ever had the pleasure of writing a cron job yourself, you might be wondering — what do you use scheduled compute for anyway?

There are a wide range of periodic maintenance tasks necessary to power any application. In my working life, I’ve built a scheduled job that ran every minute to monitor the availability of the system I was responsible for, texting me if any service was unavailable. In another instance, a job ran every five mins, keeping the core database and search feature in sync by pulling all new application data, transforming it, then inserting into a search database. In yet another example, a periodic job ran every half hour to iterate over all user sessions and cleanup sessions that were no longer active.

Scheduled jobs are the backbone of real world systems. Now, with Green Compute on Cloudflare Workers all these real world systems and their computationally expensive background maintenance tasks, can take advantage of running compute exclusively on machines powered by renewable energy.

The Green Network

Our mission at Cloudflare is to help you tackle your sustainability goals. Today, with the launch of the Carbon Impact Report we gave you visibility into your environmental impact. The collaboration with the Green Web Foundation gave green hosting certification for Cloudflare Pages. And our launch of Green Compute on Cloudflare Workers allows you to exclusively run on hardware powered by renewable energy. And the best part? No additional system complexity is required for any of the above.

Cloudflare is focused on making it easy to hit your ambitious goals. We are just getting started.

Creating a single-table design with Amazon DynamoDB

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/creating-a-single-table-design-with-amazon-dynamodb/

Amazon DynamoDB is a highly performant NoSQL database that provides data storage for many serverless applications. Unlike traditional SQL databases, it does not use table joins and other relational database constructs. However, you can model many common relational designs in a single DynamoDB table but the process is different using a NoSQL approach.

This blog post uses the Alleycat racing application to explain the benefits of a single-table DynamoDB table. It also shows how to approach modeling data access requirements in a DynamoDB table. Alleycat is a home fitness system that allows users to compete in an intense series of 5-minute virtual bicycle races. Up to 1,000 racers at a time take the saddle and push the limits of cadence and resistance to set personal records and rank on virtual leaderboards.

Alleycat requirements

In the Alleycat example, the application offers a number of exercise classes. Each class has multiple races, and there are multiple racers in each race. The system logs the output for each racer per second of the race. An entity-relationship diagram in a traditional relational database shows how you could use normalized tables and relationships to store this data:

Relational model for Alleycat

In a relational database, often each table has a key that relates to a foreign key in another table. By joining multiple tables, you can query related tables and return the results in a single table view. While this is flexible and convenient, it’s also computationally expensive and difficult to scale horizontally.

Many serverless architectures are built for scale and the relational database paradigm often does not scale as efficiently as a workload demands. DynamoDB scales to almost any level of traffic but one of the tradeoffs is the lack of joins. Fortunately, it offers alternative ways to model the data to meet Alleycat’s requirements.

DynamoDB terminology and concepts

Unlike traditional databases, there is no limit to how much data can be stored in a DynamoDB table. The service is also designed to provide predictable performance at any scale, so you can expect similar query latency regardless of the level of traffic.

The most important operational aspect of running DynamoDB in production is setting and managing throughput. There is a provisioned mode, where you set the throughput, and on-demand, which is managed by the service. In the provisioned mode, you can also use automatic scaling to let the service set the throughput between lower and upper limits you define.

The choice here is determined by the traffic patterns in your workload. For applications with predictable traffic with gradual changes, provisioned mode is the better choice and is more cost effective. If traffic patterns are unknown or you prefer to have capacity managed automatically, choose on-demand. To learn more about the capacity modes, visit the documentation page.

Within each table, you must have a partition key, which is a string, numeric, or binary value. This key is a hash value used to locate items in constant time regardless of table size. It is conceptually different to an ID or primary key field in a SQL-based database and does not relate to data in other tables. When there is only a partition key, these values must be unique across items in a table.

Each table can optionally have a sort key. This allows you to search and sort within items that match a given primary key. While you must search on exact single values in the partition key, you can pattern search on sort keys. It’s common to use a numeric sort key with timestamps to find items within a date range, or use string search operators to find data in hierarchical relationships.

With only partition key and sort keys, this limits the possible types of query without duplicating data in a table. To solve this issue, DynamoDB also offers two types of indexes:

  • Local secondary indexes (LSIs): these must be created at the same time the table is created and effectively enable another sort key using the same partition key.
  • Global secondary indexes (GSIs): create and delete these at any time, and optionally use a different partition key from the existing table.

There are other important differences between the two index types:

LSI

GSI

Create At table creation Anytime
Delete At table deletion Anytime
Size Up to 10 GB per partition Unlimited
Throughput Shared with table Separate throughput
Key type Primary key only or composite key (partition key and sort key) Composite key only
Consistency model Both eventual and strong consistency Eventual consistency only

Determining data access requirements

Relational database design focuses on the normalization process without regard to data access patterns. However, designing NoSQL data schemas starts with the list of questions the application must answer. It’s important to develop a list of data access patterns before building the schema, since NoSQL databases offer less dynamic query flexibility than their SQL equivalents.

To determine data access patterns in new applications, user stories and use-cases can help identify the types of query. If you are migrating an existing application, use the query logs to identify the typical queries used. In the Alleycat example, the frontend application has the following queries:

  1. Get the results for each race by racer ID.
  2. Get a list of races by class ID.
  3. Get the best performance by racer for a class ID.
  4. Get the list of top scores by race ID.
  5. Get the second-by-second performance by racer for all races.

While it’s possible to implement the design with multiple DynamoDB tables, it’s unnecessary and inefficient. A key goal in querying DynamoDB data is to retrieve all the required data in a single query request. This is one of the more difficult conceptual ideas when working with NoSQL databases but the single-table design can help simplify data management and maximize query throughput.

Modeling many-to-many relationships with DynamoDB

In traditional SQL, a many-to-many relationship is classically represented with three tables. In the earlier diagram for the Alleycat application, these tables are racers, raceResults, and races. Populated with sample data, the tables look like this:

Relational tables with data

In DynamoDB, the adjacency list design pattern enables you to combine multiple SQL-type tables into a single NoSQL table. It has multiple uses but in this case can model many-to-many relationships. To do this, the partition key contains both types of item – races and racers. The key value contains the type of data expected in the item (for example, “race-1” or “racer-2”):

Equivalent data structure in DynamoDB

With this table design, you can query by racer ID or by race ID. For a single race, you can query by partition key to return all results for a single race, or use the sort key to limit by a single racer or for the overall results. For per racer results, the second-by-second data is stored in a nested JSON structure.

To allow sorting by output to create leaderboard results, the output value must be a sort key. However, the sort key cannot be updated once it is set. Using the main sort key, the application would only be able to write a final race result per racer to query and sort on this data.

To resolve this problem, use an index. The index can use a separate sort key where the value can be updated. This allows Alleycat to store the latest results in this field, and then for queries to sort by output to create a leaderboard.

The preceding table does not represent the races table in the normalized view, so you cannot query by class ID to retrieve a list of races. Depending on your design, you can solve this by adding a second index to the table to enable querying by class ID and returning a list of partition keys (race IDs). However, you can also overload GSIs to contain multiple types of value.

The AlleyCat application uses both an LSI and GSI to accommodate all the data access patterns. This table shows how this is modeled, although the results attribute names are shorter in the application:

Data modeled with LSI and GSI

  • Main composite key: PK and SK.
  • Local secondary index: Partition key is PK and sort key is Numeric.
  • Global secondary index: Partition key is SK and sort key is Numeric.

Reviewing the data access patterns for Alleycat

Before creating the DynamoDB table, test the proposed schema against the list of data access patterns. In this section, I review Alleycat’s list of queries to ensure that each is supported by the table schema. I use the Item explorer feature to run queries against a test table, after running the Alleycat simulator for multiple races.

1. Get the results for each race by racer ID

Use the table’s partition key, searching for PK = racer ID. This returns a list of all races (PK) for a given racer. See the updateRaceResults function for an example of how this is used:

Results by racer ID

2. Get a list of races by class ID

Use the local secondary index, searching for partition key = class ID. This results in a list of races (PK) for a given class ID. See the getRaces function code for an example of this query:

Results by class ID

3. Get the best performance by racer for a class ID.

Use the table’s partition key, searching for PK = class ID. This returns a list of racers and their best outputs for the given class ID. See the getLeaderboard function code for an example of this query:

Best performance by racer for a class ID

4. Get the list of top scores by race ID.

Use the global secondary index, searching for PK = race ID, sorting by the GSI sort key (descending) to rank the results. This returns a sorted list of results for a race. See the updateRaceResults function for an example of how this is used:

Top scored by race ID

5. Get the second-by-second performance by racer for all races.

Use the main table index, searching for PK = racer ID. Optionally use the sort key to restrict to a single race. This returns items with second-by-second performance stored in a nested JSON attribute. See the loadRealtimeHistory function for an example of how this is used:

Second-by-second performance for all racers

Optimizing items and capacity

In the Alleycat application, races are only 5 minutes long so the results attribute only contains 300 separate data points (once per second). By using a nested JSON structure in the items, the schema flattens data that otherwise would use 300 rows in the earlier SQL-based design.

The maximum item size in DynamoDB is 400 KB, which includes attribute names. If you have many more data points, you may reach this limit. To work around this, split the data across multiple items and provide the item order in the sort key. This way, when your application retrieves the items, it can reassemble the attributes to create the original dataset.

For example, if races in Alleycat were an hour long, there would be 3,600 data points. These may be stored in six rows containing 600 second-by-second results each:

Data set split across multiple items

Additionally, to maximize the storage per row, choose short attribute names. You can also compress data in attributes by storing as GZIP output instead of raw JSON, and using a binary data type for the attribute. This increases processing for the producing and consuming applications, which must compress and decompress the items. However, it can significantly increase the amount of data stored per row.

To learn more, read Best practices for storing large items and attributes.

Conclusion

This post looks at implementing common relational database patterns using DynamoDB. Instead of using multiple tables, the single-table design pattern can use adjacency lists to provide many-to-many relational functionality.

Using the Alleycat example, I show how to list the data access patterns required by an application, and then model the data using composite keys and indexes to return the relevant data using single queries. Finally, I show how to optimize items and capacity for workloads storing large amounts of data.

For more serverless learning resources, visit Serverless Land.

Building well-architected serverless applications: Regulating inbound request rates – part 1

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/building-well-architected-serverless-applications-regulating-inbound-request-rates-part-1/

This series of blog posts uses the AWS Well-Architected Tool with the Serverless Lens to help customers build and operate applications using best practices. In each post, I address the serverless-specific questions identified by the Serverless Lens along with the recommended best practices. See the introduction post for a table of contents and explanation of the example application.

Reliability question REL1: How do you regulate inbound request rates?

Defining, analyzing, and enforcing inbound request rates helps achieve better throughput. Regulation helps you adapt different scaling mechanisms based on customer demand. By regulating inbound request rates, you can achieve better throughput, and adapt client request submissions to a request rate that your workload can support.

Required practice: Control inbound request rates using throttling

Throttle inbound request rates using steady-rate and burst rate requests

Throttling requests limits the number of requests a client can make during a certain period of time. Throttling allows you to control your API traffic. This helps your backend services maintain their performance and availability levels by limiting the number of requests to actual system throughput.

To prevent your API from being overwhelmed by too many requests, Amazon API Gateway throttles requests to your API. These limits are applied across all clients using the token bucket algorithm. API Gateway sets a limit on a steady-state rate and a burst of request submissions. The algorithm is based on an analogy of filling and emptying a bucket of tokens representing the number of available requests that can be processed.

Each API request removes a token from the bucket. The throttle rate then determines how many requests are allowed per second. The throttle burst determines how many concurrent requests are allowed. I explain the token bucket algorithm in more detail in “Building well-architected serverless applications: Controlling serverless API access – part 2

Token bucket algorithm

Token bucket algorithm

API Gateway limits the steady-state rate and burst requests per second. These are shared across all APIs per Region in an account. For further information on account-level throttling per Region, see the documentation. You can request account-level rate limit increases using the AWS Support Center. For more information, see Amazon API Gateway quotas and important notes.

You can configure your own throttling levels, within the account and Region limits to improve overall performance across all APIs in your account. This restricts the overall request submissions so that they don’t exceed the account-level throttling limits.

You can also configure per-client throttling limits. Usage plans restrict client request submissions to within specified request rates and quotas. These are applied to clients using API keys that are associated with your usage policy as a client identifier. You can add throttling levels per API route, stage, or method that are applied in a specific order.

For more information on API Gateway throttling, see the AWS re:Invent presentation “I didn’t know Amazon API Gateway could do that”.

API Gateway throttling

API Gateway throttling

You can also throttle requests by introducing a buffering layer using Amazon Kinesis Data Stream or Amazon SQS. Kinesis can limit the number of requests at the shard level while SQS can limit at the consumer level. For more information on using SQS as a buffer with Amazon Simple Notification Service (SNS), read “How To: Use SNS and SQS to Distribute and Throttle Events”.

Identify steady-rate and burst rate requests that your workload can sustain at any point in time before performance degraded

Load testing your serverless application allows you to monitor the performance of an application before it is deployed to production. Serverless applications can be simpler to load test, thanks to the automatic scaling built into many of the services. During a load test, you can identify quotas that may act as a limiting factor for the traffic you expect and take action.

Perform load testing for a sustained period of time. Gradually increase the traffic to your API to determine your steady-state rate of requests. Also use a burst strategy with no ramp up to determine the burst rates that your workload can serve without errors or performance degradation. There are a number of AWS Marketplace and AWS Partner Network (APN) solutions available for performance testing, Gatling Frontline, BlazeMeter, and Apica.

In the serverless airline example used in this series, you can run a performance test suite using Gatling, an open source tool.

To deploy the test suite, follow the instructions in the GitHub repository perf-tests directory. Uncomment the deploy.perftest line in the repository Makefile.

Perf-test makefile

Perf-test makefile

Once the file is pushed to GitHub, AWS Amplify Console rebuilds the application, and deploys an AWS CloudFormation stack. You can run the load tests locally, or use an AWS Step Functions state machine to run the setup and Gatling load test simulation.

Performance test using Step Functions

Performance test using Step Functions

The Gatling simulation script uses constantUsersPerSec and rampUsersPerSec to add users for a number of test scenarios. You can use the test to simulate load on the application. Once the tests run, it generates a downloadable report.

Gatling performance results

Gatling performance results

Artillery Community Edition is another open-source tool for testing serverless APIs. You configure the number of requests per second and overall test duration, and it uses a headless Chromium browser to run its test flows. For Artillery, the maximum number of concurrent tests is constrained by your local computing resources and network. To achieve higher throughput, you can use Serverless Artillery, which runs the Artillery package on Lambda functions. As a result, this tool can scale up to a significantly higher number of tests.

For more information on how to use Artillery, see “Load testing a web application’s serverless backend”. This runs tests against APIs in a demo application. For example, one of the tests fetches 50,000 questions per hour. This calls an API Gateway endpoint and tests whether the AWS Lambda function, which queries an Amazon DynamoDB table, can handle the load.

Artillery performance test

Artillery performance test

This is a synchronous API so the performance directly impacts the user’s experience of the application. This test shows that the median response time is 165 ms with a p95 time of 201 ms.

Performance test API results

Performance test API results

Another consideration for API load testing is whether the authentication and authorization service can handle the load. For more information on load testing Amazon Cognito and API Gateway using Step Functions, see “Using serverless to load test Amazon API Gateway with authorization”.

API load testing with authentication and authorization

API load testing with authentication and authorization

Conclusion

Regulating inbound requests helps you adapt different scaling mechanisms based on customer demand. You can achieve better throughput for your workloads and make them more reliable by controlling requests to a rate that your workload can support.

In this post, I cover controlling inbound request rates using throttling. I show how to use throttling to control steady-rate and burst rate requests. I show some solutions for performance testing to identify the request rates that your workload can sustain before performance degradation.

This well-architected question will be continued where I look at using, analyzing, and enforcing API quotas. I cover mechanisms to protect non-scalable resources.

For more serverless learning resources, visit Serverless Land.

Introducing Workers Usage Notifications

Post Syndicated from Aly Cabral original https://blog.cloudflare.com/introducing-workers-usage-notifications/

Introducing Workers Usage Notifications

Introducing Workers Usage Notifications

So you’ve built an application on the Workers platform. The first thing you might be wondering after pushing your code out into the world is “what does my production traffic look like?” How many requests is my Worker handling? How long are those requests taking? And as your production traffic evolves overtime it can be a lot to keep up with. The last thing you want is to be surprised by the traffic your serverless application is handling.  But, you have a million things to do in your day job, and having to log in to the Workers dashboard every day to check usage statistics is one extra thing you shouldn’t need to worry about.

Today we’re excited to launch Workers usage notifications that proactively send relevant usage information directly to your inbox. Usage notifications come in two flavors. The first is a weekly summary of your Workers usage with a breakdown of your most popular Workers. The second flavor is an on-demand usage notification, triggered when a worker’s CPU usage is 25% above its average CPU usage over the previous seven days. This on-demand notification helps you proactively catch large changes in Workers usage as soon as those changes occur whether from a huge spike in traffic or a change in your code.

As of today, if you create a new free account with Workers, we’ll enable both the weekly summary and the CPU usage notification by default. All other account types will not have Workers usage notifications turned on by default, but the notifications can be enabled as needed. Once we collect substantial user feedback, our goal is to turn these notifications on by default for all accounts.

The Worker Weekly Summary

The mission of the Worker Weekly Summary is to give you a high-level view of your Workers usage automatically, in your inbox, without needing to sign in to the Worker’s dashboard. The summary includes valuable information such as total request counts, duration and egress data transfer, aggregated across your account, with breakouts for your most popular Workers that include median CPU time. Where duration accounts for the entire time your Worker spends responding to a request, CPU time is the time a script spends in computational work on the Cloudflare edge, discounting any time spent waiting on network requests, including requests to third-party APIs.

Introducing Workers Usage Notifications

Workers Usage Report

Where the Workers Weekly Summary provides a high-level view of your account usage statistics, the Workers Usage Report is targeted, event-driven and potentially actionable. It identifies those Workers with greater than a 25% increase in CPU time compared to the average of the previous 7 days (i.e., those Workers taking significantly more CPU resources now than in the recent past).

While sometimes these increases may be no surprise, perhaps because you’ve recently pushed a new deployment that you expected to do more CPU heavy work, at other times they may indicate a bug or reveal that a script is unintentionally expensive. The Workers Usage Report gives us an opportunity to let you know when a noticeable change in your compute footprint occurs, so that you can remedy any potential problems right away.

Introducing Workers Usage Notifications

Enabling Notifications

If you’d like to explicitly opt in to notifications, you can start off by clicking Add in the Notifications section of the Cloudflare zone dashboard.

Introducing Workers Usage Notifications
Introducing Workers Usage Notifications

After clicking Add, note the two new entries below “Create Notifications” under the “Workers” Product:

Introducing Workers Usage Notifications

Click on the Select box in line with “ Weekly Summary” for the weekly roll-up, which will then allow you to configure email recipients, webhooks or connect the notification to PagerDuty.

Introducing Workers Usage Notifications

Clicking Select next to “Usage Report” for CPU threshold notifications will send you to a similar configuration experience where you can customize email recipients and other integrations.

Introducing Workers Usage Notifications

What’s next?

As mentioned above, we’re enabling notifications by default for all new free plans on Workers. Before rolling out these notifications by default to all our users, we want to hear from you. Let us know your experience with our Workers usage notifications by joining our Developer Community discord or by sending feedback via the survey in the email notifications.

Our Workers Weekly Summary and on-demand CPU usage notification are just the beginning of our journey to support a wide range of useful, relevant notifications that help you get visibility into your deployments. We want to surface the right usage information, exactly when you need it.

Introducing AWS SAM Pipelines: Automatically generate deployment pipelines for serverless applications

Post Syndicated from Benjamin Smith original https://aws.amazon.com/blogs/compute/introducing-aws-sam-pipelines-automatically-generate-deployment-pipelines-for-serverless-applications/

Today, AWS announces the public preview of AWS SAM Pipelines, a new capability of AWS Serverless Application Model (AWS SAM) CLI. AWS SAM Pipelines makes it easier to create secure continuous integration and deployment (CI/CD) pipelines for your organizations preferred continuous integration and continuous deployment (CI/CD) system.

This blog post shows how to use AWS SAM Pipelines to create a CI/CD deployment pipeline configuration file that integrates with GitLab CI/CD.

AWS SAM Pipelines

A deployment pipeline is an automated sequence of steps that are performed to release a new version of an application. They are defined by a pipeline template file. AWS SAM Pipelines provides templates for popular CI/CD systems such as AWS CodePipeline, Jenkins, GitHub Actions, and GitLab CI/CD. Pipeline templates include AWS deployment best practices to help with multi-account and multi-Region deployments. AWS environments such as dev and production typically exist in different AWS accounts. This allows development teams to configure safe deployment pipelines, without making unintended changes to infrastructure. You can also supply your own custom pipeline templates to help to standardize pipelines across development teams.

AWS SAM Pipelines is composed of two commands:

  1. sam pipeline bootstrap, a configuration command that creates the AWS resources required to create a pipeline.
  2. sam pipeline init, an initialization command that creates a pipeline file for your preferred CI/CD system. For example, a Jenkinsfile for Jenkins or a .gitlab-ci.yml file for GitLab CI/CD.

Having two separate commands allows you to manage the credentials for operators and developer separately. Operators can use sam pipeline bootstrap to provision AWS pipeline resources. This can reduce the risk of production errors and operational costs. Developers can then focus on building without having to set up the pipeline infrastructure by running the sam pipeline init command.

You can also combine these two commands by running sam pipeline init –bootstrap. This takes you through the entire guided bootstrap and initialization process.

Getting started

The following steps show how to use AWS SAM Pipelines to create a deployment pipeline for GitLab CI/CD. GitLab is an AWS Partner Network (APN) member to build, review, and deploy code. AWS SAM Pipelines creates two deployment pipelines, one for a feature branch, and one for a main branch. Each pipeline runs as a separate environment in separate AWS accounts. Each time you make a commit to the repository’s feature branch, the pipeline builds, tests, and deploys a serverless application in the development account. For each commit to the Main branch, the pipeline builds, tests, and deploys to a production account.

Prerequisites

  • An AWS account with permissions to create the necessary resources.
  • Install AWS Command Line Interface (CLI) and AWS SAM CLI.
  • A verified GitLab account: This post assumes you have the required permissions to configure GitLab projects, create pipelines, and configure GitLab variables.
  • Create a new GitLab project and clone it to your local environment

Create a serverless application

Use the AWS SAM CLI to create a new serverless application from a Quick Start Template.

Run the following AWS SAM CLI command in the root directory of the repository and follow the prompts. For this example, can select any of the application templates:

sam init

Creating pipeline deployment resources

The sam pipeline bootstrap command creates the AWS resources and permissions required to deploy application artifacts from your code repository into your AWS environments.

For this reason, AWS SAM Pipelines creates IAM users and roles to allow you to deploy applications across multiple accounts. AWS SAM Pipelines creates these deployment resources following the principal of least privilege:

Run the following command in a terminal window:

sam pipeline init --bootstrap

This guides you through a series of questions to help create a .gitlab-ci.yml file. The --bootstrap option enables you to set up AWS pipeline stage resources before the template file is initialized:

  1. Enter 1, to choose AWS Quick Start Pipeline Templates
  2. Enter 2 to choose to create a GitLab CI/CD template file, which includes a two stage pipeline.
  3. Next AWS SAM reports “No bootstrapped resources were detected.” and asks if you want to set up a new CI/CD stage. Enter Y to set up a new stage:

Set up the dev stage by answering the following questions:

  1. Enter “dev” for the Stage name.
  2. AWS SAM CLI detects your AWS CLI credentials file. It uses a named profile to create the required resources for this stage. If you have a development profile, select that, otherwise select the default profile.
  3. Enter a Region for the stage name (for example, “eu-west-2”).
  4. Keep the pipeline IAM user ARN and pipeline and CloudFormation execution role ARNs blank to generate these resources automatically.
  5. An Amazon S3 bucket is required to store application build artifacts during the deployment process. Keep this option blank for AWS SAM Pipelines to generate a new S3 bucket.If your serverless application uses AWS Lambda functions packaged as a container image, you must create or specify an Amazon ECR Image repository. The bootstrap command configures the required permissions to access this ECR repository.
  6. Enter N to specify you are not using Lambda functions packages as container images.
  7. Press “Enter” to confirm the resources to be created.

AWS SAM Pipelines creates a PipelineUser with an associated ACCESS_KEY_ID and SECRET_ACCESS_KEY which GitLab uses to deploy artifacts to your AWS accounts. An S3 bucket is created along with two roles PipelineExecutionRole and CloudFormationExecutionRole.

Make a note of these values. You use these in the following steps to configure the production deployment environment and CI/CD provider.

Creating the production stage

The AWS SAM Pipeline command automatically detects that a second stage is required to complete the GitLab template, and prompts you to go through the set-up process for this:

  1. Enter “Y” to continue to build the next pipeline stage resources.
  2. When prompted for Stage Name, enter “prod”.
  3. When asked to Select a credential source, choose a suitable named profile from your AWS config file. The following example shows that a named profile called “prod” is selected.
  4. Enter a Region to deploy the resources to. The example uses the eu-west-1 Region.
  5. Press enter to use the same Pipeline IAM user ARN created in the previous step.
  6. When prompted for the pipeline execution role ARN and the CloudFormation execution role ARN, leave blank to allow the bootstrap process to create them.
  7. Provide the same answers as in the previous steps 5-7.

The AWS resources and permissions are now created to run the deployment pipeline for a dev and prod stage. The definition of these two stages is saved in .aws-sam/pipeline/pipelineconfig.toml.

AWS SAM Pipelines now automatically continues the walkthrough to create a GitLab deployment pipeline file.

Creating a deployment pipeline file

The following questions help create a .gitlab-ci.yml file. GitLab uses this file to run the CI/CD pipeline to build and deploy the application. When prompted, enter the name for both the dev and Prod stages. Use the following example to help answer the questions:

Deployment pipeline file

A .gitlab-ci.yml pipeline file is generated. The file contains a number of environment variables, which reference the details from AWS SAM pipeline bootstrap command. This includes using the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY securely stored in the GitLab CI/CD repository.

The pipeline file contains a build and deploy stage for a branch that follows the naming pattern ‘feature-*’. The build process assumes the TESTING_PIPELINE_EXECUTION_ROLE in the testing account to deploy the application. sam build uses the AWS SAM template file previously created. It builds the application artifacts using the default AWS SAM build images. You can further customize the sam build –use-container command if necessary.

By default the Docker image used to create the build artifact is pulled from Amazon ECR Public. The default Node.js 14 image in this example is based on the language specified during sam init. To pull a different container image, use the --build-image option as specified in the documentation.

sam deploy deploys the application to a new stack in the dev stage using the TESTING_CLOUDFORMATION_EXECUTION_ROLE.The following code shows how this configured in the .gitlab-ci.yml file.

build-and-deploy-feature:
stage: build
only:
- /^feature-.*$/
script:
- . assume-role.sh ${TESTING_PIPELINE_EXECUTION_ROLE} feature-deployment
- sam build --template ${SAM_TEMPLATE} --use-container
- sam deploy --stack-name features-${CI_COMMIT_REF_NAME}-cfn-stack
--capabilities CAPABILITY_IAM
--region ${TESTING_REGION}
--s3-bucket ${TESTING_ARTIFACTS_BUCKET}
--no-fail-on-empty-changeset
--role-arn ${TESTING_CLOUDFORMATION_EXECUTION_ROLE}

The file also contains separate build and deployments stages for the main branch. sam package prepares the application artifacts. The build process then assumes the role in the production stage and prepares the application artifacts for production. You can customize the file to include testing phases, and manual approval steps, if necessary.

Configure GitLab CI/CD credentials

GitLab CI/CD uses the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to authenticate to your AWS account. The values are associated with a new user generated in the previous sam pipeline init --bootstrap step. Save these values securely in GitLab’s CI/CD variables section:

  1. Navigate to Settings > CI/CD > Variables and choose expand.
  2. Choose Add variable, and enter in the key name and value for the AWS_SECRET_ACCESS_KEY noted down in the previous steps:
  3. Repeat this process for the AWS_ACCESS_KEY_ID:

Creating a feature branch

Create a new branch in your GitLab CI/CD project named feature-1:

  1. In the GitLab CI/CD menu, choose Branches from the Repository section. Choose New branch.
  2. For Branch name, enter branch “feature-1” and in the Create from field choose main.
  3. Choose Create branch.

Configure the new feature-1 branch to be protected so it can access the protected GitLab CI/D variables.

  1. From the GitLab CI/CD main menu, choose to Settings then choose Repository.
  2. Choose Expand in the Protected Branches section
  3. Select the feature-1 branch in the Branch field and set Allowed to merge and Allowed to push to Maintainers.
  4. Choose Protect

Trigger a deployment pipeline run

1. Add the AWS SAM application files to the repository and push the branch changes to GitLab CI/CD:

git checkout -b feature-1
git add .
git commit -am “added sam application”
git push --set-upstream origin feature-1 

This triggers a new pipeline run that deploys the application to the dev environment. The following screenshot shows GitLab’s CI/CD page.

AWS CloudFormation shows that a new stack has been created in the dev stage account. It is named after the feature branch:

To deploy the feature to production, make a pull request to merge the feature-1 branch into the main branch:

  1. In GitLab CI/CD, navigate to Merge requests and choose New merge request.
  2. On the following screen, choose feature-1 as the Source branch and main as the Target branch.
  3. Choose Compare branches and continue, and then choose Create merge request.
  4. Choose Merge

This merges the feature-1 branch to the main branch, triggering the pipeline to run the production build, testing, and deployment steps:

Conclusion

AWS SAM Pipelines is a new feature of the AWS SAM CLI that helps organizations quickly create pipeline files for their preferred CI/CD system. AWS provides a default set of pipeline templates that follow best practices for popular CI/CD systems such as AWS CodePipeline, Jenkins, GitHub Actions, and GitLab CI/CD. Organizations can also supply their custom pipeline templates via Git repositories to standardize custom pipelines across hundreds of application development teams. This post shows how to use AWS SAM Pipelines to create a CI/CD deployment pipeline for GitLab.

Watch guided video tutorials to learn how to create deployment pipelines for GitHub Actions, GitLab CI/CD, and Jenkins.

For more learning resources, visit https://serverlessland.com/explore/sam-pipelines.

Data Caching Across Microservices in a Serverless Architecture

Post Syndicated from Irfan Saleem original https://aws.amazon.com/blogs/architecture/data-caching-across-microservices-in-a-serverless-architecture/

Organizations are re-architecting their traditional monolithic applications to incorporate microservices. This helps them gain agility and scalability and accelerate time-to-market for new features.

Each microservice performs a single function. However, a microservice might need to retrieve and process data from multiple disparate sources. These can include data stores, legacy systems, or other shared services deployed on premises in data centers or in the cloud. These scenarios add latency to the microservice response time because multiple real-time calls are required to the backend systems. The latency often ranges from milliseconds to a few seconds depending on size of the data, network bandwidth, and processing logic. In certain scenarios, it makes sense to maintain a cache close to the microservices layer to improve performance by reducing or eliminating the need for the real-time backend calls.

Caches reduce latency and service-to-service communication of microservice architectures. A cache is a high-speed data storage layer that stores a subset of data. When data is requested from a cache, it is delivered faster than if you accessed the data’s primary storage location.

While working with our customers, we have observed use cases where data caching helps reduce latency in the microservices layer. Caching can be implemented in several ways. In this blog post, we discuss a couple of these use cases that customers have built. In both use cases, the microservices layer is created using Serverless on AWS offerings. It requires data from multiple data sources deployed locally in the cloud or on premises. The compute layer is built using AWS Lambda. Though Lambda functions are short-lived, the cached data can be used by subsequent instances of the same microservice to avoid backend calls.

Use case 1: On-demand cache to reduce real-time calls

In this use case, the Cache-Aside design pattern is used for lazy loading of frequently accessed data. This means that an object is only cached when it is requested by a consumer, and the respective microservice decides if the object is worth saving.

This use case is typically useful when the microservices layer makes multiple real-time calls to fetch and process data. These calls can be greatly reduced by caching frequently accessed data for a short period of time.

Let’s discuss a real-world scenario. Figure 1 shows a customer portal that provides a list of car loans, their status, and the net outstanding amount for a customer:

  • The Billing microservice gets a request. It then tries to get required objects (for example, the list of car loans, their status, and the net outstanding balance) from the cache using an object_key. If the information is available in the cache, a response is sent back to the requester using cached data.
  • If requested objects are not available in the cache (a cache miss), the Billing microservice makes multiple calls to local services, applications, and data sources to retrieve data. The result is compiled and sent back to the requester. It also resides in the cache for a short period of time.
  • Meanwhile, if a customer makes a payment using the Payment microservice, the balance amount in the cache must be invalidated/deleted. The Payment microservice processes the payment and invokes an asynchronous event (payment_processed) with the respective object key for the downstream processes that will remove respective objects from the cache.
  • The events are stored in the event store.
  • The CacheManager microservice gets the event (payment_processed) and makes a delete request to the cache for the respective object_key. If necessary, the CacheManager can also refresh cached data. It can call a resource within the Billing service or it can refresh data directly from the source system depending on the data refresh logic.
Reducing latency by caching frequently accessed data on demand

Figure 1. Reducing latency by caching frequently accessed data on demand

Figure 2 shows AWS services for use case 1. The microservices layer (Billing, Payments, and Profile) is created using Lambda. The Amazon API Gateway is exposing Lambda functions as API operations to the internal or external consumers.

Suggested AWS services for implementing use case 1

Figure 2. Suggested AWS services for implementing use case 1

All three microservices are connected with the data cache and can save and retrieve objects from the cache. The cache is maintained in-memory using Amazon ElastiCache. The data objects are kept in cache for a short period of time. Every object has an associated TTL (time to live) value assigned to it. After that time period, the object expires. The custom events (such as payment_processed) are published to Amazon EventBridge for downstream processing.

Use case 2: Proactive caching of massive volumes of data

During large modernization and migration initiatives, not all data sources are colocated for a certain period of time. Some legacy systems, such as mainframe, require a longer decommissioning period. Many legacy backend systems process data through periodic batch jobs. In such scenarios, front-end applications can use cached data for a certain period of time (ranging from a few minutes to few hours) depending on nature of data and its usage. The real-time calls to the backend systems cannot deal with the extensive call volume on the front-end application.

In such scenarios, required data/objects can be identified up front and loaded directly into the cache through an automated process as shown in Figure 3:

  • An automated process loads data/objects in the cache during the initial load. Subsequent changes to the data sources (either in a mainframe database or another system of record) are captured and applied to the cache through an automated CDC (change data capture) pipeline.
  • Unlike use case 1, the microservices layer does not make real-time calls to load data into the cache. In this use case, microservices use data already cached for their processing.
  • However, the microservices layer may create an event if data in the cache is stale or specific objects have been changed by another service (for example, by the Payment service when a payment is made).
  • The events are stored in Event Manager. Upon receiving an event, the CacheManager initiates a backend process to refresh stale data on demand.
  • All data changes are sent directly to the system of record.
Eliminating real-time calls by caching massive data volumes proactively

Figure 3. Eliminating real-time calls by caching massive data volumes proactively

As shown in Figure 4, the data objects are maintained in Amazon DynamoDB, which provides low-latency data access at any scale. The data retrieval is managed through DynamoDB Accelerator (DAX), a fully managed, highly available, in-memory cache. It delivers up to a 10 times performance improvement, even at millions of requests per second.

Suggested AWS services for implementing use case 2

Figure 4. Suggested AWS services for implementing use case 2

The data in DynamoDB can be loaded through different methods depending on the customer use case and technology landscape. API Gateway, Lambda, and EventBridge are providing similar functionality as described in use case 1.

Use case 2 is also beneficial in scenarios where front-end applications must cache data for an extended period of time, such as a customer’s shopping cart.

In addition to caching, the following best practices can also be used to reduce latency and to improve performance within the Lambda compute layer:

Conclusion

The microservices architecture allows you to build several caching layers depending on your use case. In this blog, we discussed data caching within the compute layer to reduce latency when data is retrieved from disparate sources. The information from use case 1 can help you reduce real-time calls to your back-end system by saving frequently used data to the cache. Use case 2 helps you maintain large volumes of data in caches for extended periods of time when real-time calls to the backend system are not possible.

Architecting a Highly Available Serverless, Microservices-Based Ecommerce Site

Post Syndicated from Senthil Kumar original https://aws.amazon.com/blogs/architecture/architecting-a-highly-available-serverless-microservices-based-ecommerce-site/

The number of ecommerce vendors is growing globally—they often handle large traffic at different times of the day and different days of the year. This, in addition to building, managing, and maintaining IT infrastructure on-premises data centers can present challenges to ecommerce businesses’ scalability and growth.

This blog provides you a Serverless on AWS solution that offloads the undifferentiated heavy lifting of managing resources and ensures your businesses’ architecture can handle peak traffic.

Common architecture set up versus serverless solution

The following sections describe a common monolithic architecture and our suggested alternative approach: setting up microservices-based order submission and product search modules. These modules are independently deployable and scalable.

Typical monolithic architecture

Figure 1 shows how a typical on-premises ecommerce infrastructure with different tiers is set up:

  • Web servers serve static assets and proxy requests to application servers
  • Application servers process ecommerce business logic and authentication logic
  • Databases store user and other dynamic data
  • Firewall and load balancers provide network components for load balancing and network security
Monolithic on-premises ecommerce infrastructure with different tiers

Figure 1. Monolithic on-premises ecommerce infrastructure with different tiers

Monolithic architecture tightly couples different layers of the application. This prevents them from being independently deployed and scaled.

Microservices-based modules

Order submission workflow module

This three-layer architecture can be set up in the AWS Cloud using serverless components:

  • Static content layer (Amazon CloudFront and Amazon Simple Storage Service (Amazon S3)). This layer stores static assets on Amazon S3. By using CloudFront in front of the S3 storage cache, you can deliver assets to customers globally with low latency and high transfer speeds.
  • Authentication layer (Amazon Cognito or customer proprietary layer). Ecommerce sites deliver authenticated and unauthenticated content to the user. With Amazon Cognito, you can manage users’ sign-up, sign-in, and access controls, so this authentication layer ensures that only authenticated users have access to secure data.
  • Dynamic content layer (AWS Lambda and Amazon DynamoDB). All business logic required for the ecommerce site is handled by the dynamic content layer. Using Lambda and DynamoDB ensures that these components are scalable and can handle peak traffic.

As shown in Figure 2, the order submission workflow is split into two sections: synchronous and asynchronous.

By splitting the order submission workflow, you allow users to submit their order details and get an orderId. This makes sure that they don’t have to wait for backend processing to complete. This helps unburden your architecture during peak shopping periods when the backend process can get busy.

Microservices-based order submission workflow

Figure 2. Microservices-based order submission workflow

The details of the order, such as credit card information in encrypted form, shipping information, etc., are stored in DynamoDB. This action invokes an asynchronous workflow managed by AWS Step Functions.

Figure 3 shows sample step functions from the asynchronous process. In this scenario, you are using external payment processing and shipping systems. When both systems get busy, step functions can manage long-running transactions and also the required retry logic. It uses a decision-based business workflow, so if a payment transaction fails, the order can be canceled. Or, once payment is successful, the order can proceed.

Amazon Simple Notification Service (Amazon SNS) notifies users whenever their order status changes. You can even extend Step Functions to have it react based on status of shipping.

Sample AWS Step Functions asynchronous workflow that uses external payment processing service and shipping system

Figure 3. Sample AWS Step Functions asynchronous workflow that uses external payment processing service and shipping system

Product search module

Our product search module is set up using the following serverless components:

  • Amazon Elasticsearch Service (Amazon ES) stores product data, which is updated whenever product-related data changes.
  • Lambda formats the data.
  • Amazon API Gateway allows users to search without authentication. As shown in Figure 4, searching for products on the ecommerce portal does not require users to log in. All traffic via API Gateway is unauthenticated.
Microservices-based product search workflow module with dynamic traffic through API Gateway

Figure 4. Microservices-based product search workflow module with dynamic traffic through API Gateway

Replicating data across Regions

If your ecommerce application runs on multiple Regions, it may require the content and data to be replicated. This allows the application to handle local traffic from that Region and also act as a failover option if the application fails in another Region. The content and data are replicated using the multi-Region replication features of Amazon S3 and DynamoDB global tables.

Figure 5 shows a multi-Region ecommerce site built on AWS with serverless services. It uses the following features to make sure that data between all Regions are in sync for data/assets that do not need data residency compliance:

  • Amazon S3 multi-Region replication keeps static assets in sync for assets.
  • DynamoDB global tables keeps dynamic data in sync across Regions.

Assets that are specific to their Region are stored in Regional specific buckets.

Data replication for a multi-Region ecommerce website built using serverless components

Figure 5. Data replication for a multi-Region ecommerce website built using serverless components

Amazon Route 53 DNS web service manages traffic failover from one Region to another. Route 53 provides different routing policies, and depending on your business requirement, you can choose the failover routing policy.

Best practices

Now that we’ve shown you how to build these applications, make sure you follow these best practices to effectively build, deploy, and monitor the solution stack:

  • Infrastructure as Code (IaC). A well-defined, repeatable infrastructure is important for managing any solution stack. AWS CloudFormation allows you to treat your infrastructure as code and provides a relatively easy way to model a collection of related AWS and third-party resources.
  • AWS Serverless Application Model (AWS SAM). An open-source framework. Use it to build serverless applications on AWS.
  • Deployment automation. AWS CodePipeline is a fully managed continuous delivery service that automates your release pipelines for fast and reliable application and infrastructure updates.
  • AWS CodeStar. Allows you to quickly develop, build, and deploy applications on AWS. It provides a unified user interface, enabling you to manage all of your software development activities in one place.
  • AWS Well-Architected Framework. Provides a mechanism for regularly evaluating your workloads, identifying high risk issues, and recording your improvements.
  • Serverless Applications Lens. Documents how to design, deploy, and architect serverless application workloads.
  • Monitoring. AWS provides many services that help you monitor and understand your applications, including Amazon CloudWatch, AWS CloudTrail, and AWS X-Ray.

Conclusion

In this blog post, we showed you how to architect a highly available, serverless, and microservices-based ecommerce website that operates in multiple Regions.

We also showed you how to replicate data between different Regions for scaling and if your workload fails. These serverless services reduce the burden of building and managing physical IT infrastructure to help you focus more on building solutions.

Related information

Building well-architected serverless applications: Implementing application workload security – part 2

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/building-well-architected-serverless-applications-implementing-application-workload-security-part-2/

This series of blog posts uses the AWS Well-Architected Tool with the Serverless Lens to help customers build and operate applications using best practices. In each post, I address the serverless-specific questions identified by the Serverless Lens along with the recommended best practices. See the introduction post for a table of contents and explanation of the example application.

Security question SEC3: How do you implement application security in your workload?

This post continues part 1 of this security question. Previously, I cover reviewing security awareness documentation such as the Common Vulnerabilities and Exposures (CVE) database. I show how to use GitHub security features to inspect and manage code dependencies. I then show how to validate inbound events using Amazon API Gateway request validation.

Required practice: Store secrets that are used in your code securely

Store secrets such as database passwords or API keys in a secrets manager. Using a secrets manager allows for auditing access, easier rotation, and prevents exposing secrets in application source code. There are a number of AWS and third-party solutions to store and manage secrets.

AWS Partner Network (APN) member Hashicorp provides Vault to keep secrets and application data secure. Vault has a centralized workflow for tightly controlling access to secrets across applications, systems, and infrastructure. You can store secrets in Vault and access them from an AWS Lambda function to, for example, access a database. You can use the Vault Agent for AWS to authenticate with Vault, receive the database credentials, and then perform the necessary queries. You can also use the Vault AWS Lambda extension to manage the connectivity to Vault.

AWS Systems Manager Parameter Store allows you to store configuration data securely, including secrets, as parameter values.

AWS Secrets Manager enables you to replace hardcoded credentials in your code with an API call to Secrets Manager to retrieve the secret programmatically. You can protect, rotate, manage, and retrieve database credentials, API keys, and other secrets throughout their lifecycle. You can also generate secure secrets. By default, Secrets Manager does not write or cache the secret to persistent storage.

Parameter Store integrates with Secrets Manager. For more information, see “Referencing AWS Secrets Manager secrets from Parameter Store parameters.”

To show how Secrets Manager works, deploy the solution detailed in “How to securely provide database credentials to Lambda functions by using AWS Secrets Manager”.

The AWS Cloud​Formation stack deploys an Amazon RDS MySQL database with a randomly generated password. This is stored in Secrets Manager using a secret resource. A Lambda function behind an API Gateway endpoint returns the record count in a table from the database, using the required credentials. Lambda function environment variables store the database connection details and which secret to return for the database password. The password is not stored as an environment variable, nor in the Lambda function application code.

Lambda environment variables for Secrets Manager

Lambda environment variables for Secrets Manager

The application flow is as follows:

  1. Clients call the API Gateway endpoint
  2. API Gateway invokes the Lambda function
  3. The Lambda function retrieves the database secrets using the Secrets Manager API
  4. The Lambda function connects to the RDS database using the credentials from Secrets Manager and returns the query results

View the password secret value in the Secrets Manager console, which is randomly generated as part of the stack deployment.

Example password stored in Secrets Manager

Example password stored in Secrets Manager

The Lambda function includes the following code to retrieve the secret from Secrets Manager. The function then uses it to connect to the database securely.

secret_name = os.environ['SECRET_NAME']
rds_host = os.environ['RDS_HOST']
name = os.environ['RDS_USERNAME']
db_name = os.environ['RDS_DB_NAME']

session = boto3.session.Session()
client = session.client(
	service_name='secretsmanager',
	region_name=region_name
)
get_secret_value_response = client.get_secret_value(
	SecretId=secret_name
)
...
secret = get_secret_value_response['SecretString']
j = json.loads(secret)
password = j['password']
...
conn = pymysql.connect(
	rds_host, user=name, passwd=password, db=db_name, connect_timeout=5)

Browsing to the endpoint URL specified in the Cloud​Formation output displays the number of records. This confirms that the Lambda function has successfully retrieved the secure database credentials and queried the table for the record count.

Lambda function retrieving database credentials

Lambda function retrieving database credentials

Audit secrets access through a secrets manager

Monitor how your secrets are used to confirm that the usage is expected, and log any changes to them. This helps to ensure that any unexpected usage or change can be investigated, and unwanted changes can be rolled back.

Hashicorp Vault uses Audit devices that keep a detailed log of all requests and responses to Vault. Audit devices can append logs to a file, write to syslog, or write to a socket.

Secrets Manager supports logging API calls with AWS CloudTrail. CloudTrail captures all API calls for Secrets Manager as events. This includes calls from the Secrets Manager console and from code calling the Secrets Manager APIs.

Viewing the CloudTrail event history shows the requests to secretsmanager.amazonaws.com. This shows the requests from the console in addition to the Lambda function.

CloudTrail showing access to Secrets Manager

CloudTrail showing access to Secrets Manager

Secrets Manager also works with Amazon EventBridge so you can trigger alerts when administrator-specified operations occur. You can configure EventBridge rules to alert on deleted secrets or secret rotation. You can also create an alert if anyone tries to use a secret version while it is pending deletion. This can identify and alert when there is an attempt to use an out-of-date secret.

Enforce least privilege access to secrets

Access to secrets must be tightly controlled because the secrets contain sensitive information. Create AWS Identity and Access Management (IAM) policies that enable minimal access to secrets to prevent credentials being accidentally used or compromised. Secrets that have policies that are too permissive could be misused by other environments or developers. This can lead to accidental data loss or compromised systems. For more information, see “Authentication and access control for AWS Secrets Manager”.

Rotate secrets frequently.

Rotating your workload secrets is important. This prevents misuse of your secrets since they become invalid within a configured time period.

Secrets Manager allows you to rotate secrets on a schedule or on demand. This enables you to replace long-term secrets with short-term ones, significantly reducing the risk of compromise. Secrets Manager creates a CloudFormation stack with a Lambda function to manage the rotation process for you. Secrets Manager has native integrations with Amazon RDS, Amazon Redshift, and Amazon DocumentDB. It populates the function with the Amazon Resource Name (ARN) of the secret. You specify the permissions to rotate the credentials, and how often you want to rotate the secret.

The CloudFormation stack creates a MySecretRotationSchedule resource with a MyRotationLambda function to rotate the secret every 30 days.

MySecretRotationSchedule:
    Type: AWS::SecretsManager::RotationSchedule
    DependsOn: SecretRDSInstanceAttachment
    Properties:
    SecretId: !Ref MyRDSInstanceRotationSecret
    RotationLambdaARN: !GetAtt MyRotationLambda.Arn
    RotationRules:
        AutomaticallyAfterDays: 30
MyRotationLambda:
    Type: AWS::Serverless::Function
    Properties:
    Runtime: python3.7
    Role: !GetAtt MyLambdaExecutionRole.Arn
    Handler: mysql_secret_rotation.lambda_handler
    Description: 'This is a lambda to rotate MySql user passwd'
    FunctionName: 'cfn-rotation-lambda'
    CodeUri: 's3://devsecopsblog/code.zip'      
    Environment:
        Variables:
        SECRETS_MANAGER_ENDPOINT: !Sub 'https://secretsmanager.${AWS::Region}.amazonaws.com'

View and edit the rotation settings in the Secrets Manager console.

Secrets Manager rotation settings

Secrets Manager rotation settings

Manually rotate the secret by selecting Rotate secret immediately. This invokes the Lambda function, which updates the database password and updates the secret in Secrets Manager.

View the updated secret in Secrets Manager, where the password has changed.

Secrets Manager password change

Secrets Manager password change

Browse to the endpoint URL to confirm you can still access the database with the updated credentials.

Access endpoint with updated Secret Manager password

Access endpoint with updated Secret Manager password

You can provide your own code to customize a Lambda rotation function for other databases or services. The code includes the commands required to interact with your secured service to update or add credentials.

Conclusion

Implementing application security in your workload involves reviewing and automating security practices at the application code level. By implementing code security, you can protect against emerging security threats. You can improve the security posture by checking for malicious code, including third-party dependencies.

In this post, I continue from part 1, looking at securely storing, auditing, and rotating secrets that are used in your application code.

In the next post in the series, I start to cover the reliability pillar from the Well-Architected Serverless Lens with regulating inbound request rates.

For more serverless learning resources, visit Serverless Land.

Coming soon: Expansion of AWS Lambda states to all functions

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/coming-soon-expansion-of-aws-lambda-states-to-all-functions/

In November of 2019, we announced AWS Lambda function state attributes, a capability to track the current “state” of a function throughout its lifecycle.

Since launch, states have been used in two primary use-cases. First, to move the blocking setup of VPC resources out of the path of function invocation. Second, to allow the Lambda service to optimize new or updated container images for container-image based functions, also before invocation. By moving this additional work out of the path of the invocation, customers see lower latency and better consistency in their function performance. Soon, we will be expanding states to apply to all Lambda functions.

This post outlines the upcoming change, any impact, and actions to take during the roll out of function states to all Lambda functions. Most customers experience no impact from this change.

As functions are created or updated, or potentially fall idle due to low usage, they can transition to a state associated with that lifecycle event. Previously any function that was zip-file based and not attached to a VPC would only show an Active state. Updates to the application code and modifications of the function configuration would always show the Successful value for the LastUpdateStatus attribute. Now all functions will follow the same function state lifecycles described in the initial announcement post and in the documentation for Monitoring the state of a function with the Lambda API.

All AWS CLIs and SDKs have supported monitoring Lambda function states transitions since the original announcement in 2019. Infrastructure as code tools such as AWS CloudFormation, AWS SAM, Serverless Framework, and Hashicorp Terraform also already support states. Customers using these tools do not need to take any action as part of this, except for one recommended service role policy change for AWS CloudFormation customers (see Updating CloudFormation’s service role below).

However, there are some customers using SDK-based automation workflows, or calling Lambda’s service APIs directly, that must update those workflows for this change. To allow time for testing this change, we are rolling it out in a phased model, much like the initial rollout for VPC attached functions. We encourage all customers to take this opportunity to move to the latest SDKs and tools available.

Change details

Nothing is changing about how functions are created, updated, or operate as part of this. However, this change may impact certain workflows that attempt to invoke or modify a function shortly after a create or an update action. Before making API calls to a function that was recently created or modified, confirm it is first in the Active state, and that the LastUpdateStatus is Successful.

For a full explanation of both the create and update lifecycles, see Tracking the state of AWS Lambda functions.

Create function state lifecycle

Create function state lifecycle

Update function state lifecycle

Update function state lifecycle

Change timeframe

We are rolling out this change over a multiple phase period, starting with the Begin Testing phase today, July 12, 2021. The phases allow you to update tooling for deploying and managing Lambda functions to account for this change. By the end of the update timeline, all accounts transition to using the create/update Lambda lifecycle.

July 12 2021– Begin Testing: You can now begin testing and updating any deployment or management tools you have to account for the upcoming lifecycle change. You can also use this time to update your function configuration to delay the change until the End of Delayed Update.

September 6 2021 – General Update (with optional delayed update configuration): All customers without the delayed update configuration begin seeing functions transition through the lifecycles for create and update. Customers that have used the delay update configuration as described below will not see any change.

October 01 2021 – End of Delayed Update: The delay mechanism expires and customers now see the Lambda states lifecycle applied during function create or update.

Opt-in and delayed update configurations

Starting today, we are providing a mechanism for an opt-in. This allows you to update and test your tools and developer workflow processes for this change. We are also providing a mechanism to delay this change until the End of Delayed Update date. After the End of Delayed Update date, all functions will begin using the Lambda states lifecycle.

This mechanism operates on a function-by-function basis, so you can test and experiment individually without impacting your whole account. Once the General Update phase begins, all functions in an account that do not have the delayed update mechanism in place see the new lifecycle for their functions.

Both mechanisms work by adding a special string in the “Description” parameter of Lambda functions. You can add this string anywhere in this parameter. You can opt to add it to the prefix or suffix, or set the entire contents of the field. This parameter is processed at create or update in accordance with the requested action.

To opt in:

aws:states:opt-in

To delay the update:

aws:states:opt-out

NOTE: Delay configuration mechanism has no impact after the end of the Delayed Update.

Here is how this looks in the console:

I add the opt-in configuration to my function’s Description. You can find this under Configuration -> General Configuration in the Lambda console. Choose Edit to change the value.

Edit basic settings

Edit basic settings

After choosing Save, you can see the value in the console:

Opt-in flag set

Opt-in flag set

Once the opt-in is set for a function, then updates on that function go through the preceding update flow.

Checking a function’s state

With this in place, you can now test your development workflow ahead of the General Update phase. Download the latest AWS CLI (version 2.2.18 or greater) or SDKs to see function state and related attribute information.

You can confirm the current state of a function using the AWS APIs or AWS CLI to perform the GetFunction or GetFunctionConfiguration API or command for a specified function:

$ aws lambda get-function --function-name MY_FUNCTION_NAME --query 'Configuration.[State, LastUpdateStatus]'
[
    "Active",
    "Successful"
]

This returns the State and LastUpdateStatus in order for a function.

Updating CloudFormation’s service role

CloudFormation allows customers to create an AWS Identity and Access Management (IAM) service role to make calls to resources in a stack on your behalf. Customers can use service roles to allow or deny the ability to create, update, or delete resources in a stack.

As part of the rollout of function states for all functions, we recommend that customers configure CloudFormation service roles with an Allow for the “lambda:GetFunction” API. This API allows CloudFormation to get the current state of a function, which is required to assist in the creation and deployment of functions.

Conclusion

With function states, you can have better clarity on how the resources required by your Lambda function are being created. This change does not impact the way that functions are invoked or how your code is run. While this is a minor change to when resources are created for your Lambda function, the result is even better consistency of working with the service.

For more serverless learning resources, visit Serverless Land.

Understanding data streaming concepts for serverless applications

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/understanding-data-streaming-concepts-for-serverless-applications/

Amazon Kinesis is a suite of managed services that can help you collect, process, and analyze streaming data in near-real time. It consists of four separate services that are designed for common tasks with streaming data: This blog post focuses on Kinesis Data Streams.

One of the main benefits of processing streaming data is that an application can react as new data is generated, instead of waiting for batches. This real-time capability enables new functionality for applications. For example, payment processors can analyze payments in real time to detect fraudulent transactions. Ecommerce websites can use streams of clickstream activity to determine site engagement metrics in near-real time.

Kinesis can be used with Amazon EC2-based and container-based workloads. However, its integration with AWS Lambda can make it a useful data source for serverless applications. Using Lambda as a stream consumer can also help minimize the amount of operational overhead for managing streaming applications.

In this post, I explain important streaming concepts and how they affect the design of serverless applications. This post references the Alleycat racing application. Alleycat is a home fitness system that allows users to compete in an intense series of 5-minute virtual bicycle races. Up to 1,000 racers at a time take the saddle and push the limits of cadence and resistance to set personal records and rank on leaderboards. The Alleycat software connects the stationary exercise bike with a backend application that processes the data from thousands of remote devices.

The Alleycat frontend allows users to configure their races and view real-time leaderboard and historical rankings. The frontend could wait until the end of each race and collect the total output from each racer. Once the batch is ready, it could rank the results and provide a leaderboard once the race is completed. However, this is not very engaging for competitors. By using streaming data instead of a batch, the application show the racers a view of who is winning during the race. This makes the virtual environment more like a real-life cycling race.

Producers and consumers

In streaming data workloads, producers are the applications that produce data and consumers are those that process it. In a serverless streaming application, a consumer is usually a Lambda function, Amazon Kinesis Data Firehose, or Amazon Kinesis Data Analytics.

Kinesis producers and consumers

There are a number of ways to put data into a Kinesis stream in serverless applications, including direct service integrations, client libraries, and the AWS SDK.

Producer

Kinesis Data Streams

Kinesis Data Firehose

Amazon CloudWatch Logs Yes, using subscription filters Yes, using subscription filters
AWS IoT Core Yes, using IoT rule actions Yes, using IoT rule actions
AWS Database Migration Service Yes – set stream as target Not directly.
Amazon API Gateway Yes, via REST API direct service integration Yes, via REST API direct service integration
AWS Amplify Yes – via JavaScript library Not directly
AWS SDK Yes Yes

A single stream may have tens of thousands of producers, which could be web or mobile applications or IoT devices. The Alleycat application uses the AWS IoT SDK for JavaScript to publish messages to an IoT topic. An IoT rule action then uses a direct integration with Kinesis Data Streams to push the data to the stream. This configuration is ideal at the device level, especially since the device may already use AWS IoT Core to receive messages.

The Alleycat simulator uses the AWS SDK to send a large number of messages to the stream. The SDK provides two methods: PutRecord and PutRecords. The first allows you to send a single record, while the second supports up to 500 records per request (or up to 5 MB in total). The simulator uses the putRecords JavaScript API to batch messages to the stream.

A producer can put records directly on a stream, for example via the AWS SDK, or indirectly via other services such as Amazon API Gateway or AWS IoT Core. If direct, the producer must have appropriate permission to write data to the stream. If indirect, the producer must have permission to invoke the proxy service, and then the service must have permission to put data onto the stream.

While there may be many producers, there are comparatively fewer consumers. You can register up to 20 consumers per data stream, which share the outgoing throughout limit per shard. Consumers receive batches of records sequentially, which means processing latency increases as you add more consumers to a stream. For latency-sensitive applications, Kinesis offers enhanced fan-out which gives consumers 2 MB per second dedicated throughput and uses a push model to reduce latency.

Shards, streams, and partition keys

A shard is a sequence of data records in a stream with a fixed capacity. Part of Kinesis billing is based upon the number of shards. A single shard can process up to 1 MB per second or 1,000 records of incoming data. One shard can also send up to 2 MB per second of outgoing data to downstream consumers. These are hard limits on the throughputs of a shard and as your application approaches these limits, you must add more shards to avoid exceeding these limits.

A stream is a collection of these shards and is often a grouping at the workload or project level. Adding another shard to a stream effectively doubles the throughput, though it also doubles the cost. When there is only one shard in a stream, all records sent to that sent are routed to the same shard. With multiple shards, the routing of incoming messages to shards is determined by a partition key.

The data producer adds the partition key before sending the record to Kinesis. The service calculates an MD5 hash of the key, which maps to one of the shards in the stream. Each shard is assigned a range of non-overlapping hash values, so each partition key maps to only one shard.

MD5 hash function

The partition key exists as an alternative to specifying a shard ID directly, since it’s common in production applications to add and remove shards depending upon traffic. How you use the partition key determines the shard-mapping behavior. For example:

  • Same value: If you specify the same string as the partition key, every message is routed to a single shard, regardless of the number of shards in the stream. This is called overheating a shard.
  • Random value: Using a pseudo-random value, such as a UUID, evenly distributes messages between all the shards available.
  • Time-based: Using a timestamp as a partition key may result in a preference for a single shard if multiple messages arrive at the same time.
  • Applicationspecific: The Alleycat application uses the raceId as a partition key to ensure that all messages from a single race are processed by the same shard consumer.

A Lambda function is a consumer application for a data stream and processes one batch of records for each shard. Since Alleycat uses a tumbling window to calculate aggregates between batches, this use of the partition key ensures that all messages for each raceId are processed by the same function. The downside to this architecture is that it is limited to 1,000 incoming messages per second with the same raceId since it is bound to a single shard.

Deciding on a partition key strategy depends upon the specific needs of your workload. In most cases, a random value partition key is often the best approach.

Streaming payloads in serverless applications

When using the SDK to put messages to a stream, the Data attribute can be a buffer, typed array, blob, or string. Combined with the partition key value, the maximum record size is 1 MB. The Data value is base64 encoded when serialized by Kinesis and delivered as an encoded value to downstream consumers. When using a Lambda function consumer, Kinesis delivers batches in a Records array. Each record contains the encoded data attribute, partition key, and additional metadata in a JSON envelope:

JSON transformation from producer to consumer

Ordering and idempotency

Records in a Kinesis stream are delivered to consuming applications in the same order that they arrive at the Kinesis service. The service assigns a sequence number to the record when it is received and this is delivered as part of the payload to a Kinesis consumer:

Sequence number in payload

When using Lambda as a consuming application for Kinesis, by default each shard has a single instance of the function processing records. In this case, ordering is guaranteed as Kinesis invokes the function serially, one batch of records at a time.

Parallelization factor of 1

You can increase the number of concurrent function invocations by setting the ParallelizationFactor on the event source mapping. This allows you to set a concurrency of between 1 and 10, which provides a way to increase Lambda throughout if the IteratorAge metric is increasing. However, one side effect is that ordering per shard is no longer guaranteed, since the shard’s messages are split into multiple subgroups based upon an internal hash.

Parallelization factor is 2

Kinesis guarantees that every record is delivered “at least once”, but occasionally messages are delivered more than once. This is caused by producers that retry messages, network-related timeouts, and consumer retries, which can occur when worker processes restart. In both cases, these are normal activities and you should design your application to handle infrequent duplicate records.

To prevent the duplicate messages causing unintentional side effects, such as charging a payment twice, it’s important to design your application with idempotency in mind. By using transaction IDs appropriately, your code can determine if a given message has been processed previously, and ignore any duplicates. In the Alleycat application, the aggregation and processes of messages is idempotent. If two identical messages are received, processing completes with the same end result.

To learn more about implementing idempotency in serverless applications, read the “Serverless Application Lens: AWS Well-Architected Framework”.

Conclusion

In this post, I introduce some of the core streaming concepts for serverless applications. I explain some of the benefits of streaming architectures and how Kinesis works with producers and consumers. I compare different ways to ingest data, how streams are composed of shards, and how partition keys determine which shard is used. Finally, I explain the payload formats at the different stages of a streaming workload, how message ordering works with shards, and why idempotency is important to handle.

To learn more about building serverless web applications, visit Serverless Land.

Integrating Amazon API Gateway private endpoints with on-premises networks

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/integrating-amazon-api-gateway-private-endpoints-with-on-premises-networks/

This post was written by Ahmed ElHaw, Sr. Solutions Architect

Using AWS Direct Connect or AWS Site-to-Site VPN, customers can establish a private virtual interface from their on-premises network directly to their Amazon Virtual Private Cloud (VPC). Hybrid networking enables customers to benefit from the scalability, elasticity, and ease of use of AWS services while using their corporate network.

Amazon API Gateway can make it easier for developers to interface with and expose other services in a uniform and secure manner. You can use it to interface with other AWS services such as Amazon SageMaker endpoints for real-time machine learning predictions or serverless compute with AWS Lambda. API Gateway can also integrate with HTTP endpoints and VPC links in the backend.

This post shows how to set up a private API Gateway endpoint with a Lambda integration. It uses a Route 53 resolver, which enables on-premises clients to resolve AWS private DNS names.

Overview

API Gateway private endpoints allow you to use private API endpoints inside your VPC. When used with Route 53 resolver endpoints and hybrid connectivity, you can access APIs and their integrated backend services privately from on-premises clients.

You can deploy the example application using the AWS Serverless Application Model (AWS SAM). The deployment creates a private API Gateway endpoint with a Lambda integration and a Route 53 inbound endpoint. I explain the security configuration of the AWS resources used. This is the solution architecture:

Private API Gateway with a Hello World Lambda integration.

Private API Gateway with a Hello World Lambda integration.

 

  1. The client calls the private API endpoint (for example, GET https://abc123xyz0.execute-api.eu-west-1.amazonaws.com/demostage).
  2. The client asks the on-premises DNS server to resolve (abc123xyz0.execute-api.eu-west-1.amazonaws.com). You must configure the on-premises DNS server to forward DNS queries for the AWS-hosted domains to the IP addresses of the inbound resolver endpoint. Refer to the documentation for your on-premises DNS server to configure DNS forwarders.
  3. After the client successfully resolves the API Gateway private DNS name, it receives the private IP address of the VPC Endpoint of the API Gateway.
    Note: Call the DNS endpoint of the API Gateway for the HTTPS certificate to work. You cannot call the IP address of the endpoint directly.
  4. Amazon API Gateway passes the payload to Lambda through an integration request.
  5. If Route 53 Resolver query logging is configured, queries from on-premises resources that use the endpoint are logged.

Prerequisites

To deploy the example application in this blog post, you need:

  • AWS credentials that provide the necessary permissions to create the resources. This example uses admin credentials.
  • Amazon VPN or AWS Direct Connect with routing rules that allow DNS traffic to pass through to the Amazon VPC.
  • The AWS SAM CLI installed.
  • Clone the GitHub repository.

Deploying with AWS SAM

  1. Navigate to the cloned repo directory. Alternatively, use the sam init command and paste the repo URL:

    SAM init example

    SAM init example

  2. Build the AWS SAM application:
    sam build
  3. Deploy the AWS SAM application:
    sam deploy –guided

This stack creates and configures a virtual private cloud (VPC) configured with two private subnets (for resiliency) and DNS resolution enabled. It also creates a VPC endpoint with (service name = “com.amazonaws.{region}.execute-api”), Private DNS Name = enabled, and a security group set to allow TCP Port 443 inbound from a managed prefix list. You can edit the created prefix list with one or more CIDR block(s).

It also deploys an API Gateway private endpoint and an API Gateway resource policy that restricts access to the API, except from the VPC endpoint. There is also a “Hello world” Lambda function and a Route 53 inbound resolver with a security group that allows TCP/UDP DNS port inbound from the on-premises prefix list.

A VPC endpoint is a logical construct consisting of elastic network interfaces deployed in subnets. The elastic network interface is assigned a private IP address from your subnet space. For high availability, deploy in at least two Availability Zones.

Private API Gateway VPC endpoint

Private API Gateway VPC endpoint

Route 53 inbound resolver endpoint

Route 53 resolver is the Amazon DNS server. It is sometimes referred to as “AmazonProvidedDNS” or the “.2 resolver” that is available by default in all VPCs. Route 53 resolver responds to DNS queries from AWS resources within a VPC for public DNS records, VPC-specific DNS names, and Route 53 private hosted zones.

Integrating your on-premises DNS server with AWS DNS server requires a Route 53 resolver inbound endpoint (for DNS queries that you’re forwarding to your VPCs). When creating an API Gateway private endpoint, a private DNS name is created by API Gateway. This endpoint is resolved automatically from within your VPC.

However, the on-premises servers learn about this hostname from AWS. For this, create a Route 53 inbound resolver endpoint and point your on-premises DNS server to it. This allows your corporate network resources to resolve AWS private DNS hostnames.

To improve reliability, the resolver requires that you specify two IP addresses for DNS queries. AWS recommends configuring IP addresses in two different Availability Zones. After you add the first two IP addresses, you can optionally add more in the same or different Availability Zone.

The inbound resolver is a logical resource consisting of two elastic network interfaces. These are deployed in two different Availability Zones for resiliency.

Route 53 inbound resolver

Route 53 inbound resolver

Configuring the security groups and resource policy

In the security pillar of the AWS Well-Architected Framework, one of the seven design principles is applying security at all layers: Apply a defense in depth approach with multiple security controls. Apply to all layers (edge of network, VPC, load balancing, every instance and compute service, operating system, application, and code).

A few security configurations are required for the solution to function:

  • The resolver security group (referred to as ‘ResolverSG’ in solution diagram) inbound rules must allow TCP and UDP on port 53 (DNS) from your on-premises network-managed prefix list (source). Note: configure the created managed prefix list with your on-premises network CIDR blocks.
  • The security group of the VPC endpoint of the API Gateway “VPCEndpointSG” must allow HTTPS access from your on-premises network-managed prefix list (source). Note: configure the crated managed prefix list with your on-premises network CIDR blocks.
  • For a private API Gateway to work, a resource policy must be configured. The AWS SAM deployment sets up an API Gateway resource policy that allows access to your API from the VPC endpoint. We are telling API Gateway to deny any request explicitly unless it is originating from a defined source VPC endpoint.
    Note: AWS SAM template creates the following policy:

    {
      "Version": "2012-10-17",
      "Statement": [
          {
              "Effect": "Allow",
              "Principal": "*",
              "Action": "execute-api:Invoke",
              "Resource": "arn:aws:execute-api:eu-west-1:12345678901:dligg9dxuk/DemoStage/GET/hello"
          },
          {
              "Effect": "Deny",
              "Principal": "*",
              "Action": "execute-api:Invoke",
              "Resource": "arn:aws:execute-api:eu-west-1: 12345678901:dligg9dxuk/DemoStage/GET/hello",
              "Condition": {
                  "StringNotEquals": {
                      "aws:SourceVpce": "vpce-0ac4147ba9386c9z7"
                  }
              }
          }
      ]
    }

     

The AWS SAM deployment creates a Hello World Lambda. For demonstration purposes, the Lambda function always returns a successful response, conforming with API Gateway integration response.

Testing the solution

To test, invoke the API using a curl command from an on-premises client. To get the API URL, copy it from the on-screen AWS SAM deployment outputs. Alternatively, from the console go to AWS CloudFormation outputs section.

CloudFormation outputs

CloudFormation outputs

Next, go to Route 53 resolvers, select the created inbound endpoint and note of the endpoint IP addresses. Configure your on-premises DNS forwarder with the IP addresses. Refer to the documentation for your on-premises DNS server to configure DNS forwarders.

Route 53 resolver IP addresses

Route 53 resolver IP addresses

Finally, log on to your on-premises client and call the API Gateway endpoint. You should get a success response from the API Gateway as shown.

curl https://dligg9dxuk.execute-api.eu-west-1.amazonaws.com/DemoStage/hello

{"response": {"resultStatus": "SUCCESS"}}

Monitoring and troubleshooting

Route 53 resolver query logging allows you to log the DNS queries that originate in your VPCs. It shows which domain names are queried, the originating AWS resources (including source IP and instance ID) and the responses.

You can log the DNS queries that originate in VPCs that you specify, in addition to the responses to those DNS queries. You can also log DNS queries from on-premises resources that use an inbound resolver endpoint, and DNS queries that use an outbound resolver endpoint for recursive DNS resolution.

After configuring query logging from the console, you can use Amazon CloudWatch as the destination for the query logs. You can use this feature to view and troubleshoot the resolver.

{
    "version": "1.100000",
    "account_id": "1234567890123",
    "region": "eu-west-1",
    "vpc_id": "vpc-0c00ca6aa29c8472f",
    "query_timestamp": "2021-04-25T12:37:34Z",
    "query_name": "dligg9dxuk.execute-api.eu-west-1.amazonaws.com.",
    "query_type": "A",
    "query_class": "IN",
    "rcode": "NOERROR",
    "answers": [
        {
            "Rdata": "10.0.140.226”, API Gateway VPC Endpoint IP#1
            "Type": "A",
            "Class": "IN"
        },
        {
            "Rdata": "10.0.12.179", API Gateway VPC Endpoint IP#2
            "Type": "A",
            "Class": "IN"
        }
    ],
    "srcaddr": "172.31.6.137", ONPREMISES CLIENT
    "srcport": "32843",
    "transport": "UDP",
    "srcids": {
        "resolver_endpoint": "rslvr-in-a7dd746257784e148",
        "resolver_network_interface": "rni-3a4a0caca1d0412ab"
    }
}

Cleaning up

To remove the example application, navigate to CloudFormation and delete the stack.

Conclusion

API Gateway private endpoints allow use cases for building private API–based services inside your VPCs. You can keep both the frontend to your application (API Gateway) and the backend service private inside your VPC.

I discuss how to access your private APIs from your corporate network through Direct Connect or Site-to-Site VPN without exposing your endpoints to the internet. You deploy the demo using AWS Serverless Application Model (AWS SAM). You can also change the template for your own needs.

To learn more, visit the API Gateway tutorials and workshops page in the API Gateway developer guide.

For more serverless learning resources, visit Serverless Land.

Using serverless to load test Amazon API Gateway with authorization

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/using-serverless-to-load-test-amazon-api-gateway-with-authorization/

This post was written by Ashish Mehra, Sr. Solutions Architect and Ramesh Chidirala, Solutions Architect

Many customers design their applications to use Amazon API Gateway as the front door and load test their API endpoints before deploying to production. Customers want to simulate the actual usage scenario, including authentication and authorization. The load test ensures that the application works as expected under high traffic and spiky load patterns.

This post demonstrates using AWS Step Functions for orchestration, AWS Lambda to simulate the load and Amazon Cognito for authentication and authorization. There is no need to use any third-party software or containers to implement this solution.

The serverless load test solution shown here can scale from 1,000 to 1,000,000 calls in a few minutes. It invokes API Gateway endpoints but you can reuse the solution for other custom API endpoints.

Overall architecture

Overall architecture diagram

Overall architecture diagram

Solution design 

The serverless API load test framework is built using Step Functions that invoke Lambda functions using a fan-out design pattern. The Lambda function obtains the user specific JWT access token from Amazon Cognito user pool and invokes the API Gateway authenticated route..

The solution contains two workflows.

1. Load test workflow

The load test workflow comprises a multi-step process that includes a combination of sequential and parallel steps. The sequential steps include user pool configuration, user creation, and access token generation followed by API invocation in a fan-out design pattern. Step Functions provides a reliable way to build and run such multi-step workflows with support for logging, retries, and dynamic parallelism.

Step Functions workflow diagram for load test

Step Functions workflow diagram for load test

The Step Functions state machine orchestrates the following workflow:

  1. Validate input parameters.
  2. Invoke Lambda function to create a user ID array in the series loadtestuser0, loadtestuser1, and so on. This array is passed as an input to subsequent Lambda functions.
  3. Invoke Lambda to create:
    1. Amazon Cognito user pool
    2. Test users
    3. App client configured for admin authentication flow.
  4. Invoke Lambda functions in a fan-out pattern using dynamic parallelism support in Step Functions. Each function does the following:
    1. Retrieves an access token (one token per user) from Amazon Cognito
    2. Sends an HTTPS request to the specified API Gateway endpoint by passing an access token in the header.

For testing purposes, users can configure mock integration or use Lambda integration for the backend.

2. Cleanup workflow

Step Functions workflow diagram for cleanup

Step Functions workflow diagram for cleanup

As part of the cleanup workflow, the Step Functions state machine invokes a Lambda function to delete the specified number of users from the Amazon Cognito user pool.

Prerequisites to implement the solution

The following prerequisites are required for this walk-through:

  1. AWS account
  2. AWS SAM CLI
  3. Python 3.7
  4. Pre-existing non-production API Gateway HTTP API deployed with a JWT authorizer that uses Amazon Cognito as an identity provider. Refer to this video from the Twitch series #SessionsWithSAM which provides a walkthough for building and deploying a simple HTTP API with JWT authorizer.

Since this solution involves modifying API Gateway endpoint’s authorizer settings, it is recommended to load test non-production environments or production comparable APIs. Revert these settings after the load test is complete. Also, first check Lambda and Amazon Cognito Service Quotas in the AWS account you plan to use.

Step-by-step instructions

Use the AWS CloudShell to deploy the AWS Serverless Application Model (AWS SAM) template. AWS CloudShell is a browser-based shell pre-installed with common development tools. It includes 1 GB of free persistent storage per Region pre-authenticated with your console credentials. You can also use AWS Cloud9 or your preferred IDE. You can check for AWS CloudShell supported Regions here. Depending on your load test requirements, you can specify the total number of unique users to be created. You can also specify the number of API Gateway requests to be invoked per user every time you run the load test. These factors influence the overall test duration, concurrency and cost. Refer to the cost optimization section of this post for tips on minimizing the overall cost of the solution. Refer to the cleanup section of this post for instructions to delete the resources to stop incurring any further charges.

  1. Clone the repository by running the following command:
    git clone https://github.com/aws-snippets/sam-apiloadtest.git
  2. Change to the sam-apiloadtest directory and run the following command to build the application source:
    sam build
  3. Run the following command to package and deploy the application to AWS, with a series of prompts. When prompted for apiGatewayUrl, provide the API Gateway URL route you intend to load test.
    sam deploy --guided

    Example of SAM deploy

    Example of SAM deploy

  4. After the stack creation is complete, you should see UserPoolID and AppClientID in the outputs section.

    Example of stack outputs

    Example of stack outputs

  5. Navigate to the API Gateway console and choose the HTTP API you intend to load test.
  6. Choose Authorization and select the authenticated route configured with a JWT authorizer.

    API Gateway console display after stack is deployed

    API Gateway console display after stack is deployed

  7. Choose Edit Authorizer and update the IssuerURL with Amazon Cognito user pool ID and audience app client ID with the corresponding values from the stack output section in step 4.

    Editing the issuer URL

    Editing the issuer URL

  8. Set authorization scope to aws.cognito.signin.user.admin.

    Setting the authorization scopes

    Setting the authorization scopes

  9. Open the Step Functions console and choose the state machine named apiloadtestCreateUsersAndFanOut-xxx.
  10. Choose Start Execution and provide the following JSON input. Configure the number of users for the load test and the number of calls per user:
    {
      "users": {
        "NumberOfUsers": "100",
        "NumberOfCallsPerUser": "100"
      }
    }
  11. After the execution, you see the status updated to Succeeded.

 

Checking the load test results

The load test’s primary goal is to achieve high concurrency. The main metric to check the test’s effectiveness is the count of successful API Gateway invocations. While load testing your application, find other metrics that may identify potential bottlenecks. Refer to the following steps to inspect CloudWatch Logs after the test is complete:

  1. Navigate to API Gateway service within the console, choose Monitor → Logging, select the $default stage, and choose the Select button.
  2. Choose View Logs in CloudWatch to navigate to the CloudWatch Logs service, which loads the log group and displays the most recent log streams.
  3. Choose the “View in Logs Insights” button to navigate to the Log Insights page. Choose Run Query.
  4. The query results appear along with a bar graph showing the log group’s distribution of log events. The number of records indicates the number of API Gateway invocations.

    Histogram of API Gateway invocations

    Histogram of API Gateway invocations

  5. To visualize p95 metrics, navigate to CloudWatch metrics, choose ApiGateway → ApiId → Latency.
  6. Choose the “Graphed metrics (1)” tab.

    Addig latency metric

    Addig latency metric

  7. Select p95 from the Statistic dropdown.

    Setting the p95 value

    Setting the p95 value

  8. The percentile metrics help visualize the distribution of latency metrics. It can help you find critical outliers or unusual behaviors, and discover potential bottlenecks in your application’s backend.

    Example of the p95 data

    Example of the p95 data

Cleanup 

  1. To delete Amazon Cognito users, run the Step Functions workflow apiloadtestDeleteTestUsers. Provide the following input JSON with the same number of users that you created earlier:
    {
    “NumberOfUsers”: “100”
    }
  2. Step Functions invokes the cleanUpTestUsers Lambda function. It is configured with the test Amazon Cognito user pool ID and app client ID environment variables created during the stack deployment. The users are deleted from the test user pool.
  3. The Lambda function also schedules the corresponding KMS keys for deletion after seven days, the minimum waiting period.
  4. After the state machine is finished, navigate to Cognito → Manage User Pools → apiloadtest-loadtestidp → Users and Groups. Refresh the page to confirm that all users are deleted.
  5. To delete all the resources permanently and stop incurring cost, navigate to the CloudFormation console, select aws-apiloadtest-framework stack, and choose Delete → Delete stack.

Cost optimization

The load test workflow is repeatable and can be reused multiple times for the same or different API Gateway routes. You can reuse Amazon Cognito users for multiple tests since Amazon Cognito pricing is based on the monthly active users (MAUs). Repeatedly deleting and recreating users may exceed the AWS Free Tier or incur additional charges.

Customizations

You can change the number of users and number of calls per user to adjust the API Gateway load. The apiloadtestCreateUsersAndFanOut state machine validation step allows a maximum value of 1,000 for input parameters NumberOfUsers and NumberOfCallsPerUser.

You can customize and increase these values within the Step Functions input validation logic based on your account limits. To load test a different API Gateway route, configure the authorizer as per the step-by-step instructions provided earlier. Next, modify the api_url environment variable within aws-apiloadtest-framework-triggerLoadTestPerUser Lambda function. You can then run the load test using the apiloadtestCreateUsersAndFanOut state machine.

Conclusion

The blog post shows how to use Step Functions and its features to orchestrate a multi-step load test solution. I show how changing input parameters could increase the number of calls made to the API Gateway endpoint without worrying about scalability. I also demonstrate how to achieve cost optimization and perform clean-up to avoid any additional charges. You can modify this example to load test different API endpoints, identify bottlenecks, and check if your application is production-ready.

For more serverless learning resources, visit Serverless Land.

Developing evolutionary architecture with AWS Lambda

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/developing-evolutionary-architecture-with-aws-lambda/

This post was written by Luca Mezzalira, Principal Solutions Architect, Media and Entertainment.

Agility enables you to evolve a workload quickly, adding new features, or introducing new infrastructure as required. The key characteristics for achieving agility in a code base are loosely coupled components and strong encapsulation.

Loose coupling can help improve test coverage and create atomic refactoring. With encapsulation, you expose only what is needed to interact with a service without revealing the implementation logic.

Evolutionary architectures can help achieve agility in your design. In the book “Building Evolutionary Architectures”, this architecture is defined as one that “supports guided, incremental change across multiple dimensions”.

This blog post focuses on how to structure code for AWS Lambda functions in a modular fashion. It shows how to embrace the evolutionary aspect provided by the hexagonal architecture pattern and apply it to different use cases.

Introducing ports and adapters

Hexagonal architecture is also known as the ports and adapters architecture. It is an architectural pattern used for encapsulating domain logic and decoupling it from other implementation details, such as infrastructure or client requests.

Ports and adapters

  1. Domain logic: Represents the task that the application should perform, abstracting any interaction with the external world.
  2. Ports: Provide a way for the primary actors (on the left) to interact with the application, via the domain logic. The domain logic also uses ports for interacting with secondary actors (on the right) when needed.
  3. Adapters: A design pattern for transforming one interface into another interface. They wrap the logic for interacting with a primary or secondary actor.
  4. Primary actors: Users of the system such as a webhook, a UI request, or a test script.
  5. Secondary actors: used by the application, these services are either a Repository (for example, a database) or a Recipient (such as a message queue).

Hexagonal architecture with Lambda functions

Lambda functions are units of compute logic that accomplish a specific task. For example, a function could manipulate data in a Amazon Kinesis stream, or process messages from an Amazon SQS queue.

In Lambda functions, hexagonal architecture can help you implement new business requirements and improve the agility of a workload. This approach can help create separation of concerns and separate the domain logic from the infrastructure. For development teams, it can also simplify the implementation of new features and parallelize the work across different developers.

The following example introduces a service for returning a stock value. The service supports different currencies for a frontend application that displays the information in a dashboard. The translation of a stock value between currencies happens in real time. The service must retrieve the exchange rates with every request made by the client.

The architecture for this service uses an Amazon API Gateway endpoint that exposes a REST API. When the client calls the API, it triggers a Lambda function. This gets the stock value from a DynamoDB table and the currency information from a third-party endpoint. The domain logic uses the exchange rate to convert the stock value to other currencies before responding to the client request.

The full example is available in the AWS GitHub samples repository. Here is the architecture for this service:

Hexagonal architecture example

  1. A client makes a request to the API Gateway endpoint, which invokes the Lambda function.
  2. The primary adapter receives the request. It captures the stock ID and pass it to the port:
    exports.lambdaHandler = async (event) => {
        try{
    	// retrieve the stockID from the request
            const stockID = event.pathParameters.StockID;
    	// pass the stockID to the port
            const response = await getStocksRequest(stockID);
            return response
        } 
    };
  3. The port is an interface for communicating with the domain logic. It enforces the separation between an adapter and the domain logic. With this approach, you can change and test the infrastructure and domain logic in isolation without impacting another part of the code base:
    const retrieveStock = async (stockID) => {
        try{
    	//use the port “stock” to access the domain logic
            const stockWithCurrencies = await stock.retrieveStockValues(stockID)
            return stockWithCurrencies;
        }
    }
    
  4. The port passing the stock ID invokes the domain logic entry point. The domain logic fetches the stock value from a DynamoDB table, then it requests the exchange rates. It returns the computed values to the primary adapter via the port. The domain logic always uses a port to interact with an adapter because the ports are the interfaces with the external world:
    const CURRENCIES = [“USD”, “CAD”, “AUD”]
    const retrieveStockValues = async (stockID) => {
    try {
    //retrieve the stock value from DynamoDB using a port
            const stockValue = await Repository.getStockData(stockID);
    //fetch the currencies value using a port
            const currencyList = await Currency.getCurrenciesData(CURRENCIES);
    //calculate the stock value in different currencies
            const stockWithCurrencies = {
                stock: stockValue.STOCK_ID,
                values: {
                    "EUR": stockValue.VALUE
                }
            };
            for(const currency in currencyList.rates){
                stockWithCurrencies.values[currency] =  (stockValue.VALUE * currencyList.rates[currency]).toFixed(2)
            }
    // return the final computation to the port
            return stockWithCurrencies;
        }
    }
    

 

This is how the domain logic interacts with the DynamoDB table:

DynamoDB interaction

  1. The domain logic uses the Repository port for interacting with the database. There is not a direct connection between the domain and the adapter:
    const getStockData = async (stockID) => {
        try{
    //the domain logic pass the request to fetch the stock ID value to this port
            const data = await getStockValue(stockID);
            return data.Item;
        } 
    }
    
  2. The secondary adapter encapsulates the logic for reading an item from a DynamoDB table. All the logic for interacting with DynamoDB is encapsulated in this module:
    const getStockValue = async (stockID) => {
        let params = {
            TableName : DB_TABLE,
            Key:{
                'STOCK_ID': stockID
            }
        }
        try {
            const stockData = await documentClient.get(params).promise()
            return stockData
        }
    }
    

 

The domain logic uses an adapter for fetching the exchange rates from the third-party service. It then processes the data and responds to the client request:

 

  1. Currencies API interactionThe second operation in the business logic is retrieving the currency exchange rates. The domain logic requests the operation via a port that proxies the request to the adapter:
    const getCurrenciesData = async (currencies) => {
        try{
            const data = await getCurrencies(currencies);
            return data
        } 
    }
    
  2. The currencies service adapter fetches the data from a third-party endpoint and returns the result to the domain logic.
    const getCurrencies = async (currencies) => {
        try{        
            const res = await axios.get(`http://api.mycurrency.io?symbols=${currencies.toString()}`)
            return res.data
        } 
    }
    

These eight steps show how to structure the Lambda function code using a hexagonal architecture.

Adding a cache layer

In this scenario, the production stock service experiences traffic spikes during the day. The external endpoint for the exchange rates cannot support the level of traffic. To address this, you can implement a caching strategy with Amazon ElastiCache using a Redis cluster. This approach uses a cache-aside pattern for offloading traffic to the external service.

Typically, it can be challenging to evolve code to implement this change without the separation of concerns in the code base. However, in this example, there is an adapter that interacts with the external service. Therefore, you can change the implementation to add the cache-aside pattern and maintain the same API contract with the rest of the application:

const getCurrencies = async (currencies) => {
    try{        
// Check the exchange rates are available in the Redis cluster
        let res = await asyncClient.get("CURRENCIES");
        if(res){
// If present, return the value retrieved from Redis
            return JSON.parse(res);
        }
// Otherwise, fetch the data from the external service
        const getCurr = await axios.get(`http://api.mycurrency.io?symbols=${currencies.toString()}`)
// Store the new values in the Redis cluster with an expired time of 20 seconds
        await asyncClient.set("CURRENCIES", JSON.stringify(getCurr.data), "ex", 20);
// Return the data to the port
        return getCurr.data
    } 
}

This is a low-effort change only affecting the adapter. The domain logic and port interacting with the adapter are untouched and maintain the same API contract. The encapsulation provided by this architecture helps to evolve the code base. It also preserves many of the tests in place, considering only an adapter is modified.

Moving domain logic from a container to a Lambda function

In this example, the team working on this workload originally wrap all the functionality inside a container using AWS Fargate with Amazon ECS. In this case, the developers define a route for the GET method for retrieving the stock value:

// This web application uses the Fastify framework 
  fastify.get('/stock/:StockID', async (request, reply) => {
    try{
        const stockID = request.params.StockID;
        const response = await getStocksRequest(stockID);
        return response
    } 
})

In this case, the route’s entry point is exactly the same for the Lambda function. The team does not need to change anything else in the code base, thanks to the characteristics provided by the hexagonal architecture.

This pattern can help you more easily refactor code from containers or virtual machines to multiple Lambda functions. It introduces a level of code portability that can be more challenging with other solutions.

Benefits and drawbacks

As with any pattern, there are benefits and drawbacks to using hexagonal architecture.

The main benefits are:

  • The domain logic is agnostic and independent from the outside world.
  • The separation of concerns increases code testability.
  • It may help reduce technical debt in workloads.

The drawbacks are:

  • The pattern requires an upfront investment of time.
  • The domain logic implementation is not opinionated.

Whether you should use this architecture for developing Lambda functions depends upon the needs of your application. With an evolving workload, the extra implementation effort may be worthwhile.

The pattern can help improve code testability because of the encapsulation and separation of concerns provided. This approach can also be used with compute solutions other than Lambda, which may be useful in code migration projects.

Conclusion

This post shows how you can evolve a workload using hexagonal architecture. It explains how to add new functionality, change underlying infrastructure, or port the code base between different compute solutions. The main characteristics enabling this are loose coupling and strong encapsulation.

To learn more about hexagonal architecture and similar patterns, read:

For more serverless learning resources, visit Serverless Land.

Using Amazon MQ for RabbitMQ as an event source for Lambda

Post Syndicated from Talia Nassi original https://aws.amazon.com/blogs/compute/using-amazon-mq-for-rabbitmq-as-an-event-source-for-lambda/

Amazon MQ for RabbitMQ (ARMQ) is an AWS managed version of RabbitMQ. The service manages the provisioning, setup, and maintenance of RabbitMQ, reducing operational overhead for companies.

Now, with ARMQ as an event source for AWS Lambda, you can process messages from the service. This allows you to integrate ARMQ with downstream serverless workflows without having to manage the resources used to run the cluster itself.

RabbitMQ is one of the two engines for Amazon MQ. The other one is ActiveMQ. Both of these message brokers can now be used as an event source for AWS Lambda.

In this blog post, I explain how to set up a RabbitMQ broker and networking configuration. I also show how to create a Lambda function that is invoked by messages from Amazon MQ queues.

RabbitMQ overview

RabbitMQ is an open-source message broker to which applications connect in order to transfer a message or messages between services. An example application might send a message from one service to another. If there is no queue in between the two services, and the message is not received from the consumer, data could be lost. Queues wait for a successful confirmation from the consumer before deleting messages.

The Lambda event source uses a poller for the RabbitMQ broker that constantly polls for new messages. The poller exists between the queue and Lambda and it is managed by the Lambda service. By default, it polls RabbitMQ every 100 ms to check for messages. Once a defined batch size is reached, the poller invokes the function with the entire set of messages.

Configuring Amazon MQ for RabbitMQ as an event source for Lambda

To use Amazon MQ for RabbitMQ as a highly available service, it must be configured to run in a minimum of two Availability Zones in your preferred Region. You can also run a single broker in one Availability Zone for development and test purposes.

There are three main tasks to configure ARMQ as an event source for Lambda:

  • Set up AWS Secrets Manager.
  • Deploy an AWS SAM template that creates the necessary resources.
  • Create a queue on the broker.

Setting up AWS Secrets Manager

The Lambda service needs access to your Amazon MQ broker. In this step, you create a user name and password that is used to log in to RabbitMQ. To avoid exposing secrets in plaintext in the Lambda function, it’s best practice to use a service like Secrets Manager.

  1. Navigate to the Secrets Manager console and choose Store a new secret.
    secrets manager
  2. For Secret type, choose Other type of secrets. In the Secret key/value, enter username for the first key and password for the second key.
  3. Enter the corresponding values to use to access your RabbitMQ account. Choose Next.
    secrets manager
  4. For Secret name, enter ‘MQAccess’ and choose Next.
    secret name
  5. Keep the default setting for the automatic rotation: Disable automatic rotation. Choose Next.
  6. Review your secret information and choose Store.

Build the Lambda function and associated permissions with AWS SAM

In this step, you use AWS SAM to create the necessary resources that are deployed for your application. You can use AWS Cloud9 or your own text editor. (If you use your own text editor, be sure you have the AWS SAM CLI installed.

  1. In the terminal, enter:
    sam init
  2. Choose option 1: AWS Quick Start Templates. Then choose option 1: Zip (artifact is a zip uploaded to S3). Choose your preferred runtime. In this tutorial, use python3.7.
    sam init
  3. In the Secrets Manager console, copy the Secret ARN.
    secret arn
  4. In the template.yaml file that is created, paste the following code. This AWS SAM template deploys an Amazon MQ for RabbitMQ broker, and a Lambda function with the corresponding event source mapping and permissions. Replace the last line of the template with the secret ARN from step 3.
    AWSTemplateFormatVersion: '2010-09-09'
    Transform: AWS::Serverless-2016-10-31
    Description: ARMQ Example
    
    Resources:
      MQBroker:
        Type: AWS::AmazonMQ::Broker
        Properties: 
          AutoMinorVersionUpgrade: false
          BrokerName: myQueue
          DeploymentMode: SINGLE_INSTANCE
          EngineType: RABBITMQ
          EngineVersion: "3.8.11"
          HostInstanceType: mq.m5.large
          PubliclyAccessible: true
          Users:
            - Password: '{{resolve:secretsmanager:MQAccess:SecretString:password}}'
              Username: '{{resolve:secretsmanager:MQAccess:SecretString:username}}'
              
      MQConsumer:
        Type: AWS::Serverless::Function 
        Properties:
          CodeUri: hello_world/
          Timeout: 3
          Handler: app.lambda_handler
          Runtime: python3.7
          Policies:
            - Version: '2012-10-17'
              Statement:
                - Effect: Allow
                  Resource: '*'
                  Action:
                  - mq:DescribeBroker
                  - secretsmanager:GetSecretValue
                  - ec2:CreateNetworkInterface
                  - ec2:DescribeNetworkInterfaces
                  - ec2:DescribeVpcs
                  - ec2:DeleteNetworkInterface
                  - ec2:DescribeSubnets
                  - ec2:DescribeSecurityGroups
          Events:
            MQEvent:
              Type: MQ
              Properties:
                Broker: !GetAtt MQBroker.Arn
                Queues:
                  - myQueue
                SourceAccessConfigurations:
                  - Type: BASIC_AUTH
                    URI: <your_secret_arn>
  5. In the app.py file, replace the existing code with the following. This Lambda function decrypts the messages sent to the queue from the RabbitMQ broker:
    import json
    import logging as log
    import base64
    def lambda_handler(event, context):
        print("Target Lambda function invoked")
        print(event)
        if 'rmqMessagesByQueue' not in event:
            print("Invalid event data")
            return {
                'statusCode': 404
            }
        print(f'Div Data received from event source: ')
        for queue in event["rmqMessagesByQueue"]:
            messageCnt = len(event['rmqMessagesByQueue'][queue])
            print(f'Total messages received from event source: {messageCnt}' )
            for message in event['rmqMessagesByQueue'][queue]:
                data = base64.b64decode(message['data'])
                print(data)
        return {
            'statusCode': 200,
            'body': json.dumps('Hello from Lambda!')
        }
    
  6. Run sam deploy --guided and wait for the confirmation message. This deploys all of the resources.
    sam deploy

Creating a queue on the broker

The poller created by the Lambda service subscribes to a queue on the broker. In this step, you create a new queue:

  1. Navigate to the Amazon MQ console and choose the newly created broker.
  2. In the Connections panel, locate the URL for the RabbitMQ web console.
  3. Sign in with the credentials you created and stored in the Secrets Manager earlier.
  4. Select Queues from the top panel and then choose Add a new queue.
    new queue
  5. Enter ‘myQueue’ as the name for the queue and choose Add queue. (This must match exactly as that is the name you hardcoded in the AWS SAM template). Keep the other configuration options as default.myqueue

Testing the event source mapping

  1. In the RabbitMQ web console, choose Queues to confirm that the Lambda service is configured to consume events.
    rabbitmq console
  2. Choose the name of the queue. Under the Publish message tab, enter a message, and choose Publish message to send.
    publish message
  3. You see a confirmation message.
    confirmation message
  4. In the MQconsumer Lambda function, select the Monitoring tab and then choose View logs in CloudWatch. The log streams show that the Lambda function is invoked by Amazon MQ and you see the message Hello World in the logs.
    Cloudwatch Logs

A single Lambda function consumes messages from a single queue in an Amazon MQ broker. You control the rate of message processing by using the Batch size property in the event source mapping. The Lambda service limits the concurrency to five execution environments per queue.

Conclusion

Amazon MQ provides a fully managed, highly available message broker service for RabbitMQ. Now, Lambda supports Amazon MQ as an event source, and you can invoke Lambda functions from messages in Amazon MQ queues to integrate into your downstream serverless workflows.

In this post, I give an overview of how to set up an Amazon MQ for RabbitMQ broker. I explain how to create credentials for the RabbitMQ broker and store them in AWS Secrets Manager. I also show how to use AWS SAM templates to deploy the necessary resources to use ARMQ as an event source for AWS Lambda. Then I show how to send a test message to the queue and view the output in the Amazon CloudWatch Logs for the Lambda function.

For more serverless learning resources, visit https://serverlessland.com, and to view the RabbitMQ to Lambda pattern, visit the Serverlessland Patterns Website.

Automate Amazon ES synonym file updates

Post Syndicated from Ashwini Rudra original https://aws.amazon.com/blogs/big-data/automate-amazon-es-synonym-file-updates/

Search engines provide the means to retrieve relevant content from a collection of content. However, this can be challenging if certain exact words aren’t entered. You need to find the right item from a catalog of products, or the correct provider from a list of service providers, for example. The most common method of specifying your query is through a text box. If you enter the wrong terms, you won’t match the right items, and won’t get the best results.

Synonyms enable better search results by matching words that all match to a single indexable term. In Amazon Elasticsearch Service (Amazon ES), you can provide synonyms for keywords that your application’s users may look for. For example, your website may provide medical practitioner searches, and your users may search for “child’s doctor” instead of “pediatrician.” Mapping the two words together enables either search term to match documents that contain the term “pediatrician.” You can achieve similar search results by using synonym files. Amazon ES custom packages allow you to upload synonym files that define the synonyms in your catalog. One best practice is to manage the synonyms in Amazon Relational Database Service (Amazon RDS). You then need to deploy the synonyms to your Amazon ES domain. You can do this with AWS Lambda and Amazon Simple Storage Service (Amazon S3).

In this post, we discuss an approach using Amazon Aurora and Lambda functions to automate updating synonym files for improved search results.

Overview of solution

Amazon ES is a fully managed service that makes it easy to deploy, secure, and run Elasticsearch cost-effectively and at scale. You can build, monitor, and troubleshoot your applications using the tools you love, at the scale you need. The service supports open-source Elasticsearch API operations, managed Kibana, integration with Logstash, and many AWS services with built-in alerting and SQL querying.

The following diagram shows the solution architecture. One Lambda function pushes files to Amazon S3, and another function distributes the updates to Amazon ES.

Walkthrough overview

For search engineers, the synonym file’s content is usually stored within a database or in a data lake. You may have data in tabular format in Amazon RDS (in this case, we use Amazon Aurora MySQL). When updates to the synonym data table occur, the change triggers a Lambda function that pushes data to Amazon S3. The S3 event triggers a second function, which pushes the synonym file from Amazon S3 to Amazon ES. This architecture automates the entire synonym file update process.

To achieve this architecture, we complete the following high-level steps:

  1. Create a stored procedure to trigger the Lambda function.
  2. Write a Lambda function to verify data changes and push them to Amazon S3.
  3. Write a Lambda function to update the synonym file in Amazon ES.
  4. Test the data flow.

We discuss each step in detail in the next sections.

Prerequisites

Make sure you complete the following prerequisites:

  1. Configure an Amazon ES domain. We use a domain running Elasticsearch version 7.9 for this architecture.
  2. Set up an Aurora MySQL database. For more information, see Configuring your Amazon Aurora DB cluster.

Create a stored procedure to trigger a Lambda function

You can invoke a Lambda function from an Aurora MySQL database cluster using a native function or a stored procedure.

The following script creates an example synonym data table:

CREATE TABLE SynonymsTable (
SynID int NOT NULL AUTO_INCREMENT,
Base_Term varchar(255),
Synonym_1 varchar(255),
Synonym_2 varchar(255),
PRIMARY KEY (SynID)
)

You can now populate the table with sample data. To generate sample data in your table, run the following script:

INSERT INTO SynonymsTable(Base_Term, Synonym_1, Synonym_2)
VALUES ( 'danish', 'croissant', 'pastry')

Create a Lambda function

You can use two different methods to send data from Aurora to Amazon S3: a Lambda function or SELECT INTO OUTFILE S3.

To demonstrate the ease of setting up integration between multiple AWS services, we use a Lambda function that is called every time a change occurs that must be tracked in the database table. This function passes the data to Amazon S3. First create an S3 bucket where you store the synonym file using the Lambda function.

When you create your function, make sure you give the right permissions using an AWS Identity and Access Management (IAM) role for the S3 bucket. These permissions are for the Lambda execution role and S3 bucket where you store the synonyms.txt file. By default, Lambda creates an execution role with minimal permissions when you create a function on the Lambda console. The following is the Python code to create the synonyms.txt file in S3:

import boto3
import json
import botocore
from botocore.exceptions import ClientError

s3_resource = boto3.resource('s3')

filename = 'synonyms.txt'
BucketName = '<<provide your bucket name>>
local_file = '/tmp/test.txt'

def lambda_handler(event, context):
    S3_data = (("%s,%s,%s \n") %(event['Base_Term'], event['Synonym_1'], event['Synonym_2']))
    # open  a file and append new line
    try:
        obj=s3_resource.Bucket(BucketName).download_file(local_file,filename)
    except ClientError as e:
        if e.response['Error']['Code'] == "404":
            # create a new file if file does not exits 
            s3_resource.meta.client.put_object(Body=S3_data, Bucket= BucketName,Key=filename)
        else:
            # append file
            raise
    with open('/tmp/test.txt', 'a') as fd:
        fd.write(S3_data)
        
    s3_resource.meta.client.upload_file('/tmp/test.txt', BucketName, filename)

Note the Amazon Resource Name (ARN) of this Lambda function to use in a later step.

Give Aurora permissions to invoke a Lambda function

To give Aurora permissions to invoke your function, you must attach an IAM role with the appropriate permissions to the cluster. For more information, see Invoking a Lambda function from an Amazon Aurora DB cluster.

When you’re finished, the Aurora database has access to invoke a Lambda function.

Create a stored procedure and a trigger in Aurora

To create a new stored procedure, return to MySQL Workbench. Change the ARN in the following code to your Lambda function’s ARN before running the procedure:

DROP PROCEDURE IF EXISTS Syn_TO_S3;
DELIMITER ;;
CREATE PROCEDURE Syn_TO_S3 (IN SysID INT,IN Base_Term varchar(255),IN Synonym_1 varchar(255),IN Synonym_2 varchar(255)) LANGUAGE SQL
BEGIN
   CALL mysql.lambda_async('<<Lambda-Funtion-ARN>>,
    CONCAT('{ "SysID ": "', SysID,
    '", "Base_Term" : "', Base_Term,
    '", "Synonym_1" : "', Synonym_1,
    '", "Synonym_2" : "', Synonym_2,'"}')
    );
END
;;
DELIMITER

When this stored procedure is called, it invokes the Lambda function you created.

Create a trigger TR_SynonymTable_CDC on the table SynonymTable. When a new record is inserted, this trigger calls the Syn_TO_S3 stored procedure. See the following code:

DROP TRIGGER IF EXISTS TR_Synonym_CDC;
 
DELIMITER ;;
CREATE TRIGGER TR_Synonym_CDC
  AFTER INSERT ON SynonymsTable
  FOR EACH ROW
BEGIN
  SELECT  NEW.SynID, NEW.Base_Term, New.Synonym_1, New.Synonym_2 
  INTO @SynID, @Base_Term, @Synonym_1, @Synonym_2;
  CALL  Syn_TO_S3(@SynID, @Base_Term, @Synonym_1, @Synonym_2);
END
;;
DELIMITER ;

If a new row is inserted in SynonymsTable, the Lambda function that is mentioned in the stored procedure is invoked.

Verify that data is being sent from the function to Amazon S3 successfully. You may have to insert a few records, depending on the size of your data, before new records appear in Amazon S3.

Update synonyms in Amazon ES when a new synonym file becomes available

Amazon ES lets you upload custom dictionary files (for example, stopwords and synonyms) for use with your cluster. The generic term for these types of files is packages. Before you can associate a package with your domain, you must upload it to an S3 bucket. For instructions on uploading a synonym file for the first time and associating it to an Amazon ES domain, see Uploading packages to Amazon S3 and Importing and Associating packages.

To update the synonyms (package) when a new version of the synonym file becomes available, we complete the following steps:

  1. Create a Lambda function to update the existing package.
  2. Set up an S3 event notification to trigger the function.

Create a Lambda function to update the existing package

We use a Python-based Lambda function that uses the Boto3 AWS SDK for updating the Elasticsearch package. For more information about how to create a Python-based Lambda function, see Building Lambda functions with Python. You need the following information before we start coding for the function:

  • The S3 bucket ARN where the new synonym file is written
  • The Amazon ES domain name (available on the Amazon ES console)

  • The package ID of the Elasticsearch package we’re updating (available on the Amazon ES console)

You can use the following code for the Lambda function:

import logging
import boto3
import os

# Elasticsearch client
client = boto3.client('es')
# set up logging
logger = logging.getLogger('boto3')
logger.setLevel(logging.INFO)
# fetch from Environment Variable
package_id = os.environ['PACKAGE_ID']
es_domain_nm = os.environ['ES_DOMAIN_NAME']

def lambda_handler(event, context):
    s3_bucket = event["Records"][0]["s3"]["bucket"]["name"]
    s3_key = event["Records"][0]["s3"]["object"]["key"]
    logger.info("bucket: {}, key: {}".format(s3_bucket, s3_key))
    # update package with the new Synonym file.
    up_response = client.update_package(
        PackageID=package_id,
        PackageSource={
            'S3BucketName': s3_bucket,
            'S3Key': s3_key
        },
        CommitMessage='New Version: ' + s3_key
    )
    logger.info('Response from Update_Package: {}'.format(up_response))
    # check if the package update is completed
    finished = False
    while finished == False:
        # describe the package by ID
        desc_response = client.describe_packages(
            Filters=[{
                    'Name': 'PackageID',
                    'Value': [package_id]
                }],
            MaxResults=1
        )
        status = desc_response['PackageDetailsList'][0]['PackageStatus']
        logger.info('Package Status: {}'.format(status))
        # check if the Package status is back to available or not.
        if status == 'AVAILABLE':
            finished = True
            logger.info('Package status is now Available. Exiting loop.')
        else:
            finished = False
    logger.info('Package: {} update is now Complete. Proceed to Associating to ES Domain'.format(package_id))
    # once the package update is completed, re-associate with the ES domain
    # so that the new version is applied to the nodes.
    ap_response = client.associate_package(
        PackageID=package_id,
        DomainName=es_domain_nm
    )
    logger.info('Response from Associate_Package: {}'.format(ap_response))
    return {
        'statusCode': 200,
        'body': 'Custom Package Updated.'
    }

The preceding code requires environment variables to be set to the appropriate values and the IAM execution role assigned to the Lambda function.

Set up an S3 event notification to trigger the Lambda function

Now we set up event notification (all object create events) for the S3 bucket in which the updated synonym file is uploaded. For more information about how to set up S3 event notifications with Lambda, see Using AWS Lambda with Amazon S3.

Test the solution

To test our solution, let’s consider an Elasticsearch index (es-blog-index-01) that consists of the following documents:

  • tennis shoe
  • hightop
  • croissant
  • ice cream

A synonym file is already associated with the Amazon ES domain via Amazon ES custom packages and the index (es-blog-index-01) has the synonym file in the settings (analyzer, filter, mappings). For more information about how to associate a file to an Amazon ES domain and use it with the index settings, see Importing and associating packages and Using custom packages with Elasticsearch. The synonym file contains the following data:

danish, croissant, pastry

Test 1: Search with a word present in the synonym file

For our first search, we use a word that is present in the synonym file. The following screenshot shows that searching for “danish” brings up the document croissant based on a synonym match.

Test 2: Search with a synonym not present in the synonym file

Next, we search using a synonym that’s not in the synonym file. In the following screenshot, our search for “gelato” yields no result. The word “gelato” doesn’t match with the document ice cream because no synonym mapping is present for it.

In the next test, we add synonyms for “ice cream” and perform the search again.

Test 3: Add synonyms for “ice cream” and redo the search

To add the synonyms, let’s insert a new record into our database. We can use the following SQL statement:

INSERT INTO SynonymsTable(Base_Term, Synonym_1, Synonym_2)
VALUES ('frozen custard', 'gelato', 'ice cream')

When we search with the word “gelato” again, we get the ice cream document.

This confirms that the synonym addition is applied to the Amazon ES index.

Clean up resources

To avoid ongoing charges to your AWS account, remove the resources you created:

  1. Delete the Amazon ES domain.
  2. Delete the RDS DB instance.
  3. Delete the S3 bucket.
  4. Delete the Lambda functions.

Conclusion

In this post, we implemented a solution using Aurora, Lambda, Amazon S3, and Amazon ES that enables you to update synonyms automatically in Amazon ES. This provides central management for synonyms and ensures your users can obtain accurate search results when synonyms are changed in your source database.


About the Authors

Ashwini Rudra is a Solutions Architect at AWS. He has more than 10 years of experience architecting Windows workloads in on-premises and cloud environments. He is also an AI/ML enthusiast. He helps AWS customers, namely major sports leagues, define their cloud-first digital innovation strategy.

 

 

Arnab Ghosh is a Solutions Architect for AWS in North America helping enterprise customers build resilient and cost-efficient architectures. He has over 13 years of experience in architecting, designing, and developing enterprise applications solving complex business problems.

 

 

Jennifer Ng is an AWS Solutions Architect working with enterprise customers to understand their business requirements and provide solutions that align with their objectives. Her background is in enterprise architecture and web infrastructure, where she has held various implementation and architect roles in the financial services industry.

Translating content dynamically by using Amazon S3 Object Lambda

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/translating-content-dynamically-by-using-amazon-s3-object-lambda/

This post is written by Sandeep Mohanty, Senior Solutions Architect.

The recent launch of Amazon S3 Object Lambda creates many possibilities to transform data in S3 buckets dynamically. S3 Object Lambda can be used with other AWS serverless services to transform content stored in S3 in many creative ways. One example is using S3 Object Lambda with Amazon Translate to translate and serve content from S3 buckets on demand.

Amazon Translate is a serverless machine translation service that delivers fast and customizable language translation. With Amazon Translate, you can localize content such as websites and applications to serve a diverse set of users.

Using S3 Object Lambda with Amazon Translate, you do not need to translate content in advance for all possible permutations of source to target languages. Instead, you can transform content in near-real time using a data driven model. This can serve multiple language-specific applications simultaneously.

S3 Object Lambda enables you to process and transform data using Lambda functions as objects are being retrieved from S3 by a client application. S3 GET object requests invoke the Lambda function and you can customize it to transform the content to meet specific requirements.

For example, if you run a website or mobile application with global visitors, you must provide translations in multiple languages. Artifacts such as forms, disclaimers, or product descriptions can be translated to serve a diverse global audience using this approach.

Solution architecture

This is the high-level architecture diagram for the example application that translates dynamic content on demand:

Solution architecture

In this example, you create an S3 Object Lambda that intercepts S3 GET requests for an object. It then translates the file to a target language, passed as an argument appended to the S3 object key. At a high level, the steps can be summarized as follows:

  1. Create a Lambda function to translate data from a source language to a target language using Amazon Translate.
  2. Create an S3 Object Lambda Access Point from the S3 console.
  3. Select the Lambda function created in step 1.
  4. Provide a supporting S3 Access Point to give S3 Object Lambda access to the original object.
  5. Retrieve a file from S3 by invoking the S3 GetObject API, and pass the Object Lambda Access Point ARN as the bucket name instead of the actual S3 bucket name.

Creating the Lambda function

In the first step, you create the Lambda function, DynamicFileTranslation. This Lambda function is invoked by an S3 GET Object API call and it translates the requested object. The target language is passed as an argument appended to the S3 object key, corresponding to the object being retrieved.

For example, for the object key passed in the S3 GetObject API call is customized to look something like “ContactUs/contact-us.txt#fr”, the characters after the pound sign represent the code for the target language. In this case, ‘fr’ is French. The full list of supported languages and language codes can be found here.

This Lambda function dynamically translates the content of an object in S3 to a target language:

import json
import boto3
from urllib.parse import urlparse, unquote
from pathlib import Path
def lambda_handler(event, context):
    print(event)

   # Extract the outputRoute and outputToken from the object context
    object_context = event["getObjectContext"]
    request_route = object_context["outputRoute"]
    request_token = object_context["outputToken"]

   # Extract the user requested URL and the supporting access point arn
    user_request_url = event["userRequest"]["url"]
    supporting_access_point_arn = event["configuration"]["supportingAccessPointArn"]

    print("USER REQUEST URL: ", user_request_url)
   
   # The S3 object key is after the Host name in the user request URL.
   # The user request URL looks something like this, 
   # https://<User Request Host>/ContactUs/contact-us.txt#fr.
   # The target language code in the S3 GET request is after the "#"
    
    user_request_url = unquote(user_request_url)
    result = user_request_url.split("#")
    user_request_url = result[0]
    targetLang = result[1]
    
   # Extract the S3 Object Key from the user requested URL
    s3Key = str(Path(urlparse(user_request_url).path).relative_to('/'))
       
   # Get the original object from S3
    s3 = boto3.resource('s3')
    
   # To get the original object from S3,use the supporting_access_point_arn 
    s3Obj = s3.Object(supporting_access_point_arn, s3Key).get()
    srcText = s3Obj['Body'].read()
    srcText = srcText.decode('utf-8')

   # Translate original text
    translateClient = boto3.client('translate')
    response = translateClient.translate_text(
                                                Text = srcText,
                                                SourceLanguageCode='en',
                                                TargetLanguageCode=targetLang)
    
  # Write object back to S3 Object Lambda
    s3client = boto3.client('s3')
    s3client.write_get_object_response(
                                        Body=response['TranslatedText'],
                                        RequestRoute=request_route,
                                        RequestToken=request_token )
    
    return { 'statusCode': 200 }

The code in the Lambda function:

  • Extracts the outputRoute and outputToken from the object context. This defines where the WriteGetObjectResponse request is delivered.
  • Extracts the user-requested url from the event object.
  • Parses the S3 object key and the target language that is appended to the S3 object key.
  • Calls S3 GetObject to fetch the raw text of the source object.
  • Invokes Amazon Translate with the raw text extracted.
  • Puts the translated output back to S3 using the WriteGetObjectResponse API.

Configuring the Lambda IAM role

The Lambda function needs permissions to call back to the S3 Object Lambda access point with the WriteGetObjectResponse. It also needs permissions to call S3 GetObject and Amazon Translate. Add the following permissions to the Lambda execution role:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "S3ObjectLambdaAccess",
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3-object-lambda:WriteGetObjectResponse"
            ],
            "Resource": [
                "<arn of your S3 access point/*>”,
                "<arn of your Object Lambda accesspoint>"
            ]
        },
        {
            "Sid": "AmazonTranslateAccess",
            "Effect": "Allow",
            "Action": "translate:TranslateText",
            "Resource": "*"
        }
    ]
}

Deploying the Lambda using an AWS SAM template

Alternatively, deploy the Lambda function with the IAM role by using an AWS SAM template. The code for the Lambda function and the AWS SAM template is available for download from GitHub.

Creating the S3 Access Point

  1. Navigate to the S3 console and create a bucket with a unique name.
  2. In the S3 console, select “Access Points” and choose “Create access point”. Enter a name for the access point.
  3. For Bucket name, enter the S3 bucket name you entered in step 1.Create access point

This access point is the supporting access point for the Object Lambda Access Point you create in the next step. Keep all other settings on this page as default.

After creating the S3 access point, create the S3 Object Lambda Access Point using the supporting Access Point. The Lambda function you created earlier uses the supporting Access Point to download the original untransformed objects from S3.

Create Object Lambda Access Point

In the S3 console, go to the Object Lambda Access Point configuration and create an Object Lambda Access Point. Enter a name.

Create Object Lambda Access Point

For the Lambda function configurations, associate this with the Lambda function created earlier. Select the latest version of the Lambda function and keep all other settings as default.

Select Lambda function

To understand how to use the other settings of the S3 Object Lambda configuration, refer to the product documentation.

Testing dynamic content translation

In this section, you create a Python script and invoke the S3 GetObject API twice. First, against the S3 bucket and then against the Object Lambda Access Point. You can then compare the output to see how content is transformed using Object Lambda:

  1. Upload a text file to the S3 bucket using the Object Lambda Access Point you configured. For example, upload a sample “Contact Us” file in English to S3.
  2. To use the object Lambda Access Point, locate its ARN from the Properties tab of the Object Lambda Access Point.Object Lambda Access Point
  3. Create a local file called s3ol_client.py that contains the following Python script:
    import json
    import boto3
    import sys, getopt
      
    def main(argv):
    	
      try:	
        targetLang = sys.argv[1]
        print("TargetLang = ", targetLang)
        
        s3 = boto3.client('s3')
        s3Bucket = "my-s3ol-bucket"
        s3Key = "ContactUs/contact-us.txt"
    
        # Call get_object using the S3 bucket name
        response = s3.get_object(Bucket=s3Bucket,Key=s3Key)
        print("Original Content......\n")
        print(response['Body'].read().decode('utf-8'))
    
        print("\n")
    
        # Call get_object using the S3 Object Lambda access point ARN
        s3Bucket = "arn:aws:s3-object-lambda:us-west-2:123456789012:accesspoint/my-s3ol-access-point"
        s3Key = "ContactUs/contact-us.txt#" + targetLang
        response = s3.get_object(Bucket=s3Bucket,Key=s3Key)
        print("Transformed Content......\n")
        print(response['Body'].read().decode('utf-8'))
    
        return {'Success':200}             
      except:
        print("\n\nUsage: s3ol_client.py <Target Language Code>")
    
    #********** Program Entry Point ***********
    if __name__ == '__main__':
        main(sys.argv[1: ])
    
  4. Run the client program from the command line, passing the target language code as an argument. The full list of supported languages and codes in Amazon Translate can be found here.python s3ol_client.py "fr"

The output looks like this:

Example output

The first output is the original content that was retrieved when calling GetObject with the S3 bucket name. The second output is the transformed content when calling GetObject against the Object Lambda access point. The content is transformed by the Lambda function as it is being retrieved, and translated to French in near-real time.

Conclusion

This blog post shows how you can use S3 Object Lambda with Amazon Translate to simplify dynamic content translation by using a data driven approach. With user-provided data as arguments, you can dynamically transform content in S3 and generate a new object.

It is not necessary to create a copy of this new object in S3 before returning it to the client. We also saw that it is not necessary for the object with the same name to exist in the S3 bucket when using S3 Object Lambda. This pattern can be used to address several real world use cases that can benefit from the ability to transform and generate S3 objects on the fly.

For more serverless learning resources, visit Serverless Land.

Building well-architected serverless applications: Implementing application workload security – part 1

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/building-well-architected-serverless-applications-implementing-application-workload-security-part-1/

This series of blog posts uses the AWS Well-Architected Tool with the Serverless Lens to help customers build and operate applications using best practices. In each post, I address the serverless-specific questions identified by the Serverless Lens along with the recommended best practices. See the introduction post for a table of contents and explanation of the example application.

Security question SEC3: How do you implement application security in your workload?

Review and automate security practices at the application code level, and enforce security code review as part of development workflow. By implementing security at the application code level, you can protect against emerging security threats and reduce the attack surface from malicious code, including third-party dependencies.

Required practice: Review security awareness documents frequently

Stay up to date with both AWS and industry security best practices to understand and evolve protection of your workloads. Having a clear understanding of common threats helps you to mitigate them when developing your workloads.

The AWS Security Blog provides security-specific AWS content. The Open Web Application Security Project (OWASP) Top 10 is a guide for security practitioners to understand the most common application attacks and risks. The OWASP Top 10 Serverless Interpretation provides information specific to serverless applications.

Review and subscribe to vulnerability and security bulletins

Regularly review news feeds from multiple sources that are relevant to the technologies used in your workload. Subscribe to notification services to be informed of critical threats in near-real time.

The Common Vulnerabilities and Exposures (CVE) program identifies, defines, and catalogs publicly disclosed cybersecurity vulnerabilities. You can search the CVE list directly, for example “Python”.

CVE Python search

CVE Python search

The US National Vulnerability Database (NVD) allows you to search by vulnerability type, severity, and impact. You can also perform advanced searching by vendor name, product name, and version numbers. GitHub also integrates with CVE, which allows for advanced searching within the CVEproject/cvelist repository.

AWS Security Bulletins are a notification system for security and privacy events related to AWS services. Subscribe to the security bulletin RSS feed to keep up to date with AWS security announcements.

The US Cybersecurity and Infrastructure Security Agency (CISA) provides alerts about current security issues, vulnerabilities, and exploits. You can receive email alerts or subscribe to the RSS feed.

AWS Partner Network (APN) member Palo Alto Networks provides the “Serverless architectures Security Top 10” list. This is a security awareness and education guide to use while designing, developing, and testing serverless applications to help minimize security risks.

Good practice: Automatically review a workload’s code dependencies/libraries

Regularly reviewing application and code dependencies is a good industry security practice. This helps detect and prevent non-certified application code, and ensure that third-party application dependencies operate as intended.

Implement security mechanisms to verify application code and dependencies before using them

Combine automated and manual security code reviews to examine application code and its dependencies to ensure they operate as intended. Automated tools can help identify overly complex application code, and common security vulnerability exposures that are already cataloged.

Manual security code reviews, in addition to automated tools, help ensure that application code works as intended. Manual reviews can include business contextual information and integrations that automated tools may not capture.

Before adding any code dependencies to your workload, take time to review and certify each dependency to ensure that you are adding secure code. Use third-party services to review your code dependencies on every commit automatically.

OWASP has a code review guide and dependency check tool that attempt to detect publicly disclosed vulnerabilities within a project’s dependencies. The tool has a command line interface, a Maven plugin, an Ant task, and a Jenkins plugin.

GitHub has a number of security features for hosted repositories to inspect and manage code dependencies.

The dependency graph allows you to explore the packages that your repository depends on. Dependabot alerts show information about dependencies that are known to contain security vulnerabilities. You can choose whether to have pull requests generated automatically to update these dependencies. Code scanning alerts automatically scan code files to detect security vulnerabilities and coding errors.

You can enable these features by navigating to the Settings tab, and selecting Security & analysis.

GitHub configure security and analysis features

GitHub configure security and analysis features

Once Dependabot analyzes the repository, you can view the dependencies graph from the Insights tab. In the serverless airline example used in this series, you can view the Loyalty service package.json dependencies.

Serverless airline loyalty dependencies

Serverless airline loyalty dependencies

Dependabot alerts for security vulnerabilities are visible in the Security tab. You can review alerts and see information about how to resolve them.

Dependabot alert

Dependabot alert

Once Dependabot alerts are enabled for a repository, you can also view the alerts when pushing code to the repository from the terminal.

Dependabot terminal alert

Dependabot terminal alert

If you enable security updates, Dependabot can automatically create pull requests to update dependencies.

Dependabot pull requests

Dependabot pull requests

AWS Partner Network (APN) member Snyk has an integration with AWS Lambda to manage the security of your function code. Snyk determines what code and dependencies are currently deployed for Node.js, Ruby, and Java projects. It tests dependencies against their vulnerability database.

If you build your functions using container images, you can use Amazon Elastic Container Registry’s (ECR) image scanning feature. You can manually scan your images, or scan them on each push to your repository.

Elastic Container Registry image scanning example results

Elastic Container Registry image scanning example results

Best practice: Validate inbound events

Sanitize inbound events and validate them against a predefined schema. This helps prevent errors and increases your workload’s security posture by catching malformed events or events intentionally crafted to be malicious. The OWASP Input validation cheat sheet includes guidance for providing input validation security functionality in your applications.

Validate incoming HTTP requests against a schema

Implicitly trusting data from clients could lead to malformed data being processed. Use data type validators or web application frameworks to ensure data correctness. These should include regular expressions, value range, data structure, and data normalization.

You can configure Amazon API Gateway to perform basic validation of an API request before proceeding with the integration request to add another layer of security. This ensures that the HTTP request matches the desired format. Any HTTP request that does not pass validation is rejected, returning a 400 error response to the caller.

The Serverless Security Workshop has a module on API Gateway input validation based on the fictional Wild Rydes unicorn raid hailing service. The example shows a REST API endpoint where partner companies of Wild Rydes can submit unicorn customizations, such as branded capes, to advertise their company. The API endpoint should ensure that the request body follows specific patterns. These include checking the ImageURL is a valid URL, and the ID for Cape is a numeric value.

In API Gateway, a model defines the data structure of a payload, using the JSON schema draft 4. The model ensures that you receive the parameters in the format you expect. You can check them against regular expressions. The CustomizationPost model specifies that the ImageURL and Cape schemas should contain the following valid patterns:

    "imageUrl": {
      "type": "string",
      "title": "The Imageurl Schema",
      "pattern": "^https?:\/\/[-a-zA-Z0-9@:%_+.~#?&//=]+$"
    },
    "sock": {
      "type": "string",
      "title": " The Cape Schema ",
      "pattern": "^[0-9]*$"
    },
    …

The model is applied to the /customizations/post method as part of the Method Request. The Request Validator is set to Validate body and the CustomizationPost model is set for the Request Body.

API Gateway request validator

API Gateway request validator

When testing the POST /customizations API with valid parameters using the following input:

{  
   "name":"Cherry-themed unicorn",
   "imageUrl":"https://en.wikipedia.org/wiki/Cherry#/media/File:Cherry_Stella444.jpg",
   "sock": "1",
   "horn": "2",
   "glasses": "3",
   "cape": "4"
}

The result is a valid response:

{"customUnicornId":<the-id-of-the-customization>}

Testing validation to the POST /customizations API using invalid parameters shows the input validation process.

The ImageUrl is not a valid URL:

 {  
    "name":"Cherry-themed unicorn",
    "imageUrl":"htt://en.wikipedia.org/wiki/Cherry#/media/File:Cherry_Stella444.jpg",
    "sock": "1" ,
    "horn": "2" ,
    "glasses": "3",
    "cape": "4"
 }

The Cape parameter is not a number, which shows a SQL injection attempt.

 {  
    "name":"Orange-themed unicorn",
    "imageUrl":"https://en.wikipedia.org/wiki/Orange_(fruit)#/media/File:Orange-Whole-%26-Split.jpg",
    "sock": "1",
    "horn": "2",
    "glasses": "3",
    "cape":"2); INSERT INTO Cape (NAME,PRICE) VALUES ('Bad color', 10000.00"
 }

These return a 400 Bad Request response from API Gateway before invoking the Lambda function:

{"message": "Invalid request body"}

To gain further protection, consider adding an AWS Web Application Firewall (AWS WAF) access control list to your API endpoint. The workshop includes an AWS WAF module to explore three AWS WAF rules:

  • Restrict the maximum size of request body
  • SQL injection condition as part of the request URI
  • Rate-based rule to prevent an overwhelming number of requests
AWS WAF ACL

AWS WAF ACL

AWS WAF also includes support for custom responses and request header insertion to improve the user experience and security posture of your applications.

For more API Gateway security information, see the security overview whitepaper.

Also add further input validation logic to your Lambda function code itself. For examples, see “Input Validation for Serverless”.

Conclusion

Implementing application security in your workload involves reviewing and automating security practices at the application code level. By implementing code security, you can protect against emerging security threats. You can improve the security posture by checking for malicious code, including third-party dependencies.

In this post, I cover reviewing security awareness documentation such as the CVE database. I show how to use GitHub security features to inspect and manage code dependencies. I then show how to validate inbound events using API Gateway request validation.

This well-architected question will be continued where I look at securely storing, auditing, and rotating secrets that are used in your application code.

For more serverless learning resources, visit Serverless Land.

ICYMI: Serverless Q2 2021

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/icymi-serverless-q2-2021/

Welcome to the 14th edition of the AWS Serverless ICYMI (in case you missed it) quarterly recap. Every quarter, we share all of the most recent product launches, feature enhancements, blog posts, webinars, Twitch live streams, and other interesting things that you might have missed!

Q2 calendar

In case you missed our last ICYMI, check out what happened last quarter here.

AWS Step Functions

Step Functions launched Workflow Studio, a new visual tool that provides a drag-and-drop user interface to build Step Functions workflows. This exposes all the capabilities of Step Functions that are available in Amazon States Language (ASL). This makes it easier to build and change workflows and build definitions in near-real time.

For more:

Workflow Studio

The new data flow simulator in the Step Functions console helps you evaluate the inputs and outputs passed through your state machine. It allows you to simulate each of the fields used to process data and updates in real time. It can help accelerate development with workflows and help visualize JSONPath processing.

For more:

Data flow simulator

Also, Amazon API Gateway can now invoke synchronous Express Workflows using REST APIs.

Amazon EventBridge

EventBridge now supports cross-Region event routing from any commercial AWS Region to a list of supported Regions. This feature allows you to centralize global events for auditing and monitoring or replicate events across Regions.

EventBridge cross-Region routing

The service now also supports bus-to-bus event routing in the same Region and in the same AWS account. This can be useful for centralizing events related to a single project, application, or team within your organization.

EventBridge bus-to-bus

You can now use EventBridge as a resource within Step Functions workflows. This provides a direct service integration for both standard and Express Workflows. You can publish events directly to a specified event bus using either a request-response or wait-for-callback pattern.

EventBridge added a new target for rules – Amazon SageMaker Pipelines. This allows you to use a rule to trigger a continuous integration and continuous deployment (CI/CD) service for your machine learning workloads.

AWS Lambda

Lambda Extensions

AWS Lambda extensions are now generally available including some performance and functionality improvements. Lambda extensions provide a new way to integrate your chosen monitoring, observability, security, and governance tools with AWS Lambda. These use the Lambda Runtime Extensions API to integrate with the execution environment and provide hooks into the Lambda lifecycle.

To help build your own extensions, there is an updated GitHub repository with example code.

To learn more:

  • Watch a Tech Talk with Julian Wood.
  • Watch the 8-episode Learning Path series covering all aspects of extensions.

Extensions available today

Amazon CloudWatch Lambda Insights support for Lambda container images is now generally available.

Amazon SNS

Amazon SNS has expanded the set of filter operators available to include IP address matching, existence of an attribute key, and “anything-but” matching.

The service has also introduced an SMS sandbox to help developers testing workloads that send text messages.

To learn more:

Amazon DynamoDB

DynamoDB announced CloudFormation support for several features. First, it now supports configuring Kinesis Data Streams using CloudFormation. This allows you to use infrastructure as code to set up Kinesis Data Streams instead of DynamoDB streams.

The service also announced that NoSQL Workbench now supports CloudFormation, so you can build data models and configure table capacity settings directly from the tool. Finally, you can now create and manage global tables with CloudFormation.

Learn how to use the recently launched Serverless Patterns Collection to configure DynamoDB as an event source for Lambda.

AWS Amplify

Amplify Hosting announced support for server-side rendered (SSR) apps built with the Next.js framework. This provides a zero configuration option for developers to deploy and host their Next.js-based applications.

The Amplify GLI now allows developers to make multiple DynamoDB GSI updates in a single deployment. This can help accelerate data model iterations. Additionally, the data management experience in the Amplify Admin UI launched at AWS re:Invent 2020 is now generally available.

AWS Serverless Application Model (AWS SAM)

AWS SAM has a public preview of support for local development and testing of AWS Cloud Development Kit (AWS CDK) projects.

To learn more:

Serverless blog posts

Operating Lambda

The “Operating Lambda” blog series includes the following posts in this quarter:

Streaming data

The “Building serverless applications with streaming data” blog series shows how to use Lambda with Kinesis.

Getting started with serverless for developers

Learn how to build serverless applications from your local integrated development environment (IDE).

April

May

June

Tech Talks & Events

We hold AWS Online Tech Talks covering serverless topics throughout the year. These are listed in the Serverless section of the AWS Online Tech Talks page. We also regularly deliver talks at conferences and events around the world, speak on podcasts, and record videos you can find to learn in bite-sized chunks.

Here are some from Q2:

Serverless Live was a day of talks held on May 19, featuring the serverless developer advocacy team, along with Adrian Cockroft and Jeff Barr. You can watch a replay of all the talks on the AWS Twitch channel.

Videos

YouTube ServerlessLand channel

Serverless Office Hours – Tues 10 AM PT / 1PM EST

Weekly live virtual office hours. In each session we talk about a specific topic or technology related to serverless and open it up to helping you with your real serverless challenges and issues. Ask us anything you want about serverless technologies and applications.

YouTube: youtube.com/serverlessland
Twitch: twitch.tv/aws

April

May

June

DynamoDB Office Hours

Are you an Amazon DynamoDB customer with a technical question you need answered? If so, join us for weekly Office Hours on the AWS Twitch channel led by Rick Houlihan, AWS principal technologist and Amazon DynamoDB expert. See upcoming and previous shows

Learning Path – AWS Lambda Extensions: The deep dive

Are you looking for a way to more easily integrate AWS Lambda with your favorite monitoring, observability, security, governance, and other tools? Welcome to AWS Lambda extensions: The deep dive, a learning path video series that shows you everything about augmenting Lambda functions using Lambda extensions.

There are also other helpful videos covering serverless available on the Serverless Land YouTube channel.

Still looking for more?

The Serverless landing page has more information. The Lambda resources page contains case studies, webinars, whitepapers, customer stories, reference architectures, and even more Getting Started tutorials.

You can also follow the Serverless Developer Advocacy team on Twitter to see the latest news, follow conversations, and interact with the team.

Implementing a LIFO task queue using AWS Lambda and Amazon DynamoDB

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/implementing-a-lifo-task-queue-using-aws-lambda-and-amazon-dynamodb/

This post was written by Diggory Briercliffe, Senior IoT Architect.

When implementing a task queue, you can use Amazon SQS standard or FIFO (First-In-First-Out) queue types. Both queue types give priority to tasks created earlier over tasks that are created later. However, there are use cases where you need a LIFO (Last-In-First-Out) queue.

This post shows how to implement a serverless LIFO task queue. This uses AWS Lambda, Amazon DynamoDB, AWS Serverless Application Model (AWS SAM), and other AWS Serverless technologies.

The LIFO task queue gives priority to newer queue tasks over earlier tasks. Under heavy load, earlier tasks are deprioritized and eventually removed. This is useful when your workload must communicate with a system that is throughput-constrained and newer tasks should have priority.

To help understand the approach, consider the following use case. As part of optimizing the responsiveness of a mobile application, an IoT application validates device IP addresses after connecting to AWS IoT Core. Users open the application soon after the device connects so the most recent connection events should take priority for the validation work.

If the validation work is not done at connection time, it can be done later. A legacy system validates the IP addresses, but its throughput capacity cannot match the peak connection rate of the IoT devices. A LIFO queue can manage this load, by prioritizing validation of newer connection events. It can buffer or load shed earlier connection event validation.

For a more detailed discussion around insurmountable queue backlogs and queuing theory, read “Avoiding insurmountable queue backlogs” in the Amazon Builders’ Library.

Example application

An example application implementing the LIFO queue approach is available at https://github.com/aws-samples/serverless-lifo-queue-demonstration.

The application uses AWS SAM and the Lambda functions are written in Node.js. The AWS SAM template describes AWS resources required by the application. These include a DynamoDB table, Lambda functions, and Amazon SNS topics.

The README file contains instructions on deploying and testing the application, with detailed information on how it works.

Overview

The example application has the following queue characteristics:

  1. Newer queue tasks are prioritized over earlier tasks.
  2. Queue tasks are buffered if they cannot be processed.
  3. Queue tasks are eventually deleted if they are never processed, such as when the queue is under insurmountable load.
  4. Correct queue task state transition is maintained (such as PENDING to TAKEN, but not PENDING to SUCCESS).

A DynamoDB table stores queue task items. It uses the following DynamoDB features:

  • A global secondary index (GSI) sorts queue task items by a created timestamp, in reverse chronological (LIFO) order.
  • Update expressions and condition expressions provide atomic and exclusive queue task item updates. This prevents duplicate processing of queue tasks and ensures that the queue task state transitions are valid.
  • Time to live (TTL) deletes queue task items once they expire. Under insurmountable load, this ensures that tasks are deleted if they are never processed from the queue. It also deletes queue task items once they have been processed.
  • DynamoDB Streams invoke a Lambda function when new queue task items are inserted into the table and must be processed.

The application consists of the following resources defined in the AWS SAM template:

  • QueueTable: A DynamoDB table containing queue task items, which is configured for DynamoDB Streams to invoke a TriggerFunction.
  • TriggerFunction: A Lambda function, which governs triggering of queue task processing. Source code: app/trigger.js
  • ProcessTasksFunction: A Lambda function, which processes queue tasks and ensures consistent queue task state flow. Source code: app/process_tasks.js
  • CreateTasksFunction: A Lambda function, which inserts queue task items into the QueueTable. Source code: app/create_tasks.js
  • TriggerTopic: An SNS topic which TriggerFunction subscribes to.
  • ProcessTasksTopic: An SNS topic which ProcessTasksFunction subscribes to.

The following diagram illustrates how those resources interact to implement the LIFO queue.

LIFO Architecture diagram

LIFO Architecture diagram

  1. CreateTasksFunction inserts queue task items into QueueTable with PENDING state.
  2. A DynamoDB stream invokes TriggerFunction for all queue task item activity in QueueTable.
  3. TriggerFunction publishes a notification on ProcessTasksTopic if queue tasks should be processed.
  4. ProcessTasksFunction subscribes to ProcessTasksTopic.
  5. ProcessTasksFunction queries for PENDING queue task items in QueueTable for up to 1 minute, or until no PENDING queue task items remain.
  6. ProcessTasksFunction processes each PENDING queue task by calling the throughput constrained legacy system.
  7. ProcessTasksFunction updates each queue task item during processing to reflect state (first to TAKEN, and then to SUCCESS, FAILURE, or PENDING).
  8. ProcessTasksFunction publishes an SNS notification on TriggerTopic if PENDING tasks remain in the queue.
  9. TriggerFunction subscribes to TriggerTasksTopic.

Application activity continues while DynamoDB Streams receives QueueTable events (2) or TriggerTasksTopic receives notifications (9).

LIFO queue DynamoDB table

A DynamoDB table stores the LIFO queue task items. The AWS SAM template defines this resource (named QueueTable):

  • Each item in the table represents a queue task. It has the item attributes taskId (hash key), taskStatus, taskCreated, and taskUpdated.
  • The table has a single global secondary index (GSI) with taskStatus as the hash key and taskCreated as the range key. This GSI is fundamental to LIFO queue characteristics. It allows you to query for PENDING queue tasks, in reverse chronological order, so that the newest tasks can be processed first.
  • The DynamoDB TTL attribute causes earlier queue tasks to expire and be deleted. This prevents the queue from growing indefinitely if there is insurmountable load.
  • DynamoDB Streams invokes the TriggerFunction Lambda function for all changes in QueueTable.

Triggering queue task processing

The application continuously processes all PENDING queue tasks until there is none remaining. With no PENDING queue tasks, the application will be idle.

As the application is serverless, task processing is triggered by events. If a single Lambda function cannot process the volume of PENDING tasks, the application notifies itself so that processing can continue in another invocation. This is a tail call, which is an SNS notification sent by ProcessTasksFunction to TriggerTopic.

The Lambda functions, which collaborate on managing the LIFO queue are:

  • TriggerFunction is a proxy to ProcessTasksFunction and decides if task processing should be triggered. This function is invoked by DynamoDB Streams events on item changes in QueueTable or by a tail call SNS notification received from TriggerTopic.
  • ProcessTasksFunction performs the processing of queue tasks and implements the LIFO queue behavior. An SNS notification published on ProcessTasksTopic invokes this function.

Processing queue task items

The ProcessTasksFunction function processes queue tasks:

  1. The function is invoked by an SNS notification on ProcessTasksTopic.
  2. While the function runs, it polls QueueTable for PENDING queue tasks.
  3. The function processes each queue task and then updates the item.
  4. The function stops polling after 1 minute or if there are no PENDING queue tasks remaining.
  5. If there are more PENDING tasks in the queue, the function triggers another task. It sends a tail call SNS notification to TriggerTopic.

This uses DynamoDB expressions to ensure that tasks are not processed more than once during periods of concurrent function invocations. To prevent higher concurrency, the reserved concurrent executions attribute is set to 1.

Before processing a queue task, the taskStatus item attribute is transitioned from PENDING to TAKEN. Following queue task processing, the taskStatus item attribute is transitioned from TAKEN to SUCCESS or FAILURE.

If a queue task cannot be processed (for example, an external system has reached capacity), the item taskStatus attribute is set to PENDING again. Any aging PENDING queue tasks that cannot be processed are buffered. They are eventually deleted once they expire, due to the TTL configuration.

Querying for queue task items

To get the most recently created PENDING queue tasks, query the task-status-created-index GSI. The following shows the DynamoDB query action request parameters for the task-status-created-index. By using a Limit of 10 and setting ScanIndexForward to false, it retrieves the 10 most recently created queue task items:

{
  "TableName": "QueueTable",
  "IndexName": "task-status-created-index",
  "ExpressionAttributeValues": {
    ":taskStatus": {
      "S": "PENDING"
    }
  },
  "KeyConditionExpression": "taskStatus = :taskStatus",
  "Limit": 10,
  "ScanIndexForward": false
}

Updating queue tasks items

The following code shows request parameters for the DynamoDB UpdateItem action. This sets the taskStatus attribute of a queue task item (to TAKEN from PENDING). The update expression and condition expression ensure that the taskStatus is set (to TAKEN) only if the current value is as expected (from PENDING). It also ensures that the update is atomic. This prevents more-than-once processing of a queue task.

{
  "TableName": "QueueTable",
  "Key": {
    "taskId": {
      "S": "task-123"
    }
  },
  "UpdateExpression": "set taskStatus = :toTaskStatus, taskUpdated = :taskUpdated",
  "ConditionExpression": "taskStatus = :fromTaskStatus",
  "ExpressionAttributeValues": {
    ":fromTaskStatus": {
      "S": "PENDING"
    },
    ":toTaskStatus": {
      "S": "TAKEN"
    },
    ":taskUpdated": {
      "N": "1623241938151"
    }
  }
}

Conclusion

This post describes how to implement a LIFO queue with AWS Serverless technologies, using an example application as an example. Newer tasks in the queue are prioritized over earlier tasks. Tasks that cannot be processed are buffered and eventually load shed. This helps for use cases with heavy load and where newer queue tasks must take priority.

For more serverless learning resources, visit Serverless Land.

Hosting Hugging Face models on AWS Lambda for serverless inference

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/hosting-hugging-face-models-on-aws-lambda/

This post written by Eddie Pick, AWS Senior Solutions Architect – Startups and Scott Perry, AWS Senior Specialist Solutions Architect – AI/ML

Hugging Face Transformers is a popular open-source project that provides pre-trained, natural language processing (NLP) models for a wide variety of use cases. Customers with minimal machine learning experience can use pre-trained models to enhance their applications quickly using NLP. This includes tasks such as text classification, language translation, summarization, and question answering – to name a few.

First introduced in 2017, the Transformer is a modern neural network architecture that has quickly become the most popular type of machine learning model applied to NLP tasks. It outperforms previous techniques based on convolutional neural networks (CNNs) or recurrent neural networks (RNNs). The Transformer also offers significant improvements in computational efficiency. Notably, Transformers are more conducive to parallel computation. This means that Transformer-based models can be trained more quickly, and on larger datasets than their predecessors.

The computational efficiency of Transformers provides the opportunity to experiment and improve on the original architecture. Over the past few years, the industry has seen the introduction of larger and more powerful Transformer models. For example, BERT was first published in 2018 and was able to get better benchmark scores on 11 natural language processing tasks using between 110M-340M neural network parameters. In 2019, the T5 model using 11B parameters achieved better results on benchmarks such as summarization, question answering, and text classification. More recently, the GPT-3 model was introduced in 2020 with 175B parameters and in 2021 the Switch Transformers are scaling to over 1T parameters.

One consequence of this trend toward larger and more powerful models is an increased barrier to entry. As the number of model parameters increases, as does the computational infrastructure that is necessary to train such a model. This is where the open-source Hugging Face Transformers project helps.

Hugging Face Transformers provides over 30 pretrained Transformer-based models available via a straightforward Python package. Additionally, there are over 10,000 community-developed models available for download from Hugging Face. This allows users to use modern Transformer models within their applications without requiring model training from scratch.

The Hugging Face Transformers project directly addresses challenges associated with training modern Transformer-based models. Many customers want a zero administration ML inference solution that allows Hugging Face Transformers models to be hosted in AWS easily. This post introduces a low touch, cost effective, and scalable mechanism for hosting Hugging Face models for real-time inference using AWS Lambda.

Overview

Our solution consists of an AWS Cloud Development Kit (AWS CDK) script that automatically provisions container image-based Lambda functions that perform ML inference using pre-trained Hugging Face models. This solution also includes Amazon Elastic File System (EFS) storage that is attached to the Lambda functions to cache the pre-trained models and reduce inference latency.Solution architecture

In this architectural diagram:

  1. Serverless inference is achieved by using Lambda functions that are based on container image
  2. The container image is stored in an Amazon Elastic Container Registry (ECR) repository within your account
  3. Pre-trained models are automatically downloaded from Hugging Face the first time the function is invoked
  4. Pre-trained models are cached within Amazon Elastic File System storage in order to improve inference latency

The solution includes Python scripts for two common NLP use cases:

  • Sentiment analysis: Identifying if a sentence indicates positive or negative sentiment. It uses a fine-tuned model on sst2, which is a GLUE task.
  • Summarization: Summarizing a body of text into a shorter, representative text. It uses a Bart model that was fine-tuned on the CNN / Daily Mail dataset.

For simplicity, both of these use cases are implemented using Hugging Face pipelines.

Prerequisites

The following is required to run this example:

Deploying the example application

  1. Clone the project to your development environment:
    git clone https://github.com/aws-samples/zero-administration-inference-with-aws-lambda-for-hugging-face.git
  2. Install the required dependencies:
    pip install -r requirements.txt
  3. Bootstrap the CDK. This command provisions the initial resources needed by the CDK to perform deployments:
    cdk bootstrap
  4. This command deploys the CDK application to its environment. During the deployment, the toolkit outputs progress indications:
    $ cdk deploy

Testing the application

After deployment, navigate to the AWS Management Console to find and test the Lambda functions. There is one for sentiment analysis and one for summarization.

To test:

  1. Enter “Lambda” in the search bar of the AWS Management Console:Console Search
  2. Filter the functions by entering “ServerlessHuggingFace”:Filtering functions
  3. Select the ServerlessHuggingFaceStack-sentimentXXXXX function:Select function
  4. In the Test event, enter the following snippet and then choose Test:Test function
{
   "text": "I'm so happy I could cry!"
}

The first invocation takes approximately one minute to complete. The initial Lambda function environment must be allocated and the pre-trained model must be downloaded from Hugging Face. Subsequent invocations are faster, as the Lambda function is already prepared and the pre-trained model is cached in EFS.Function test results

The JSON response shows the result of the sentiment analysis:

{
  "statusCode": 200,
  "body": {
    "label": "POSITIVE",
    "score": 0.9997532367706299
  }
}

Understanding the code structure

The code is organized using the following structure:

├── inference
│ ├── Dockerfile
│ ├── sentiment.py
│ └── summarization.py
├── app.py
└── ...

The inference directory contains:

  • The Dockerfile used to build a custom image to be able to run PyTorch Hugging Face inference using Lambda functions
  • The Python scripts that perform the actual ML inference

The sentiment.py script shows how to use a Hugging Face Transformers model:

import json
from transformers import pipeline

nlp = pipeline("sentiment-analysis")

def handler(event, context):
    response = {
        "statusCode": 200,
        "body": nlp(event['text'])[0]
    }
    return response

For each Python script in the inference directory, the CDK generates a Lambda function backed by a container image and a Python inference script.

CDK script

The CDK script is named app.py in the solution’s repository. The beginning of the script creates a virtual private cloud (VPC).

vpc = ec2.Vpc(self, 'Vpc', max_azs=2)

Next, it creates the EFS file system and an access point in EFS for the cached models:

        fs = efs.FileSystem(self, 'FileSystem',
                            vpc=vpc,
                            removal_policy=cdk.RemovalPolicy.DESTROY)
        access_point = fs.add_access_point('MLAccessPoint',
                                           create_acl=efs.Acl(
                                               owner_gid='1001', owner_uid='1001', permissions='750'),
                                           path="/export/models",
                                           posix_user=efs.PosixUser(gid="1001", uid="1001"))>

It iterates through the Python files in the inference directory:

docker_folder = os.path.dirname(os.path.realpath(__file__)) + "/inference"
pathlist = Path(docker_folder).rglob('*.py')
for path in pathlist:

And then creates the Lambda function that serves the inference requests:

            base = os.path.basename(path)
            filename = os.path.splitext(base)[0]
            # Lambda Function from docker image
            function = lambda_.DockerImageFunction(
                self, filename,
                code=lambda_.DockerImageCode.from_image_asset(docker_folder,
                                                              cmd=[
                                                                  filename+".handler"]
                                                              ),
                memory_size=8096,
                timeout=cdk.Duration.seconds(600),
                vpc=vpc,
                filesystem=lambda_.FileSystem.from_efs_access_point(
                    access_point, '/mnt/hf_models_cache'),
                environment={
                    "TRANSFORMERS_CACHE": "/mnt/hf_models_cache"},
            )

Adding a translator

Optionally, you can add more models by adding Python scripts in the inference directory. For example, add the following code in a file called translate-en2fr.py:

import json
from transformers 
import pipeline

en_fr_translator = pipeline('translation_en_to_fr')

def handler(event, context):
    response = {
        "statusCode": 200,
        "body": en_fr_translator(event['text'])[0]
    }
    return response

Then run:

$ cdk synth
$ cdk deploy

This creates a new endpoint to perform English to French translation.

Cleaning up

After you are finished experimenting with this project, run “cdk destroy” to remove all of the associated infrastructure.

Conclusion

This post shows how to perform ML inference for pre-trained Hugging Face models by using Lambda functions. To avoid repeatedly downloading the pre-trained models, this solution uses an EFS-based approach to model caching. This helps to achieve low-latency, near real-time inference. The solution is provided as infrastructure as code using Python and the AWS CDK.

We hope this blog post allows you to prototype quickly and include modern NLP techniques in your own products.