Tag Archives: Amazon EventBridge

Optimizing the cost of serverless web applications

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/optimizing-the-cost-of-serverless-web-applications/

Web application backends are one of the most frequent types of serverless use-case for customers. The pay-for-value model can make it cost-efficient to build web applications using serverless tools.

While serverless cost is generally correlated with level of usage, there are architectural decisions that impact cost efficiency. The impact of these choices is more significant as your traffic grows, so it’s important to consider the cost-effectiveness of different designs and patterns.

This blog post reviews some common areas in web applications where you may be able to optimize cost. It uses the Happy Path web application as a reference example, which you can read about in the introductory blog post.

Serverless web applications generally use a combination of the services in the following diagram. I cover each of these areas to highlight common areas for cost optimization.

Serverless architecture by AWS service

The API management layer: Selecting the right API type

Most serverless web applications use an API between the frontend client and the backend architecture. Amazon API Gateway is a common choice since it is a fully managed service that scales automatically. There are three types of API offered by the service – REST APIs, WebSocket APIs, and the more recent HTTP APIs.

HTTP APIs offer many of the features in the REST APIs service, but the cost is often around 70% less. It supports Lambda service integration, JWT authorization, CORS, and custom domain names. It also has a simpler deployment model than REST APIs. This feature set tends to work well for web applications, many of which mainly use these capabilities. Additionally, HTTP APIs will gain feature parity with REST APIs over time.

The Happy Path application is designed for 100,000 monthly active users. It uses HTTP APIs, and you can inspect the backend/template.yaml to see how to define these in the AWS Serverless Application Model (AWS SAM). If you have existing AWS SAM templates that are using REST APIs, in many cases you can change these easily:

REST to HTTP API

Content distribution layer: Optimizing assets

Amazon CloudFront is a content delivery network (CDN). It enables you to distribute content globally across 216 Points of Presence without deploying or managing any infrastructure. It reduces latency for users who are geographically dispersed and can also reduce load on other parts of your service.

A typical web application uses CDNs in a couple of different ways. First, there is the distribution of the application itself. For single-page application frameworks like React or Vue.js, the build processes create static assets that are ideal for serving over a CDN.

However, these builds may not be optimized and can be larger than necessary. Many frameworks offer optimization plugins, and the JavaScript community frequently uses Webpack to bundle modules and shrink deployment packages. Similarly, any media assets used in the application build should be optimized. You can use tools like Lighthouse to analyze your web apps to find images that can be resized or compressed.

Optimizing images

The second common CDN use-case for web apps is for user-generated content (UGC). Many apps allow users to upload images, which are then shared with other users. A typical photo from a 12-megapixel smartphone is 3–9 MB in size. This high resolution is not necessary when photos are rendered within web apps. Displaying the high-resolution asset results in slower download performance and higher data transfer costs.

The Happy Path application uses a Resizer Lambda function to optimize these uploaded assets. This process creates two different optimized images depending upon which component loads the asset.

Image sizes in front-end applications

The upload S3 bucket shows the original size of the upload from the smartphone:

The distribution S3 bucket contains the two optimized images at different sizes:

Optimized images in the distribution S3 bucket

The distribution file sizes are 98–99% smaller. For a busy web application, using optimized image assets can make a significant difference to data transfer and CloudFront costs.

Additionally, you can convert to highly optimized file formats such as WebP to reduce file size even further. Not all browsers support this format, but you can use CSS on the frontend to fall back to other types if needed:

<img src="myImage.webp" onerror="this.onerror=null; this.src='myImage.jpg'">

The data layer

AWS offers many different database and storage options that can be useful for web applications. Billing models vary by service and Region. By understanding the data access and storage requirements of your app, you can make informed decisions about the right service to use.

Generally, it’s more cost-effective to store binary data in S3 than a database. First, when the data is uploaded, you can upload directly to S3 with presigned URLs instead of proxying data via API Gateway or another service.

If you are using Amazon DynamoDB, it’s best practice to store larger items in S3 and include a reference token in a table item. Part of DynamoDB pricing is based on read capacity units (RCUs). For binary items such as images, it is usually more cost-efficient to use S3 for storage.

Many web developers who are new to serverless are familiar with using a relational database, so choose Amazon RDS for their database needs. Depending upon your use-case and data access patterns, it may be more cost effective to use DynamoDB instead. RDS is not a serverless service so there are monthly charges for the underlying compute instance. DynamoDB pricing is based upon usage and storage, so for many web apps may be a lower-cost choice.

Integration layer

This layer includes services like Amazon SQS, Amazon SNS, and Amazon EventBridge, which are essential for decoupling serverless applications. Each of these have a request-based pricing component, where 64 KB of a payload is billed as one request. For example, a single SQS message with a 256 KB payload is billed as four requests. There are two optimization methods common for web applications.

1. Combine messages

Many messages sent to these services are much smaller than 64 KB. In some applications, the publishing service can combine multiple messages to reduce the total number of publish actions to SNS. Additionally, by either eliminating unused attributes in the message or compressing the message, you can store more data in a single request.

For example, a publishing service may be able to combine multiple messages together in a single publish action to an SNS topic:

  • Before optimization, a publishing service sends 100,000,000 1KB-messages to an SNS topic. This is charged as 100 million messages for a total cost of $50.00.
  • After optimization, the publishing service combines messages to send 1,562,500 64KB-messages to an SNS topic. This is charged as 1,562,500 messages for a total cost of $0.78.

2. Filter messages

In many applications, not every message is useful for a consuming service. For example, an SNS topic may publish to a Lambda function, which checks the content and discards the message based on some criteria. In this case, it’s more cost effective to use the native filtering capabilities of SNS. The service can filter messages and only invoke the Lambda function if the criteria is met. This lowers the compute cost by only invoking Lambda when necessary.

For example, an SNS topic receives messages about customer orders and forwards these to a Lambda function subscriber. The function is only interested in canceled orders and discards all other messages:

  • Before optimization, the SNS topic sends all messages to a Lambda function. It evaluates the message for the presence of an order canceled attribute. On average, only 25% of the messages are processed further. While SNS does not charge for delivery to Lambda functions, you are charged each time the Lambda service is invoked, for 100% of the messages.
  • After optimization, using an SNS subscription filter policy, the SNS subscription filters for canceled orders and only forwards matching messages. Since the Lambda function is only invoked for 25% of the messages, this may reduce the total compute cost by up to 75%.

3. Choose a different messaging service

For complex filtering options based upon matching patterns, you can use EventBridge. The service can filter messages based upon prefix matching, numeric matching, and other patterns, combining several rules into a single filter. You can create branching logic within the EventBridge rule to invoke downstream targets.

EventBridge offers a broader range of targets than SNS destinations. In cases where you publish from an SNS topic to a Lambda function to invoke an EventBridge target, you could use EventBridge instead and eliminate the Lambda invocation. For example, instead of routing from SNS to Lambda to AWS Step Functions, instead create an EventBridge rule that routes events directly to a state machine.

Business logic layer

Step Functions allows you to orchestrate complex workflows in serverless applications while eliminating common boilerplate code. The Standard Workflow service charges per state transition. Express Workflows were introduced in December 2019, with pricing based on requests and duration, instead of transitions.

For workloads that are processing large numbers of events in shorter durations, Express Workflows can be more cost-effective. This is designed for high-volume event workloads, such as streaming data processing or IoT data ingestion. For these cases, compare the cost of the two workflow types to see if you can reduce cost by switching across.

Lambda is the on-demand compute layer in serverless applications, which is billed by requests and GB-seconds. GB-seconds is calculated by multiplying duration in seconds by memory allocated to the function. For a function with a 1-second duration, invoked 1 million times, here is how memory allocation affects the total cost in the US East (N. Virginia) Region:

Memory (MB)GB/SCompute costTotal cost
128125,000$ 2.08$ 2.28
512500,000$ 8.34$ 8.54
10241,000,000$ 16.67$ 16.87
15361,500,000$ 25.01$ 25.21
20482,000,000$ 33.34$ 33.54
30082,937,500$ 48.97$ 49.17

There are many ways to optimize Lambda functions, but one of the most important choices is memory allocation. You can choose between 128 MB and 3008 MB, but this also impacts the amount of virtual CPU as memory increases. Since total cost is a combination of memory and duration, choosing more memory can often reduce duration and lower overall cost.

Instead of manually setting the memory for a Lambda function and running executions to compare duration, you can use the AWS Lambda Power Tuning tool. This uses Step Functions to run your function against varying memory configurations. It can produce a visualization to find the optimal memory setting, based upon cost or execution time.

Optimizing costs with the AWS Lambda Power Tuning tool

Conclusion

Web application backends are one of the most popular workload types for serverless applications. The pay-per-value model works well for this type of workload. As traffic grows, it’s important to consider the design choices and service configurations used to optimize your cost.

Serverless web applications generally use a common range of services, which you can logically split into different layers. This post examines each layer and suggests common cost optimizations helpful for web app developers.

To learn more about building web apps with serverless, see the Happy Path series. For more serverless learning resources, visit https://serverlessland.com.

Improved failure recovery for Amazon EventBridge

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/improved-failure-recovery-for-amazon-eventbridge/

Today we’re announcing two new capabilities for Amazon EventBridgedead letter queues and custom retry policies. Both of these give you greater flexibility in how to handle any failures in the processing of events with EventBridge. You can easily enable them on a per target basis and configure them uniquely for each.

Dead letter queues (DLQs) are a common capability in queuing and messaging systems that allow you to handle failures in event or message receiving systems. They provide a way for failed events or messages to be captured and sent to another system, which can store them for future processing. With DLQs, you can have greater resiliency and improved recovery from any failure that happens.

You can also now configure a custom retry policy that can be set on your event bus targets. Today, there are two attributes that can control how events are retried: maximum number of retries and maximum event age. With these two settings, you could send events to a DLQ sooner and reduce the retries attempted.

For example, this could allow you to recover more quickly if an event bus target is overwhelmed by the number of events received, causing throttling to occur. The events are placed in a DLQ and then processed later.

Failures in event processing

Currently, EventBridge can fail to deliver an event to a target in certain scenarios. Events that fail to be delivered to a target due to client-side errors are dropped immediately. Examples of this are when EventBridge does not have permission to a target AWS service or if the target no longer exists. This can happen if the target resource is misconfigured or is deleted by the resource owner.

For service-side issues, EventBridge retries delivery of events for up to 24 hours. This can happen if the target service is unavailable or the target resource is not provisioned to handle the incoming event traffic and the target service is throttling the requests.

EventBridge failures

EventBridge failures

Previously, when all attempts to deliver an event to the target were exhausted, EventBridge published a CloudWatch metric indicating a failed target invocation. However, this provides no visibility into which events failed to be delivered and there was no way to recover the event that failed.

Dead letter queues

EventBridge’s DLQs are made possible today with Amazon Simple Queue Service (SQS) standard queues. With SQS, you get all of the benefits of a fully serverless queuing service: no servers to manage, automatic scalability, pay for what you consume, and high availability and security built in. You can configure the DLQs for your EventBridge bus and pay nothing until it is used, if and when a target experiences an issue. This makes it a great practice to follow and standardize on, and provides you with a safety net that’s active only when needed.

Optionally, you could later configure an AWS Lambda function to consume from that DLQ. The function is only invoked when messages exist in the queue, allowing you to maintain a serverless stack to recover from a potential failure.

DLQ configured per target

DLQ configured per target

With DLQ configured, the queue receives the event that failed in the message with important metadata that you can use to troubleshoot the issue. This can include: Error Code, Error Message, Exhausted Retry Condition, Retry Attempts, Rule ARN, and the Target ARN.

You can use this data to more easily troubleshoot what went wrong with the original delivery attempt and take action to resolve or prevent such failures in the future. You could also use the information such as Exhausted Retry Condition and Retry Attempts to further tweak your custom retry policy.

You can configure a DLQ when creating or updating rules via the AWS Management Console and AWS Command Line Interface (AWS CLI). You can also use infrastructure as code (IaC) tools such as AWS CloudFormation.

In the console, select the queue to be used for your DLQ configuration from the drop-down as shown here:

DLQ configuration

DLQ configuration

When configured via API, AWS CLI, or IaC tools, you must specify the ARN of the queue:

arn:aws:sqs:us-east-1:123456789012:orders-bus-shipping-service-dlq

When you configure a DLQ, the target SQS queue requires a resource-based policy that grants EventBridge access. One is created and applied automatically via the console when you create or update an EventBridge rule with a DLQ that exists in your own account.

For any queues created in other accounts, or via API, AWS CLI, or IaC tools, you must add a policy that allows SQS’s SendMessage permission to the EventBridge rule ARN, as shown below:

{
  "Sid": "Dead-letter queue permissions",
  "Effect": "Allow",
  "Principal": {
     "Service": "events.amazonaws.com"
  },
  "Action": "sqs:SendMessage",
  "Resource": "arn:aws:sqs:us-east-1:123456789012:orders-bus-shipping-service-dlq",
  "Condition": {
    "ArnEquals": {
      "aws:SourceArn": "arn:aws:events:us-east-1:123456789012:rule/MyTestRule"
    }
  }
}

You can read more about setting permissions for DLQ the documentation for “Granting permissions to the dead-letter queue”.

Once configured, you can monitor CloudWatch metrics for the DLQ queue. This shows both the successful delivery of messages via the InvocationsSentToDLQ metric, in addition to any failures via the InvocationsFailedToBeSentToDLQ. Note that these metrics do not exist if your queue is not considered “active”.

Retry policies

By default, EventBridge retries delivery of an event to a target so long as it does not receive a client-side error as described earlier. Retries occur with a back-off, for up to 185 attempts or for up to 24 hours, after which the event is dropped or sent to a DLQ, if configured. Due to the jitter of the back-off and retry process you may reach the 24-hour limit before reaching 185 retries.

For many workloads, this provides an acceptable way to handle momentary service issues or throttling that might occur. For some however, this model of back-off and retry can cause increased and on-going traffic to an already overloaded target system.

For example, consider an Amazon API Gateway target that has a resource constrained backend service behind it.

Constrained target service

Constrained target service

Under a consistently high load, the bus could end up generating too many API requests, tripping the API Gateway’s throttling configuration. This would cause API Gateway to respond with throttling errors back to EventBridge.

Throttled API reply

Throttled API reply

You may decide that allowing the failed events to retry for 24 hours puts too much load into this system and it may not properly recover from the load. This could lead to potential data loss unless a DLQ was configured.

Added DLQ

Added DLQ

With a DLQ, you could choose to process these events later, once the overwhelmed target service has recovered.

DLQ drained back to API

DLQ drained back to API

Or the events in question may no longer have the same value as they did previously. This can occur in systems where data loss is tolerated but the timeliness of data processing matters. In these situations, the DLQ would have less value and dropping the message is acceptable.

For either of these situations, configuring the maximum number of retries or the maximum age of the event could be useful.

Now with retry policies, you can configure per target the following two attributes:

  • MaximumEventAgeInSeconds: between 60 and 86400 seconds (86400, or 24 hours the default)
  • MaximumRetryAttempts: between 0 and 185 (185 is the default)

When either condition is met, the event fails. It’s then either dropped, which generates an increase to the FailedInvocations CloudWatch metric, or sent to a configured DLQ.

You can configure retry policy attributes when creating or updating rules via the AWS Management Console and AWS Command Line Interface (AWS CLI). You can also use infrastructure as code (IaC) tools such as AWS CloudFormation.

Retry policy

Retry policy

There is no additional cost for configuring either of these new capabilities. You only pay for the usage of the SQS standard queue configured as the dead letter queue during a failure and any application that handles the failed events. SQS pricing can be found here.

Conclusion

With dead letter queues and custom retry policies, you have improved handling and control over failure in distributed systems built with EventBridge. With DLQs you can capture failed events and then process them later, potentially saving yourself from data loss. With custom retry policies, you gain the improved ability to control the number of retries and for how long they can be retried.

I encourage you to explore how both of these new capabilities can help make your applications more resilient to failures, and to standardize on using them both in your infrastructure.

For more serverless learning resources, visit https://serverlessland.com.

Building resilient serverless patterns by combining messaging services

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-resilient-no-code-serverless-patterns-by-combining-messaging-services/

In “Choosing between messaging services for serverless applications”, I explain the features and differences between the core AWS messaging services. Amazon SQS, Amazon SNS, and Amazon EventBridge provide queues, publish/subscribe, and event bus functionality for your applications. Individually, these are robust, scalable services that are fundamental building blocks of serverless architectures.

However, you can also combine these services to solve specific challenges in distributed architectures. By doing this, you can use specific features of each service to build sophisticated patterns with little code. These combinations can make your applications more resilient and scalable, and reduce the amount of custom logic and architecture in your workload.

In this blog post, I highlight several important patterns for serverless developers. I also show how you use and deploy these integrations with the AWS Serverless Application Model (AWS SAM).

Examples in this post refer to code that can be downloaded from this GitHub repo. The README.md file explains how to deploy and run each example.

SNS to SQS: Adding resilience and throttling to message throughput

SNS has a robust retry policy that results in up to 100,010 delivery attempts over 23 days. If a downstream service is unavailable, it may be overwhelmed by retries when it comes back online. You can solve this issue by adding an SQS queue.

Adding an SQS queue between the SNS topic and its subscriber has two benefits. First, it adds resilience to message delivery, since the messages are durably stored in a queue. Second, it throttles the rate of messages to the consumer, helping smooth out traffic bursts caused by the service catching up with missed messages.

To build this in an AWS SAM template, you first define the two resources, and the SNS subscription:

  MySqsQueue:
    Type: AWS::SQS::Queue

  MySnsTopic:
    Type: AWS::SNS::Topic
    Properties:
      Subscription:
        - Protocol: sqs
          Endpoint: !GetAtt MySqsQueue.Arn

Finally, you provide permission to the SNS topic to publish to the queue, using the AWS::SQS::QueuePolicy resource:

  SnsToSqsPolicy:
    Type: AWS::SQS::QueuePolicy
    Properties:
      PolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Sid: "Allow SNS publish to SQS"
            Effect: Allow
            Principal: "*"
            Resource: !GetAtt MySqsQueue.Arn
            Action: SQS:SendMessage
            Condition:
              ArnEquals:
                aws:SourceArn: !Ref MySnsTopic
      Queues:
        - Ref: MySqsQueue

To test this, you can publish a message to the SNS topic and then inspect the SQS queue length using the AWS CLI:

aws sns publish --topic-arn "arn:aws:sns:us-east-1:123456789012:sns-sqs-MySnsTopic-ABC123ABC" --message "Test message"
aws sqs get-queue-attributes --queue-url "https://sqs.us-east-1.amazonaws.com/123456789012/sns-sqs-MySqsQueue- ABC123ABC " --attribute-names ApproximateNumberOfMessages

This results in the following output:

CLI output

Another usage of this pattern is when you want to filter messages in architectures using an SQS queue. By placing the SNS topic in front of the queue, you can use the message filtering capabilities of SNS. This ensures that only the messages you need are published to the queue. To use message filtering in AWS SAM, use the AWS:SNS:Subcription resource:

  QueueSubcription:
    Type: 'AWS::SNS::Subscription'
    Properties:
      TopicArn: !Ref MySnsTopic
      Endpoint: !GetAtt MySqsQueue.Arn
      Protocol: sqs
      FilterPolicy:
        type:
        - orders
        - payments 
      RawMessageDelivery: 'true'

EventBridge to SNS: combining features of both services

Both SNS and EventBridge have different characteristics in terms of targets, and integration with broader features. This table compares the major differences between the two services:

Amazon SNSAmazon EventBridge
Number of targets10 million (soft)5
Limits100,000 topics. 12,500,000 subscriptions per topic.100 event buses. 300 rules per event bus.
Input transformationNoYes – see details.
Message filteringYes – see details.Yes, including IP address matching – see details.
FormatRaw or JSONJSON
Receive events from AWS CloudTrailNoYes
TargetsHTTP(S), SMS, SNS Mobile Push, Email/Email-JSON, SQS, Lambda functions15 targets including AWS LambdaAmazon SQSAmazon SNSAWS Step FunctionsAmazon Kinesis Data StreamsAmazon Kinesis Data Firehose.
SaaS integrationNoYes – see integration partners.
Schema Registry integrationNoYes – see details.
Dead-letter queues supportedYesNo
Public visibilityCan create public topicsCannot create public buses
Cross-RegionYou can subscribe your AWS Lambda functions to an Amazon SNS topic in any Region.Targets must be same Region. You can publish across Region to another event bus.

In this pattern, you configure an SNS topic as a target of an EventBridge rule:

SNS topic as a target for an EventBridge rule

In the AWS SAM template, you declare the resources in the preceding diagram as follows:

Resources:
  MySnsTopic:
    Type: AWS::SNS::Topic

  EventRule: 
    Type: AWS::Events::Rule
    Properties: 
      Description: "EventRule"
      EventPattern: 
        account: 
          - !Sub '${AWS::AccountId}'
        source:
          - "demo.cli"
      Targets: 
        - Arn: !Ref MySnsTopic
          Id: "SNStopic"

The default bus already exists in every AWS account, so there is no need to declare it. For the event bus to publish matching events to the SNS topic, you define permissions using the AWS::SNS::TopicPolicy resource:

  EventBridgeToToSnsPolicy:
    Type: AWS::SNS::TopicPolicy
    Properties: 
      PolicyDocument:
        Statement:
        - Effect: Allow
          Principal:
            Service: events.amazonaws.com
          Action: sns:Publish
          Resource: !Ref MySnsTopic
      Topics: 
        - !Ref MySnsTopic       

EventBridge has a limit of five targets per rule. In cases where you must send events to hundreds or thousands of targets, publishing to SNS first and then subscribing those targets to the topic works around this limit. Both services have different targets, and this pattern allows you to deliver EventBridge events to SMS, HTTP(s), email and SNS mobile push.

You can transform and filter the message using these services, often without needing an AWS Lambda function. SNS does not support input transformation but you can do this in an EventBridge rule. Message filtering is possible in both services but EventBridge provides richer content filtering capabilities.

AWS CloudTrail can log and monitor activity across services in your AWS account. It can be a useful source for events, allowing you to respond dynamically to objects in Amazon S3 or react to changes in your environment, for example. This natively integrates with EventBridge, allowing you to ingest events at scale from dozens of services.

Using EventBridge enables you to source events from outside your AWS account, offering integrations with a list of software as a service (SaaS) providers. This capability allows you to receive events from your accounts with SaaS providers like Zendesk, PagerDuty, and Auth0. These events are delivered to a partner event bus in your account, and can then be filtered and routed to an SNS topic.

Additionally, this pattern allows you to deliver events to Lambda functions in other AWS accounts and in other AWS Regions. You can invoke Lambda from SNS topics in other Regions and accounts. It’s also possible to make SNS topics publicly read-only, making them extensible endpoints that other third parties can consume from. SNS has comprehensive access control, which you can incorporate into this pattern.

Cross-account publishing

EventBridge to SQS: Building fault-tolerant microservices

EventBridge can route events to targets such as microservices. In the case of downstream failures, the service retries events for up to 24 hours. For workloads where you need a longer period of time to store and retry messages, you can deliver the events to an SQS queue in each microservice. This durably stores those events until the downstream service recovers. Additionally, this pattern protects the microservice from large bursts of traffic by throttling the delivery of messages.

Fault-tolerant microservices architecture

The resources declared in the AWS SAM template are similar to the previous examples, but it uses the AWS::SQS::QueuePolicy resource to grant the appropriate permission to EventBridge:

  EventBridgeToToSqsPolicy:
    Type: AWS::SQS::QueuePolicy
    Properties:
      PolicyDocument:
        Statement:
        - Effect: Allow
          Principal:
            Service: events.amazonaws.com
          Action: SQS:SendMessage
          Resource:  !GetAtt MySqsQueue.Arn
      Queues:
        - Ref: MySqsQueue

Conclusion

You can combine these services in your architectures to implement patterns that solve complex challenges, often with little code required. This blog post shows three examples that implement message throttling and queueing, integrating SNS and EventBridge, and building fault tolerant microservices.

To learn more building decoupled architectures, see this Learning Path series on EventBridge. For more serverless learning resources, visit https://serverlessland.com.

Choosing between messaging services for serverless applications

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/choosing-between-messaging-services-for-serverless-applications/

Most serverless application architectures use a combination of different AWS services, microservices, and AWS Lambda functions. Messaging services are important in allowing distributed applications to communicate with each other, and are fundamental to most production serverless workloads.

Messaging services can improve the resilience, availability, and scalability of applications, when used appropriately. They can also enable your applications to communicate beyond your workload or even the AWS Cloud, and provide extensibility for future service features and versions.

In this blog post, I compare the primary messaging services offered by AWS and how you can use these in your serverless application architectures. I also show how you use and deploy these integrations with the AWS Serverless Application Model (AWS SAM).

Examples in this post refer to code that can be downloaded from this GitHub repository. The README.md file explains how to deploy and run each example.

Overview

Three of the most useful messaging patterns for serverless developers are queues, publish/subscribe, and event buses. In AWS, these are provided by Amazon SQS, Amazon SNS, and Amazon EventBridge respectively. All of these services are fully managed and highly available, so there is no infrastructure to manage. All three integrate with Lambda, allowing you to publish messages via the AWS SDK and invoke functions as targets. Each of these services has an important role to play in serverless architectures.

SNS enables you to send messages reliably between parts of your infrastructure. It uses a robust retry mechanism for when downstream targets are unavailable. When the delivery policy is exhausted, it can optionally send those messages to a dead-letter queue for further processing. SNS uses topics to logically separate messages into channels, and your Lambda functions interact with these topics.

SQS provides queues for your serverless applications. You can use a queue to send, store, and receive messages between different services in your workload. Queues are an important mechanism for providing fault tolerance in distributed systems, and help decouple different parts of your application. SQS scales elastically, and there is no limit to the number of messages per queue. The service durably persists messages until they are processed by a downstream consumer.

EventBridge is a serverless event bus service, simplifying routing events between AWS services, software as a service (SaaS) providers, and your own applications. It logically separates routing using event buses, and you implement the routing logic using rules. You can filter and transform incoming messages at the service level, and route events to multiple targets, including Lambda functions.

Integrating an SQS queue with AWS SAM

The first example shows an AWS SAM template defining a serverless application with two Lambda functions and an SQS queue:

Producer-consumer example

You can declare an SQS queue in an AWS SAM template with the AWS::SQS::Queue resource:

  MySqsQueue:
    Type: AWS::SQS::Queue

To publish to the queue, the publisher function must have permission to send messages. Using an AWS SAM policy template, you can apply policy that enables send messaging to one specific queue:

      Policies:
        - SQSSendMessagePolicy:
            QueueName: !GetAtt MySqsQueue.QueueName

The AWS SAM template passes the queue name into the Lambda function as an environment variable. The function uses the sendMessage method of the AWS.SQS class to publish the message:

const AWS = require('aws-sdk')
AWS.config.region = process.env.AWS_REGION 
const sqs = new AWS.SQS({apiVersion: '2012-11-05'})

// The Lambda handler
exports.handler = async (event) => {
  // Params object for SQS
  const params = {
    MessageBody: `Message at ${Date()}`,
    QueueUrl: process.env.SQSqueueName
  }
  
  // Send to SQS
  const result = await sqs.sendMessage(params).promise()
  console.log(result)
}

When the SQS queue receives the message, it publishes to the consuming Lambda function. To configure this integration in AWS SAM, the consumer function is granted the SQSPollerPolicy policy. The function’s event source is set to receive messages from the queue in batches of 10:

  QueueConsumerFunction:
    Type: AWS::Serverless::Function 
    Properties:
      CodeUri: code/
      Handler: consumer.handler
      Runtime: nodejs12.x
      Timeout: 3
      MemorySize: 128
      Policies:  
        - SQSPollerPolicy:
            QueueName: !GetAtt MySqsQueue.QueueName
      Events:
        MySQSEvent:
          Type: SQS
          Properties:
            Queue: !GetAtt MySqsQueue.Arn
            BatchSize: 10

The payload for the consumer function is the message from SQS. This is an array of messages up to the batch size, containing a body attribute with the publishing function’s MessageBody. You can see this in the CloudWatch log for the function:

CloudWatch log result

Integrating an SNS topic with AWS SAM

The second example shows an AWS SAM template defining a serverless application with three Lambda functions and an SNS topic:

SNS fanout to Lambda functions

You declare an SNS topic and the subscribing Lambda functions with the AWS::SNS:Topic resource:

  MySnsTopic:
    Type: AWS::SNS::Topic
    Properties:
      Subscription:
        - Protocol: lambda
          Endpoint: !GetAtt TopicConsumerFunction1.Arn    
        - Protocol: lambda
          Endpoint: !GetAtt TopicConsumerFunction2.Arn

You provide the SNS service with permission to invoke the Lambda functions but defining an AWS::Lambda::Permission for each:

  TopicConsumerFunction1Permission:
    Type: 'AWS::Lambda::Permission'
    Properties:
      Action: 'lambda:InvokeFunction'
      FunctionName: !Ref TopicConsumerFunction1
      Principal: sns.amazonaws.com

The SNSPublishMessagePolicy policy template grants permission to the publishing function to send messages to the topic. In the function, the publish method of the AWS.SNS class handles publishing:

const AWS = require('aws-sdk')
AWS.config.region = process.env.AWS_REGION 
const sns = new AWS.SNS({apiVersion: '2012-11-05'})

// The Lambda handler
exports.handler = async (event) => {
  // Params object for SNS
  const params = {
    Message: `Message at ${Date()}`,
    Subject: 'New message from publisher',
    TopicArn: process.env.SNStopic
  }
  
  // Send to SQS
  const result = await sns.publish(params).promise()
  console.log(result)
}

The payload for the consumer functions is the message from SNS. This is an array of messages, containing subject and message attributes from the publishing function. You can see this in the CloudWatch log for the function:

CloudWatch log result

Differences between SQS and SNS configurations

SQS queues and SNS topics offer different functionality, though both can publish to downstream Lambda functions.

An SQS message is stored on the queue for up to 14 days until it is successfully processed by a subscriber. SNS does not retain messages so if there are no subscribers for a topic, the message is discarded.

SNS topics may broadcast to multiple targets. This behavior is called fan-out. It can be used to parallelize work across Lambda functions or send messages to multiple environments (such as test or development). An SNS topic can have up to 12,500,000 subscribers, providing highly scalable fan-out capabilities. The targets may include HTTP/S endpoints, SMS text messaging, SNS mobile push, email, SQS, and Lambda functions.

In AWS SAM templates, you can retrieve properties such as ARNs and names of queues and topics, using the following intrinsic functions:

Amazon SQSAmazon SNS
Channel typeQueueTopic
Get ARN!GetAtt MySqsQueue.Arn!Ref MySnsTopic
Get name!GetAtt MySqsQueue.QueueName!GetAtt MySnsTopic.TopicName

Integrating with EventBridge in AWS SAM

The third example shows the AWS SAM template defining a serverless application with two Lambda functions and an EventBridge rule:

EventBridge integration with AWS SAM

The default event bus already exists in every AWS account. You declare a rule that filters events in the event bus using the AWS::Events::Rule resource:

  EventRule: 
    Type: AWS::Events::Rule
    Properties: 
      Description: "EventRule"
      EventPattern: 
        source: 
          - "demo.event"
        detail: 
          state: 
            - "new"
      State: "ENABLED"
      Targets: 
        - Arn: !GetAtt EventConsumerFunction.Arn
          Id: "ConsumerTarget"

The rule describes an event pattern specifying matching JSON attributes. Events that match this pattern are routed to the list of targets. You provide the EventBridge service with permission to invoke the Lambda functions in the target list:

  PermissionForEventsToInvokeLambda: 
    Type: AWS::Lambda::Permission
    Properties: 
      FunctionName: 
        Ref: "EventConsumerFunction"
      Action: "lambda:InvokeFunction"
      Principal: "events.amazonaws.com"
      SourceArn: !GetAtt EventRule.Arn

The AWS SAM template uses an IAM policy statement to grant permission to the publishing function to put events on the event bus:

  EventPublisherFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: code/
      Handler: publisher.handler
      Timeout: 3
      Runtime: nodejs12.x
      Policies:
        - Statement:
          - Effect: Allow
            Resource: '*'
            Action:
              - events:PutEvents      

The publishing function then uses the putEvents method of the AWS.EventBridge class, which returns after the events have been durably stored in EventBridge:

const AWS = require('aws-sdk')
AWS.config.update({region: 'us-east-1'})
const eventbridge = new AWS.EventBridge()

exports.handler = async (event) => {
  const params = {
    Entries: [ 
      {
        Detail: JSON.stringify({
          "message": "Hello from publisher",
          "state": "new"
        }),
        DetailType: 'Message',
        EventBusName: 'default',
        Source: 'demo.event',
        Time: new Date 
      }
    ]
  }
  const result = await eventbridge.putEvents(params).promise()
  console.log(result)
}

The payload for the consumer function is the message from EventBridge. This is an array of messages, containing subject and message attributes from the publishing function. You can see this in the CloudWatch log for the function:

CloudWatch log result

Comparing SNS with EventBridge

SNS and EventBridge have many similarities. Both can be used to decouple publishers and subscribers, filter messages or events, and provide fan-in or fan-out capabilities. However, there are differences in the list of targets and features for each service, and your choice of service depends on the needs of your use-case.

EventBridge offers two newer capabilities that are not available in SNS. The first is software as a service (SaaS) integration. This enables you to authorize supported SaaS providers to send events directly from their EventBridge event bus to partner event buses in your account. This replaces the need for polling or webhook configuration, and creates a highly scalable way to ingest SaaS events directly into your AWS account.

The second feature is the Schema Registry, which makes it easier to discover and manage OpenAPI schemas for events. EventBridge can infer schemas based on events routed through an event bus by using schema discovery. This can be used to generate code bindings directly to your IDE for type-safe languages like Python, Java, and TypeScript. This can help accelerate development by automating the generation of classes and code directly from events.

This table compares the major features of both services:

Amazon SNSAmazon EventBridge
Number of targets10 million (soft)5
Availability SLA99.9%99.99%
Limits100,000 topics. 12,500,000 subscriptions per topic.100 event buses. 300 rules per event bus.
Publish throughputVaries by Region. Soft limits.Varies by Region. Soft limits.
Input transformationNoYes – see details.
Message filteringYes – see details.Yes, including IP address matching – see details.
Message size maximum256 KB256 KB
BillingPer 64 KB
FormatRaw or JSONJSON
Receive events from AWS CloudTrailNoYes
TargetsHTTP(S), SMS, SNS Mobile Push, Email/Email-JSON, SQS, Lambda functions.15 targets including AWS LambdaAmazon SQSAmazon SNSAWS Step FunctionsAmazon Kinesis Data StreamsAmazon Kinesis Data Firehose.
SaaS integrationNoYes – see integrations.
Schema Registry integrationNoYes – see details.
Dead-letter queues supportedYesNo
FIFO ordering availableNoNo
Public visibilityCan create public topicsCannot create public buses
Pricing$0.50/million requests + variable delivery cost + data transfer out cost. SMS varies.$1.00/million events. Free for AWS events. No charge for delivery.
Billable request size1 request = 64 KB1 event = 64 KB
AWS Free Tier eligibleYesNo
Cross-RegionYou can subscribe your AWS Lambda functions to an Amazon SNS topic in any Region.Targets must be in the same Region. You can publish across Regions to another event bus.
Retry policy
  • For SQS/Lambda, exponential backoff over 23 days.
  • For SMTP, SMS and Mobile push, exponential backoff over 6 hours.
At-least-once event delivery to targets, including retry with exponential backoff for up to 24 hours.

Conclusion

Messaging is an important part of serverless applications and AWS services provide queues, publish/subscribe, and event routing capabilities. This post reviews the main features of SNS, SQS, and EventBridge and how they provide different capabilities for your workloads.

I show three example applications that publish and consume events from the three services. I walk through AWS SAM syntax for deploying these resources in your applications. Finally, I compare differences between the services.

To learn more building decoupled architectures, see this Learning Path series on EventBridge. For more serverless learning resources, visit https://serverlessland.com.

Automatically updating AWS WAF Rule in real time using Amazon EventBridge

Post Syndicated from Adam Cerini original https://aws.amazon.com/blogs/security/automatically-updating-aws-waf-rule-in-real-time-using-amazon-eventbridge/

In this post, I demonstrate a method for collecting and sharing threat intelligence between Amazon Web Services (AWS) accounts by using AWS WAF, Amazon Kinesis Data Analytics, and Amazon EventBridge. AWS WAF helps protect against common web exploits and gives you control over which traffic can reach your application.

Attempted exploitation blocked by AWS WAF provides a data source on potential attackers that can be shared proactively across AWS accounts. This solution can be an effective way to block traffic known to be malicious across accounts and public endpoints. AWS WAF managed rules provide an easy way to mitigate and record the details of common web exploit attempts. This solution will use the Admin protection managed rule for demonstration purposes.

In this post you will see references to the Sender account and the Receiver account. There is only one receiver in this example, but the receiving process can be duplicated multiple times across multiple accounts. This post walks through how to set up the solution. You’ll notice there is also an AWS CloudFormation template that makes it easy to test the solution in your own AWS account. The diagram in figure 1 illustrates how this architecture fits together at a high level.
 

Figure 1: Architecture diagram showing the activity flow of traffic blocked on the Sender AWS WAF

Figure 1: Architecture diagram showing the activity flow of traffic blocked on the Sender AWS WAF

Prerequisites

You should know how to do the following tasks:

Extracting threat intelligence

AWS WAF logs using a Kinesis Data Firehose delivery stream. This allows you to not only log to a destination S3 bucket, but also act on the stream in real time using a Kinesis Data Analytics Application. The following SQL code demonstrates how to extract any unique IP addresses that have been blocked by AWS WAF. While this example returns all blocked IPs, more complex SQL could be used for a more granular result. The full list of log fields is included in the documentation.


CREATE OR REPLACE STREAM "wafstream" (
 "clientIp" VARCHAR(16),
 "action" VARCHAR(8),
 "time_stamp" TIMESTAMP
 );

CREATE OR REPLACE PUMP "WAFPUMP" as
INSERT INTO "wafstream" (
"clientIp",
"action",
"time_stamp"
) 

Select STREAM DISTINCT "clientIp", "action", FLOOR(WAF_001.ROWTIME TO MINUTE) as "time_stamp"
FROM "WAF_001"
WHERE "action" = 'BLOCK';

Proactively blocking unwanted traffic

After extracting the IP addresses involved in the abnormal traffic, you will want to proactively block those IPs on your other web facing resources. You can accomplish this in a scalable way using Amazon EventBridge. After the Kinesis Application extracts the IP address, it will use an AWS Lambda function to call PutEvents on an EventBridge event bus. This process will create the event pattern, which is used to determine when to trigger an event bus rule. This example uses a simple pattern, which acts on any event with a source of “custom.waflogs” as shown in Figure 2. A more complex pattern could be used to for finer grain control of when a rule triggers.
 

Figure 2: EventBridge Rule creation

Figure 2: EventBridge Rule creation

Once the event reaches the event bus, the rule will forward the event to an event bus in “Receiver” account, where a second rule will trigger to call a Lambda function to update a WAF IPSet. A Web ACL rule is used to block all traffic sourcing from an IP address contained in the IPSet.

Test the solution by using AWS CloudFormation

Now that you’ve walked through the design of this solution, you can follow these instructions to test it in your own AWS account by using CloudFormation stacks.

To deploy using CloudFormation

  1. Launch the stack to provision resources in the Receiver account.
  2. Provide the account ID of the Sender account. This will correctly configure the permissions for the EventBridge event bus.
  3. Wait for the stack to complete, and then capture the event bus ARN from the output tab.

    This stack creates the following resources:

    • An AWS WAF v2 web ACL
    • An IPSet which will be used to contain the IP addresses to block
    • An AWS WAF rule that will block IP addresses contained in the IPSet
    • A Lambda function to update the IPSet
    • An IAM policy and execution role for the Lambda function
    • An event bus
    • An event bus rule that will trigger the Lambda function
  4. Switch to the Sender account. This should be the account you used in step 2 of this procedure.
  5. Provide the ARN of the event bus that was captured in step 3. This stack will provision the following resources in your account:
    • A virtual private cloud (VPC) with public and private subnets
    • Route tables for the VPC resources
    • An Application Load Balancer (ALB) with a fixed response rule
    • A security group that allows ingress traffic on port 80 to the ALB
    • A web ACL with the AWS Managed Rule for Admin Protection enabled
    • An S3 bucket for AWS WAF logs
    • A Kinesis Data Firehose delivery stream
    • A Kinesis Data Analytics application
    • An EventBridge event bus
    • An event bus rule
    • A Lambda function to send information to the Receiver account event bus
    • A custom CloudFormation resource which enables WAF logging and starts the Kinesis Application
    • An IAM policy and execution role that allows a Lamba function to put events into the event bus
    • An IAM policy and role to allow the custom CloudFormation resource to enable WAF logging and start the Kinesis Application
    • An IAM policy and role that allows the Kinesis Firehose to put logs into S3
    • An IAM policy that allows the WAF Web ACL to put records into the Firehose
    • An IAM policy and role that allows the Kinesis Application to invoke a Lambda function and log to CloudWatch
    • An IAM policy and role that allows the “Sender” account to put events in the “Receiver” event bus

After the CloudFormation stack completes, you should test your environment. To test the solution, check the output tab for the DNS name of the Application Load Balancer and run the following command:

curl ALBDNSname/admin

You should be able to check the Receiver account’s AWS WAF IPSet named WAFBlockIPset and find your IP.

Conclusion

This example is intentionally simple to clearly demonstrate how each component works. You can take these principles and apply them to your own environment. Layering the Amazon managed rules with your own custom rules is the best way to get started with AWS WAF. This example shows how you can use automation to update your WAF rules without needing to rely on humans. A more complete solution would source log data from each Web ACL and update an active IP Set in each account to protect all resources. As seen in Figure 3, a more complete implementation would send all logs in a region to a centralized Kinesis Firehose to be processed by the Kinesis Analytics Application, EventBridge would be used to update a local IPset as well as forward the event to other accounts event buses for processing.
 

Figure 3: Updating across accounts

Figure 3: Updating across accounts

You can also add additional targets to the event bus to do things such as send to a Simple Notification Service topic for notifications, or run additional automation. To learn more about AWS WAF web ACLs, visit the AWS WAF Developer Guide. Using Amazon EventBridge opens up the possibility to send events to partner integrations. Customers or APN Partners like PagerDuty or Zendesk can enrich this solution by taking actions such as automatically opening a ticket or starting an incident. To learn more about the power of Amazon EventBridge, see the EventBridge User Guide.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS WAF forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Adam Cerini

Adam is a Senior Solutions Architect with Amazon Web Services. He focuses on helping Public Sector customers architect scalable, secure, and cost effective systems. Adam holds 5 AWS certifications including AWS Certified Solutions Architect – Professional and AWS Certified Security – Specialist.

Building Salesforce integrations with Amazon EventBridge and Amazon AppFlow

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-salesforce-integrations-with-amazon-eventbridge/

This post is courtesy of Den Delimarsky, Senior Product Manager, and Vinay Kondapi, Senior Product Manager.

The integration between Amazon EventBridge and Amazon AppFlow enables customers to receive and react to events from Salesforce in their event-driven applications. In this blog post, I show you how to set up the integration, and route Salesforce events to an AWS Lambda function for processing.

Amazon AppFlow is a fully managed integration service that enables you to securely transfer data between software as a service (SaaS) applications like Salesforce, Marketo, Slack, and ServiceNow, and AWS services like Amazon S3 and Amazon Redshift.

EventBridge SaaS integrations make it easier for customers to receive events from over 30 different SaaS providers. Salesforce is a popular SaaS provider among AWS customers, so it has been one of the most anticipated event sources for EventBridge. Customers want to build rich applications that can react to events that track campaigns, contracts, opportunities, and order changes.

The ability to receive these events allows you to build workflows where you can start a variety of processes. For example, you could notify a broad range of subscribers about the changes, or enrich the data with information from another service. Or you could route the event to an order delivery system.

Previously, to connect Salesforce to your application, you must write custom API polling code that routes events either directly to an application or to an event bus. With the Salesforce integration with EventBridge and Amazon AppFlow, the integration is built in minutes directly through the AWS Management Console, with no code required.

The solution outlined in this blog post is structured as follows:

Architecture overview

Setting up the event source

To set up the event source:

  1. Open the Amazon AppFlow console, and create a new flow. Choose Create flow button on the service landing page. Give your flow a unique name, and choose Next.Specify flow details
  2. In the Source name list, select Salesforce, and then choose Connect. Select the Salesforce environment you are using, and provide a unique connection name.Connect to Salesforce
  3. Choose Continue. When prompted, provide your Salesforce credentials. These are the credentials that are associated with the specific Salesforce environment selected in the previous step.
  4. Select Salesforce events from the list of available options for the flow, and choose the event that you want to route to EventBridge. This ensures that Amazon AppFlow can route specific events that are coming from Salesforce to an EventBridge event bus.Source details
  5. With the source set up, you can now specify the destination. In the Destination name list, select EventBridge.Destination name

To send Salesforce events to EventBridge, Amazon AppFlow creates a new partner event source that is associated with a partner event bus.

To create a partner event source:

  1. Select an existing partner event source, or create a new one by choosing the list of partner event sources.Destination details
  2. When creating a new event source, you can optionally customize the name, to make it easier for you to identify it later.Generate partner event source
  3. Choose an Amazon S3 bucket for large events. For events that are larger than 256 KB, Amazon AppFlow sends a URL for the S3 object to the event bus instead of the event payload.Large event handling
  4. Define a flow trigger, which determines when the flow is started. Because we are tracking events, we want to react to those as they come in. Using the default Run flow on event enables this scenario as changes occur in Salesforce.Flow trigger

With Amazon AppFlow, you can also configure data field mapping, validation rules, and filters. These enable you to enrich and modify event data before it is sent to the event bus.

Once you create the flow, you must activate the event source that you created. To complete this step:

  1. Open the EventBridge console.
  2. Associate a partner event source with an event bus by following the link in the Amazon AppFlow integration dialog box, or navigating directly to the partner event sources view. You can see a partner event source with a Pending state.Partner event source
  3. Select the event source and choose Associate with event bus.
  4. Confirm the settings and choose on Associate.Associate with an event bus
  5. Return to the Amazon AppFlow console, and open the flow you were creating. Choose Activate flow.Activate flow

Your integration is now complete, and EventBridge can start receiving Salesforce events from the configured flow.

Routing Salesforce events to Lambda function

The associated partner event bus receives all events of the configured type from the connected Salesforce accounts. Now your application can react to these events with the help of rules in EventBridge. Rules allow you to set conditions for event routing that determine what targets receive event payloads. You can learn more about this functionality in the EventBridge documentation.

To create a new rule:

  1. Go to the rules view in the EventBridge console, and choose Create rule.EventBridge Rules
  2. Provide a unique name and an optional description for your rule.
  3. Select the Event pattern option in the Define pattern section. With event pattern configuration, you can define parts of the event payload that EventBridge must look at to determine where to route the event.Define pattern
    For this exercise, start by capturing every Salesforce event that goes through the partner event bus. The only events routed through this bus are from the partner event source. In this case, it is Amazon AppFlow connected to Salesforce.
  4. Set the event matching pattern to Pre-defined pattern by service, with the service provider being All Events. The default setting allows you to receive all events that are coming through the partner event bus.Event matching pattern
  5. Select the event bus that the rule should be associated with. Choose Custom or partner event bus and select the event bus that you associated with the Amazon AppFlow event source. Every rule in EventBridge is associated with an event bus.Select event bus

When rules are triggered, the event can be routed to other AWS services. Additionally, every rule can have up to five different AWS targets. You can read more about available targets in the EventBridge documentation. For this blog post, we use an AWS Lambda function as a target for Salesforce events received from Amazon AppFlow.

To configure targets for your rule:

  1. From the list of targets, select Lambda function, and select an existing function. If you do not yet have a function available, you can create one in the AWS Lambda console.Select targets
  2. Choose Create. You have now completed the rule setup.

Now, Salesforce events that match the configured type are routed directly to a Lambda function in your account.

Testing the integration

To test the integration:

  1. Open the Lambda view in the AWS Management Console.
  2. Choose the function that is handling the events from EventBridge.
  3. In the Function code section, update the code to:
    exports.handler = async (event) => {
        console.log(event);
        const response = {
            statusCode: 200,
            body: JSON.stringify('Hello from Lambda!'),
        };
        return response;
    };
    

    Function code

  4. Choose Save.
  5. Open your Salesforce instance, and take an action that is associated with the event you configured earlier. For example, you could update a contract or create an order.
  6. Go back to your function in AWS Management Console, and choose the Monitoring tab.Lambda function monitoring tab
  7. Scroll to CloudWatch Logs Insights section.CloudWatch Logs Insights
  8. Choose the latest log stream. Make sure that the timestamp approximately matches the time when you triggered the action in Salesforce.
  9. Choose the log stream.
  10. Observe log events that contain Salesforce event data.

You have completed your first Salesforce integration with EventBridge and Amazon AppFlow. You are now able to build decoupled and highly scalable applications that integrate with Salesforce.

Conclusion

Building decoupled and scalable cross-service applications is more relevant than ever with requirements for high availability, consistency, and reliability. This blog post demonstrates a solution that connects Salesforce to an event-driven application that uses EventBridge and Amazon AppFlow to route events. The application uses events from Salesforce as a starting point for a custom processing workflow in a Lambda function.

To learn more about EventBridge, visit the EventBridge documentation or EventBridge Learning Path.

To learn more about Amazon AppFlow, visit the Amazon AppFlow documentation.

Building storage-first serverless applications with HTTP APIs service integrations

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/building-storage-first-applications-with-http-apis-service-integrations/

Over the last year, I have been talking about “storage first” serverless patterns. With these patterns, data is stored persistently before any business logic is applied. The advantage of this pattern is increased application resiliency. By persisting the data before processing, the original data is still available, if or when errors occur.

Common pattern for serverless API backend

Common pattern for serverless API backend

Using Amazon API Gateway as a proxy to an AWS Lambda function is a common pattern in serverless applications. The Lambda function handles the business logic and communicates with other AWS or third-party services to route, modify, or store the processed data. One option is to place the data in an Amazon Simple Queue Service (SQS) queue for processing downstream. In this pattern, the developer is responsible for handling errors and retry logic within the Lambda function code.

The storage first pattern flips this around. It uses native error handling with retry logic or dead-letter queues (DLQ) at the SQS layer before any code is run. By directly integrating API Gateway to SQS, developers can increase application reliability while reducing lines of code.

Storage first pattern for serverless API backend

Storage first pattern for serverless API backend

Previously, direct integrations require REST APIs with transformation templates written in Velocity Template Language (VTL). However, developers tell us they would like to integrate directly with services in a simpler way without using VTL. As a result, HTTP APIs now offers the ability to directly integrate with five AWS services without needing a transformation template or code layer.

The first five service integrations

This release of HTTP APIs direct integrations includes Amazon EventBridge, Amazon Kinesis Data Streams, Simple Queue Service (SQS), AWS System Manager’s AppConfig, and AWS Step Functions. With these new integrations, customers can create APIs and webhooks for their business logic hosted in these AWS services. They can also take advantage of HTTP APIs features like authorizers, throttling, and enhanced observability for securing and monitoring these applications.

Amazon EventBridge

HTTP APIs service integration with Amazon EventBridge

HTTP APIs service integration with Amazon EventBridge

The HTTP APIs direct integration for EventBridge uses the PutEvents API to enable client applications to place events on an EventBridge bus. Once the events are on the bus, EventBridge routes the event to specific targets based upon EventBridge filtering rules.

This integration is a storage first pattern because data is written to the bus before any routing or logic is applied. If the downstream target service has issues, then EventBridge implements a retry strategy with incremental back-off for up to 24 hours. Additionally, the integration helps developers reduce code by filtering events at the bus. It routes to downstream targets without the need for a Lambda function as a transport layer.

Use this direct integration when:

  • Different tasks are required based upon incoming event details
  • Only data ingestion is required
  • Payload size is less than 256 kb
  • Expected requests per second are less than the Region quotas.

Amazon Kinesis Data Streams

HTTP APIs service integration with Amazon Kinesis Data Streams

HTTP APIs service integration with Amazon Kinesis Data Streams

The HTTP APIs direct integration for Kinesis Data Streams offers the PutRecord integration action, enabling client applications to place events on a Kinesis data stream. Kinesis Data Streams are designed to handle up to 1,000 writes per second per shard, with payloads up to 1 mb in size. Developers can increase throughput by increasing the number of shards in the data stream. You can route the incoming data to targets like an Amazon S3 bucket as part of a data lake or a Kinesis data analytics application for real-time analytics.

This integration is a storage first option because data is stored on the stream for up to seven days until it is processed and routed elsewhere. When processing stream events with a Lambda function, errors are handled at the Lambda layer through a configurable error handling strategy.

Use this direct integration when:

  • Ingesting large amounts of data
  • Ingesting large payload sizes
  • Order is important
  • Routing the same data to multiple targets

Amazon SQS

HTTP APIs service integration with Amazon SQS

HTTP APIs service integration with Amazon SQS

The HTTP APIs direct integration for Amazon SQS offers the SendMessage, ReceiveMessage, DeleteMessage, and PurgeQueue integration actions. This integration differs from the EventBridge and Kinesis integrations in that data flows both ways. Events can be created, read, and deleted from the SQS queue via REST calls through the HTTP API endpoint. Additionally, a full purge of the queue can be managed using the PurgeQueue action.

This pattern is a storage first pattern because the data remains on the queue for four days by default (configurable to 14 days), unless it is processed and removed. When the Lambda service polls the queue, the messages that are returned are hidden in the queue for a set amount of time. Once the calling service has processed these messages, it uses the DeleteMessage API to remove the messages permanently.

When triggering a Lambda function with an SQS queue, the Lambda service manages this process internally. However, HTTP APIs direct integration with SQS enables developers to move this process to client applications without the need for a Lambda function as a transport layer.

Use this direct integration when:

  • Data must be received as well as sent to the service
  • Downstream services need reduced concurrency
  • The queue requires custom management
  • Order is important (FIFO queues)

AWS AppConfig

HTTP APIs service integration with AWS Systems Manager AppConfig

HTTP APIs service integration with AWS Systems Manager AppConfig

The HTTP APIs direct integration for AWS AppConfig offers the GetConfiguration integration action and allows applications to check for application configuration updates. By exposing the systems parameter API through an HTTP APIs endpoint, developers can automate configuration changes for their applications. While this integration is not considered a storage first integration, it does enable direct communication from external services to AppConfig without the need for a Lambda function as a transport layer.

Use this direct integration when:

  • Access to AWS AppConfig is required.
  • Managing application configurations.

AWS Step Functions

HTTP APIs service integration with AWS Step Functions

HTTP APIs service integration with AWS Step Functions

The HTTP APIs direct integration for Step Functions offers the StartExecution and StopExecution integration actions. These actions allow for programmatic control of a Step Functions state machine via an API. When starting a Step Functions workflow, JSON data is passed in the request and mapped to the state machine. Error messages are also mapped to the state machine when stopping the execution.

This pattern provides a storage first integration because Step Functions maintains a persistent state during the life of the orchestrated workflow. Step Functions also supports service integrations that allow the workflows to send and receive data without needing a Lambda function as a transport layer.

Use this direct integration when:

  • Orchestrating multiple actions.
  • Order of action is required.

Building HTTP APIs direct integrations

HTTP APIs service integrations can be built using the AWS CLI, AWS SAM, or through the API Gateway console. The console walks through contextual choices to help you understand what is required for each integration. Each of the integrations also includes an Advanced section to provide additional information for the integration.

Creating an HTTP APIs service integration

Creating an HTTP APIs service integration

Once you build an integration, you can export it as an OpenAPI template that can be used with infrastructure as code (IaC) tools like AWS SAM. The exported template can also include the API Gateway extensions that define the specific integration information.

Exporting the HTTP APIs configuration to OpenAPI

Exporting the HTTP APIs configuration to OpenAPI

OpenAPI template

An example of a direct integration from HTTP APIs to SQS is located in the Sessions With SAM repository. This example includes the following architecture:

AWS SAM template resource architecture

AWS SAM template resource architecture

The AWS SAM template creates the HTTP APIs, SQS queue, Lambda function, and both Identity and Access Management (IAM) roles required. This is all generated in 58 lines of code and looks like this:

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: HTTP API direct integrations

Resources:
  MyQueue:
    Type: AWS::SQS::Queue
    
  MyHttpApi:
    Type: AWS::Serverless::HttpApi
    Properties:
      DefinitionBody:
        'Fn::Transform':
          Name: 'AWS::Include'
          Parameters:
            Location: './api.yaml'
          
  MyHttpApiRole:
    Type: "AWS::IAM::Role"
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: "Allow"
            Principal:
              Service: "apigateway.amazonaws.com"
            Action: 
              - "sts:AssumeRole"
      Policies:
        - PolicyName: ApiDirectWriteToSQS
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              Action:
              - sqs:SendMessage
              Effect: Allow
              Resource:
                - !GetAtt MyQueue.Arn
                
  MyTriggeredLambda:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: src/
      Handler: app.lambdaHandler
      Runtime: nodejs12.x
      Policies:
        - SQSPollerPolicy:
            QueueName: !GetAtt MyQueue.QueueName
      Events:
        SQSTrigger:
          Type: SQS
          Properties:
            Queue: !GetAtt MyQueue.Arn

Outputs:
  ApiEndpoint:
    Description: "HTTP API endpoint URL"
    Value: !Sub "https://${MyHttpApi}.execute-api.${AWS::Region}.amazonaws.com"

The OpenAPI template handles the route definitions for the HTTP API configuration and configures the service integration. The template looks like this:

openapi: "3.0.1"
info:
  title: "my-sqs-api"
paths:
  /:
    post:
      responses:
        default:
          description: "Default response for POST /"
      x-amazon-apigateway-integration:
        integrationSubtype: "SQS-SendMessage"
        credentials:
          Fn::GetAtt: [MyHttpApiRole, Arn]
        requestParameters:
          MessageBody: "$request.body.MessageBody"
          QueueUrl:
            Ref: MyQueue
        payloadFormatVersion: "1.0"
        type: "aws_proxy”
        connectionType: "INTERNET"
x-amazon-apigateway-importexport-version: "1.0"

Because the OpenAPI template is included in the AWS SAM template via a transform, the API Gateway integration can reference the roles and services created within the AWS SAM template.

Conclusion

This post covers the concept of storage first integration patterns and how the new HTTP APIs direct integrations can help. I cover the five current integrations and possible use cases for each. Additionally, I demonstrate how to use AWS SAM to build and manage the integrated applications using infrastructure as code.

Using the storage first pattern with direct integrations can help developers build serverless applications that are more durable with fewer lines of code. A Lambda function is no longer required to transport data from the API endpoint to the desired service. Instead, use Lambda function invocations for differentiating business logic.

To learn more join us for the HTTP API service integrations session of Sessions With SAM! 

#ServerlessForEveryone

Using serverless backends to iterate quickly on web apps – part 2

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/using-serverless-backends-to-iterate-quickly-on-web-apps-part-2/

This series is about building flexible solutions that can adapt as user requirements change. One of the challenges of building modern web applications is that requirements can change quickly. This is especially true for new applications that are finding their product-market fit. Many development teams start building a product with one set of requirements, and quickly find they must build a product with different features.

For both start-ups and enterprises, it’s often important to find a development methodology and architecture that allows flexibility. This is the surest way to keep up with feature requests in evolving products and innovate to delight your end-users. In this post, I show how to build sophisticated workflows using minimal custom code.

Part 1 introduces the Happy Path application that allows park visitors to share maps and photos with other users. In that post, I explain the functionality, how to deploy the application, and walk through the backend architecture.

The Happy Path application accepts photo uploads from users’ smartphones. The application architecture must support 100,000 monthly active users. These binary uploads are typically 3–9 MB in size and must be resized and optimized for efficient distribution.

Using a serverless approach, you can develop a robust low-code solution that can scale to handle millions of images. Additionally, the solution shown here is designed to handle complex changes that are introduced in subsequent versions of the software. The code and instructions for this application are available in the GitHub repo.

Architecture overview

After installing the backend in the previous post, the architecture looks like this:

In this design, the API, storage, and notification layers exist as one application, and the business logic layer is a separate application. These two applications are deployed using AWS Serverless Application Model (AWS SAM) templates. This architecture uses Amazon EventBridge to pass events between the two applications.

In the business logic layer:

  1. The workflow starts when events are received from EventBridge. Each time a new object is uploaded by an end-user, the PUT event in the Amazon S3 Upload bucket triggers this process.
  2. After the workflow is completed successfully, processed images are stored in the Distribution bucket. Related metadata for the object is also stored in the application’s Amazon DynamoDB table.

By separating the architecture into two independent applications, you can replace the business logic layer as needed. Providing that the workflow accepts incoming events and then stores processed images in the S3 bucket and DynamoDB table, the workflow logic becomes interchangeable. Using the pattern, this workflow can be upgraded to handle new functionality.

Introducing AWS Step Functions for workflow management

One of the challenges in building distributed applications is coordinating components. These systems are composed of separate services, which makes orchestrating workflows more difficult than working with a single monolithic application. As business logic grows more complex, if you attempt to manage this in custom code, it can become quickly convoluted. This is especially true if it handles retries and error handling logic, and it can be hard to test and maintain.

AWS Step Functions is designed to coordinate and manage these workflows in distributed serverless applications. To do this, you create state machine diagrams using Amazon States Language (ASL). Step Functions renders a visualization of your state machine, which makes it simpler to see the flow of data from one service to another.

Each state machine consists of a series of steps. Each step takes an input and produces an output. Using ASL, you define how this data progresses through the state machine. The flow from step to step is called a transition. All state machines transition from a Start state towards an End state.

The Step Functions service manages the state of individual executions. The service also supports versioning, which makes it easier to modify state machines in production systems. Executions continue to use the version of a state machine when they were started, so it’s possible to have active executions on multiple versions.

For developers using VS Code, the AWS Toolkit extension provides support for writing state machines using ASL. It also renders visualizations of those workflows. Combined with AWS Serverless Application Model (AWS SAM) templates, this provides a powerful way to deploy and maintain applications based on Step Functions. I refer to this IDE and AWS SAM in this walkthrough.

Version 1: Image resizing

The Happy Path application uses Step Functions to manage the image-processing part of the backend. The first version of this workflow resizes the uploaded image.

To see this workflow:

  1. In VS Code, open the workflows/statemachines folder in the Explorer panel.
  2. Choose the v1.asl.sjon file.v1 state machine
  3. Choose the Render graph option in the CodeLens. This opens the workflow visualization.CodeLens - Render graph

In this basic workflow, the state machine starts at the Resizer step, then progresses to the Publish step before ending:

  • In the top-level attributes in the definition, StartsAt sets the Resizer step as the first action.
  • The Resizer step is defined as a task with an ARN of a Lambda function. The Next attribute determines that the Publish step is next.
  • In the Publish step, this task defines a Lambda function using an ARN reference. It sets the input payload as the entire JSON payload. This step is set as the End of the workflow.

Deploying the Step Functions workflow

To deploy the state machine:

  1. In the terminal window, change directory to the workflows/templates/v1 folder in the repo.
  2. Execute these commands to build and deploy the AWS SAM template:
    sam build
    sam deploy –guided
  3. The deploy process prompts you for several parameters. Enter happy-path-workflow-v1 as the Stack Name. The other values are the outputs from the backend deployment process, detailed in the repo’s README. Enter these to complete the deployment.
  4. SAM deployment output

Testing and inspecting the deployed workflow

Now the workflow is deployed, you perform an integration test directly from the frontend application.

To test the deployed v1 workflow:

  1. Open the frontend application at https://localhost:8080 in your browser.
  2. Select a park location, choose Show Details, and then choose Upload images.
  3. Select an image from the sample photo dataset.
  4. After a few seconds, you see a pop-up message confirming that the image has been added:Upload confirmation message
  5. Select the same park location again, and the information window now shows the uploaded image:Happy Path - park with image data

To see how the workflow processed this image:

  1. Navigate to the Steps Functions console.
  2. Here you see the v1StateMachine with one execution in the Succeeded column.Successful execution view
  3. Choose the state machine to display more information about the start and end time.State machine detail
  4. Select the execution ID in the Executions panel to open details of this single instance of the workflow.

This view shows important information that’s useful for understanding and debugging an execution. Under Input, you see the event passed into Step Functions by EventBridge:

Event detail from EventBridge

This contains detail about the S3 object event, such as the bucket name and key, together with the placeId, which identifies the location on the map. Under Output, you see the final result from the state machine, shows a successful StatusCode (200) and other metadata:

Event output from the state machine

Using AWS SAM to define and deploy Step Functions state machines

The AWS SAM template defines both the state machine, the trigger for executions, and the permissions needed for Step Functions to execute. The AWS SAM resource for a Step Functions definition is AWS::Serverless::StateMachine.

Definition permissions in state machines

In this example:

  • DefinitionUri refers to an external ASL definition, instead of embedding the JSON in the AWS SAM template directly.
  • DefinitionSubstitutions allow you to use tokens in the ASL definition that refer to resources created in the AWS SAM template. For example, the token ${ResizerFunctionArn} refers to the ARN of the resizer Lambda function.
  • Events define how the state machine is invoked. Here it defines an EventBridge rule. If an event matches this source and detail-type, it triggers an execution.
  • Policies: the Step Functions service must have permission to invoke the services that perform tasks in the state machine. AWS SAM policy templates provide a convenient shorthand for common execution policies, such as invoking a Lambda function.

This workflow application is separate from the main backend template. As more functionality is added to the workflow, you deploy the subsequent AWS SAM templates in the same way.

Conclusion

Using AWS SAM, you can specify serverless resources, configure permissions, and define substitutions for the ASL template. You can deploy a standalone Step Functions-based application using the AWS SAM CLI, separately from other parts of your application. This makes it easier to decouple and maintain larger applications. You can visualize these workflows directly in the VS Code IDE in addition to the AWS Management Console.

In part 3, I show how to build progressively more complex workflows and how to deploy these in-place without affecting the other parts of the application.

To learn more about building serverless web applications, see the Ask Around Me series.

ICYMI: Season one of Sessions with SAM

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/icymi-season-one-of-sessions-with-sam/

Developers tell us they want to know how to easily build and manage their serverless applications. In 2017 AWS announced AWS Serverless Application Model (SAM) to help with just that. To help developers learn more about SAM, I created a weekly Twitch series called Sessions with SAM. Each session focuses on a specific serverless task or service. It demonstrates deploying and managing that task using infrastructure as code (IaC) with SAM templates. This post recaps each session of the first season to prepare you for Sessions with SAM season two, starting August 13.

Sessions with SAM

Sessions with SAM

What is SAM

AWS SAM is an open source framework designed for building serverless applications. The framework provides shorthand syntax to quickly declare AWS Lambda functions, Amazon DynamoDB tables and more. Additionally, SAM is not limited to serverless resources and can also declare any standard AWS CloudFormation resource. With around 20 lines of code, a developer can create an application with an API, logic, and database layer with the proper permissions in place.

Example of using SAM templates to generate infrastructure

20 Lines of code

By using infrastructure as code to manage and deploy serverless applications, developers gain several advantages. You can version the templates and rollback when necessary. They can be parameterized for flexibility across multiple environments. They can be shared with development teams for consistency across developer environments.

Sessions

The code and linked videos are listed with the session. See the YouTube playlist and GitHub repository for the entire season.

Session one: JWT authorizers on Amazon API Gateway

In this session, I cover building an application backend using JWT authorizers with the new Amazon API Gateway HTTP API. We also discussed building an application with multiple routes and the ability to change the authorization requirements per route.

Code: https://github.com/aws-samples/sessions-with-aws-sam/tree/master/http-api

Video: https://youtu.be/klOScYEojzY

Session two: Amazon Cognito authentication

In this session, I cover building an Amazon Cognito template for authentication. This includes the user management component with user pools and user groups in addition to a hosted authentication workflow with an app client.

Building an Amazon Cognito authentication provider

Building an Amazon Cognito authentication provider

We also discussed using custom pre-token Lambda functions to modify the JWT token issued by Amazon Cognito. This custom token allows you to insert custom scopes based on the Amazon Cognito user groups. These custom scopes are then used to customize the authorization requirements for the individual routes.

Code: https://github.com/aws-samples/sessions-with-aws-sam/tree/master/cognito

Video: https://youtu.be/nBtWCjKd72M

Session three: Building a translation app with Amazon EventBridge

I covered using AWS SAM to build a basic translation and sentiment app centered around Amazon EventBridge. The SAM template created three Lambda functions, a custom EventBridge bus, and an HTTP API endpoint.

Architecture for serverless translation application

Architecture for serverless translation application

Requests from HTTP API endpoint are put into the custom EventBridge bus via the endpoint Lambda function. Based on the type of request, either the translate function or the sentiment function is invoked. The AWS SAM template manages all the infrastructure in addition to the permissions to invoke the Lambda functions and access Amazon Translate and Amazon Comprehend.

Code: https://github.com/aws-samples/sessions-with-aws-sam/tree/master/eventbridge

Video: https://youtu.be/73R02KufLac

Session four: Building an Amazon Kinesis Data Firehose for ingesting website access logs

In this session, I covered building an Amazon Kinesis Data Firehose for ingesting large amounts of data. This particular application is designed for access logs generated from API Gateway. The logs are first stored to an Amazon DynamoDB data base for immediate processing. Next, the logs are sent through a Kinesis Data Firehose and stored in an Amazon S3 bucket for later processing.

Code: https://github.com/aws-samples/sessions-with-aws-sam/tree/master/kinesis-firehose

Video: https://youtu.be/jdTBtaxs0hA

Session five: Analyzing API Gateway logs with Amazon Kinesis Data Analytics

Continuing from session 4, I discussed configuring API Gateway access logs to use the Kinesis Data Firehose built in the previous session. I also demonstrate an Amazon Kinesis data analytics application for near-real-time analytics of your access logs.

Example of Kinesis Data Analytics in SAM

Example of Kinesis Data Analytics in SAM

Code: https://github.com/aws-samples/sessions-with-aws-sam/tree/master/kinesis-firehose

Video: https://youtu.be/ce0v-q9EVTQ

Session six: Managing Amazon SQS with AWS SAM templates

I demonstrated configuring an Amazon Simple Queue Service (SQS) queue and the queue policy to control access to the queue. We also discuss allowing cross-account and external resources to access the queue. I show how to identify the proper principal resources for building the proper AWS IAM policy templates.

Code: https://github.com/aws-samples/sessions-with-aws-sam/tree/master/SQS

Video: https://youtu.be/q2rbHMyJBDY

Session seven: Creating canary deploys for Lambda functions

In this session, I cover canary and linear deployments for serverless applications. We discuss how canary releases compare to linear releases and how they can be customized. We also spend time discussing pre-traffic and post-traffic tests and how rollbacks are handled when one of these tests fails.

Code: https://github.com/aws-samples/sessions-with-aws-sam/tree/master/safe-deploy

Video: https://youtu.be/RE4r_6edaXc

Session eight: Configuring custom domains for Amazon API Gateway endpoints

In session eight I configured custom domains for API Gateway REST and HTTP APIs. The demonstration included the option to pass in an Amazon Route 53 zone ID or AWS Certificate Manager (ACM) certificate ARN. If either of these are missing, then the template built a zone or SSL cert respectively.

Working with Amazon Route 53 zones

Working with Amazon Route 53 zones

We discussed how to use declarative and imperative methods in our templates. We also discussed how to use a single domain across multiple APIs, regardless of they are REST or HTTP APIs.

Code: https://github.com/aws-samples/sessions-with-aws-sam/tree/master/custom-domains

Video: https://youtu.be/4uXEGNKU5NI

Session nine: Managing AWS Step Functions with AWS SAM

In this session I was joined by fellow Senior Developer Advocate, Rob Sutter. Rob and I demonstrated managing and deploying AWS Step Functions using the new Step Functions support built into SAM. We discussed how SAM offers definition substitutions to pass data from the template into the state machine configuration.

Code: https://github.com/aws-samples/sessions-with-aws-sam/tree/master/step-functions

Video: https://youtu.be/BguUgdZwymQ

Session ten: Using Amazon EFS with Lambda functions in SAM

Joined by Senior Developer Advocate, James Beswick, we covered configuring Amazon Elastic File System (EFS) as a storage option for Lambda functions using AWS SAM. We discussed the Amazon VPC requirements in configuring for EFS. James also walked through using the AWS Command Line Interface (CLI) to aid in configuration of the VPC.

Code: https://github.com/aws-samples/aws-lambda-efs-samples

Video: https://youtu.be/up1op216trk

Session eleven: Ask the experts

This session introduced you to some of our SAM experts. Jeff Griffiths, Senior Product Manager, and Alex Woods, Software Development Engineer, joined me in answering live audience questions. WE discussed best practices for local development and debugging, Docker networking, CORS configurations, roadmap features and more.

SAM experts panel

SAM experts panel

Video: https://youtu.be/2JRa8MugPCY

Session twelve: Managing .Net Lambda function in AWS SAM and Stackery

In this final session of the season, I was joined by Stackery CTO and serverless hero, Chase Douglas. Chase demonstrated using Stackery and AWS SAM to build and deploy .Net Core Lambda functions. We discuss how Stackery’s editor allows developers to visually design a serverless application and how it uses SAM templates under the hood.

Stackery visual editor

Stackery visual editor

Code only examples

In addition to code examples with each video session, the repo includes developer-requested code examples. In this section, I demonstrate how to build an access log pipeline for HTTP API or use the SAM build command to compile Swift for Lambda functions.

Conclusion

Sessions with SAM helps developers bootstrap their serverless applications with instructional video and ready-made IaC templates. From JWT authorizers to EFS storage solutions, over 15 AWS services are represented in SAM templates. The first season of live videos supplements these templates with best practices explained and real developer questions answered.

Season two of Sessions with SAM starts August 13. The series will continue the pattern of explaining best practices, providing usable starter templates, and having some fun along the way.

#ServerlessForEveryone

 

BBVA: Architecture for Large-Scale Macie Implementation

Post Syndicated from Neel Sendas original https://aws.amazon.com/blogs/architecture/bbva-architecture-for-large-scale-macie-implementation/

This post was co-written by Andrew Alaniz , Technical Information Security Officer, and Brady Pratt, Cloud Security Enginner, both at BBVA USA.

Introduction

Data Loss Prevention (DLP) is a common topic among companies that work with any type of sensitive data. One of the challenges is that many people either don’t fully understand what DLP is, or rather, have their own definition of what it is. Regardless of one’s interpretation of DLP, one thing is certain: before you can control data loss, you need to locate find the data sources.

If an organization can’t identify its data, it can’t protect it. BBVA USA, a bank holding company, turned to AWS for advice, and decided to use Amazon Macie to accomplish this in Amazon Simple Storage Service (Amazon S3). Macie is a fully managed data security and data privacy service that uses machine learning and pattern matching to discover and protect your sensitive data in AWS. This blog post will share some of the design and architecture we used to deploy Macie using services, such as AWS Lambda and Amazon CloudWatch.

Data challenges in Amazon S3

Although all S3 buckets are private by default, everyone is aware of the challenges of unsecured S3 buckets exposing data publicly. Amazon has provided a way to prevent that by removing the ability to make buckets public. As with other data storage mechanisms, this doesn’t stop anyone from storing sensitive data within AWS and exposing it another way. With Macie, one can classify the data stored in S3, centrally, and through AWS Organizations.

Recommended architecture

We can break the Macie architecture into two main parts: S3 discovery and evaluation, and S3 sensitive data discovery:

Macie architecture

The setup of discovery and evaluation is simple and straightforward, and should be enabled through Amazon Organizations and across all accounts. The cost of this piece is minimal, and it provides valuable insights into the compliance state of S3 buckets.

Once setup of discovery and evaluation is completed, we are a ready to move to the next step and configure discovery jobs for our S3 buckets. The architecture includes the use of S3, Amazon CloudWatch Events, Amazon EventBridge, and Lambda. All of the execution should happen in a centralized account, but the event triggers should come from each individual account.

Architectural considerations

When determining the architectural design of the solution, consider a few main components:

Centralization

Utilize AWS Organizations: Macie allows native integration with AWS Organizations. This is a significant advantage for Macie. Additionally, within AWS Organizations, it allows the delegation of the Macie master account to a subordinate account. The benefit of this is that it allows centralized management while allowing for the compartmentalization of roles.

Ease of management

One of the most challenging things to manage is non-conforming configurations. It’s much easier to manage a standard way to create, name, and configure settings. Once the classification jobs were ready to be created, we had to take into consideration the following when deploying Macie for our use case:

  1. Macie classifies content in a single job across one account.
  2. If you submit multiple jobs that contain the same bucket, Macie will scan the objects multiple times.
  3. Macie jobs are immutable.

Due to these considerations, we decided to create one job per S3 bucket. This allows administrators to search more easily for jobs related to findings.

Cost considerations

Macie plays an essential role, not only in identifying data and improving data collection, but also in compliance. We needed to make a decision about how to determine if an S3 bucket would be included in a classification job. Initially, we considered including all buckets no matter what. The logic here was that even if we make an assumption that a bucket would never have sensitive data in it, an entity with the right role could always add something at a later date.

Finally, we implemented a solution to tag specific buckets that were known to have immutable properties and which would never allow sensitive data to be added. We could do this because we knew exactly what data was in the bucket, who or what created the bucket, and exactly who or what had access to the bucket.

An example of this type of bucket is the S3 bucket used to store VPC Flow Logs. We know that this bucket is only created by provisioning scripts and is only going to store VPC flow logs that contain no sensitive data based on data classification standards. Also, only VPC services and specific security services can access this bucket for anything other than READ. This is controlled organizationally and can be tagged with a simple ignore key/value pair upon creation.

Deploying Macie at BBVA USA

BBVA USA developed an approach to working within AWS that allows guardrails to be applied as accounts are created. One of those guardrails identifies if developers have stored sensitive data in an account. BBVA needed to be able to do this, and do it at scale. If there is a roadblock or a challenge with AWS services, the first place BBVA looks is to support, but the second place is the Technical Account Manager.

After initiating conversations with its account team, BBVA determined that AWS Macie was the tool to help them with this challenge.

With the help of its technical account manager (TAM), BBVA was able to meet with the Macie Product team and discuss the best options for deploying at scale. Through these conversations, they were even able to influence the Macie product roadmap.

Getting Macie ready to deploy at scale was actually quite simple once the architectural pattern was designed.

Initial job creation

In order to set up jobs for each existing bucket in the organization, it’s a matter of scripting the job creation and adding each bucket from each account into its own job, which is pretty straightforward.

Job creation for new buckets

The recommended architecture and implementation for existing buckets:

  1. Whenever a new S3 bucket is added to Organization accounts, trigger a CloudWatch Event in the target account.
  2. Set up a cross account EventBridge to consume the Event. Using the EventBridge allows for a simpler configuration and centralized management of both Events and Lambda.
  3. Trigger a Lambda function in a delegated Macie admin account, which creates classification jobs to apply Macie to all the newly created S3 buckets.
  4. Repeat the same process when a bucket is deleted by triggering a cancel job.

Evaluate the state of S3 buckets

To evaluate the S3 accounts, turn on Macie at the organization master account and delegate administration to a subordinate account used for Macie. This enables management consolidation of security features into a centralized security account. This helps further restrict access from those that may need access to the master billing account. Finally, enable Macie by default on all organization accounts.

Evaluate the state of S3 buckets

Conclusion

BBVA USA worked directly with the Macie product team by leveraging its relationship with the AWS account team and Enterprise Support. This allowed the company to eventually deploy Macie quickly and at scale. Through Macie, the company is able to track any changes to configurations on buckets that allow a bucket to be public, shared, or replicated with external accounts and if the encryption policies are disabled. Using Macie, BBVA was able to identify buckets that contained sensitive information and put in another control to bolster its AWS governance profile.

Integrating Amazon EventBridge and Amazon ECS

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/integrating-amazon-eventbridge-and-amazon-ecs/

This post is courtesy of Jakub Narloch, Senior Software Development Engineer.

Today, AWS announced the support for Amazon API Gateway as an event target in Amazon EventBridge. This feature enables new integration scenarios for web applications and services. It allows customers to seamlessly connect their infrastructure, SaaS services, and APIs hosted in AWS.

With API Gateway as a target for EventBridge, this creates new integration capabilities for new or existing web applications. This post explains how developers can now deliver events directly to their applications hosted on Amazon ECS, Amazon Elastic Kubernetes Service (EKS), or Amazon EC2 using EventBridge and API Gateway. In this post, I show how to build an event driven application running on ECS Fargate that process events from EventBridge using API Gateway integration.

EventBridge is a serverless event bus that makes it easy to connect applications together. It uses data from your own applications, integrated software as a service (SaaS) applications, and AWS services. This simplifies the process of building event-driven architectures by decoupling event producers from event consumers. This allows producers and consumers to be scaled, updated, and deployed independently. Loose coupling improves developer agility in addition to application resiliency.

API Gateway helps developers to create, publish, and maintain secure APIs at any scale. When used with EventBridge, API Gateway authenticates and authorizes API calls. It also acts as an HTTP proxy layer for integrating other AWS services or third-party web applications.

Previously, EventBridge customers could consume events from EventBridge in ECS via Amazon SNS or Amazon SQS, or by triggering an ECS task directly. API Gateway as a target replaces this approach and brings additional API Gateway features like authentication and rate limiting. This can help you build more resilient and feature-rich integrations. API Gateway throttling limits the maximum number events delivered at a same time, while EventBridge retries events delivery for up to 24 hours.

This blog post uses an ecommerce application as an example of a custom integration. The application is responsible for processing customer orders. The following diagram illustrates the interaction of the components of the system. The application itself is hosted as ECS service on top of AWS Fargate.

Architecture diagram

To achieve high availability, the application cluster is distributed across subnets in different Availability Zones. The Application Load Balancer ensures that the incoming traffic is distributed across the nodes in the cluster. API Gateway is responsible for authenticating requests and routing to the backend. The application logic is responsible for receiving the event and persisting it in Amazon DocumentDB.

The order event is modeled as follows:

{
  "version": "0",
  "region": "us-east-1",
  "account": "123456789012",
  "id": "4236e18e-f858-4e2b-a8e8-9b772525e0b2",
  "source": "ecommerce",
  "detail-type": "CreateOrder",
  "resources": [],
  "detail": {
    "order_id": "ce4fe8b7-9911-4377-a795-29ecca9d7a3d",
    "create_date": "2020-06-02T13:01:00Z",
    "items": [
      {
        "product_id": "b8575571-5e91-4521-8a29-4af4a8aaa6f6",
        "quantity": 1,
        "price": "9.99",
        "currency": "CAD"
      }
    ],
    "customer": {
      "customer_id": "5d22899e-3ff5-4ce0-a2a3-480cfce39a56"
    },
    "payment": {
      "payment_id": "fb563473-bef4-4965-ad78-e37c6c9c0c2a",
    },
    "delivery_address": {
      "street": "510 W Georgia St",
      "city": "Vancouver",
      "state": "BC",
      "country": "Canada",
      "zip_code": "V6B0M7"
    }
  }
}

Application layer
The application that processes the orders is implemented using a reactive stack through Spring Boot. A reactive application design can help build a scalable application capable of handling thousands of transactions per second from a single instance. This is important for applications with high throughput and can help in achieving economies of scale.

The resource handler
The application itself defines a OrderResource, which acts as entry handler for receiving the events from EventBridge and processing them. The handler logic is responsible for unmarshalling the event and retrieving the order details out of the event detail. The order is then persisted in DocumentDB using a dedicated DAO instance.

@Slf4j
@RequestMapping("/orders")
@RestController
public class EventResource {
 
    private final OrderRepository orderRepository;
 
    public EventResource(OrderRepository orderRepository) {
        this.orderRepository = Objects.requireNonNull(orderRepository);
    }
 
    @RequestMapping(method = RequestMethod.PUT)
    public Mono<ResponseEntity<Object>> onEvent(@Valid @RequestBody Event<Order> event) {
 
        log.info("Processing order {}", event.getDetail().getOrderId());
 
        return orderRepository.save(event.getDetail())
                .map(order -> ResponseEntity.created(UriComponentsBuilder.fromPath("/events/{id}")
                        .build(order.getOrderId())).build())
                .onErrorResume(error -> {
                    log.error("Unhandled error occurred", error);
                    return Mono.just(ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR).build());
                });
    }
}

The handler is a mapped to process requests at ‘/orders’ path. The implementation unmarshals an event payload and stores it into DocumentDB. Upon successful execution, the service responds with a 201 Created HTTP status code.

You can store EventBridge events in a document database like Amazon DocumentDB. This is a non-relational database that allows you to store JSON content directly. This example uses DocumentDB for storing the documents, making it easy for writing the event payload directly. It also supports general querying of event content.

Prerequisites
To build and deploy the application, you must have AWS CDK and JDK 11 installed. Start by cloning the GitHub repository. The repository contains the example code and supporting infrastructure for deploying to AWS.

Step 1: Create Amazon ECR repository.
Start by creating a dedicated Amazon ECR repository, where Docker images are uploaded. There is an AWS CDK template in the application code repo for this purpose.

First, install Node.js dependencies needed to execute the CDK command:

cd ../eventbridge-integration-solution-aws-api-cdk
npm install

Next, compile the CDK TypeScript template.

npm run build

Finally, synthesize the CloudFormation stack.

cdk synth "*"

Now bootstrap CloudFormation resources needed to deploy the remaining templates.

cdk bootstrap

Finally, deploy the stack that creates the Amazon ECR registry.

cdk deploy EventsRegistry

Step 2: Build the application

Before the application is deployed, it must be built and uploaded to Amazon ECR.
To get started, compile the source code and build the application distribution.

cd ../eventbridge-integration-solution-aws-api
./gradlew clean build

Step 3: Containerizing the application
The build system is configured to include the task for containerizing the artifacts and creating the Docker image. To create a new Docker image from the build artifact, run the following command:

./gradlew dockerBuildImage

The build task generates the Dockerfile using the provided settings. It then executes the docker build command to create a new Docker image named eventbridge-integration-solution-aws-api.

Step 4: Upload the image to Amazon ECR
You can now upload the image directly to Amazon ECR. First, login into the Amazon ECR registry through Docker. Replace AWS_ACCOUNT_ID with your actual account.

aws ecr get-login-password --region us-west-2 | docker login --username AWS --password-stdin "${AWS_ACCOUNT_ID}.dkr.ecr.us-west-2.amazonaws.com"

Before uploading the image to ECR, tag it with the expected remote repository name. To do that, first list all of the Docker images.

docker images

Copy the image id attribute of eventbridge-integration-solution-aws-api image and use it in the tag command, also replacing AWS_ACCOUNT_ID.

docker tag $DOCKER_IMAGE "${AWS_ACCOUNT_ID}.dkr.ecr.us-west-2.amazonaws.com/eventbridge-integration-solution-aws-api"

Finally, push the Docker image to ECR, replacing AWS_ACCOUNT_ID with your AWS account ID.

docker push "${AWS_ACCOUNT_ID}.dkr.ecr.us-west-2.amazonaws.com/eventbridge-integration-solution-aws-api"

Step 5: Deploying the application stack
Once the application image is uploaded to Amazon ECR, you can deploy the entire application stack using CDK. The stack creates multiple resources including a VPC, DocumentDB cluster, ECS TaskDefinition and Service, Application Load Balancer, API Gateway and EventBridge rule. You can inspect the resources created in the CDK definition by opening the TypeScript files in the eventbridge-integration-solution-aws-api-cdk/lib directory.

At this point, you can proceed with deploying the CloudFormation stack.

cd ../eventbridge-integration-solution-aws-api-cdk
cdk deploy "*"

Step 6: Testing running application
Now, test the end to end event delivery by publishing the sample events to the EventBridge PutEvents API. Create a file named event.json and paste the following code:

[
  {
    "Source": "ecommerce",
    "DetailType": "CreateOrder",
    "Detail": "{\"order_id\": \"ce4fe8b7-9911-4377-a795-29ecca9d7a3d\",\"create_date\": \"2020-06-02T13:01:00Z\",\"items\": [{\"product_id\": \"b8575571-5e91-4521-8a29-4af4a8aaa6f6\",\"quantity\": 1,\"price\": \"9.99\",\"currency\": \"CAD\"}],\"customer\": {\"customer_id\": \"5d22899e-3ff5-4ce0-a2a3-480cfce39a56\"},\"payment\": {\"payment_id\": \"fb563473-bef4-4965-ad78-e37c6c9c0c2a\"},\"delivery_address\": {\"street\": \"510 W Georgia St\",\"city\": \"Vancouver\",\"state\": \"BC\",\"country\": \"Canada\",\"zip_code\": \"V6B0M7\"}}"
  }
]

Publish this event with the following AWS CLI command.

aws events put-events --entries file://event.json

EventBridge delivers the event to API Gateway and the application persists it in DocumentDB.

Step 7: Cleanup
Delete all the resources created in this tutorial by running this CDK command:

cdk destroy "*"

Additional considerations
The demo application is simplified for the purpose of showcasing the EventBridge integration with API Gateway. In production, it’s recommended that you isolate the DocumentDB cluster in a private subnet. Additionally, the Application Load Balancer can be hidden from public access and connected to API Gateway through VPC Link.

Conclusion

This post demonstrates how to set up a sample application for consuming events directly from EventBridge into a custom application hosted in ECS. This integration uses EventBridge’s native support for API Gateway as a target that allows to integrate any HTTP base web applications.

Learn more from the EventBridge documentation.

Upgrading to Amazon EventBridge from Amazon CloudWatch Events

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/upgrading-to-amazon-eventbridge-from-amazon-cloudwatch-events/

Amazon EventBridge was announced at the AWS New York Summit in 2019. It’s a serverless event bus service that uses the Amazon CloudWatch Events API, but also includes more functionality, like the ability to ingest events from SaaS apps. Event-based architectures can make it easier to decouple your application services and make your systems more extensible.

Events are emitted from services throughout AWS, and you can create custom events from your own applications. The popularity of event-based compute is why CloudWatch Events grew to process trillions of events every month. See this video for an overview of the features of the EventBridge service.

EventBridge is designed to extend the event model beyond AWS, bringing data from software as a service (SaaS) providers into your AWS environment. This means you can consume events from popular providers such as Zendesk, PagerDuty, and Auth0. You can use these in your applications with the same ease as any AWS-generated event.

This blog post explains the differences between the CloudWatch Events and EventBridge, and the benefits of upgrading. I also provide additional resources to help you gain the benefits of using the newer EventBridge service.

Integrated SaaS providers in EventBridge

Fetching data from third-party providers has typically relied on processes such as polling, or building custom webhooks where these are supported. Polling is compute-intensive and often wasteful, since many requests return no new data. Additionally, there is a lag between new data becoming available and your system receiving the information.

While webhooks offers improvements over polling and may approximate a real-time connection, these requests travel over the public internet. This means you must secure the webhook endpoint, and use a security mechanism between the two services. You must also scale up if the SaaS provider sends large numbers of messages.

EventBridge enables SaaS providers to publish data on an event bus within their AWS environment. These events are then routed to your AWS account, and appear on your partner event bus. All of this happens on the private AWS network, away from the public internet. The service manages scaling, security, and integration for you.

Configuring EventBridge in your SaaS provider account is easy. To learn how to set up the integration, see these videos for a full walkthrough:

A full list of EventBridge SaaS integrations is available on the EventBridge website. Additionally, if you want to integrate your own SaaS software with EventBridge, read more in the onboarding guide.

Custom event buses and enhanced rules

CloudWatch Events provides a default event bus that exists in every AWS account. All AWS events are routed via the default bus. You can also choose to publish your custom events to the default bus.

EventBridge introduces custom event buses you can use exclusively for your own workloads. These can be useful for controlling access to events to a limited set of AWS accounts or custom applications. Custom event buses are free to set up. Watch this video to see how to set up a custom event bus.

You can also use powerful advanced rules for routing events. With content-based filtering, you can use comparison operators to identify and filter values in events. This allows you to use EventBridge to handle more processing on behalf of your application, reducing the load on downstream services. See this blog post to learn more about using content-filtering to build advanced rules.

Schema registry

One of the traditional challenges of working with event-based architectures is the administration of managing event structures. With different applications, services and microservices publishing events, it can be hard to standardize event formats. These formats, or schemas, may also change when developers introduce new versions of their services.

The EventBridge Schema Registry allows you to automate the discovery and creation of schemas. You can find schemas for AWS services, integrated SaaS providers, and your own custom events. EventBridge infers the schemas and stores these in a registry. You can then download custom code bindings for popular type-safe programming languages. This accelerates the development process, making it easy to construct objects based on events.

Watch this video to see how to use the EventBridge Schema Registry, and get started with your own applications.

Migration and compatibility

Comparing the EventBridge console and CloudWatch Events console, EventBridge has a new design that makes it easier to build and manage rules and event buses. If you are new to using events within AWS, it’s recommended that you start with the EventBridge console.

The EventBridge service uses the CloudWatch Events API and it is fully backward compatible. Any rules you have configured in CloudWatch Events continue to work in EventBridge. You can access the default event bus and any configurations you created in CloudWatch Events from EventBridge immediately. If you’re a current CloudWatch Events customer, you can upgrade to EventBridge by simply opening the EventBridge console.

AWS continues to build new functionality to enhance the capabilities of event-based architectures. These features are only released via EventBridge, so it’s recommended that you upgrade to ensure you can take advantage of these new capabilities.

Additional EventBridge resources for serverless developers

Event-based architectures can help serverless developers create dynamic, decoupled applications. Here are additional resources with sample code repos to help you get started:

Conclusion

EventBridge is the evolution of the CloudWatch Events service. It brings new features, including the ability to integrate data from popular SaaS providers as events within AWS.

In this post, I discuss the new ability to create custom event buses, and how you can develop advanced rules for sophisticated event routing. I also discuss the new EventBridge Schema Registry, which automates event schema discovery, and downloading code bindings directly into your IDE.

New event-based features are now released in EventBridge. By migrating from CloudWatch Events, you can take advantage of new capabilities as they are released.

To learn more about using EventBridge for your AWS workloads, visit the EventBridge Learning Path, which includes a range of learning resources.

Building scalable serverless applications with Amazon S3 and AWS Lambda

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-scalable-serverless-applications-with-amazon-s3-and-aws-lambda/

Well-designed serverless applications are typically a combination of managed services connected by custom business logic. One of the most powerful combinations for enterprise application development is Amazon S3 and AWS Lambda. S3 is a highly durable, highly available object store that scales to meet your storage needs. Lambda runs custom code in response to events, automatically scaling with the size of the workload. When you use the two services together, they can provide a scalable core for serverless solutions.

This blog post shows how to design and deploy serverless applications designed around S3 events. The solutions presented use AWS services to create scalable serverless architectures, using minimal custom code. This is the conclusion of a series showing how the S3-to-Lambda pattern can implement the following business solutions:

Bringing the compute layer to the data

Much traditional software operates by bringing data to the compute layer. This means that processes run on batches of data in files, databases, and other sources. This is inherently harder to scale as data volumes grow, often needing a fleet of servers to scale out at peak times. For the developer, this creates operational overhead to ensure that the compute capacity is keeping pace with the data volume.

The S3-to-Lambda serverless pattern instead brings the compute layer to the data. As data arrives, the compute process scales up and down automatically to meet the demand. This allows developers to focus on building business logic for a single item of data, and the execution at scale is handled by the Lambda service.

The image optimization application is a good example for comparing the traditional and serverless approaches. For a busy media site, capturing hundreds of images per minute in an S3 bucket, the operations overhead becomes clearer. A script running on a server must scale up across multiple instances to keep pace with this level of traffic. Compare this to the Lambda-based approach, which scales on-demand. The code itself does not change, whether it is used for a single image or thousands.

Receiving and processing events from S3 in custom code

S3 raises events when objects are put, copied, or deleted in a bucket. It also raises a broad number of other notifications, such as when lifecycle events occur. You can configure S3 to invoke Lambda from these events by using the S3 console, Lambda console, AWS CLI, or AWS Serverless Application Model (SAM) templates.

S3 passes details of the event, not the object itself, to the Lambda function in a JSON object. This object contains an array of records, so it’s possible to receive more than one S3 event per invocation:

S3 passes event details to Lambda

As the Lambda handler may receive more than one record, it should iterate through the records collection. It’s best practice to keep the handler small and generic, calling out to the business logic in a separate function or file:

const processEvent = require('my-custom-logic’)

// A Node.js Lambda handler
exports.handler = async (event) => {

  // Capture event – can be used to create mock events
  console.log (JSON.stringify(event, null, 2))  

  // Handle each incoming S3 object in the event
  await Promise.all(
    event.Records.map(async (event) => {
      try {
	  // Pass each event to the business logic handler
        await processEvent(event)
      } catch (err) {
        console.error('Handler error: ', err)
      }
    })
  )
}

This code example takes advantage of concurrent asynchronous executions available in Node.js but similar constructs are available in many other languages. This means that multiple objects are processed in parallel to minimize the overall function execution time.

Instead of handling and logging any errors within the function’s code, it’s also possible to use destinations for asynchronous invocations. You use an On failure condition to route the error to various potential targets, including another Lambda function or other AWS services. For complex applications or those handling large volumes, this provides greater control for managing events that fail processing.

During the development process, you can debug and test the S3-to-Lambda integration locally. First, capture a sample event during development to create a mock event for local testing. The sample applications in this series each use a test harness so the developer can test the handler on a local machine. The test harness invokes the handler locally, providing mock environment variables:

// Mock event
const event = require('./localTestEvent')

// Mock environment variables
process.env.AWS_REGION = 'us-east-1'
process.env.localTest = true
process.env.language = 'en'

// Lambda handler
const { handler } = require('./app')

const main = async () => {
  console.time('localTest')
  await handler(event)
  console.timeEnd('localTest')
}

main().catch(error => console.error(error))

Scaling up when more data arrives

The Lambda service scales up if S3 sends multiple events simultaneously. How this works depends on several factors. If the target Lambda function has sufficient concurrency available, and if any active instances of the function are already processing events, the Lambda service scales up.

Lambda scaling up as events queue grows

The function does not scale up if the reserved concurrency is set to 1 or the scaling capacity is fully consumed for a Region in your account. In this case, the events from S3 are queued internally until a Lambda instance is available for processing. You can request to increase the regional concurrency limit by submitting a request in the Support Center console. You may also intend to perform one-at-a-time processing by setting the reserved concurrency to 1.

One-at-a-time processing with Lambda

Generally, multiple instances of a function are invoked simultaneously when S3 receives multiple objects, to process the events as quickly as possible. It’s this rapid scaling and parallelization in both S3 and Lambda that make this pattern such a powerful core architecture for many applications.

Amazon SNS and Amazon SQS integrations

The native S3 to Lambda integration provides a reliable way to invoke one function per prefix or suffix-pattern per bucket. In example, invoking a function when object keys end in .pdf in a single bucket. This works well for the vast majority of use-cases but you may want to invoke multiple Lambda functions per S3 event.

In this case, S3 can publish notifications to SNS, where events are delivered to a range of targets. These include Lambda functions, SQS queues, HTTP endpoints, email, text messages and push notifications. SNS provides fan-out capability, enabling one event to be delivered to multiple destinations, such as Lambda functions or web hooks, for example.

In busy applications, the volume of S3 events may be too large for a downstream system, such as a non-serverless service. In this case, you can also use an SQS queue as a notification target. After events are published to a queue, they can be consumed by Lambda functions and other services. The queue acts as a buffer and can help smooth out traffic for systems consuming these events. See the DynamoDB importer repository for an example.

Uploading data to S3 in upstream applications

You may have upstream services in your architecture that generate the data stored in S3. Some upstream workloads have spiky usage patterns and large numbers of users, like web or mobile applications. You may increase the performance and throughput of these workloads by uploading directly to S3. This avoids proxying binary data through an API Gateway endpoint or web server.

For example, for a mobile application uploading user photos, S3 and Lambda can handle the upload process for large of numbers of users:

  1. The upstream process, in this case a mobile client, requests a presigned URL from an API Gateway endpoint.
  2. This invokes a Lambda function that requests a presigned URL for the S3 bucket, and returns this back via the API call.
  3. The mobile client sends the data directly to the presigned S3 URL using HTTPS POST. The upload is managed directly by S3.

This simple pattern can be a scalable and cost-effective way to upload large binary data into your applications. After the object successfully uploads, the S3 put event can then asynchronously invoke downstream workflows.

Visit this repository to see an example of a serverless S3 uploader application. You can also see a walkthrough of this process in this YouTube video.

Developing larger applications

As you develop larger serverless applications, it often becomes more practical to split applications into multiple services and repositories for separate teams. Often, individual services must integrate with existing S3 buckets, not create these in the application templates. You may also have to integrate a single service with multiple S3 buckets.

In decoupling larger applications with Amazon EventBridge, I show how you can decouple services within an application using an event bus. This pattern helps separate the producers and consumers of events in your workload. This can make each service become more independent and more resilient to changes with the overall application.

This example demonstrates how the document repository solution can be refactored into several smaller applications that communicate using events. This uses Amazon EventBridge as the event router coordinating the flow. Each application contains a SAM template that defines the EventBridge rule to filter for events, and publishes data back to the event bus after processing is complete.

One of the major benefits to using an event-based architecture is that development teams retain flexibility even as the application grows. It allows developers to separate AWS resources like S3 buckets and DynamoDB tables, from the compute resources, like Lambda functions. This decoupling can simplify the deployment process, help avoid building monoliths, and reduce the cognitive load of developing in large applications.

Conclusion

S3 and Lambda are two highly scalable AWS services that can be powerful when combined in serverless applications. In this post, I summarize many of the patterns shown across this series. I explain the integration pattern and the scaling behavior, and how you can use mock events for local testing and development. You can also use SNS and SQS in some applications for fan-out and buffering of events.

Upstream applications can upload data directly to S3 to achieve greater scalability by avoiding proxies. For larger applications, I show how using an event-based architecture modeled around EventBridge can help decouple application services. This can promote service independence, and help maintain flexibility as applications grow.

To learn more about the S3-to-Lambda architecture pattern, watch the YouTube video series, or explore the articles listed at top of this post.

Running Web Applications on Amazon EC2 Spot Instances

Post Syndicated from Ben Peven original https://aws.amazon.com/blogs/compute/running-web-applications-on-amazon-ec2-spot-instances/

Author: Isaac Vallhonrat, Sr. EC2 Spot Specialist SA

Amazon EC2 Spot Instances allow customers to save up to 90% compared to On-Demand pricing by leveraging spare EC2 capacity. Spot Instances are a perfect fit for fault tolerant workloads that are flexible to run on multiple instance types such as batch jobs, code builds, load tests, containerized workloads, web applications, big data clusters, and High Performance Computing clusters. In this blog post, I walk you through how-to and best practices to run web applications on Spot Instances, so you can benefit from the scale and savings that they provide.

As Spot Instances are interruptible, your web application should be stateless, fault tolerant, loosely coupled, and rely on external data stores for persistent data, like Amazon ElastiCache, Amazon RDS, Amazon DynamoDB, etc.

Spot Instances recap

While Spot Instances have been around since 2009, recent updates and service integrations have made them easier than ever to integrate with your workloads. Before I get into the details of how to use them to host a web application, here is a brief recap of how Spot Instances work:

First, Spot Instances are a purchasing option within EC2. There is no difference in hardware when compared to On-Demand or Reserved Instances. The only difference is that these instances can be reclaimed with a two-minute notice if EC2 needs the capacity back.

A set of unused EC2 instances with the same instance type, operating system, and Availability Zone is a Spot capacity pool. Spot Instance pricing is set by Amazon EC2 and adjusts gradually based on long-term trends in supply and demand of EC2 instances in each pool. You can expect Spot pricing to be stable over time, meaning no sudden spikes or swings. You can view historical pricing data for the last three months in both the EC2 console and via the API. The following image is an example of the pricing history for m5.xlarge instances in the N. Virginia (us-east-1).

spot instance pricing history

Figure 1. Spot Instance pricing history for an m5.xlarge Linux/UNIX instance in us-east-1. Price changes independently for each Availability Zone and in small increments over time based on long term supply and demand.

As each capacity pool is independent from each other, I recommend you spread the fleet of instances that power your workload across multiple Availability Zones (AZs) and be flexible to use multiple instance types. This way, you effectively increase the amount of spare capacity you can use to launch Spot Instances and are able to replace interrupted instances with instances from other pools.

The best way to adhere to Spot Instance flexibility best practices and managing your fleet of instances is using an Amazon EC2 Auto Scaling group. This allows you to combine multiple instance types and purchase options. Once you identify a list of suitable instance types for your workload, you can use Auto Scaling features to launch a combination of On-Demand and Spot Instances from optimal Spot Instance pools in each Availability Zone.

Stateless web applications are a natural fit for Spot Instances as they are fault tolerant, used to handle interruptions through scale-in events in an Auto Scaling group, and commonly run across multiple Availability Zones. It’s also normally easy to implement instance type flexibility on web applications, so they tap on Spot Instance capacity from multiple pools.

Identifying suitable instance types for your web application

Being instance type flexible is key to being successful with Spot Instances. At the time of this writing, AWS provides more than 270 instance types, so it’s likely that your web application can run on multiple of them. For example, if your application runs on the m5.xlarge instance type, it can also run on the m5d.xlarge instance type as it is the same instance type, but with local SSDs. Also, running your application on the r5.xlarge or r5d.xlarge performs similarly since they use the same processor family. Instead, your application runs on an instance with more memory and you benefit from the savings Spot Instances provide.

You may be able to qualify additional similarly-sized instance types from other families or generations, like the AMD-based variants m5a.xlarge and r5a.xlarge, or the higher bandwidth variants m5n.xlarge and r5n.xlarge. These may present some performance differences compared to other types due to their hardware and/or the virtualization system. However, Application Load Balancer has mechanisms in place to spread the load across your instances according to the number of outstanding requests they are processing. This means, instances handling longer requests or that have lower processing power are not overloaded. This allows you to acquire capacity from even more Spot capacity pools regardless of hardware differences. Later in the blog post, I dive into the details on the load balancing mechanisms.

Configure your Auto Scaling group

Now that you have qualified multiple instance types to run your web application, you need to create an EC2 Auto Scaling group.

The first step is to create a Launch Template where you specify configuration for your fleet of instances including the AMI, Security Groups, Tags, user data scripts, etc. You configure a single instance type on the template, here you configure the instance type you commonly use to run your web application. Also, do not mark the Request Spot Instances checkbox on the template. You configure multiple instance types and the Spot Instance settings when configuring the Auto Scaling group. Here’s my Launch Template:

launch templates console

Figure 2. Launch templates console

Now, go to the EC2 Auto Scaling console to create an Auto Scaling group. In the wizard, select the launch template you created and then click next. Then, on the second step of the wizard, select Combine purchase options and instance types. You can see the configuration options in the following image.

combine EC2 purchase options

Figure 3. Combine purchase options and instance types configuration.

Here is a detailed look at the configuration details:

  • Optional On-Demand Base: This parameter specifies a baseline of On-Demand Instances within your Auto Scaling group. This is also handy if you have Reserved Instances or Savings Plans to cover your compute baseline, since On-Demand Instances apply towards your Savings Plans or the instances matching your reservation are billed as Reserved Instances.
  • On-Demand percentage above base: Once the size of the Auto Scaling group grows over the (optional) On-Demand base capacity, this parameter defines the percentage of instances that are On-Demand and Spot Instances. The default settings are a good starting point, set at 70% On-Demand and 30% Spot. You can adjust these percentages as suitable for your workload.
  • Spot Allocation Strategy per Availability Zone: This defines the logic that Auto Scaling uses to select which instance type to launch in each Availability Zone when launching Spot Instances. The possible Spot allocation strategies are:
    • Capacity Optimized: This strategy allocates your instances from the Spot Instance pool with optimal capacity for the number of instances that are launching. This is the default selection and the recommended one, as launching from capacity pools with optimal capacity can lower the chance of interruption. Every time EC2 Auto Scaling scales out, the strategy is evaluated, so as your Auto Scaling group scales out and scales in, your instances are recycled and you make use of Spot Instances in the optimal pools.
    • Lowest price: This strategy allocates your instances from the number (N) of Spot Instance pools that you specify and from the pools with the lowest price per unit at the time of fulfillment.

Next, you choose your instance types from the Instance Types section of the wizard. Here, there is a primary instance type inherited from your Launch Template. The console then automatically populates a list of same size instance types across other families and generations that could also fit your web application requirements. You can add or remove instance types as you see fit.

Selection of instance types

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 4. Selection of instance types

Note: There is an option to combine instance types of different sizes which is very helpful for queue worker nodes and similar workloads. We won’t cover this in this blog post as it is not relevant for web applications. You can read more about this feature here.

Once you select your instance types, your Auto Scaling group uses the configured Spot Instance allocation strategy to decide which Spot Instance types to launch on each Availability Zone. By specifying multiple instance types, EC2 Auto Scaling replenishes capacity from other Spot Instance pools applying your configured allocation strategy if some of your instances get interrupted. This maintains the size of your fleet and keeps your service available.

For the On-Demand part of your Auto Scaling group, the allocation strategy is prioritized. This means that EC2 Auto Scaling launches the primary instance type, and moves to the next types in the order you selected, in the unlikely event it cannot launch the capacity. If you have Reserved Instances to cover your compute baseline, set the instance type you reserved as the primary instance type in your Auto Scaling group, so the On-Demand Instances get the reservation discounts. You can learn more about how Reserved Instances are applied here.

After completing this step, complete the Auto Scaling group creation wizard, and you can move on to set up your Application Load Balancer.

Load balancing with Application Load Balancer

To balance the load across the fleet of instances running your web application you use an Application Load Balancer (ALB). EC2 Auto Scaling groups are fully integrated with ALB and Target Groups that manage the fleet of instances fronted by ALB.

Target groups are set up by default with a deregistration delay of 300 seconds. This means that when EC2 Auto Scaling needs to remove an instance out of the fleet, it first puts the instance in draining state, informs ALB to stop sending new requests to it, and allows the configured time for in-flight requests to complete before the instance is finally deregistered. This way, removing an instance from the fleet is not perceptible to your end users.

As Spot Instances are interrupted with a 2 minute warning, you need to adjust this setting to a slightly lower time. Recommended values are 90 seconds or less. This allows time for in-flight requests to complete and gracefully close existing connections before the instance is interrupted.

At the time of this writing, EC2 Auto Scaling is not aware of Spot Instance interruptions and does not trigger draining automatically when a Spot Instance is going to be interrupted. However, you can put the instance in draining state by catching the interruption notice and calling the EC2 Auto Scaling API. I cover this in more detail in the next section.

Target Groups allow you to configure a routing algorithm, which by default is Round Robin. As with Spot Instances you combine multiple instance types in your fleet, your web application may see some performance differences if they use different hardware and in some cases a different virtualization platform (Xen for the older generation instances and AWS Nitro for the newer ones). While the impact to response time may be negligible to the end user, you need to ensure your instances handle load fairly accounting for the processing power differences of each instance type.

With the Least Outstanding Requests load balancing algorithm, instead of distributing the requests across your fleet in a round-robin fashion, as a new request comes in, the load balancer sends it to the target with the least number of outstanding requests. This way backend instances with lower processing capabilities or processing long-standing requests are not burdened with more requests and the load is spread evenly across targets. This also helps the new targets effectively take load from overloaded targets.

These settings can be configured on the Target Groups section of the EC2 Console and editing the attributes of your target group.
application load balancer attributes

Figure 5. ALB Target group attribute configuration.

Your architecture will look like the following image.

stateless web application architecture

Figure 6. Stateless web application hosted on a combination of On-Demand and Spot Instances

Leveraging Instance Termination Notices for graceful termination

When a Spot Instance is going to be interrupted, EC2 triggers a Spot Instance interruption notice that is presented via the EC2 Metadata endpoint on the instance. It can also be caught by an Amazon EventBridge rule, for example, trigger an AWS Lambda function. You can find the specific documentation here.

By catching the interruption notice, you can take actions to gracefully take the instance out of the fleet without impacting your end users. For example, stop receiving new requests from the Load Balancer, allowing in-flight requests to finish and launch replacement capacity.

Below is a sample Spot Instance Interruption Notice caught by Amazon EventBridge:

{
  "version": "0",
  "id": "72d6566c-e6bd-117d-5bbc-2779c679abd5",
  "detail-type": "EC2 Spot Instance Interruption Warning",
  "source": "aws.ec2",
  "account": "12345678910",
  "time": "2020-03-06T08:25:40Z",
  "region": "eu-west-1",
  "resources": [
    "arn:aws:ec2:eu-west-1a:instance/i-0cefe48f36d7c281a"
  ],
  "detail": {
    "instance-id": "i-0cefe48f36d7c281a",
    "instance-action": "terminate"
  }
}

Upon intercepting the interruption event, you can leverage the DetachInstances API call of EC2 Auto Scaling. Invoking this API puts the instance in Draining state, which makes the configured Load Balancer stop sending new requests to the instance that is going to be interrupted. The instance remains in this state during the configured deregistration delay on the Target Group to allow time for in-flight requests to complete.

Target Group console

Figure 7. Target Group console showing instance draining.

Also, if you set the ShouldDecrementDesiredCapacity parameter of the API call to False, EC2 Auto Scaling attempts to launch replacement Spot Instance capacity according to your instance type selection and allocation strategy. Below is an example code snippet calling the Auto Scaling API using Python Boto3.

def detach_instance_from_asg(instance_id,as_group_name):
   try:
       # detach instance from ASG and launch replacement instance
       response = asgclient.detach_instances(
           InstanceIds=[instance_id],
           AutoScalingGroupName=as_group_name,
           ShouldDecrementDesiredCapacity=False)
       logger.info(response['Activities'][0]['Cause'])
   except ClientError as e:
       error_message = "Unable to detach instance {id} from AutoScaling Group {asg_name}. ".format(
           id=instance_id,asg_name=as_group_name)
       logger.error( error_message + e.response['Error']['Message'])
       raise e

In some cases, you may also want to take actions at the operating system level when a Spot Instance is going to be interrupted. For example, you may want to gracefully stop your application so it closes open connections to a database, deregister agents running on the instance or other clean-up activities. You can leverage AWS Systems Manager Run Command to invoke commands on the instance that is going to be interrupted to perform these actions.

You may also want to delay the execution of commands to capitalize on the two minute warning. As a first step, you can check the instance termination time on the instance metadata and wait, for example, until 30 seconds before the interruption. Then, gracefully stop your application and issue other termination commands. Note that there are no charges for using Run Command (limits apply) and the API call is asynchronous so it doesn’t hold your Lambda function running.

Note: To use Run Command you must have the AWS Systems Manager agent running on your instance and meet a set of pre-requisites, like configuring an IAM instance profile. You can find more information here.

We provide you with a sample solution that handles all the steps outlined above here that you can easily deploy on your AWS account.

The sample solution also allows you to optionally execute commands upon Spot Instance interruption by creating an AWS Systems Manager parameter in Parameter Store specific to each Auto Scaling group with the commands you want to run, so you can easily customize termination commands to the needs of each application. You can find detailed configuration instructions in the README.md file. The high-level architecture of this solution is detailed below:

serverless architecture spot interruptions

Figure 8. Serverless architecture to handle Spot Instance interruptions

Tying it all together

Now, that I covered all the relevant features to run web applications on Spot Instances, let’s put them all together and summarize:

  • Using mixed instance types and purchase options in EC2 Auto Scaling groups you can configure a combination of On-Demand and Spot Instances. Leverage this feature to configure multiple instance types to run your web application on Spot Instances, so you can leverage unused capacity from multiple Spot capacity pools and replace interrupted instances.
  • Using the capacity-optimized Spot Instance allocation strategy you can launch Spot Instances from your instance type selection based on the most available Spot Instance capacity pools. This reduces the chances of interruptions. Auto Scaling then replenishes Spot Instance capacity from the optimal pools if some of your instances are interrupted as Spot Instance capacity fluctuates.
  • With Application Load Balancer you can adjust the deregistration delay of your Target Group to a slightly lower time than the Spot Instance termination notice time (e.g. 90 seconds), so when an instance is going to be interrupted, you stop sending new requests to it and leverage the notice time to let in-flight requests complete and avoid impact to your end users. You can also configure the Least Outstanding Requests load balancing algorithm in your ALB Target Group so the load is distributed evenly across your fleet accounting for differences in processing power for different instance families.
  • You can leverage Amazon EventBridge, AWS Lambda and AWS Systems Manager Run Command to take interruption handling actions like put the instance in draining state, request EC2 Auto Scaling to launch a replacement instance and run commands to gracefully shutdown your application or trigger clean-up activities. You can also subscribe an SNS topic or other Lambda functions to the interruption event for alarming, logging or other purposes.

I hope you found this blog post useful, and if you are not using Spot Instances to run your stateless web applications today, I encourage you to try them to get the cost savings and scale they provide.


Isaac Vallhonrat is a Sr. Specialist Solutions Architect for EC2 Spot at AWS. He helps customers build resilient, fault tolerant and cost-effective architectures with EC2 Spot Instances across multiple workload types like web applications, containers, big data, CI/CD, etc.

 

 

 

Using dynamic Amazon S3 event handling with Amazon EventBridge

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/using-dynamic-amazon-s3-event-handling-with-amazon-eventbridge/

A common pattern in serverless applications is to invoke a Lambda function in response to an event from Amazon S3. For example, you could use this pattern for automating document translation, transcribing audio files, or staging data imports. You can configure this integration in many places, including the AWS Management Console, the AWS CLI, or the AWS Serverless Application Model (SAM).

If you need to fan out notifications, or hold messages in queue, you are also able to route S3 events to Amazon SNS or Amazon SQS. These standard notification mechanisms work well for most applications, and are simple to implement. However, for more complex notification patterns, you can use Amazon EventBridge to route events dynamically. This blog post explores advanced use-cases and how to implement these in your serverless applications.

S3 to EventBridge, using CloudTrail.

To set up the example applications, visit the GitHub repo and follow the instructions in the README.md file. The code uses SAM templates, enabling you to deploy the applications in your own AWS account. This walkthrough creates resources covered in the AWS Free Tier but you may incur cost if you test with large amounts of data.

Integrating S3 events with Lambda via EventBridge

EventBridge consumes S3 events via AWS CloudTrail. A single trail can log events for one or more S3 buckets, and you can configure which data events are recorded. It’s best practice to store CloudTrail log files in a separate S3 bucket. Once this is configured, EventBridge can then receive any event logged in the trail.

The first example in the GitHub repo shows how this can be configured in a SAM template. The application comprises an S3 bucket, a Lambda EventConsumer function, and other required resources. First, the template defines the two buckets:

Resources: 
  SourceBucket: 
    Type: AWS::S3::Bucket
    Properties:
      BucketName: "TheSourceBucket"

  LoggingBucket: 
    Type: AWS::S3::Bucket
    Properties:
      BucketName: "TheLoggingBucket"

Next, an S3 bucket policy grants permissions for CloudTrail to write files to the logging bucket:

  BucketPolicy: 
    Type: AWS::S3::BucketPolicy
    Properties: 
      Bucket: 
        Ref: LoggingBucket
      PolicyDocument: 
        Version: "2012-10-17"
        Statement: 
          - 
            Sid: "AWSCloudTrailAclCheck"
            Effect: "Allow"
            Principal: 
              Service: "cloudtrail.amazonaws.com"
            Action: "s3:GetBucketAcl"
            Resource: 
              !Sub |-
                arn:aws:s3:::${LoggingBucket}
          - 
            Sid: "AWSCloudTrailWrite"
            Effect: "Allow"
            Principal: 
              Service: "cloudtrail.amazonaws.com"
            Action: "s3:PutObject"
            Resource:
              !Sub |-
                arn:aws:s3:::${LoggingBucket}/AWSLogs/${AWS::AccountId}/*
            Condition: 
              StringEquals:
                s3:x-amz-acl: "bucket-owner-full-control"

The template configures the trail and sets the logging bucket. It defines event selectors, which identify the specific events for logging:

  myTrail: 
    Type: AWS::CloudTrail::Trail
    DependsOn: 
      - BucketPolicy
    Properties: 
      TrailName: "MyTrailName"
      S3BucketName: 
        Ref: LoggingBucket
      IsLogging: true
      IsMultiRegionTrail: false
      EventSelectors:
        - DataResources:
          - Type: AWS::S3::Object
            Values:
              - !Sub |-
                arn:aws:s3:::${SourceBucket}/
      IncludeGlobalServiceEvents: false

The SAM template configures a target Lambda function for receiving the events:

  EventConsumerFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: eventConsumer/
      Handler: app.handler
      Runtime: nodejs12.x

Finally, it defines a rule that sets the event pattern and targets. It also grants permission to EventBridge to invoke the Lambda function:

  EventRule: 
    Type: AWS::Events::Rule
    Properties: 
      Description: "EventRule"
      State: "ENABLED"
      EventPattern: 
        source: 
          - "aws.s3"
        detail: 
          eventName: 
            - "PutObject"
          requestParameters:
            bucketName: !Ref SourceBucketName

      Targets: 
        - 
          Arn: 
            Fn::GetAtt: 
              - "EventConsumerFunction"
              - "Arn"
          Id: "EventConsumerFunctionTarget"

  PermissionForEventsToInvokeLambda: 
    Type: AWS::Lambda::Permission
    Properties: 
      FunctionName: 
        Ref: "EventConsumerFunction"
      Action: "lambda:InvokeFunction"
      Principal: "events.amazonaws.com"
      SourceArn: 
        Fn::GetAtt: 
          - "EventRule"
          - "Arn"

To deploy this application, follow the instructions in the GitHub repo’s README.file. To test, upload any file to the Source Bucket. This invokes the Lambda function via the EventBridge event, and logs out the event details. Open the CloudWatch Logs console for the deployed Lambda function to view the output.

The event pattern in this example matches on any PutObject event in the Source Bucket. You can also match on any attribute, or combination of attributes, in an S3 event. This makes it possible to identify events by source IP address, object size, time range, or principalId (the user causing the event). With access to the entire S3 event, this enables more granularity on matching events before invoking the target Lambda function.

Consuming events from existing S3 buckets

When deploying S3 and Lambda integrations in SAM templates, you cannot use existing buckets managed outside of the CloudFormation stack. Frequently, it’s useful to deploy serverless applications that integrate with existing S3 buckets. Using the S3-to-EventBridge integration, you can create new applications that receive events from existing buckets.

Consuming events from existing S3 buckets

The second example in the GitHub repo shows how to configure a new application for an existing bucket. This template takes the existing S3 bucket name as a parameter, and generates the CloudTrail trail, EventBridge rule, and required permissions.

Follow this example’s README.md file to deploy the application. To test, upload any file into the existing S3 bucket you selected. This invokes the eventConsumer logging function deployed in the template.

Invoking a single Lambda function from multiple S3 buckets

With EventBridge decoupling the producer and consumer of the events, this also makes it easier to introduce multiple producers. In the third example, the SAM template creates three buckets that invoke the same EventConsumer Lambda function:

Invoking Lambda from multiple S3 buckets

The MultiBucketName parameter is used to create the three buckets with a number appended to the name. First, the CloudTrail EventSelector includes the three buckets in the trail:

  # The CloudTrail trail 
  myTrail: 
    Type: AWS::CloudTrail::Trail
    DependsOn: 
      - BucketPolicy
    Properties: 
      TrailName: "myTrail"
      S3BucketName: 
        Ref: LoggingBucket
      IsLogging: true
      IsMultiRegionTrail: false
      EventSelectors:
        - DataResources:
          - Type: AWS::S3::Object
            Values:
              - !Sub 'arn:aws:s3:::${MultiBucketName}-1/'
              - !Sub 'arn:aws:s3:::${MultiBucketName}-2/'
              - !Sub 'arn:aws:s3:::${MultiBucketName}-3/'
      IncludeGlobalServiceEvents: false

Next, the EventRule includes the three bucket names in the event pattern, so events from any of these buckets can now trigger the rule:

  # EventBridge rule - invokes EventConsumerFunction 
  EventRule: 
    Type: AWS::Events::Rule
    Properties: 
      Description: "EventRule"
      State: "ENABLED"
      EventPattern: 
        source: 
          - "aws.s3"
        detail: 
          eventName: 
            - "PutObject"
          requestParameters:
            bucketName:
              - !Sub '${MultiBucketName}-1'
              - !Sub '${MultiBucketName}-2'
              - !Sub '${MultiBucketName}-3'

It’s also possible to use content-based filtering in event patterns to match dynamically on bucket names. For example, if you have multiple buckets with the prefix myCompanySales, you can create an event pattern to match all of these buckets:

      EventPattern: 
        source: 
          - "aws.s3"
        detail: 
          eventName: 
            - "PutObject"
          requestParameters:
            bucketName:
              - "prefix": "myCompanySales" 

This enables your application to consume events from new buckets created after the application is deployed. With content-based filtering, you can create search patterns that allow greater flexibility in matching events.

Multiple buckets with multiple Lambda functions

In the standard S3 and Lambda integration, a single Lambda function can only be invoked by distinct prefix and suffix patterns in the S3 trigger. This means that the same Lambda function cannot be set as the trigger for PutObject events for the same filetype or prefix. When you need to invoke multiple functions with the same or overlapping prefixes or suffixes, the EventBridge integration can handle this.

EventBridge allows up to five targets per rule, so you can specify up to five separate Lambda functions to receive the event. All five functions are invoked in parallel when the event pattern matches. To use this, add the targets in the rule – no change to the event pattern is required.

In the fourth example, the SAM template configures three buckets and three Lambda functions, all subscribing to the same event pattern.

Multiple buckets with multiple Lambda subscribers

This template takes the existing S3 bucket name as a parameter, and generates the CloudTrail trail, EventBridge rule, and required permissions. The key change to the template is in the EventRule, where now more than one target is defined:

      Targets: 
        - Arn: 
            Fn::GetAtt: 
              - "EventConsumerFunction1"
              - "Arn"
          Id: "EventConsumerFunctionTarget1"
        - Arn: 
            Fn::GetAtt: 
              - "EventConsumerFunction2"
              - "Arn"
          Id: "EventConsumerFunctionTarget2"
        - Arn: 
            Fn::GetAtt: 
              - "EventConsumerFunction3"
              - "Arn"
          Id: "EventConsumerFunctionTarget3"

This approach enables more complex routing of S3 events to Lambda targets. It allows events from multiple S3 buckets with overlapping prefixes and suffixes in object names. It also enables you to route those events to multiple Lambda functions simultaneously.

Conclusion

The standard S3 to Lambda integration enables developers to deploy code that responds to bucket- or object-based events. You can also use SNS or SQS as targets for fanning out or buffering messages from S3. Using Amazon EventBridge, you can employ even more sophisticated routing and filtering of events between S3 and Lambda.

In this blog post, I show how to deploy a basic integration using a SAM template with a single bucket and single Lambda function. I cover how to use existing S3 buckets in your new application deployments, and use EventBridge content filtering in rules to dynamically match bucket events.

Finally, in complex serverless applications, I show how EventBridge completely decouples the producers and consumers. This makes it easy to route events from multiple S3 buckets to multiple Lambda functions. When combined with attribute matching across the entire S3 event object, this allows much more granularity in identifying events before invoking Lambda functions.

To learn more about using decoupled, event-driven architectures in your serverless applications, visit the Amazon EventBridge Learning Path.

New – Amazon EventBridge Schema Registry is Now Generally Available

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-amazon-eventbridge-schema-registry-is-now-generally-available/

Amazon EventBridge is a serverless event bus that makes it easy to connect applications together. It can use data from AWS services, your own applications, and integrations with Software-as-a-Service (SaaS) partners. Last year at re:Invent, we introduced in preview EventBridge schema registry and discovery, a way to store the structure of the events (the schema) in a central location, and simplify using events in your code by generating the code to process them for Java, Python, and Typescript.

Today, I am happy to announce that the EventBridge schema registry is generally available, and that we added support for resource policies. Resource policies allow to share a schema repository across different AWS accounts and organizations. In this way, developers on different teams can search for and use any schema that another team has added to the shared registry.

Using EventBridge Schema Registry Resource Policies
It’s common for companies to have different development teams working on different services. To make a more concrete example, let’s take two teams working on services that have to communicate with each other:

  • The CreateAccount development team, working on a frontend API that receives requests from a web/mobile client to create a new customer account for the company.
  • the FraudCheck development team, working on a backend service checking the data for newly created accounts to estimate the risk that those are fake.

Each team is using their own AWS account to develop their application. Using EventBridge, we can implement the following architecture:

  • The frontend CreateAccount applications is using the Amazon API Gateway to process the request using a AWS Lambda function written in Python. When a new account is created, the Lambda function publishes the ACCOUNT_CREATED event on a custom event bus.
  • The backend FraudCheck Lambda function is built in Java, and is expecting to receive the ACCOUNT_CREATED event to call Amazon Fraud Detector (a fully managed service we introduced in preview at re:Invent) to estimate the risk of that being a fake account. If the risk is above a certain threshold, the Lambda function takes preemptive actions. For example, it can flag the account as fake on a database, or post a FAKE_ACCOUNT event on the event bus.

How can the two teams coordinate their work so that they both know the syntax of the events, and use EventBridge to generate the code to process those events?

First, a custom event bus is created with permissions to access within the company organization.

Then, the CreateAccount team uses EventBridge schema discovery to automatically populate the schema for the ACCOUNT_CREATED event that their service is publishing. This event contains all the information of the account that has just been created.

In an event-driven architecture, services can subscribe to specific types of events that they’re interested in. To receive ACCOUNT_CREATED events, a rule is created on the event bus to send those events to the FraudCheck function.

Using resource policies, the CreateAccount team gives read-only access to the FraudCheck team AWS account to the discovered schemas. The Principal in this policy is the AWS account getting the permissions. The Resource is the schema registry that is being shared.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "GiveSchemaAccess",
      "Effect": "Allow",
      "Action": [
        "schemas:ListSchemas",
        "schemas:SearchSchemas", 
        "schemas:DescribeSchema",
        "schemas:DescribeCodeBinding",
        "schemas:GetCodeBindingSource",
        "schemas:PutCodeBinding"
      ],
      "Principal": {
        "AWS": "123412341234"
      },
      "Resource": [
        "arn:aws:schemas:us-east-1:432143214321:schema/discovered-schemas",
        "arn:aws:schemas:us-east-1:432143214321:schema/discovered-schemas*"
      ]
    }
  ]
}

Now, the FraudCheck team can search the content of the discovered schema for the ACCOUNT_CREATED event. Resource policies allow you to make a registry available across accounts and organizations, but they will not automatically show up in the console. To access the shared registry, the FraudCheck team needs to use the AWS Command Line Interface (CLI) and specify the full ARN of the registry:

aws schemas search-schemas \
    --registry-name arn:aws:schemas:us-east-1:432143214321:registry/discovered-schemas \
    --keywords ACCOUNT_CREATED

In this way, the FraudCheck team gets the exact name of the schema created by the CreateAccount team.

{
    "Schemas": [
        {
            "RegistryName": "discovered-schemas",
            "SchemaArn": "arn:aws:schemas:us-east-1:432143214321:schema/discovered-schemas/[email protected]_CREATED",
            "SchemaName": “[email protected]_CREATED",
            "SchemaVersions": [
                {
                    "CreatedDate": "2020-04-28T11:10:15+00:00",
                    "SchemaVersion": 1
                }
            ]
        }
    ]
}

With the schema name, the FraudCheck team can describe the content of the schema:

aws schemas describe-schema \
    --registry-name arn:aws:schemas:us-east-1:432143214321:registry/discovered-schemas \
    --schema-name [email protected]_CREATED

The result describes the schema using the OpenAPI specification:

{
    "Content": "{\"openapi\":\"3.0.0\",\"info\":{\"version\":\"1.0.0\",\"title\":\"CREATE_ACCOUNT\"},\"paths\":{},\"components\":{\"schemas\":{\"AWSEvent\":{\"type\":\"object\",\"required\":[\"detail-type\",\"resources\",\"detail\",\"id\",\"source\",\"time\",\"region\",\"version\",\"account\"],\"x-amazon-events-detail-type\":\"CREATE_ACCOUNT\",\"x-amazon-events-source\":\”CreateAccount\",\"properties\":{\"detail\":{\"$ref\":\"#/components/schemas/CREATE_ACCOUNT\"},\"account\":{\"type\":\"string\"},\"detail-type\":{\"type\":\"string\"},\"id\":{\"type\":\"string\"},\"region\":{\"type\":\"string\"},\"resources\":{\"type\":\"array\",\"items\":{\"type\":\"object\"}},\"source\":{\"type\":\"string\"},\"time\":{\"type\":\"string\",\"format\":\"date-time\"},\"version\":{\"type\":\"string\"}}},\"CREATE_ACCOUNT\":{\"type\":\"object\",\"required\":[\"firstName\",\"surname\",\"id\",\"email\"],\"properties\":{\"email\":{\"type\":\"string\"},\"firstName\":{\"type\":\"string\"},\"id\":{\"type\":\"string\"},\"surname\":{\"type\":\"string\"}}}}}}",
    "LastModified": "2020-04-28T11:10:15+00:00",
    "SchemaArn": "arn:aws:schemas:us-east-1:432143214321:schema/discovered-schemas/[email protected]_ACCOUNT",
    "SchemaName": “[email protected]_CREATED",
    "SchemaVersion": "1",
    "Tags": {},
    "Type": "OpenApi3",
    "VersionCreatedDate": "2020-04-28T11:10:15+00:00"
}

Using the AWS Command Line Interface (CLI), the FraudCheck team can create a code binding if it isn’t already created, using the put-code-binding command, and then download the code binding to process that event:

aws schemas get-code-binding-source \
    --registry-name arn:aws:schemas:us-east-1:432143214321:registry/discovered-schemas \
    --schema-name [email protected]_CREATED \
    --language Java8 CreateAccount.zip

Another option for the FraudCheck team is to copy and paste (after unescaping the JSON string) the Content of the discovered schema to create a new custom schema in their AWS account.

Once the schema is copied to their own account, the FraudCheck team can use the AWS Toolkit IDE plugins to view the schema, download code bindings, and generate serverless applications directly from their IDEs. The EventBridge team is working to add the capability to the AWS Toolkit to use a schema registry in a different account, making this step simpler. Stay tuned!

Often customers have a specific team, with a different AWS account, managing the event bus. For the sake of simplicity, in this post I assumed that the CreateAccount team was the one configuring the EventBridge event bus. With more accounts, you can simplify permissions using IAM to share resources with groups of AWS accounts in AWS Organizations.

Available Now
The EventBridge Schema Registry is available now in all commercial regions except Bahrain, Cape Town, Milan, Osaka, Beijing, and Ningxia. For more information on how to use resource policies for schema registries, please see the documentation.

Using Schema Registry resource policies, it is much easier to coordinate the work of different teams sharing information in an event-driven architecture.

Let me know what are you going to build with this!

Danilo

Building an automated knowledge repo with Amazon EventBridge and Zendesk

Post Syndicated from Benjamin Smith original https://aws.amazon.com/blogs/compute/building-an-automated-knowledge-repo-with-amazon-eventbridge-and-zendesk/

Zendesk Guide is a smart knowledge base that helps customers harness the power of institutional knowledge. It enables users to build a customizable help center and customer portal.

This post shows how to implement a bidirectional event orchestration pattern between AWS services and an Amazon EventBridge third-party integration partner. This example uses support ticket events to build a customer self-service knowledge repository. It uses the EventBridge partner integration with Zendesk to accelerate the growth of a customer help center.

The examples in this post are part of a serverless application called FreshTracks. This is built in Vue.js and demonstrates SaaS integrations with Amazon EventBridge. To test this example, ask a question on the Fresh Tracks application.

The backend components for this EventBridge integration with Zendesk have been extracted into a separate example application in this GitHub repo.

How the application works

Routing Zendesk events with Amazon EventBridge.

Routing Zendesk events with Amazon EventBridge.

  1. A user searches the knowledge repository via a widget embedded in the web application.
  2. If there is no answer, the user submits the question via the web widget.
  3. Zendesk receives the question as a support ticket.
  4. Zendesk emits events when the support ticket is resolved.
  5. These events are streamed into a custom SaaS event bus in EventBridge.
  6. Event rules match events and send them downstream to an AWS Step Functions Express Workflow.
  7. The Express Workflow orchestrates Lambda functions to retrieve additional information about the event with the Zendesk API.
  8. A Lambda function uses the Zendesk API to publish a new help article from the support ticket data.
  9. The new article is searchable on the website widget for other users to read.

Before deploying this application, you must generate an API key from within Zendesk.

Creating the Zendesk API resource

Use an API to execute events on your Zendesk account from AWS. Follow these steps to generate a Zendesk API token. This is used by the application to authenticate Zendesk API calls.

To generate an API token

  1. Log in to the Zendesk dashboard.
  2. Click the Admin icon in the sidebar, then select Channels > API.
  3. Click the Settings tab, and make sure that Token Access is enabled.
  4. Click the + button to the right of Active API Tokens.

    Creating a Zendesk API token.

    Creating a Zendesk API token.

  5. Copy the token, and store it securely. Once you close this window, the full token is not displayed again.
  6. Click Save to return to the API page, which shows a truncated version of the token.

    Zendesk API token.

    Zendesk API token.

Configuring Zendesk with Amazon EventBridge

Step 1. Configuring your Zendesk event source.

  1. Go to your Zendesk Admin Center and select Admin Center > Integrations.
  2. Choose Connect in events Connector for Amazon EventBridge to open the page to configure your Zendesk event source.

    Zendesk integrations

    Zendesk integrations

  3. Enter your AWS account ID in the Amazon Web Services account ID field, and select the Region to receive events.
  4. Choose Save.

    Zendesk Amazon EventBridge configuration.

Step 2. Associate the Zendesk event source with a new event bus: 

  1. Log into the AWS Management Console and navigate to services > Amazon EventBridge > Partner event sources
    New event source

    New event source

     

  2. Select the radio button next to the new event source and choose Associate with event bus.
    Associating event source with event bus.

    Associating event source with event bus.

     

  3. Choose Associate.

Deploying the backend application

After associating the Event source with a new partner event bus, you can deploy backend services to receive events.

To set up the example application, visit the GitHub repo and follow the instructions in the README.md file.

When deploying the application stack, make sure to provide the custom event bus name, and Zendesk API credentials with --parameter-overrides.

sam deploy --parameter-overrides ZendeskEventBusName=aws.partner/zendesk.com/123456789/default ZenDeskDomain=MydendeskDomain ZenDeskPassword=myAPITOken ZenDeskUsername=myZendeskAgentUsername

You can find the name of the new Zendesk custom event bus in the custom event bus section of the EventBridge console.

Routing events with rules

When a support ticket is updated in Zendesk, a number of individual events are streamed to EventBridge, these include an event for each of:

  • Agent Assignment Changed
  • Comment Created
  • Status Changed
  • Brand Changed
  • Subject Changed

An EventBridge rule is used filter for events. The AWS Serverless Application Model (SAM) template defines the rule with the `AWS::Events::Rule` resource type. This routes the event downstream to an AWS Step Functions Express Workflow.  The EventPattern is shown below:

  ZendeskNewWebQueryClosed: 
    Type: AWS::Events::Rule
    Properties: 
      Description: "New Web Query"
      EventBusName: 
         Ref: ZendeskEventBusName
      EventPattern: 
        account:
        - !Sub '${AWS::AccountId}'
        detail-type: 
        - "Support Ticket: Comment Created"
        detail:
          ticket_event:
            ticket:
              status: 
              - solved
              tags:
              - web_widget
              tags: 
              - guide
      Targets: 
        - RoleArn: !GetAtt [ MyStatesExecutionRole, Arn ]
          Arn: !Ref FreshTracksZenDeskQueryMachine
          Id: NewQuery

The tickets must have two specific tags (web_widget and guide) for this pattern to match. These are defined as separate fields to create an AND  matching rule, instead of declaring within the same array field to create an OR rule. A new comment on a support ticket triggers the event.

The Step Functions Express Workflow

The application routes events to a Step Functions Express Workflow that is defined in the application’s SAM template:

FreshTracksZenDeskQueryMachine:
    Type: "AWS::StepFunctions::StateMachine"
    Properties:
      StateMachineType: EXPRESS
      DefinitionString: !Sub |
               {
                    "Comment": "Create a new article from a zendeskTicket",
                    "StartAt": "GetFullZendeskTicket",
                    "States": {
                      "GetFullZendeskTicket": {
                      "Comment": "Get Full Ticket Details",
                      "Type": "Task",
                      "ResultPath": "$.FullTicket",
                      "Resource": "${GetFullZendeskTicket.Arn}",
                      "Next": "GetFullZendeskUser"
                      },
                      "GetFullZendeskUser": {
                      "Comment": "Get Full User Details",
                      "Type": "Task",
                      "ResultPath": "$.FullUser",
                      "Resource": "${GetFullZendeskUser.Arn}",
                      "Next": "PublishArticle"
                      },
                      "PublishArticle": {
                      "Comment": "Publish as an article",
                      "Type": "Task",
                      "Resource": "${CreateZendeskArticle.Arn}",
                      "End": true
                      }
                    }
                }
      RoleArn: !GetAtt [ MyStatesExecutionRole, Arn ]

This application is suited for a Step Functions Express Workflow because it is orchestrating short duration, high-volume, event-based workloads.  Each workflow task is idempotent and stateless. The Express Workflow carries the workload’s state by passing the output of one task to the input of the next. The Amazon States Language ResultPath definition is used to control where each tasks output is appended to workflow’s state before it is passed to the next task.

 

AWS StepFunctions Express workflow

AWS StepFunctions Express workflow

Lambda functions

Each task in this Express Workflow invokes a Lambda function defined within the example application’s SAM template. The Lambda functions use the Node.js Axios package to make a request to Zendesk’s API.  The Zendesk API credentials are stored in the Lambda function’s environment variables and accessible via ‘process.env’.

The first two Lambda functions in the workflow make a GET request to Zendesk. This retrieves additional data about the support ticket, the author, and the agent’s response.

The final Lambda function makes a POST request to Zendesk API. This creates and publishes a new article using this data.  The permission_group and section defined in this function must be set to your Zendesk account’s default permission group ID and FAQ section ID.

AWS Lambda function code

AWS Lambda function code

Integrating with your front-end application

Follow the instructions in the Fresh Tracks repo on GitHub to deploy the front-end application. This application includes Zendesk’s web widget script in the index.html page. The widget has been customized using Zendesk’s javascript API. This is implemented in the navigation component to insert custom forms into the widget and prefill the email address field for authenticated users. The backend application starts receiving Zendesk emitted events immediately.

The video below demonstrates the implementation from end to end.

Conclusion

This post explains how to set up EventBridge’s third-party integration with Zendesk to capture events. The example backend application demonstrates how to filter these events, and send downstream to a Step Functions Express Workflow. The Express Workflow orchestrates a series of stateless Lambda functions to gather additional data about the event. Zendesk’s API is then used to publish a new help guide article from this data.

This pattern provides a framework for bidirectional event orchestration between AWS services, custom web applications and third party integration partners. This can be replicated and applied to any number of third party integration partners.

This is implemented with minimal code to provide near real-time streaming of events and without adding latency to your application.

The possibilities are vast. I am excited to see how builders use this bidirectional serverless pattern to add even more value to their third party services.

Start here to learn about other SaaS integrations with Amazon EventBridge.

Decoupling larger applications with Amazon EventBridge

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/decoupling-larger-applications-with-amazon-eventbridge/

Many applications start to grow in complexity as they mature, making it harder for developers to maintain code or add new features. This can lead to monolithic applications, where developers must know more about the entire architecture to make changes. Typically, this causes code to become more fragile, and the rate of development slows down.

This blog post shows how you can use an event-based architecture to decouple services and functional areas of applications. It uses the document repository solution as an example, to compare architecture after shifting to an event-based approach. The new architecture offers both greater extensibility and simplicity for developers adding new functionality in the future. It can help alleviate the problems associated with monolithic applications.

The original version of this application uses Amazon S3 event notifications to invoke AWS Lambda functions to index content in the Amazon Elasticsearch Service:

Original document repository application architecture

There are some limitations with this design. First, there is a single source bucket for documents, which may not reflect production usage. Also, while it could be modified to allow new file types for indexing, adding new functionality such as translating documents would require refactoring. And despite having multiple Lambda functions, it’s packaged as a single application, which makes it harder to deploy changes.

The new design uses events to decouple each service used to process incoming S3 objects. It can also use one or more buckets as event sources, which you can change dynamically as needed. Most importantly, it can be easier to introduce changes and new functionality, since the application is no longer deployed as a mono-repo. The new architecture uses this design:

Decoupled architecture

  1. Setup and configuration of AWS resources.
  2. Parser function to filter and reformat S3 events for the application.
  3. Converter functions to operate on distinct file types.
  4. Analyzer functions for interpreting the content of the files.
  5. The Loader function imports the metadata into the Amazon Elasticsearch Service.

The code uses the AWS Serverless Application Model (SAM), enabling you to deploy the application easily in your own AWS account. This walkthrough creates resources covered in the AWS Free Tier but you may incur cost for significant data usage. Additionally, it requires an Amazon Elasticsearch Service domain, which may incur cost on your AWS bill.

The resulting solution is five separate applications, which you deploy in stages. To set up the application, visit the GitHub repo and follow the instructions in the README.md file.

Setup and configuration

The SAM template in the setup directory creates the S3 buckets, and configures AWS CloudTrail to capture put events in these buckets. This is required as EventBridge consumes S3 events via CloudTrail. Now, when any object is stored in any of these S3 buckets, EventBridge receives an event.

This template also creates a customer managed IAM policy that creates read-only access to the source S3 buckets:

  MyManagedPolicy:
    Type: AWS::IAM::ManagedPolicy
    Properties:
      ManagedPolicyName: docrepo-s3-read-policy
      PolicyDocument: 
        Version: 2012-10-17
        Statement: 
          - Effect: Allow
            Action:
              - s3:GetObject
              - s3:ListBucket
              - s3:GetBucketLocation
              - s3:GetObjectVersion
              - s3:GetLifecycleConfiguration
            Resource:
              - !Sub 'arn:aws:s3:::${Dept1Bucket}/*'
              - !Sub 'arn:aws:s3:::${Dept2Bucket}/*'
              - !Sub 'arn:aws:s3:::${Dept3Bucket}/*'

This policy can be attached to any Lambda function that must read the contents from one of the S3 buckets. If the pool of source buckets changes in the future, you only need to modify this policy. Any downstream Lambda functions using the policy automatically gain access to the added buckets.

In the second setup application, the Parser service receives those S3 events and reformats the event for downstream services. Specifically, it creates a new attribute for the file type of the S3 object. After you deploy these two templates, creating any objects in the source S3 buckets generates the following event in the default event bus:

Parsing events from Amazon S3

Building the converter processes

This application uses converters to process incoming objects in the S3 buckets. One converter handles one file type. There are two converters required to replicate the original application’s functionality, for pdf and docx files. An EventBridge rule matches incoming events and triggers the appropriate Lambda function to convert the object. This diagram shows abridged input and output events for these functions:

  1. A matching EventBridge rule invokes the relevant converter function. The function converts the source file into raw text.
  2. The text is split into batches of 5,000 characters.
  3. The functions publish the text batches back to EventBridge, using new detail-type and source attributes.

The SAM template specifies the EventBridge rules, the permissions for EventBridge to invoke the Lambda functions, and the processing Lambda functions. The Lambda functions use the customer managed IAM policy created during the setup for read-only access to the originating S3 bucket. Each converter has its own logic for processing file types differently, and can produce different types of events if needed.

The analyzer functions

In this workflow, any file type containing text is analyzed by Amazon Comprehend to detect entities. The AnalyzeText function is invoked by an EventBridge rule. The rule is filtering for the NewTextBatch attribute in an event from docRepo.converters.

Another EventBridge rule triggers the AnalyzeImage function. This is filtering for jpg and jpeg file types where the event source is docRepo.s3. This function uses Amazon Rekognition to identify labels in the images.

Both functions produce new events containing the entities and labels, using new detail-type and source attributes. These events are published back to the default bus on EventBridge:

Analyzers processing events

  1. A matching EventBridge rule invokes the relevant analyzer function. The function uses Amazon ML services to detect labels in images and entities in text.
  2. The functions publish the metadata back to EventBridge, using new detail-type and source attributes.

Loader function

The Loader function is invoked by an EventBridge rule that is filtering for events from the Analyzers functions. This final function receives those events and loads the labels and entities metadata into the Amazon Elasticsearch Service:

Loader function processing events

Choosing between AWS Step Functions and Amazon EventBridge

In this application, there is a sequence of steps to the workflow that could also be handled by AWS Step Functions. Both services can simplify workflows in distributed applications and make it easier to maintain and modify serverless applications. In many cases, it makes sense to use both services for larger enterprise applications with complex business logic.

However, EventBridge enables you to separate processes into independent applications. It also allows other consumers to build custom logic using your events without impacting your application design or performance. In enterprise applications, this makes it much easier to innovate and develop new application features.

Benefits for developers

With the original monolithic application divided into five separate applications, it’s now easier for different teams to work on this project. It’s also easier and safer to deploy changes to a single microservice without needing to deploy the entire application. Developers must only understand their own service rather than the complete architecture of the application.

For example, to add more S3 buckets to the source list, you only need to modify the SAM template in the setup part of the application. The Parser function consumes put events from any number of buckets, and downstream functions consume events via EventBridge. To add a new file type, you only need to add a new converter function. Or to change the indexing provider, you create a new loader function to route the metadata to another service. The services of this application are independent, decoupled by EventBridge, and you can add more producers and consumers as required.

Traditionally, one of the challenges with event-based applications is tracking the format of events. Event schemas are typically hard to manage because any service can produce an event. The schema may also change as developers release new versions of a service. To help solve these issues, EventBridge has a feature called schema discovery that can automate the tracking and management of events in your application.

All the microservices in this application publish with a source attribute of docRepo. If you enable schema discovery, EventBridge quickly identifies these custom event schemas:

Schema discovery in Amazon EventBridge

The schemas are defined in JSON using the OpenAPI Specification. As you develop new features, you can download code bindings directly from these schemas. For type-safe languages, this allows you to use events as objects directly in your applications, helping to accelerate development. To learn more about how to use code bindings and schema discovery, watch this video:

Conclusion

Larger applications can quickly become monoliths. You can use event-based architectures to decouple services within applications, and maintain flexibility as your application grows. Amazon EventBridge is a serverless event bus that can help simplify you architecture, allowing each service to operate independently with no dependence on event consumers.

In this post, I show how to rearchitect the Serverless Document Repository example into five smaller applications orchestrated using events. I explore the benefits of developing applications using this approach, including the ability to make changes more easily. I also show how EventBridge schema discovery can help automate event schema management.

To learn more about how to use Amazon EventBridge to decouple large applications, visit the Amazon EventBridge learning path.

Visualize user behavior with Auth0 and Amazon EventBridge

Post Syndicated from Benjamin Smith original https://aws.amazon.com/blogs/compute/visualize-user-behavior-with-auth0-and-amazon-eventbridge/

In this post, I show how to capture user events and monitor user behavior by using the Amazon EventBridge partner integration with Auth0. This enables you to gain insights to help deliver a more customized application experience for your users.

Auth0 is a flexible, drop-in solution that adds authentication and authorization services to your applications. The EventBridge integration automatically and continuously pushes Auth0 log events your AWS account via a custom SaaS event bus.

The examples used in this post are implemented in a custom-built serverless application called FreshTracks. This is a demo application built in Vue.js, which I will use to demonstrate multiple SaaS integrations into AWS with EventBridge in this and future blog posts.

FreshTracks – a demo serverless web application with multiple SaaS Integrations.

FreshTracks – a demo serverless web application with multiple SaaS Integrations.

The components for this EventBridge integration with Auth0 have been extracted into a separate example application in this GitHub repo.

How the application works

Routing Auth0 Events with Amazon EventBridge.

Routing Auth0 Events with Amazon EventBridge.

  1. Events are emitted from Auth0 when a user interacts with the login service on the front-end application.
  2. These events are streamed into a custom SaaS event bus in EventBridge.
  3. Event rules match events and send them downstream to a Lambda function target.
  4. The receiving Lambda function performs some data transformation before writing an object to S3.
  5. These objects are made available by a QuickSight data source manifest file and used as datapoints for QuickSight visuals.

Configuring the Auth0 EventBridge integration

To capture Auth0 emitted events in EventBridge, you must first configure Auth0 for use as the Event Source on your Auth0 Dashboard.

  1. Log in to the Auth0 Dashboard.
  2. Choose to Logs > Streams.
  3. Choose + Create Stream.
  4. Choose Amazon EventBridge and enter a unique name for the new Amazon EventBridge Event Stream.
  5. Create the Event Source by providing your AWS Account ID and AWS Region. The Region you select must match the Region of the Amazon EventBridge bus.
  6. Choose Save.
Event Source Configuration on Auth0 dashboard

Event Source Configuration on Auth0 dashboard

Auth0 provides you with an Event Source Name. Make sure to save your Event Source Name value since you need this at a later point to complete the integration.

Creating a custom event bus

  1. Go to the EventBridge partners tab in your AWS Management Console. Ensure the AWS Region matches where the Event Source was created.
  2. Paste the Event Source Name in the Partner event sources search box to find and choose the new Auth0 event source.Note: The Event Source remains in a pending state until it is associated with an event bus.

    Partner event source

  3. Choose the event source, then choose Associate with Event Bus.
  4. Choose Associate.

Deploying the application

Once you have associated the Event Source with a new partner event bus, you are ready to deploy backend services to receive and respond to these events.

To set up the example application, visit the GitHub repo and follow the instructions in the README.md file.

When deploying the application stack, make sure to provide the custom event bus name with –parameter-overrides.

sam deploy --parameter-overrides Auth0EventBusName=aws.partner/auth0.com/auth0username-0123344567-e5d2-4514-84f2-97dd4ff8aad0/auth0.logs

You can find the name of the new Auth0 custom event bus in the custom event bus section of the EventBridge console:

Custom event bus name

Custom event bus name

Routing events with rules

The AWS Serverless Application Model (SAM) template in the example application creates four event rules:

  1. Successful sign-in
  2. Successful signup
  3. Successful log-out
  4. Unsuccessful signup

These are defined with the `AWS::Events::Rule` resource type. Each of these rules is routed to a single target Lambda function. For a successful sign-in, the rule event pattern is matched with detail:data:type:s.  This refers to the Auth0 event type code for a successful sign-in. Every Auth0 event code is listed here.

SuccessfullSignIn: 
    Type: AWS::Events::Rule
    Properties: 
      Description: "Auth0 User Successfully signed in"
      EventBusName: 
         Ref: Auth0EventBusName
      EventPattern: 
        account:
        - !Sub '${AWS::AccountId}'
        detail:
          data:
            type:
            - s
      Targets: 
        - 
          Arn:
            Fn::GetAtt:
              - "SaveAuth0EventToS3" 
              - "Arn"
          Id: "SignInSuccessV1"

To respond to additional events, copy this event rule pattern and change the event code string for the event you want to match.

Writing events to S3 with Lambda

The application routes events to a Lambda function, which performs some data transformation before writing an object to S3.  The function code uses an environment variable named AuthLogsBucket to store the S3 bucket name. The permissions to write to S3 are granted by policy defined within the SAM template:

  SaveAuth0EventToS3:
    Type: AWS::Serverless::Function 
    Properties:
      CodeUri: src/
      Handler: saveAuth0EventToS3.handler
      Runtime: nodejs12.x
      MemorySize: 128
      Environment:
        Variables:
          AuthLogBucket: !Ref AuthZeroToEventBridgeUserActivitylogs
      Policies:
        - S3CrudPolicy:
            BucketName: !Ref AuthZeroToEventBridgeUserActivitylogs

The S3 object is a CSV file with context about each event. Each of the Auth0 event schemas is different. To maintain a consistent CSV file structure across different event types, this Lambda function explicitly defines each of the header and row values. An output string is constructed from the Auth0 event:

Lambda function output string

Lambda function output string

This string is placed into a new buffer and written to S3 with the AWS SDK for Javascript as referenced in GitHub here

Sending events to the application

There is a test event in the /event directory of the example application. This contains an example of a successful sign-in event emitted from Auth0.

Sending a test Auth0 event to Lambda

Send a test event to the Lambda function using the AWS Command Line Interface.

Run the following command in the root directory of the example application, replacing {function-name} with the full name of your Lambda function.

aws lambda invoke --function-name {function-name} --invocation-type Event --payload file://events/event.json  events/response.json --log-type Tail

Response:

{  
 "StatusCode": 202
}

The response output appears in the output terminal window. To confirm that an object is stored in S3, navigate to the S3 Console.  Choose the AuthZeroToEventBridgeUserActivityLogs bucket. You see a new auth0 directory and can open the CSV file that holds context about the event.

Object written to S3

Object written to S3

Sending real Auth0 events from a front-end application

Follow the instructions in the Fresh Tracks repo on GitHub to deploy the front-end application. This application includes Auth0’s authentication flow. You can connect to your Auth0 application by entering your credentials in the `auth0_config.json` file:

{
  "domain": "<YOUR AUTH0 DOMAIN>",
  "clientId": "<YOUR AUTH0 CLIENT ID>"
}

The example backend application starts receiving Auth0 emitted events immediately.

To see the full Fresh Tracks application continue to the backend deployment instructions. This is not required for the examples in this blog post.

Building a QuickSight dashboard

You can visualize these Auth0 user events with an Amazon QuickSight dashboard. This provides a snapshot analysis that you can share with other QuickSight users for reporting purposes.

To use Auth0 events as metrics, create a separate calculated field for each event (for example, successful signup and successful login).  For example, an analysis could include multiple visuals, custom fields, conditional formatting, and events. This gives a snapshot of user interaction with the front-end application at any given time.

FreshTracks final dashboard example

FreshTracks final dashboard example

The example application in the GitHub repo provides instructions on how to create a dashboard.

Conclusion

This post explains how to set up EventBridge’s third-party integration with Auth0 to capture events. The example backend application demonstrates how to filter these events, perform computations on them, save as S3 objects, and send to a downstream service.

The ability to build QuickSight story boards from these events and share visuals with key business stakeholders can provide a narrative about the analysis data. This is implemented with minimal code to provide near real-time streaming of events and without adding latency to your application.

The possibilities are vast. I am excited to see how builders use this serverless pattern to create their own visuals to build a better, more customized application experience for their users.

Start here to learn about other SaaS integrations with Amazon EventBridge.

Event-Based Processing for Asynchronous Communication

Post Syndicated from Roberto Iturralde original https://aws.amazon.com/blogs/architecture/event-based-processing-for-asynchronous-communication/

In the first post in this series on messaging patterns, we gave an overview of messaging and the benefits and challenges of both synchronous and asynchronous service communication. In this post, we will look at common characteristics to consider when evaluating messaging channel technologies. We will also introduce Amazon EventBridge, the AWS Serverless event bus.

What is an Event?

An event is the occurrence of an immutable message describing something that happened in the past. The event data payload typically contains details of the occurrence and associated metadata. Events are routed from a producer to potential consumers via a transportation channel, as defined in the previous post in this series. This event may be delivered to zero, one, or many consumers, depending on the number of consumers subscribed to that transportation channel, and possibly further limited to the subscribed consumers who have filtered their subscription to particular events.

Below is an example of an AWS event. AWS events are emitted by AWS services, are represented as JSON objects, require all the top-level metadata fields shown below, and contain a source-specific payload in the “detail” field.

Sample AWS event

  "version": "0",
  "id": "6a7e8feb-b491-4cf7-a9f1-bf3703467718",
  "detail-type": "EC2 Instance State-change Notification",
  "source": "aws.ec2",
  "account": "111122223333",
  "time": "2017-12-22T18:43:48Z",
  "region": "us-west-1",
  "resources": [
    "arn:aws:ec2:us-west-1:123456789012:instance/ i-1234567890abcdef0"
  ],
  "detail": {
    "instance-id": " i-1234567890abcdef0",
    "state": "terminated"
  }
}

Event Schemas

Looking at the above sample event and the documentation on AWS events, you’ll notice there is a schema for the top-level fields of all events and separate schemas for the source-specific attributes in the “detail” field. This is because multiple AWS services emit events to the Amazon EventBridge default message bus. Unlike HTTP APIs, which often have a published schema and validate requests against that schema, it’s common for event transportation channels to allow heterogeneous events through the channel, like the default bus in EventBridge.

To simplify the experience of defining, discovering, and developing against event schemas, Amazon EventBridge has a Schema Registry. The registry is pre-populated with OpenAPI schemas for events from existing AWS services and allows you to define your own schemas. It also supports downloading code bindings for popular programming languages for schemas in the registry.

Event channel characteristics

When evaluating an event channel technology against your business and technical requirements, keep in mind the following common characteristics. This list is not exhaustive, but may serve as a useful starting point for orienting yourself to each technology.

  • Ordering: Are events guaranteed to be delivered in the order they arrive?
  • Duplication: Can duplicate events be introduced? Will duplicate events in the channel be de-duplicated?
  • Fanout: Can each event be delivered to or processed by multiple consumers?
  • Push versus pull: Are events sent to consumers or do consumers need to fetch events?
  • Filtering: Can event consumers choose to receive a filtered subset of the events passing through the channel?
  • Serverless: Does the underlying infrastructure need to be managed and tuned by the channel owner?

Amazon EventBridge

Amazon Eventbrdige custom application

Amazon EventBridge is a serverless, fully managed event bus. Producers can be AWS services, supported Software as a Service (SaaS) partners, or your own applications that write JSON messages to an EventBridge event bus. Messages are delivered to consumers configured as the targets of event rules. Event rules can filter messages based on attributes in the JSON payload, and a single rule can route events to multiple targets. Event rules retry delivery of events to configured targets with exponential backoff for up to 24 hours. EventBridge does not guarantee event order will be maintained and promises as-least-once event delivery, meaning duplicate messages can be introduced. Consider EventBridge for use cases where you need rich event filtering, fanout of events to tens of consumers, message delivery or consumption across AWS accounts, to listen to events from a wide variety of AWS services, or to receive messages from SaaS partner systems.

Conclusion

In this post, we defined events and outlined criteria against which to evaluate event transportation technologies. We then briefly reviewed Amazon EventBridge against these criteria. For a deeper discussion on eventing technologies, I recommend watching the recording of the session Moving to Event-Driven Architectures by AWS Distinguished Engineer, Tim Bray, from AWS re:Invent 2019. For a deep dive on Amazon EventBridge, watch this AWS Tech Talk. For an example of integrating Amazon EventBridge into a serverless architecture, review this blog post that walks through an example application.