Tag Archives: Amazon EventBridge

Extending your SaaS platform with AWS Lambda

Post Syndicated from Hasan Tariq original https://aws.amazon.com/blogs/architecture/extending-your-saas-platform-with-aws-lambda/

Software as a service (SaaS) providers continuously add new features and capabilities to their products to meet their growing customer needs. As enterprises adopt SaaS to reduce the total cost of ownership and focus on business priorities, they expect SaaS providers to enable customization capabilities.

Many SaaS providers allow their customers (tenants) to provide customer-specific code that is triggered as part of various workflows by the SaaS platform. This extensibility model allows customers to customize system behavior and add rich integrations, while allowing SaaS providers to prioritize engineering resources on the core SaaS platform and avoid per-customer customizations.

To simplify experience for enterprise developers to build on SaaS platforms, SaaS providers are offering the ability to host tenant’s code inside the SaaS platform. This blog provides architectural guidance for running custom code on SaaS platforms using AWS serverless technologies and AWS Lambda without the overhead of managing infrastructure on either the SaaS provider or customer side.

Vendor-hosted extensions

With vendor-hosted extensions, the SaaS platform runs the customer code in response to events that occur in the SaaS application. In this model, the heavy-lifting of managing and scaling the code launch environment is the responsibility of the SaaS provider.

To host and run custom code, SaaS providers must consider isolating the environment that runs untrusted custom code from the core SaaS platform, as detailed in Figure 1. This introduces additional challenges to manage security, cost, and utilization.

Distribution of responsibility between Customer and SaaS platform with vendor-hosted extensions

Figure 1. Distribution of responsibility between Customer and SaaS platform with vendor-hosted extensions

Using AWS serverless services to run custom code

Using AWS serverless technologies removes the tasks of infrastructure provisioning and management, as there are no servers to manage, and SaaS providers can take advantage of automatic scaling, high availability, and security, while only paying for value.

Example use case

Let’s take an example of a simple SaaS to-do list application that supports the ability to initiate custom code when a new to-do item is added to the list. This application is used by customers who supply custom code to enrich the content of newly added to-do list items. The requirements for the solution consist of:

  • Custom code provided by each tenant should run in isolation from all other tenants and from the SaaS core product
  • Track each customer’s usage and cost of AWS resources
  • Ability to scale per customer

Solution overview

The SaaS application in Figure 2 is the core application used by customers, and each customer is considered a separate tenant. For the sake of brevity, we assume that the customer code was already stored in an Amazon Simple Storage Service (Amazon S3) bucket as part of the onboarding. When an eligible event is generated in the SaaS application as a result of user action, like a new to-do item added, it gets propagated down to securely launch the associated customer code.

Example use case architecture

Figure 2. Example use case architecture

Walkthrough of custom code run

Let’s detail the initiation flow of custom code when a user adds a new to-do item:

  1. An event is generated in the SaaS application when a user performs an action, like adding a new to-do list item. To extend the SaaS application’s behavior, this event is linked to the custom code. Each event contains a tenant ID and any additional data passed as part of the payload. Each of these events is an “initiation request” for the custom code Lambda function.
  2. Amazon EventBridge is used to decouple the SaaS Application from event processing implementation specifics. EventBridge makes it easier to build event-driven applications at scale and keeps the future prospect of adding additional consumers. In case of unexpected failure in any downstream service, EventBridge retries sending events a set number of times.
  3. EventBridge sends the event to an Amazon Simple Queue Service (Amazon SQS) queue as a message that is subsequently picked up by a Lambda function (Dispatcher) for further routing. Amazon SQS enables decoupling and scaling of microservices and also provides a buffer for the events that are awaiting processing.
  4. The Dispatcher polls the messages from SQS queue and is responsible for routing the events to respective tenants for further processing. The Dispatcher retrieves the tenant ID from the message and performs a lookup in the database (we recommend Amazon DynamoDB for low latency), retrieves tenant SQS Amazon Resource Name (ARN) to determine which queue to route the event. To further improve performance, you can cache the tenant-to-queue mapping.
  5. The tenant SQS queue acts as a message store buffer and is configured as an event source for a Lambda function. Using Amazon SQS as an event source for Lambda is a common pattern.
  6. Lambda executes the code uploaded by the tenant to perform the desired operation. Common utility and management code (including logging and telemetry code) is kept in Lambda layers that get added to every custom code Lambda function provisioned.
  7. After performing the desired operation on data, custom code Lambda returns a value back to the SaaS application. This completes the run cycle.

This architecture allows SaaS applications to create a self-managed queue infrastructure for running custom code for tenants in parallel.

Tenant code upload

The SaaS platform can allow customers to upload code either through a user interface or using a command line interface that the SaaS provider provides to developers to facilitate uploading custom code to the SaaS platform. Uploaded code is saved in the custom code S3 bucket in .zip format that can be used to provision Lambda functions.

Custom code Lambda provisioning

The tenant environment includes a tenant SQS queue and a Lambda function that polls initiation requests from the queue. This Lambda function serves several purposes, including:

  1. It polls messages from the SQS queue and constructs a JSON payload that will be sent an input to custom code.
  2. It “wraps” the custom code provided by the customer using boilerplate code, so that custom code is fully abstracted from the processing implementation specifics. For example, we do not want custom code to know that the payload it is getting is coming from Amazon SQS or be aware of the destination where launch results will be sent.
  3. Once custom code initiation is complete, it sends a notification with launch results back to the SaaS application. This can be done directly via EventBridge or Amazon SQS.
  4. This common code can be shared across tenants and deployed by the SaaS provider, either as a library or as a Lambda layer that gets added to the Lambda function.

Each Lambda function execution environment is fully isolated by using a combination of open-source and proprietary isolation technologies, it helps you to address the risk of cross-contamination. By having a separate Lambda function provisioned per-tenant, you achieve the highest level of isolation and benefit from being able to track per-tenant costs.

Conclusion

In this blog post, we explored the need to extend SaaS platforms using custom code and why AWS serverless technologies—using Lambda and Amazon SQS—can be a good fit to accomplish that. We also looked at a solution architecture that can provide the necessary tenant isolation and is cost-effective for this use case.

For more information on building applications with Lambda, visit Serverless Land. For best practices on building SaaS applications, visit SaaS on AWS.

Introducing bidirectional event integrations with Salesforce and Amazon EventBridge

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/introducing-bidirectional-event-integrations-with-salesforce-and-amazon-eventbridge/

This post is written by Alseny Diallo, Prototype Solutions Architect and Rohan Mehta, Associate Cloud Application Architect.

AWS now supports Salesforce as a partner event source for Amazon EventBridge, allowing you to send Salesforce events to AWS. You can also configure Salesforce with EventBridge API Destinations and send EventBridge events to Salesforce. These integrations enable you to act on changes to your Salesforce data in real-time and build custom applications with EventBridge and over 100 built-in sources and targets.

In this blog post, you learn how to set up a bidirectional integration between Salesforce and EventBridge and use cases for working with Salesforce events. You see an example application for interacting with Salesforce support case events with automated workflows for detecting sentiment with AWS AI/ML services and enriching support cases with customer order data.

Integration overview

Salesforce is a customer relationship management (CRM) platform that gives companies a single, shared view of customers across their marketing, sales, commerce, and service departments. Salesforce Event Relays for AWS enable bidirectional event flows between Salesforce and AWS through EventBridge.

Amazon EventBridge is a serverless event bus that makes it easier to build event-driven applications at scale using events generated from your applications, integrated software as a service (SaaS) applications, and AWS services. EventBridge partner event source integrations enable customers to receive events from over 30 SaaS applications and ingest them into their AWS applications.

Salesforce as a partner event source for EventBridge makes it easier to build event-driven applications that span customers’ data in Salesforce and applications running on AWS. Customers can send events from Salesforce to EventBridge and vice versa without having to write custom code or manage an integration.

EventBridge joins Amazon AppFlow as a way to integrate Salesforce with AWS. The Salesforce Amazon AppFlow integration is well suited for use cases that require ingesting large volumes of data, like a daily scheduled data transfer sending Salesforce records into an Amazon Redshift data warehouse or an Amazon S3 data lake. The Salesforce EventBridge integration is a good fit for real-time processing of changes to individual Salesforce records.

Use cases

Customers can act on new or modified Salesforce records through integrations with a variety of EventBridge targets, including AWS Lambda, AWS Step Functions, and API Gateway. The integration can enable use cases across industries that must act on customer events in real time.

  • Retailers can automatically unify their Salesforce data with AWS data sources. When a new customer support case is created in Salesforce, enrich the support case with recent order data from that customer retrieved from an orders database running on AWS.
  • Media and entertainment providers can augment their omnichannel experiences with AWS AI/ML services to increase customer engagement. When a new customer account is created in Salesforce, use Amazon Personalize and Amazon Simple Email Service to send a welcome email with personalized media recommendations.
  • Insurers can automate form processing workflows. When a new insurance claim form PDF is uploaded to Salesforce, extract the submitted information with Amazon Textract and orchestrate processing the claim information with AWS Step Functions.

Solution overview

The example application shows how the integration can enhance customer support experiences by unifying support tickets with customer order data, detecting customer sentiment, and automating support case workflows.

Reference architecture

  1. A new case is created in Salesforce and an event is sent to an EventBridge partner event bus.
  2. If the event matches the EventBridge rule, the rule sends the event to both the Enrich Case and Case Processor Workflows in parallel.
  3. The Enrich Case Workflow uses the Customer ID in the event payload to query the Orders table for the customer’s recent order. If this step fails, the event is sent to an Amazon SQS dead letter queue.
  4. The Enrich Case Workflow publishes a new event with the customer’s recent order to an EventBridge custom event bus.
  5. The Case Processor Workflow performs sentiment analysis on the support case content and sends a customized text message to the customer. See the diagram below for details on the workflow.
  6. The Case Processor Workflow publishes a new event with the sentiment analysis results to the custom event bus.
  7. EventBridge rules match the events published to the associated rules: CaseProcessorEventRule and EnrichCaseAppEventRule.
  8. These rules send the events to EventBridge API Destinations. API Destinations sends the events to Salesforce HTTP endpoints to create two Salesforce Platform Events.
  9. Salesforce data is updated with the two Platform Events:
    1. The support case record is updated with the customer’s recent order details and the support case sentiment.
    2. If the support case sentiment is negative, a task is created for an agent to follow up with the customer.

The Case Processor workflow uses Step Functions to process the Salesforce events.

Case processor workflow

  1. Detect the sentiment of the customer feedback using Amazon Comprehend. This is positive, negative, or neutral.
  2. Check if the customer phone number is a mobile number and can receive SMS using Amazon Pinpoint’s mobile number validation endpoint.
  3. If the customer did not provide a mobile number, bypass the SMS steps and put an event with the detected sentiment onto the custom event bus.
  4. If the customer provided a mobile number, send them an SMS with the appropriate message based on the sentiment of their case.
    1. If sentiment is positive or neutral, the message is thanking the customer for their feedback.
    2. If the sentiment is negative, the message offers additional support.
  5. The state machine then puts an event with the sentiment analysis results onto the custom event bus.

Prerequisites

Environment setup

  1. Follow the instructions here to set up your Salesforce Event Relay. Once you have an event bus created with the partner event source, proceed to step 2.
  2. Copy the ARN of the event bus.
  3. Create a Salesforce Connected App. This is used for the API Destinations configuration to send updates back into Salesforce.
  4. You can create a new user within Salesforce with appropriate API permissions to update records. The user name and password is used by the API Destinations configuration.
  5. The example provided by Salesforce uses a Platform Event called “Carbon Comparison”. For this sample app, you create three custom platform events with the following configurations:
    1. Customer Support Case (Salesforce to AWS):
      Customer support case
    2. Processed Support Case (AWS to Salesforce):
      Processed Support case
    3. Enrich Case (AWS to Salesforce):
      Enrich case example
  6. This example application assumes that a custom Sentiment field is added to the Salesforce Case record type. See this link for how to create custom fields in Salesforce.
  7. The example application uses Salesforce Flows to trigger outbound platform events and handle inbound platform events. See this link for how to use Salesforce Flows to build event driven applications on Salesforce.
  8. Clone the AWS SAM template here.
    sam build
    sam deploy —guided

    For the parameter prompts, enter:

  • SalesforceOauthClientId and SalesforceOauthClientSecret: Use the values created with the Connected App in step 3.
  • SalesforceUsername and SalesforcePassword: Use the values created for the new user in step 4.
  • SalesforceOauthUrl: Salesforce URL for OAuth authentication
  • SalesforceCaseProcessorEndpointUrl: Salesforce URL for creating a new Processed Support Case Platform Event object, in this case: https://MyDomainName.my.salesforce.com/services/data/v54.0/sobjects/Processed_Support_Case__e
  • SFEnrichCaseEndpointUrl: Salesforce URL for creating a new Enrich Case Platform Event object, in this case: https://MyDomainName.my.salesforce.com/services/data/v54.0/sobjects/Enrich_Case__e
  • SalesforcePartnerEventBusArn: Use the value from step 2.
  • SalesforcePartnerEventPattern: The detail-type value should be the API name of the custom platform event, in thiscase: {“detail-type”: [“Customer_Support_Case__e”]}

Conclusion

This blog shows how to act on changes to your Salesforce data in real-time using the new Salesforce partner event source integration with EventBridge. The example demonstrated how your Salesforce data can be processed and enriched with custom AWS applications and updates sent back to Salesforce using EventBridge API Destinations.

To learn more about EventBridge partner event sources and API Destinations, see the EventBridge Developer Guide. For more serverless resources, visit Serverless Land.

Securely retrieving secrets with AWS Lambda

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/securely-retrieving-secrets-with-aws-lambda/

AWS Lambda functions often need to access secrets, such as certificates, API keys, or database passwords. Storing secrets outside the function code in an external secrets manager helps to avoid exposing secrets in application source code. Using a secrets manager also allows you to audit and control access, and can help with secret rotation. Do not store secrets in Lambda environment variables, as these are visible to anyone who has access to view function configuration.

This post highlights some solutions to store secrets securely and retrieve them from within your Lambda functions.

AWS Partner Network (APN) member Hashicorp provides Vault to secure secrets and application data. Vault allows you to control access to your secrets centrally, across applications, systems, and infrastructure. You can store secrets in Vault and access them from a Lambda function to access a database, for example. The Vault Agent for AWS helps you authenticate with Vault, retrieve the database credentials, and then perform the queries. You can also use the Vault AWS Lambda extension to manage connectivity to Vault.

AWS Systems Manager Parameter Store enables you to store configuration data securely, including secrets, as parameter values. For information on Parameter Store pricing, see the documentation.

AWS Secrets Manager allows you to replace hardcoded credentials in your code with an API call to Secrets Manager to retrieve the secret programmatically. You can generate, protect, rotate, manage, and retrieve secrets throughout their lifecycle. By default, Secrets Manager does not write or cache the secret to persistent storage. Secrets Manager supports cross-account access to secrets. For information on Secrets Manager pricing, see the documentation.

Parameter Store integrates directly with Secrets Manager as a pass-through service for references to Secrets Manager secrets. Use this integration if you prefer using Parameter Store as a consistent solution for calling and referencing secrets across your applications. For more information, see “Referencing AWS Secrets Manager secrets from Parameter Store parameters.”

For an example application to show Secrets Manager functionality, deploy the example detailed in “How to securely provide database credentials to Lambda functions by using AWS Secrets Manager”.

When to retrieve secrets

When Lambda first invokes your function, it creates a runtime environment. It runs the function’s initialization (init) code, which is the code outside the main handler. Lambda then runs the function handler code as the invocation. This receives the event payload and processes your business logic. Subsequent invocations can use the same runtime environment.

You can retrieve secrets during each function invocation from within your handler code. This ensures that the secret value is always up to date but can lead to increased function duration and cost, as the function calls the secret manager during each invocation. There may also be additional retrieval costs from Secret Manager.

Retrieving secret during each invocation

Retrieving secret during each invocation

You can reduce costs and improve performance by retrieving the secret during the function init process. During subsequent invocations using the same runtime environment, your handler code can use the same secret.

Retrieving secret during function initialization.

Retrieving secret during function initialization.

The Serverless Land pattern example shows how to retrieve a secret during the init phase using Node.js and top-level await.

If a secret may change between subsequent invocations, ensure that your handler can check for the secret validity and, if necessary, retrieve the secret again.

Retrieve changed secret during subsequent invocation.

Retrieve changed secret during subsequent invocation.

You can also use Lambda extensions to retrieve secrets from Secrets Manager, cache them, and automatically refresh the cache based on a time value. The extension retrieves the secret from Secrets Manager before the init process and makes it available via a local HTTP endpoint. The function then retrieves the secret from the local HTTP endpoint, rather than directly from Secrets Manager, increasing performance. You can also share the extension with multiple functions, which can reduce function code. The extension handles refreshing the cache based on a configurable timeout value. This ensures that the function has the updated value, without handling the refresh in your function code, which increases reliability.

Using Lambda extensions to cache and refresh secret.

Using Lambda extensions to cache and refresh secret.

You can deploy the solution using the steps in Cache secrets using AWS Lambda extensions.

Lambda Powertools

Lambda Powertools provides a suite of utilities for Lambda functions to simplify the adoption of serverless best practices. AWS Lambda Powertools for Python and AWS Lambda Powertools for Java both provide a parameters utility that integrates with Secrets Manager.

from aws_lambda_powertools.utilities import parameters
def handler(event, context):
    # Retrieve a single secret
    value = parameters.get_secret("my-secret")
import software.amazon.lambda.powertools.parameters.SecretsProvider;
import software.amazon.lambda.powertools.parameters.ParamManager;

public class AppWithSecrets implements RequestHandler<APIGatewayProxyRequestEvent, APIGatewayProxyResponseEvent> {
    // Get an instance of the Secrets Provider
    SecretsProvider secretsProvider = ParamManager.getSecretsProvider();

    // Retrieve a single secret
    String value = secretsProvider.get("/my/secret");

Rotating secrets

You should rotate secrets to prevent the misuse of your secrets. This helps you to replace long-term secrets with short-term ones, which reduces the risk of compromise.

Secrets Manager has built-in functionality to rotate secrets on demand or according to a schedule. Secrets Manager has native integrations with Amazon RDS, Amazon DocumentDB, and Amazon Redshift, using a Lambda function to manage the rotation process for you. It deploys an AWS CloudFormation stack and populates the function with the Amazon Resource Name (ARN) of the secret. You specify the permissions to rotate the credentials, and how often you want to rotate the secret. You can view and edit Secrets Manager rotation settings in the Secrets Manager console.

Secrets Manager rotation settings

Secrets Manager rotation settings

You can also create your own rotation Lambda function for other services.

Auditing secrets access

You should continually review how applications are using your secrets to ensure that the usage is as you expect. You should also log any changes to them so you can investigate any potential issues, and roll back changes if necessary.

When using Hashicorp Vault, use Audit devices to log all requests and responses to Vault. Audit devices can append logs to a file, write to syslog, or write to a socket.

Secrets Manager supports logging API calls using AWS CloudTrail. CloudTrail monitors and records all API calls for Secrets Manager as events. This includes calls from code calling the Secrets Manager APIs and access via the Secrets Manager console. CloudTrail data is considered sensitive, so you should use AWS KMS encryption to protect it.

The CloudTrail event history shows the requests to secretsmanager.amazonaws.com.

Viewing CloudTrail access to Secrets Manager

Viewing CloudTrail access to Secrets Manager

You can use Amazon EventBridge to respond to alerts based on specific operations recorded in CloudTrail. These include secret rotation or deleted secrets. You can also generate an alert if someone tries to use a version of a secret version while it is pending deletion. This may help identify and alert you when an outdated certificate is used.

Securing secrets

You must tightly control access to secrets because of their sensitive nature. Create AWS Identity and Access Management (IAM) policies and resource policies to enable minimal access to secrets. You can use role-based, as well as attribute-based, access control. This can prevent credentials from being accidentally used or compromised. For more information, see “Authentication and access control for AWS Secrets Manager”.

Secrets Manager supports encryption at rest using AWS Key Management Service (AWS KMS) using keys that you manage. Secrets are encrypted in transit using TLS by default, which requires request signing.

You can access secrets from inside an Amazon Virtual Private Cloud (Amazon VPC) without requiring internet access. Use AWS PrivateLink and configure a Secrets Manager specific VPC endpoint.

Do not store plaintext secrets in Lambda environment variables. Ensure that you do not embed secrets directly in function code, commit these secrets to code repositories, or log the secret to CloudWatch.

Conclusion

Using a secrets manager to store secrets such as certificates, API keys or database passwords helps to avoid exposing secrets in application source code. This post highlights some AWS and third-party solutions, such as Hashicorp Vault, to store secrets securely and retrieve them from within your Lambda functions.

Secrets Manager is the preferred AWS solution for storing and managing secrets. I explain when to retrieve secrets, including using Lambda extensions to cache secrets, which can reduce cost and improve performance.

You can use the Lambda Powertools parameters utility, which integrates with Secrets Manager. Rotating secrets reduces the risk of compromise and you can audit secrets using CloudTrail and respond to alerts using EventBridge. I also cover security considerations for controlling access to your secrets.

For more serverless learning resources, visit Serverless Land.

How to track AWS account metadata within your AWS Organizations

Post Syndicated from Jonathan Nguyen original https://aws.amazon.com/blogs/architecture/how-to-track-aws-account-metadata-within-your-aws-organizations/

United States Automobile Association (USAA) is a San Antonio-based insurance, financial services, banking, and FinTech company supporting millions of military members and their families. USAA has partnered with Amazon Web Services (AWS) to digitally transform and build multiple USAA solutions that help keep members safe and save members’ money and time.

Why build an AWS account metadata solution?

The USAA Cloud Program developed a centralized solution for collecting all AWS account metadata to facilitate core enterprise functions, such as financial management, remediation of vulnerable and insecure configurations, and change release processes for critical application and infrastructure changes.

Companies without centralized metadata solutions may have distributed documents and wikis that contain account metadata, which has to be updated manually. Manually inputting/updating information generally leads to outdated or incorrect metadata and, in addition, requires individuals to reach out to multiple resources and teams to collect specific information.

Solution overview

USAA utilizes AWS Organizations and a series of GitLab projects to create, manage, and baseline all AWS accounts and infrastructure within the organization, including identity and access management, security, and networking components. Within their GitLab projects, each deployment uses a GitLab baseline version that determines what version of the project was provisioned within the AWS account.

During the creation and onboarding of new AWS accounts, which are created for each application team and use-case, there is specific data that is used for tracking and governance purposes, and applied across the enterprise. USAA’s Public Cloud Security team took an opportunity within a hackathon event to develop the solution depicted in Figure 1.

  1. AWS account is created conforming to a naming convention and added to AWS Organizations.

Metadata tracked per AWS account includes:

    • AWS account name
    • Points of contact
    • Line of business (LOB)
    • Cost center #
    • Application ID #
    • Status
    • Cloud governance record #
    • GitLab baseline version
  1. Amazon EventBridge rule invokes AWS Step Functions when new AWS accounts are created.
  2. Step Functions invoke an AWS Lambda function to pull AWS account metadata and load into a centralized Amazon DynamoDB table with Streams enabled to support automation.
  3. A private Amazon API Gateway is exposed to USAA’s internal network, which queries the DynamoDB table and provides AWS account metadata.
Overview of USAA architecture automation workflow to manage AWS account metadata

Figure 1. Overview of USAA architecture automation workflow to manage AWS account metadata

After the solution was deployed, USAA teams leveraged the data in multiple ways:

  1. User interface: a front-end user-interface querying the API Gateway to allow internal users on the USAA network to filter and view metadata for any AWS accounts within AWS Organizations.
  2. Event-driven automation: DynamoDB streams for any changes in the table that would invoke a Lambda function, which would check the most recent version from GitLab and the GitLab baseline version in the AWS account. For any outdated deployments, the Lambda function invokes the CI/CD pipeline for that AWS account to deploy a standardized set of IAM, infrastructure, and security resources and configurations.
  3. Incident response: the Cyber Threat Response team reduces mean-time-to-respond by developing automation to query the API Gateway to append points-of-contact, environment, and AWS account name for custom detections as well as Security Hub and Amazon GuardDuty findings.
  4. Financial management: Internal teams have integrated workflows to their applications to query the API Gateway to return cost center, LOB, and application ID to assist with financial reporting and tracking purposes. This replaces manually reviewing the AWS account metadata from an internal and manually updated wiki page.
  5. Compliance and vulnerability management: automated notification systems were developed to send consolidated reports to points-of-contact listed in the AWS account from the API Gateway to remediate non-compliant resources and configurations.

Conclusion

In this post, we reviewed how USAA enabled core enterprise functions and teams to collect, store, and distribute AWS account metadata by developing a secure and highly scalable serverless application natively in AWS. The solution has been leveraged for multiple use-cases, including internal application teams in USAA’s production AWS environment.

Automatically block suspicious DNS activity with Amazon GuardDuty and Route 53 Resolver DNS Firewall

Post Syndicated from Akshay Karanth original https://aws.amazon.com/blogs/security/automatically-block-suspicious-dns-activity-with-amazon-guardduty-and-route-53-resolver-dns-firewall/

In this blog post, we’ll show you how to use Amazon Route 53 Resolver DNS Firewall to automatically respond to suspicious DNS queries that are detected by Amazon GuardDuty within your Amazon Web Services (AWS) environment.

The Security Pillar of the AWS Well-Architected Framework includes incident response, stating that your organization should implement mechanisms to automatically respond to and mitigate the potential impact of security issues. Automating incident response helps you scale your capabilities, rapidly reduce the scope of compromised resources, and reduce repetitive work by security teams.

Use cases for Route 53 Resolver DNS Firewall

Route 53 Resolver DNS Firewall is a managed firewall that you can use to block DNS queries that are made for known malicious domains and to allow queries for trusted domains. It provides more granular control over the DNS querying behavior of resources within your VPCs.

Let’s discuss two use cases for Route 53 Resolver DNS Firewall:

Use of allow lists – If you have stricter security requirements around network security controls and want to deny all outbound DNS queries for domains that don’t match those on your lists of approved domains (known as allow lists), you can create such rules. This is called a walled garden approach to DNS security. These allow lists only include the domains for which resources within your Amazon Virtual Private Cloud (Amazon VPC) are allowed to make DNS queries through Amazon-provided DNS. This helps to ensure that the DNS queries containing the domains that your organization doesn’t trust are blocked.

Use of deny lists – If your organization prefers to allow all outbound DNS lookups within your accounts by default and only requires the ability to block DNS queries for known malicious domains, you can use DNS Firewall to create deny lists, which include all the malicious domain names that your organization is aware of. DNS Firewall also provides AWS Managed Rules, giving you to the ability to configure protections against known DNS threats like command-and-control (C&C) bots. You can also add block lists from open-source third-party threat intelligence sources.

A few important points about the use of allow and deny lists:

  1. Broader use of allow lists is more effective at blocking a greater number of malicious DNS queries than a short deny list. For example, if your workloads only need access to .com domains, then allowing only .com will block many malicious domains that might be specific to certain countries. View a list of country code top-level domains (ccTLDs).
  2. If you use allow lists, you need to make sure that you keep up with the domains that your applications need to communicate with. Likewise, if you use deny lists, you need to keep up with updates to the lists.
  3. Allow lists and deny lists are not mutually exclusive models and can be used together. For example, let’s say that you have an allow list that only allows .com domains (with the intention of blocking several ccTLDs by default). You can also use the built-in AWS Managed Rules deny list to block known malicious .com domains for an additional layer of security.

Solution overview

Refer to the DNS Firewall documentation to familiarize yourself with its constructs and understand how it works. The automation example we provide in this blog post is focused on providing blocks or alerts for DNS queries with suspicious domain names. For example, consider the scenario where an Amazon Elastic Compute Cloud (Amazon EC2) instance queries a domain name that is associated with a known command-and-control server. As shown in Figure 1, when GuardDuty detects communication with the malicious domain, it initiates a series of steps. First, AWS Step Functions orchestrates the remediation response through a defined workflow, then DNS Firewall adds the suspicious domain to deny list or alert list, and finally GuardDuty notifies the security operators of the attempted communication.

Figure 1: High-level solution overview

Figure 1: High-level solution overview

In this solution, the detection of threats by GuardDuty triggers the automated remediation procedure documented in this post. GuardDuty informs you of the status of your AWS environment by producing security findings. Each GuardDuty finding has an assigned severity level and value that reflects the potential risk that the finding could have to your network as determined by our security engineers. The value of the severity can fall anywhere within the 0.1 to 8.9 range, with higher values indicating greater security risk. To help you determine a response to a potential security issue that is highlighted by a finding, GuardDuty breaks down this range into High, Medium, and Low severity levels. We have seen that many of the DNS-based GuardDuty findings fall into the category of High severity, and many times these findings are strongly indicative of potential compromise (for example, pre ransomware activity).

In this blog post, we specifically focus on the following types of GuardDuty findings:

  • Backdoor:EC2/C&CActivity.B!DNS
  • Impact:EC2/MaliciousDomainRequest.Reputation
  • Trojan:EC2/DNSDataExfiltration

We’ve configured DNS Firewall to block only events with High severity by sending only those domains to the deny list. DNS Firewall sends the rest of the domains to an alert list.

This solution uses Step Functions and AWS Lambda so that incident response steps run in the correct order. Step Functions also provides retry and error-handling logic. Lambda functions interact with networking services to block traffic, and with databases to store data about blocked domain lists and AWS Security Hub finding Amazon Resource Names (ARNs).

How it works

Figure 2 shows the automated remediation workflow in detail.
 

Figure 2: Detailed workflow diagram

Figure 2: Detailed workflow diagram

The solution is implemented as follows:

  1. GuardDuty detects communication attempts that include a suspicious domain. GuardDuty generates a finding, in JSON format, that includes details such as the EC2 instance ID involved (if applicable), account information, type of finding, domain, and other details. Following is a sample finding (some fields removed for brevity).
    {
      "schemaVersion": "2.0",
      "accountId": "123456789012",
      "id": " 1234567890abcdef0",
      "type": "Backdoor:EC2/C&CActivity.B!DNS",
      "service": {
        "serviceName": "guardduty",
        "action": {
          "actionType": "DNS_REQUEST",
         "dnsRequestAction": {
    "domain": "guarddutyc2activityb.com",
    "protocol": "UDP",
    "blocked": false
          }
        }
      }
    }
    

  2. Security Hub ingests the finding generated by GuardDuty and consolidates it with findings from other AWS security services. Security Hub also publishes the contents of the finding to the default bus in Amazon EventBridge. Following is a snippet from a sample event published to EventBridge.
    { 
      "id": "12345abc-ca56-771b-cd1b-710550598e37", 
      "detail-type": "Security Hub Findings - Imported", 
      "source": "aws.securityhub", 
      "account": "123456789012", 
      "time": "2021-01-05T01:20:33Z", 
      "region": "us-east-1", 
      "detail": { 
        "findings": [ 
            { "ProductArn": "arn:aws:securityhub:us-east-1::product/aws/guardduty", 
            "Types": ["Software and Configuration Checks/Backdoor:EC2.C&CActivity.B!DNS"], 
            "LastObservedAt": "2021-01-05T01:15:01.549Z", 
            "ProductFields": 
                {"aws/guardduty/service/action/dnsRequestAction/blocked": "false",
                "aws/guardduty/service/action/dnsRequestAction/domain": "guarddutyc2activityb.com"} 
                }
                ]}}
    

  3. EventBridge has a rule with an event pattern that matches GuardDuty events that contain the malicious domain name. When an event matching the pattern is published on the default bus, EventBridge routes that event to the designated target, in this case a Step Functions state machine. Following is a snippet of AWS CloudFormation code that defines the EventBridge rule.
    # EventBridge Event Rule - For Security Hub event published to EventBridge:
      SecurityHubtoFirewallStateMachineEvent:
        Type: "AWS::Events::Rule"
        Properties:
          Description: "Security Hub - GuardDuty findings with DNS Domain"
          EventPattern:
            source:
            - aws.securityhub
            detail:
              findings:
                ProductFields:
                  aws/guardduty/service/action/dnsRequestAction/blocked:
                    - "exists": true
          State: "ENABLED"
          Targets:
            -
              Arn: !GetAtt SecurityHubtoDnsFirewallStateMachine.Arn
              RoleArn: !GetAtt SecurityHubtoFirewallStateMachineEventRole.Arn
              Id: "GuardDutyEvent-StepFunctions-Trigger"
    

  4. The Step Functions state machine ingests the details of the Security Hub finding published in EventBridge and orchestrates the remediation response through a defined workflow. Figure 3 shows the state machine workflow.
     
    Figure 3: AWS Step Functions state machine workflow

    Figure 3: AWS Step Functions state machine workflow

  5. The first two steps in the state machine, getDomainFromDynamo and isDomainInDynamo, invoke the Lambda function CheckDomainInDynamoLambdaFunction that checks whether the flagged domain is already in the Amazon DynamoDB table. If the domain already exists in DynamoDB, then the workflow continues to check whether the domain is also in the domain list and adds it accordingly. If the domain is not in DynamoDB, then the workflow considers it a new addition and adds the domain to both domain lists, as well as the DynamoDB table.
  6. The next three steps in the state machine—getDomainFromDomainList, isDomainInDomainList, and addDomainToDnsFirewallDomainList—invoke a second Lambda function that checks and updates the DNS Firewall domain lists with the domain name. Figure 4 shows an example of the DNS Firewall rules and associated domain list.
     
    Figure 4: Sample rules in a DNS Firewall rule group

    Figure 4: Sample rules in a DNS Firewall rule group

    Figure 5 shows the domain lists.
     

    Figure 5: Domain lists

    Figure 5: Domain lists

    The next step in the state machine, updateDynamoDB, invokes a third Lambda function that updates the DynamoDB table with the domain that was just added to the domain list. Figure 6 shows an example domain entry that gets stored inside the DynamoDB table.
     

    Figure 6: DynamoDB table entry

    Figure 6: DynamoDB table entry

  7. The notifySuccess step of the state machine uses an Amazon Simple Notification Service (Amazon SNS) topic to send out a message that the automatic block or alert happened.
  8. If there was a failure in any of the previous steps, then the state machine runs the notifyFailure step. The state machine publishes a message on the SNS topic that the automated remediation workflow has failed to complete, and that manual intervention might be required.

Solution deployment and testing

To set up this solution, you’ll do the following steps:

  1. Verify prerequisites in your AWS account.
  2. Deploy the CloudFormation template.
  3. Create a test Security Hub event.
  4. Confirm the entry in the DNS Firewall rule group domain list.
  5. Confirm the SNS notification.
  6. Apply the rule group to your VPC by using DNS Firewall.

Step 1: Verify prerequisites in your AWS account

The sample solution we provide in this blog post requires that you activate both GuardDuty and Security Hub in your AWS account. If either of these services is not activated in your account, do the following:

Step 2: Deploy the CloudFormation template

For this next step, make sure that you deploy the template within the AWS account and the AWS Region where you want to monitor GuardDuty findings and block suspicious DNS activity. Depending on your architecture, you can deploy the solution one time centrally in a security account or deploy it repeatedly across multiple accounts.

To deploy the template

  1. Choose the Launch Stack button to launch a CloudFormation stack in your account:
    Select the Launch Stack button to launch the template

    Note: The stack will launch in the N. Virginia (us-east-1) Region. It takes approximately 15 minutes for the CloudFormation stack to complete. To deploy this solution into other AWS Regions, download the solution’s CloudFormation template and deploy it to the selected Region. Network Firewall isn’t currently available in all Regions. For more information about where it’s available, see the list of service endpoints.

  2. In the AWS CloudFormation console, select the Select Template form, and then choose Next.
  3. On the Specify Details page, provide the following input parameters. You can modify the default values to customize the solution for your environment.
    • AdminEmail – The email address to receive notifications. This must be a valid email address. There is no default value.
    • DnsFireWallAlertDomainListName – The name of the domain list for DNS Firewall that consists of domains that will be only alerted and not blocked. The default value is DemoAlertDomainListAutoUpdated.
    • DnsFireWallBlockDomainListName – The name of the domain list for DNS Firewall that consists of domains that will be blocked. The default value is DemoBlockedDomainListAutoUpdated.
    • DnsFirewallBlockAction – You can select NODATA or NXDOMAIN. NODATA implies that there is no response available if a DNS query from the VPC matches a domain in the block domain list. NXDOMAIN implies that the response is an error message, which indicates that a domain doesn’t exist. The default value is NODATA.

    Figure 7 shows an example of the values entered in the Parameters screen.

    Figure 7: Sample CloudFormation stack parameters

    Figure 7: Sample CloudFormation stack parameters

  4. After you’ve entered values for all of the input parameters, choose Next.
  5. On the Options page, keep the defaults, and then choose Next.
  6. On the Review page, in the Capabilities section, select the check box next to I acknowledge that AWS CloudFormation might create IAM resources. Then choose Create. Figure 8 shows what the CloudFormation capabilities acknowledgement prompt looks like.
     
    Figure 8: AWS CloudFormation capabilities acknowledgement

    Figure 8: AWS CloudFormation capabilities acknowledgement

While the stack is being created, check the email inbox that corresponds to the value that you gave for the AdminEmail address parameter. Look for an email message with the subject “AWS Notification – Subscription Confirmation.” Choose the link to confirm the subscription to the SNS topic.

After the Status field for the CloudFormation stack changes to CREATE_COMPLETE, as shown in Figure 9, the solution is implemented and is ready for testing.
 

Figure 9: CloudFormation stack completed deployment

Figure 9: CloudFormation stack completed deployment

Step 3: Create a test Security Hub event

After the CloudFormation stack has completed deployment, you can test the functionality by creating a test event in the same format as would be published by Security Hub.

To create a test run of the solution

  1. In the AWS Management Console, choose Services, choose CloudFormation, and then for Stack, choose the stack name that you provided in Step 2: Deploy the CloudFormation template.
  2. In the Resources tab for the stack, look for the SecurityHubDnsFirewallStateMachine entry. It should appear as shown in Figure 10.
     
    Figure 10: CloudFormation stack resources

    Figure 10: CloudFormation stack resources

  3. Choose the link in the entry. You’ll be redirected to the Step Functions console, with the state machine already open. Choose Start execution.
     
    Figure 11: AWS Step Functions state machine

    Figure 11: AWS Step Functions state machine

  4. To facilitate testing, we’ve provided a test event file. On the Start execution page, in the Input section, paste the C&CActivity.B!DNS finding sample as shown in Figure 12.
     
    Figure 12: Sample input for the Step Functions state machine execution

    Figure 12: Sample input for the Step Functions state machine execution

  5. Note the domain name guarddutyc2activityb.com for the remote host identified in the GuardDuty finding in the test event on line 57 of the sample. The solution should block or alert traffic from that domain name in the following steps.
  6. Choose Start execution to begin the processing of the test event.
  7. You can now track the state machine processing of the test event. The processing should complete within a few seconds. You can select different steps in the visual Graph inspector to view input and output data. Figure 13 shows the input to the addDomainToDnsFirewallDomainList step that launches a Lambda function that interacts with DNS Firewall.
     
    Figure 13: Step Functions state machine step details

    Figure 13: Step Functions state machine step details

Step 4: Confirm the entry in the DNS Firewall rule group

Now that a test event was processed by the state machine, you can check whether the DNS Firewall rule group would block traffic to the domain name identified in the GuardDuty finding.

To validate entries in the DNS Firewall rule group

  1. In the AWS Management Console, choose Services, and then choose VPC. In the DNS Firewall section in the left navigation bar, choose DNS Firewall rule groups.
  2. Choose the demoDnsFirewallRuleGroup rule group created by the solution, and you’ll be able to see the rules as shown in Figure 14.
     
    Figure 14: Select the DNS Firewall rule

    Figure 14: Select the DNS Firewall rule

  3. Choose the domain list associated with the BLOCK rule. Confirm that the rules blocking the traffic from the source and to the domain that you specified in the test event were created. The domain list should look similar to what is shown in Figure 15.
     
    Figure 15: Verify that the domain was added to the blocked domain list

    Figure 15: Verify that the domain was added to the blocked domain list

Step 5: Confirm the SNS notification

In this step, you’ll view the SNS notification that was sent to the email address you set up.

To confirm the SNS notification

  • Review the email inbox for the value that you provided for the AdminEmail parameter and look for a message with the subject line “AWS Notification Message.” The contents of the message from SNS should be similar to the following.
    {"Blocked":"true","Input":{"ResponseMetadata":{"RequestId":"HOLOAAENUS3MN9B0DS6CO8BF4BVV4KQNSO5AEMVJF66Q9ASUAAJG","HTTPStatusCode":200,"HTTPHeaders":{"server":"Server","date":"Wed, 17 Nov 2021 08:20:38 GMT","content-type":"application/x-amz-json-1.0","content-length":"2","connection":"keep-alive","x-amzn-requestid":"HOLOAAENUS3MN9B0DS6CO8BF4BVV4KQNSO5AEMVJF66Q9ASUAAJG","x-amz-crc32":"2745614147"},"RetryAttempts":0}}}
    

Step 6: Apply the rule group to your VPC by using DNS Firewall

As part of the CloudFormation template deployment, two test VPCs have been created for you, to demonstrate that you can assign a single DNS Firewall rule group to multiple VPCs. You can also associate this rule group to your existing VPC of interest. To learn how to do this task, see Managing associations between your VPC and Route 53 Resolver DNS Firewall rule group. For visibility into DNS queries and for debugging purposes, the template creates log groups that accumulate DNS Resolver query logs.

After you’ve successfully tested the given sample that emulates C&CActivity.B!DNS, you can repeat steps 3 to 6 for the MaliciousDomainRequest.Reputation finding sample and the DNSDataExfiltration finding sample.

These samples are supplied for your convenience, and you will see the blocking action in a matter of minutes. Alternatively, you can use other ways to test, which might need about an hour for blocking action to happen. To initiate DNS C&C activity, you can make a DNS request from your instance (using dig for Linux or nslookup for Windows) against the test domain guarddutyc2activityb.com. Alternatively, you can use GuardDuty Tester, which generates DNS C&C activity and DNS exfiltration unauthorized events.

To take this solution one step further, you can implement automatic aging out of the domains that get added to the domain list. One way to do this is to use the Time to Live feature in DynamoDB and keep repopulating the domain list from DynamoDB at regular intervals of time. The benefit of this is that if the malicious nature of a domain in the domain list changes over time, the list will be kept up to date during this age out and repopulation process.

Considerations

There are a few considerations that you should keep in mind regarding DNS Firewall:

  • DNS Firewall and AWS Network Firewall work together for improved domain-filtering capability across HTTP(S) traffic. A domain list that you configure in Network Firewall should reflect the domain list configured in DNS Firewall.
  • DNS Firewall filters based on the domain name. It doesn’t translate that domain name to an IP address to be blocked.
  • It’s a best practice to block outbound traffic to port 53 with network access control lists (network ACLs) or Network Firewall so that GuardDuty can monitor DNS queries.
  • DNS Firewall filters DNS queries to the Amazon Route 53 Resolver (also known as AmazonProvidedDNS or VPC .2 Resolver) in the VPC. So for traffic leaving the VPC, we recommend that you use DNS Firewall along with Network Firewall, which you can use to secure traffic that isn’t headed to Amazon Route 53 Resolver. Network Firewall can also block domain names that exist in network traffic leaving the Amazon VPC, such as in HTTP HOST headers, TLS Server Name Indication (SNI) fields, and so on.
  • You can use Network Firewall to block external encrypted DNS services so that these services can’t be used to circumvent your DNS Firewall policies.

Conclusion

In this blog post, you learned how to automatically block malicious domains by using Route 53 Resolver DNS Firewall and GuardDuty. You can use this sample solution to automatically block communication to suspicious hosts discovered by GuardDuty, and you can apply those blocks across all configured DNS Firewall firewalls within your account.

All of the code for this solution is available on GitHub. Feel free to play around with the code; we hope it helps you learn more about automated security remediation. You can adjust the code to better fit your unique environment or extend the code with additional steps.

If you have comments about this blog post, submit them in the Comments section below. If you have questions about using this solution, start a thread in the Route 53 Resolver forum or GuardDuty forums, or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Akshay Karanth

Akshay is a senior solutions architect at AWS. He helps digital native businesses learn, build, and grow in the AWS Cloud. Before AWS, he worked at companies such as Juniper Networks and Microsoft in various customer facing roles across networking and security domains. When not at work, Akshay enjoys hiking up a hard trail or cooking a fulfilling meal with his family.

Author

Rohit Aswani

Rohit is a specialist solutions architect focussed on Networking at AWS, where he helps customers build and design scalable, highly-available, secure, resilient, and cost-effective networks. He holds an MS in telecommunication systems management from Northeastern University, specializing in computer networking.

Contributor

Special thanks to Fabrice Dall’ara who made significant contributions to this post.

ICYMI: Serverless Q2 2022

Post Syndicated from dboyne original https://aws.amazon.com/blogs/compute/icymi-serverless-q2-2022/

Welcome to the 18th edition of the AWS Serverless ICYMI (in case you missed it) quarterly recap. Every quarter, we share all the most recent product launches, feature enhancements, blog posts, webinars, Twitch live streams, and other interesting things that you might have missed!

In case you missed our last ICYMI, check out what happened last quarter here.

AWS Lambda

For Node.js developers, AWS Lambda now supports the Node.js 16.x runtime version. This offers new features, including the Stable timers promises API and RegExp match indices. There is also new documentation for TypeScript with Lambda.

Customers are rapidly adopting the new runtime version by updating to Node.js 16.x. To help keep Lambda functions secure, AWS continually updates Node.js 16 with all minor updates released by the Node.js community when using the zip archive format. Read the release blog post to learn more about building Lambda functions with Node.js 16.x.

A new Lambda custom runtime is now available for PowerShell. It makes it even easier to run Lambda functions written in PowerShell. Although Lambda has supported PowerShell since 2018, this new version simplifies the process and reduces the additional steps required during the development process.

To get started, see the GitHub repository which contains the code, examples and installation instructions.

PowerShell code in Lambda console

PowerShell code in Lambda console

AWS Lambda Powertools is an open-source library to help customers discover and incorporate serverless best practices more easily. Powertools for Python went GA in July 2020, followed by Java in 2021, TypeScript in 2022, and .NET is coming soon. AWS Lambda Powertools crossed the 10M download milestone and TypeScript support has now moved from beta to a release candidate.

When building with Lambda, it’s important to develop solutions to handle retries and failures when the same event may be received more than once. Lambda Powertools provide a utility to handle idempotency within your functions.

To learn more:

AWS Step Functions

AWS Step Functions launched a new opt-in console experience to help builders analyze, debug, and optimize Step Functions Standard Workflows. This allows you to debug workflow executions and analyze the payload as it passes through each state. To opt in to the new console experience and get started, follow these detailed instructions.

Events Tab in Step Functions Workflow

Events tab in Step Functions workflow

Amazon EventBridge

Amazon EventBridge released support for global endpoints in April 2022. Global endpoints provide a reliable way for you to improve availability and reliability of event-driven applications. Using global endpoints, you can fail over event ingestion automatically to another Region during service disruptions.

The new IngestionToInvocationStartLatency metric exposes the time to process events from the point at which they are ingested by EventBridge to the point of the first invocation. Amazon Route 53 uses this information to failover event ingestion automatically to a secondary Region if the metric exceeds a configured threshold of 30 seconds, consecutively for 5 minutes.

To learn more:

Amazon EventBridge Architecture for Global Endpoints

Amazon EventBridge global endpoints architecture diagram

Serverless Blog Posts

April

Apr 6 – Getting Started with Event-Driven Architecture

Apr 7 – Introducing global endpoints for Amazon EventBridge

Apr 11 – Building an event-driven application with Amazon EventBridge

Apr 12 – Orchestrating high performance computing with AWS Step Functions and AWS Batch

Apr 14 – Working with events and the Amazon EventBridge schema registry

Apr 20 – Handling Lambda functions idempotency with AWS Lambda Powertools

Apr 26 – Build a custom Java runtime for AWS Lambda

May

May 05 – Amazon EC2 DL1 instances Deep Dive

May 05 – Orchestrating Amazon S3 Glacier Deep Archive object retrieval using AWS Step Functions

May 09 – Benefits of migrating to event-driven architecture

May 09 – Debugging AWS Step Functions executions with the new console experience

May 12 – Node.js 16.x runtime now available in AWS Lambda

May 25 – Introducing the PowerShell custom runtime for AWS Lambda

June

Jun 01 – Testing Amazon EventBridge events using AWS Step Functions

Jun 02 – Optimizing your AWS Lambda costs – Part 1

Jun 02 – Optimizing your AWS Lambda costs – Part 2

Jun 02 – Extending PowerShell on AWS Lambda with other services

Jun 02 – Running AWS Lambda functions on AWS Outposts using AWS IoT Greengrass

Jun 14 – Combining Amazon AppFlow with AWS Step Functions to maximize application integration benefits

Jun 14 – Capturing GPU Telemetry on the Amazon EC2 Accelerated Computing Instances

Serverlesspresso goes global

Serverlesspresso in five countries

Serverlesspresso is a serverless event-driven application that allows you to order coffee from your phone.

Since building Serverlesspresso for reinvent 2021, the Developer Advocate team have put in approximately 100 additional development hours to improve the application to make it a multi-tenant event-driven serverless app.

This allowed us to run Serverlesspresso concurrently at five separate events across Europe on a single day in June, serving over 5,000 coffees. Each order is orchestrated by a single Step Functions workflow. To read more about how this application is built:

AWS Heroes EMEA Summit in Milan, Italy

AWS Heros in Milan, Italy 2022

AWS Heroes EMEA Summit in Milan, Italy

The AWS Heroes program recognizes talented experts whose enthusiasm for knowledge-sharing has a real impact within the community. The EMEA-based Heroes gathered for a Summit on June 28 to share their thoughts, providing valuable feedback on topics such as containers, serverless and machine learning.

Serverless workflow collection added to Serverless Land

Serverless Land is a website that is maintained by the Serverless Developer Advocate team to help you learn with workshops, patterns, blogs and videos.

The Developer Advocate team have extended Serverless Land and introduced the new AWS Step Functions workflows collection.

Using the new collection you can explore common patterns built with Step Functions and use the 1-click deploy button to deploy straight into your AWS account.

Serverless Workflows Collection on Serverless Land

Serverless Workflows Collection on Serverless Land

Videos

Serverless Office Hours – Tues 10AM PT

ServerlessLand YouTube Channel

ServerlessLand YouTube Channel

Weekly live virtual office hours. In each session we talk about a specific topic or technology related to serverless and open it up to helping you with your real serverless challenges and issues. Ask us anything you want about serverless technologies and applications.

YouTube: youtube.com/serverlessland
Twitch: twitch.tv/aws

April

May

June

FooBar Serverless YouTube channel

FooBar Serverless YouTube Header

FooBar Serverless Channel

Marcia Villalba frequently publishes new videos on her popular serverless YouTube channel. You can view all of Marcia’s videos at https://www.youtube.com/c/FooBar_codes.

April

May

June

Still looking for more?

The Serverless landing page has more information. The Lambda resources page contains case studies, webinars, whitepapers, customer stories, reference architectures, and even more Getting Started tutorials.

You can also follow the Serverless Developer Advocacy team on Twitter to see the latest news, follow conversations, and interact with the team.

Building a low-code speech “you know” counter using AWS Step Functions

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/building-a-low-code-speech-you-know-counter-using-aws-step-functions/

This post is written by Doug Toppin, Software Development Engineer, and Kishore Dhamodaran, Solutions Architect.

In public speaking, filler phrases can distract the audience and reduce the value and impact of what you are telling them. Reviewing recordings of presentations can be helpful to determine whether presenters are using filler phrases. Instead of manually reviewing prior recordings, automation can process media files and perform a speech-to-text function. That text can then be processed to report on the use of filler phrases.

This blog explains how to use AWS Step Functions, Amazon EventBridge, Amazon Transcribe and Amazon Athena to report on the use of the common phrase “you know” in media files. These services can automate and reduce the time required to find the use of filler phrases.

Step Functions can automate and chain together multiple activities and other Amazon services. Amazon Transcribe is a speech to text service that uses media files as input and produces textual transcripts from them. Athena is an interactive query service that makes it easier to analyze data in Amazon S3 using standard SQL. Athena enables the use of standard SQL to query data in S3.

This blog shows a low-code, configuration driven approach to implementing this solution. Low-code means writing little or no custom software to perform a function. Instead, you use a configuration drive approach using service integrations where state machine tasks call AWS services using existing SDKs, APIs, or interfaces. A configuration driven approach in this example is using Step Functions’ Amazon States Language (ASL) to tie actions together rather than writing traditional code. This requires fewer details for data management and error handling combined with a visual user interface for composing the workflow. As the actions and logic are clearly defined with the visual workflow, this reduces maintenance.

Solution overview

The following diagram shows the solution architecture.

SolutionOverview

Solution Overview

  1. You upload a media file to an Amazon S3 Media bucket.
  2. The media file upload to S3 triggers an EventBridge rule.
  3. The EventBridge rule starts the Step Functions state machine execution.
  4. The state machine invokes Amazon Transcribe to process the media file.
  5. The transcription output file is stored in the Amazon S3 Transcript bucket.
  6. The state machine invokes Athena to query the textual transcript for the filler phrase. This uses the AWS Glue table to describe the format of the transcription results file.
  7. The filler phrase count determined by Athena is returned and stored in the Amazon S3 Results bucket.

Prerequisites

  1. An AWS account and an AWS user or role with sufficient permissions to create the necessary resources.
  2. Access to the following AWS services: Step Functions, Amazon Transcribe, Athena, and Amazon S3.
  3. Latest version of the AWS Serverless Application Model (AWS SAM) CLI, which helps developers create and manage serverless applications in the AWS Cloud.
  4. Test media files (for example, the Official AWS Podcast).

Example walkthrough

  1. Clone the GitHub repository to your local machine.
  2. git clone https://github.com/aws-samples/aws-stepfunctions-examples.git
  3. Deploy the resources using AWS SAM. The deploy command processes the AWS SAM template file to create the necessary resources in AWS. Choose you-know as the stack name and the AWS Region that you want to deploy your solution to.
  4. cd aws-stepfunctions-examples/sam/app-low-code-you-know-counter/
    sam deploy --guided

Use the default parameters or replace with different values if necessary. For example, to get counts of a different filler phrase, replace the FillerPhrase parameter.

GlueDatabaseYouKnowP Name of the AWS Glue database to create.
AthenaTableName Name of the AWS Glue table that is used by Athena to query the results.
FillerPhrase The filler phrase to check.
AthenaQueryPreparedStatementName Name of the Athena prepared statement used to run SQL queries on.
AthenaWorkgroup Athena workgroup to use
AthenaDataCatalog The data source for running the Athena queries
SAM Deploy

SAM Deploy

Running the filler phrase counter

  1. Navigate to the Amazon S3 console and upload an mp3 or mp4 podcast recording to the bucket named bucket-{account number}-{Region}-you-know-media.
  2. Navigate to the Step Functions console. Choose the running state machine, and monitor the execution of the transcription state machine.
  3. State Machine Execution

    State Machine Execution

  4. When the execution completes successfully, select the QueryExecutionSuccess task to examine the output and see the filler phrase count.
  5. State Machine Output

    State Machine Output

  6. Amazon Transcribe produces the transcript text of the media file. You can examine the output in the Results bucket. Using the S3 console, navigate to the bucket, choose the file matching the media file name and use ‘Query with S3 Select’ to view the content.
  7. If the transcription job does not execute, the state machine reports the failure and exits.
  8. State Machine Fail

    State Machine Fail

Exploring the state machine

The state machine orchestrates the transcription processing:

State Machine Explore

State Machine Explore

The StartTranscriptionJob task starts the transcription job. The Wait state adds a 60-second delay before checking the status of the transcription job. Until the status of the job changes to FAILED or COMPLETED, the choice state continues.

When the job successfully completes, the AthenaStartQueryExecutionUsingPreparedStatement task starts the Athena query, and stores the results in the S3 results bucket. The AthenaGetQueryResults task retrieves the count from the resultset.

The TranscribeMediaBucket holds the media files to be uploaded. The configuration sends the upload notification event to EventBridge:

      
   NotificationConfiguration:
     EventBridgeConfiguration:
       EventBridgeEnabled: true
	  

The TranscribeResultsBucket has an associated policy to provide access to Amazon Transcribe. Athena stores the output from the queries performed by the state machine in the AthenaQueryResultsBucket .

When a media upload occurs, the YouKnowTranscribeStateMachine uses Step Functions’ native event integration to trigger an EventBridge rule. This contains an event object similar to:

{
  "version": "0",
  "id": "99a0cb40-4b26-7d74-dc59-c837f5346ac6",
  "detail-type": "Object Created",
  "source": "aws.s3",
  "account": "012345678901",
  "time": "2022-05-19T22:21:10Z",
  "region": "us-east-2",
  "resources": [
    "arn:aws:s3:::bucket-012345678901-us-east-2-you-know-media"
  ],
  "detail": {
    "version": "0",
    "bucket": {
      "name": "bucket-012345678901-us-east-2-you-know-media"
    },
    "object": {
      "key": "Podcase_Episode.m4a",
      "size": 202329,
      "etag": "624fce93a981f97d85025e8432e24f48",
      "sequencer": "006286C2D604D7A390"
    },
    "request-id": "B4DA7RD214V1QG3W",
    "requester": "012345678901",
    "source-ip-address": "172.0.0.1",
    "reason": "PutObject"
  }
}

The state machine allows you to prepare parameters and use the direct SDK integrations to start the transcription job by calling the Amazon Transcribe service’s API. This integration means you don’t have to write custom code to perform this function. The event triggering the state machine execution contains the uploaded media file location.


  StartTranscriptionJob:
	Type: Task
	Comment: Start a transcribe job on the provided media file
	Parameters:
	  Media:
		MediaFileUri.$: States.Format('s3://{}/{}', $.detail.bucket.name, $.detail.object.key)
	  TranscriptionJobName.$: "$.detail.object.key"
	  IdentifyLanguage: true
	  OutputBucketName: !Ref TranscribeResultsBucket
	Resource: !Sub 'arn:${AWS::Partition}:states:${AWS::Region}:${AWS::AccountId}:aws-sdk:transcribe:startTranscriptionJob'

The SDK uses aws-sdk:transcribe:getTranscriptionJob to get the status of the job.


  GetTranscriptionJob:
	Type: Task
	Comment: Retrieve the status of an Amazon Transcribe job
	Parameters:
	  TranscriptionJobName.$: "$.TranscriptionJob.TranscriptionJobName"
	Resource: !Sub 'arn:${AWS::Partition}:states:${AWS::Region}:${AWS::AccountId}:aws-sdk:transcribe:getTranscriptionJob'
	Next: TranscriptionJobStatus

The state machine uses a polling loop with a delay to check the status of the transcription job.


  TranscriptionJobStatus:
	Type: Choice
	Choices:
	- Variable: "$.TranscriptionJob.TranscriptionJobStatus"
	  StringEquals: COMPLETED
	  Next: AthenaStartQueryExecutionUsingPreparedStatement
	- Variable: "$.TranscriptionJob.TranscriptionJobStatus"
	  StringEquals: FAILED
	  Next: Failed
	Default: Wait

When the transcription job completes successfully, the filler phrase counting process begins.

An Athena prepared statement performs the query with the transcription job name as a runtime parameter. The AWS SDK starts the query and the state machine execution pauses, waiting for the results to return before progressing to the next state:

athena:startQueryExecution.sync

When the query completes, Step Functions uses the SDK integration to retrieve the results using athena:getQueryResults:

athena:getQueryResults

It creates an Athena prepared statement to pass the transcription jobname as a parameter for the query execution:

  ResultsQueryPreparedStatement:
    Type: AWS::Athena::PreparedStatement
    Properties:
      Description: Create a statement that allows the use of a parameter for specifying an Amazon Transcribe job name in the Athena query
      QueryStatement: !Sub >-
        select cardinality(regexp_extract_all(results.transcripts[1].transcript, '${FillerPhrase}')) AS item_count from "${GlueDatabaseYouKnow}"."${AthenaTableName}" where jobname like ?
      StatementName: !Ref AthenaQueryPreparedStatementName
      WorkGroup: !Ref AthenaWorkgroup

There are several opportunities to enhance this tool. For example, adding support for multiple filler phrases. You could build a larger application to upload media and retrieve the results. You could take advantage of Amazon Transcribe’s real-time transcription API to display the results while a presentation is in progress to provide immediate feedback to the presenter.

Cleaning up

  1. Navigate to the Amazon Transcribe console. Choose Transcription jobs in the left pane, select the jobs created by this example, and choose Delete.
  2. Cleanup Delete

    Cleanup Delete

  3. Navigate to the S3 console. In the Find buckets by name search bar, enter “you-know”. This shows the list of buckets created for this example. Choose each of the radio buttons next to the bucket individually and choose Empty.
  4. Cleanup S3

    Cleanup S3

  5. Use the following command to delete the stack, and confirm the stack deletion.
  6. sam delete

Conclusion

Low-code applications can increase developer efficiency by reducing the amount of custom code required to build solutions. They can also enable non-developer roles to create automation to perform business functions by providing drag-and-drop style user interfaces.

This post shows how a low-code approach can build a tool chain using AWS services. The example processes media files to produce text transcripts and count the use of filler phrases in those transcripts. It shows how to process EventBridge data and how to invoke Amazon Transcribe and Athena using Step Functions state machines.

For more serverless learning resources, visit Serverless Land.

Sending Amazon EventBridge events to private endpoints in a VPC

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/sending-amazon-eventbridge-events-to-private-endpoints-in-a-vpc/

This post is written by Emily Shea, Senior GTM Specialist, Event-Driven Architectures.

Building with events can help you accelerate feature velocity and build scalable, fault tolerant applications. You can achieve loose coupling in your application using asynchronous communication via events. Loose coupling allows each development team to build and deploy independently and each component to scale and fail without impacting the others. This approach is referred to as event-driven architecture.

Amazon EventBridge helps you build event-driven architectures. You can publish events to the EventBridge event bus and EventBridge routes those events to targets. You can write rules to filter events and only send them to the interested targets. For example, an order fulfillment service may only be interested in events of type ‘new order created.’

EventBridge is serverless, so there is no infrastructure to manage and the service scales automatically. EventBridge has native integrations with over 100 AWS services and over 40 SaaS providers.

Amazon EventBridge has a native integration with AWS Lambda, and many AWS customers use events to trigger Lambda functions to process events. You may also want to send events to workloads running on Amazon EC2 or containerized workloads deployed with Amazon ECS or Amazon EKS. These services are deployed into an Amazon Virtual Private Cloud, or VPC.

For some use cases, you may be able to expose public endpoints for your VPC. You can use EventBridge API destinations to send events to any public HTTP endpoint. API destinations include features like OAuth support and rate limiting to control the number of events you are sending per second.

However, some customers are not able to expose public endpoints for security or compliance purposes. This tutorial shows you how to send EventBridge events to a private endpoint in a VPC using a Lambda function to relay events. This solution deploys the Lambda function connected to the VPC and uses IAM permissions to enable EventBridge to invoke the Lambda function. Learn more about Lambda VPC connectivity here.

In this blog post, you learn how to send EventBridge events to a private endpoint in a VPC. You set up an example application with an EventBridge event bus, a Lambda function to relay events, a Flask application running in an EKS cluster to receive events behind an Application Load Balancer (ALB), and a secret stored in Secrets Manager for authenticating requests. This application uses EKS and Secrets Manager to demonstrate sending and authenticating requests to a containerized workload, but the same pattern applies for other container orchestration services like ECS and your preferred secret management solution.

Continue reading for the full example application and walkthrough. If you have an existing application in a VPC, you can deploy just the event relay portion and input your VPC details as parameters.

Solution overview

Architecture

  1. An event is sent to the EventBridge bus.
  2. If the event matches a certain pattern (ex, if ‘detail-type’ is ‘inbound-event-sent’), an EventBridge rule uses EventBridge’s input transformer to format the event as an HTTP call.
  3. The EventBridge rule pushes the event to a Lambda function connected to the VPC and a CloudWatch Logs group for debugging.
  4. The Lambda function calls Secrets Manager and retrieves a secret key. It appends the secret key to the event headers and then makes an HTTP call to the ALB URL endpoint.
  5. ALB routes this HTTP call to a node group in the EKS cluster. The Flask application running on the EKS cluster calls Secret Manager, confirms that the secret key is valid, and then processes the event.
  6. The Lambda function receives a response from ALB.
    1. If the Flask application fails to process the event for any reason, the Lambda function raises an error. The function’s failure destination is configured to send the event and the error message to an SQS dead letter queue.
    2. If the Flask application successfully processes the event and the ‘return-response-event’ flag in the event was set to ‘true’, then the Lambda function publishes a new ‘outbound-event-sent’ event to the same EventBridge bus.
  7. Another EventBridge rule matches detail-type ‘outbound-event-sent’ events and routes these to the CloudWatch Logs group for debugging.

Prerequisites

To run the application, you must install the AWS CLI, Docker CLI, eksctl, kubectl, and AWS SAM CLI.

To clone the repository, run:

git clone https://github.com/aws-samples/eventbridge-events-to-vpc.git

Creating the EKS cluster

  1. In the example-vpc-application directory, use eksctl to create the EKS cluster using the config file.
    cd example-vpc-application
    eksctl create cluster --config-file eksctl_config.yaml

    This takes a few minutes. This step creates an EKS cluster with one node group in us-east-1. The EKS cluster has a service account with IAM permissions to access the Secrets Manager secret you create later.

  2. Use your AWS account’s default Amazon Elastic Container Registry (ECR) private registry to store the container image. First, follow these instructions to authenticate Docker to ECR. Next, run this command to create a new ECR repository. The create-repository command returns a repository URI (for example, 123456789.dkr.ecr.us-east-1.amazonaws.com/events-flask-app).
    aws ecr create-repository --repository-name events-flask-app 

    Use the repository URI in the following commands to build, tag, and push the container image to ECR.

    docker build --tag events-flask-app .
    docker tag events-flask-app:latest {repository-uri}:1
    docker push {repository-uri}:1
  3. In the Kuberenetes deployment manifest file (/example-vpc-application/manifests/deployment.yaml), fill in your repository URI and container image version (for example, 123456789.dkr.ecr.us-east-1.amazonaws.com/events-flask-app:1)

Deploy the Flask application and Application Load Balancer

  1. Within the example-vpc-application directory, use kubectl to apply the Kubernetes manifest files. This step deploys the ALB, which takes time to create and you may receive an error message during the deployment (‘no endpoints available for service “aws-load-balancer-webhook-service”‘). Rerun the same command until the ALB is deployed and you no longer receive the error message.
    kubectl apply --kustomize manifests/
  2. Once the deployment is completed, verify that the Flask application is running by retrieving the Kubernetes pod logs. The first command retrieves a pod name to fill in for the second command.
    kubectl get pod --namespace vpc-example-app
    kubectl logs --namespace vpc-example-app {pod-name} --follow

    You should see the Flask application outputting ‘Hello from my container!’ in response to GET request health checks.

    Hello message

Get VPC and ALB details

Next, you retrieve the security group ID, private subnet IDs, and ALB DNS Name to deploy the Lambda function connected to the same VPC and private subnet and send events to the ALB.

  1. In the AWS Management Console, go to the VPC dashboard and find Subnets. Copy the subnet IDs for the two private subnets (for example, subnet name ‘eksctl-events-cluster/SubnetPrivateUSEAST1A’).
    Subnets
  2. In the VPC dashboard, under Security, find the Security Groups tab. Copy the security group ID for ‘eksctl-events-cluster/ClusterSharedNodeSecurityGroup’.
    Security groups
  3. Go to the EC2 dashboard. Under Load Balancing, find the Load Balancer tab. There is a load balancer associated with your VPC ID. Copy the DNS name for the load balancer, adding ‘http://’ as a prefix (for example, http://internal-k8s-vpcexamp-vpcexamp-c005e07d1a-1074647274.us-east-1.elb.amazonaws.com).
    Load balancer

Create the Secrets Manager VPC endpoint

You need a VPC endpoint for your application to call Secrets Manager.

  1. In the VPC dashboard, find the Endpoints tab and choose Create Endpoint. Select Secrets Manager as the service, and then select the VPC, private subnets, and security group that you copied in the previous step. Choose Create.VPC endpoint

Deploy the event relay application

Deploy the event relay application using the AWS Serverless Application Model (AWS SAM) CLI:

  1. Open a new terminal window and navigate to the event-relay directory. Run the following AWS SAM CLI commands to build the application and step through a guided deployment.
    cd event-relay
    sam build
    sam deploy --guided

    The guided deployment process prompts for input parameters. Enter ‘event-relay-app’ as the stack name and accept the default Region. For other parameters, submit the ALB and VPC details you copied: Url (ALB DNS name), security group ID, and private subnet IDs. For the Secret parameter, pass any value.The AWS SAM template saves this value as a Secrets Manager secret to authenticate calls to the container application. This is an example of how to pass secrets in the event relay HTTP call. Replace this with your own authentication method in production environments.

  2. Accept the defaults for the remaining options. For ‘Deploy this changeset?’, select ‘y’. Here is an example of the deployment parameters.
    Parameters

Test the event relay application

Both the Flask application in a VPC and the event relay application are now deployed. To test the event relay application, keep the Kubernetes pod logs from a previous step open to monitor requests coming into the Flask application.

  1. You can open a new terminal window and run this AWS CLI command to put an event on the bus, or go to the EventBridge console, find your event bus, and use the Send events UI.
    aws events put-events \
    --entries '[{"EventBusName": "event-relay-bus" ,"Source": "eventProducerApp", "DetailType": "inbound-event-sent", "Detail": "{ \"event-id\": \"123\", \"return-response-event\": true }"}]'

    When the event is relayed to the Flask application, a POST request in the Kubernetes pod logs confirms that the application processed the event.

    Terminal response

  2. Navigate to the CloudWatch Logs groups in the AWS Management Console. In the ‘/aws/events/event-bus-relay-logs’ group, there are logs for the EventBridge events. In ‘/aws/lambda/EventRelayFunction’ stream, the Lambda function relays the inbound event and puts a new outbound event on the EventBridge bus.
  3. You can test the SQS dead letter queue by creating an error. For example, you can manually change the Lambda function code in the console to pass an incorrect value for the secret. After sending a test event, navigate to the SQS queue in the console and poll for messages. The message shows the error message from the Flask application and the full event that failed to process.

Cleaning up

In the VPC dashboard in the AWS Management Console, find the Endpoints tab and delete the Secrets Manager VPC endpoint. Next, run the following commands to delete the rest of the example application. Be sure to run the commands in this order as some of the resources have dependencies on one another.

sam delete --stack-name event-relay-app
kubectl --namespace vpc-example-app delete ingress vpc-example-app-ingress

From the example-vpc-application directory, run this command.

eksctl delete cluster --config-file eksctl_config.yaml

Conclusion

Event-driven architectures and EventBridge can help you accelerate feature velocity and build scalable, fault tolerant applications. This post demonstrates how to send EventBridge events to a private endpoint in a VPC using a Lambda function to relay events and emit response events.

To learn more, read Getting started with event-driven architectures and visit EventBridge tutorials on Serverless Land.

New – High Volume Outbound Communication with Amazon Connect Outbound Campaigns

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/new-high-volume-outbound-communication-with-amazon-connect-outbound-campaigns/

The new high volume outbound communication capability in Amazon Connect which was announced at Enterprise Connect last year, is now generally available to all. It is named Amazon Connect outbound campaigns.

If you haven’t heard about Amazon Connect, it is an easy-to-use cloud contact center service that helps companies of any size deliver superior customer service at lower cost. You can read the original blog post Jeff wrote at launch in 2017, with amazing Lego art 🙂

Contact centers not only receive calls and communications, but they also send outbound communications to customers. There are a variety of reasons to send outbound communication: appointment reminders, telemarketing, subscription renewals, and billing reminders. The vast majority of these communications are phone calls, and in many contact centers, agents make the calls manually using customer contact lists in external systems. Since customers only answer about ten percent of calls, these agents can spend nearly half of their time dialing and waiting. This can result in millions of dollars in lost productivity each year for a contact center with as few as 200 agents.

To help you to address this challenge, today we are adding to Amazon Connect outbound campaigns a set of high-volume outbound communication capabilities that allows you to proactively reach more of your customers across voice, SMS, and email. When using this capability, you will have a scalable way for proactive outreach for hundreds to millions of your customers, and you will increase your agents’ productivity and lower your operational costs.

Amazon Connect outbound campaigns delivers a predictive phone dialer. The dialer includes an answering machine detection system powered by machine learning. It allows the automatic detection of answering machines for voice calls and passes calls to agents only when the call is answered by a human. The dialer also adjusts the call rate depending on factors such as percentage of human-answered the calls, call duration, and agent availability. There is no integration required to get the benefit of existing Amazon Connect features, such as automated workflows, routing, and machine learning capabilities like Contact Lens. You now have a single system for inbound and outbound communications.

To further refine the customer experience or use multiple channels in your campaigns, for example, to send an SMS or email message to your customers when they do not answer calls, you have the option to use Amazon Pinpoint. Amazon Pinpoint is a flexible and scalable outbound and inbound marketing communications service. It allows you to define customer segments, define the customer journey, define the contact strategy, and more. Amazon Pinpoint is the system handling high-volume SMS and email campaigns.

To better understand how Amazon Connect, Amazon Pinpoint, and other AWS services work together, you can refer to this very detailed blog post.

Let’s show you how it works
Imagine I am a contact center manager, and I want to create an outbound call campaign to target a selected list of customers.

I first import my customer contact list from a spreadsheet on Amazon S3. I may also import it from popular customer relationship management (CRM) and marketing automation applications, such as Marketo, Salesforce, Twilio’s Segment, ServiceNow, Shopify, Zendesk, and Amazon Pinpoint itself.

Amazon Connect outbound campaigns - import contact 2

Then I create a campaign and define some journey parameters: the communication channel, the start time, and the corresponding content, such as a call script, email template, or SMS message. At the scheduled start time, the journey is executed using Amazon Connect for calls or Amazon Pinpoint for SMS or emails, as specified.

Amazon Connect outbound campaigns - create campaign

When I configure the campaign to run in Predictive dial mode, as I mentioned before, the dialer automatically adjusts the dial rate based on the duration of calls and the real-time availability of agents. Once a call is answered, Amazon Connect distinguishes whether it is a live voice or a recorded message and routes the live customer to an available agent in the Amazon Connect agent application, where the agent can see the call script that I specified during setup, along with relevant customer information.

As explained earlier, I may use Amazon Pinpoint to define the customer journey. By doing so, I can combine voice, email, and SMS channels in the same outbound communication campaign to improve the efficiency of my agents and my customer’s experience. For example, a financial institution can use Amazon Connect to send an SMS notification to remind a customer of a missed payment and include a link to request a call back from an agent. When a call is requested, Amazon Connect automatically queues the call, dials the customer’s number, detects their voice, and connects an available agent to the customer.

Amazon Connect outbound campaigns - journey workflow

Amazon Pinpoint allows you to define the details of the customer journey.

Amazon Connect outbound campaigns - setup quiet times

As usual with AWS services, I can analyze contact events sent via Amazon EventBridge. EventBridge is a serverless event bus that makes it easier to build event-driven applications at scale using events generated from your applications, integrated software-as-a-service (SaaS) applications, and AWS service. When filtering or analyzing events posted to EventBridge, I can create metrics such as time to connect to an agent, duration of the contact, and call abandonment rate

These metrics help me understand the status of my campaign and ensure compliance with applicable regulations, such as maximum call abandonment rates. I also can use historical reports of these metrics to understand the effectiveness of all my communications campaigns over time.

Amazon Connect outbound campaigns - jounrey metrics

Speaking of compliance, we do not want anyone to abuse the system, intentionally or not, or to break any local compliance rules.

Access and Compliance
Using automated services to drive outbound communication campaigns is strictly regulated in several countries and territories. For example, the US adopted the Telephone Consumer Protection Act (TCPA) in 1991, and the United Kingdom’s Office of Communications has similar rules.

Amazon Connect outbound campaigns gives you the tools to stay compliant with these regulations and many others. However, just like with traditional IT security, it is a shared responsibility. It is your responsibility to use the service in a compliant manner. We are happy to assist you in addressing specific use cases.

Let’s share two examples to illustrate how Amazon Connect outbound campaigns can help you meet your compliance status: respect quiet time and monitor call abandonment rate.

The use of quiet times allows contact center managers to configure a schedule for channel communications based on the day of the week and the hours of the day. More precise delivery times means your customers are most likely to engage with the communication and increase metrics such as open rates for SMS and email, as well as pick-up rates for voice calls. It also allows contact center managers to follow country and state-level voice dialing legislation. The following screenshot shows how you can configure quiet times using Amazon Pinpoint.

Amazon Connect outbound campaigns - quiet times

According to TCPA, call abandonment rate is the percentage of calls picked up by a live customer but not connected to a live agent within two seconds after the customer greeting. I found it interesting that in the UK, the time is measured from the start of your customer greetings, while in the US, it is measured from the end of the greeting. Amazon Connect outbound campaigns provides you with metrics, such as customerGreetingStart, customerGreetingStop, andconnectedToAgent for each outbound communication. Contact center managers can use these to compute the abandonment rate and dial up or down the outgoing communication channel accordingly.

Other metrics, configuration parameters, and AWS Lambda API integration allow contact center managers to consult a Do-Not-Call (DNC) registry or list scrubbing and verify your customer’s local time zone or bank holiday calendars, just to name a few.

Pricing and Availability
Amazon Connect outbound campaigns is available in US East (N. Virginia), US West (Oregon), Asia Pacific (Sydney), and Europe (London) AWS Regions. This allows you to start your outbound campaigns for customers in the USA, UK, Australia, and New Zealand.

As usual, pricing is based on your usage; you only pay for what you use with no upfront or minimum engagement. The key metrics we are using for pricing are the minutes of outbound calls. The pricing page has all the details.

And now, go build your contact centers.

— seb

Trigger an AWS Glue DataBrew job based on an event generated from another DataBrew job

Post Syndicated from Nipun Chagari original https://aws.amazon.com/blogs/big-data/trigger-an-aws-glue-databrew-job-based-on-an-event-generated-from-another-databrew-job/

Organizations today have continuous incoming data, and analyzing this data in a timely fashion is becoming a common requirement for data analytics and machine learning (ML) use cases. As part of this, you need clean data in order to gain insights that can enable enterprises to get the most out of their data for business growth and profitability. You can now use AWS Glue DataBrew, a visual data preparation tool that makes it easy to transform and prepare datasets for analytics and ML workloads.

As we build these data analytics pipelines, we can decouple the jobs by building event-driven analytics and ML workflow pipelines. In this post, we walk through how to trigger a DataBrew job automatically on an event generated from another DataBrew job using Amazon EventBridge and AWS Step Functions.

Overview of solution

The following diagram illustrates the architecture of the solution. We use AWS CloudFormation to deploy an EventBridge rule, an Amazon Simple Queue Service (Amazon SQS) queue, and Step Functions resources to trigger the second DataBrew job.

The steps in this solution are as follows:

  1. Import your dataset to Amazon Simple Storage Service (Amazon S3).
  2. DataBrew queries the data from Amazon S3 by creating a recipe and performing transformations.
  3. The first DataBrew recipe job writes the output to an S3 bucket.
  4. When the first recipe job is complete, it triggers an EventBridge event.
  5. A Step Functions state machine is invoked based on the event, which in turn invokes the second DataBrew recipe job for further processing.
  6. The event is delivered to the dead-letter queue if the rule in EventBridge can’t invoke the state machine successfully.
  7. DataBrew queries data from an S3 bucket by creating a recipe and performing transformations.
  8. The second DataBrew recipe job writes the output to the same S3 bucket.

Prerequisites

To use this solution, you need the following prerequisites:

Load the dataset into Amazon S3

For this post, we use the Credit Card customers sample dataset from Kaggle. This data consists of 10,000 customers, including their age, salary, marital status, credit card limit, credit card category, and more. Download the sample dataset and follow the instructions. We recommend creating all your resources in the same account and Region.

Create a DataBrew project

To create a DataBrew project, complete the following steps:

  1. On the DataBrew console, choose Projects and choose Create project.
  2. For Project name, enter marketing-campaign-project-1.
  3. For Select a dataset, select New dataset.
  4. Under Data lake/data store, choose Amazon S3.
  5. For Enter your source from S3, enter the S3 path of the sample dataset.
  6. Select the dataset CSV file.
  7. Under Permissions, for Role name, choose an existing IAM role created during the prerequisites or create a new role.
  8. For New IAM role suffix, enter a suffix.
  9. Choose Create project.

After the project is opened, a DataBrew interactive session is created. DataBrew retrieves sample data based on your sampling configuration selection.

Create the DataBrew jobs

Now we can create the recipe jobs.

  1. On the DataBrew console, in the navigation pane, choose Projects.
  2. On the Projects page, select the project marketing-campaign-project-1.
  3. Choose Open project and choose Add step.
  4. In this step, we choose Delete to drop the unnecessary columns from our dataset that aren’t required for this exercise.

You can choose from over 250 built-in functions to merge, pivot, and transpose the data without writing code.

  1. Select the columns to delete and choose Apply.
  2. Choose Create job.
  3. For Job name, enter marketing-campaign-job1.
  4. Under Job output settings¸ for File type, choose your final storage format (for this post, we choose CSV).
  5. For S3 location, enter your final S3 output bucket path.
  6. Under Settings, for File output storage, select Replace output files for each job run.
  7. Choose Save.
  8. Under Permissions, for Role name¸ choose an existing role created during the prerequisites or create a new role.
  9. Choose Create job.

Now we repeat the same steps to create another DataBrew project and DataBrew job.

  1. For this post, I named the second project marketing-campaign-project2 and named the job marketing-campaign-job2.
  2. When you create the new project, this time use the job1 output file location as the new dataset.
  3. For this job, we deselect Unknown and Uneducated in the Education_Level column.

Deploy your resources using CloudFormation

For a quick start of this solution, we deploy the resources with a CloudFormation stack. The stack creates the EventBridge rule, SQS queue, and Step Functions state machine in your account to trigger the second DataBrew job when the first job runs successfully.

  1. Choose Launch Stack:
  2. For DataBrew source job name, enter marketing-campaign-job1.
  3. For DataBrew target job name, enter marketing-campaign-job2.
  4. For both IAM role configurations, make the following choice:
    1. If you choose Create a new Role, the stack automatically creates a role for you.
    2. If you choose Attach an existing IAM role, you must populate the IAM role ARN manually in the following field or else the stack creation fails.
  5. Choose Next.
  6. Select the two acknowledgement check boxes.
  7. Choose Create stack.

Test the solution

To test the solution, complete the following steps:

  1. On the DataBrew console, choose Jobs.
  2. Select the job marketing-campaign-job1 and choose Run job.

This action automatically triggers the second job, marketing-campaign-job2, via EventBridge and Step Functions.

  1. When both jobs are complete, open the output link for marketing-campaign-job2.

You’re redirected to the Amazon S3 console to access the output file.

In this solution, we created a workflow that required minimal code. The first job triggers the second job, and both jobs deliver the transformed data files to Amazon S3.

Clean up

To avoid incurring future charges, delete all the resources created during this walkthrough:

  • IAM roles
  • DataBrew projects and their associated recipe jobs
  • S3 bucket
  • CloudFormation stack

Conclusion

In this post, we walked through how to use DataBrew along with EventBridge and Step Functions to run a DataBrew job that automatically triggers another DataBrew job. We encourage you to use this pattern for event-driven pipelines where you can build sequence jobs to run multiple jobs in conjunction with other jobs.


About the Authors

Nipun Chagari is a Senior Solutions Architect at AWS, where he helps customers build highly available, scalable, and resilient applications on the AWS Cloud. He is passionate about helping customers adopt serverless technology to meet their business objectives.

Prarthana Angadi is a Software Development Engineer II at AWS, where she has been expanding what is possible with code in order to make life more efficient for AWS customers.

Extending PowerShell on AWS Lambda with other services

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/extending-powershell-on-aws-lambda-with-other-services/

This post expands on the functionality introduced with the PowerShell custom runtime for AWS Lambda. The previous blog explains how the custom runtime approach makes it easier to run Lambda functions written in PowerShell.

You can add additional functionality to your PowerShell serverless applications by importing PowerShell modules, which are shareable packages of code. Build your own modules or import from the wide variety of existing vendor modules to manage your infrastructure and applications.

You can also take advantage of the event-driven nature of Lambda, which allows you to run Lambda functions in response to events. Events can include an object being uploaded to Amazon S3, a message placed on an Amazon SQS queue, a scheduled task using Amazon EventBridge, or an HTTP request from Amazon API Gateway. Lambda functions support event triggers from over 200 AWS services and software as a service (SaaS) applications.

Adding PowerShell modules

You can add PowerShell modules from a number of locations. These can include modules from the AWS Tools for PowerShell, from the PowerShell Gallery, or your own custom modules. Lambda functions access these PowerShell modules within specific folders within the Lambda runtime environment.

You can include PowerShell modules via Lambda layers, within your function code package, or container image. When using .zip archive functions, you can use layers to package and share modules to use with your functions. Layers reduce the size of uploaded deployment archives and can make it faster to deploy your code. You can attach up to five layers to your function, one of which must be the PowerShell custom runtime layer. You can include multiple modules per layer.

The custom runtime configures PowerShell’s PSModulePath environment variable, which contains the list of folder locations to search to find modules. The runtime searches the folders in the following order:

1. User supplied modules as part of function package

You can include PowerShell modules inside the published Lambda function package in a /modules subfolder.

2. User supplied modules as part of Lambda layers

You can publish Lambda layers that include PowerShell modules in a /modules subfolder. This allows you to share modules across functions and accounts. Lambda extracts layers to /opt within the Lambda runtime environment so the modules are located in /opt/modules. This is the preferred solution to use modules with multiple functions.

3. Default/user supplied modules supplied with PowerShell

You can also include additional default modules and add them within a /modules folder within the PowerShell custom runtime layer.

For example, the following function includes four Lambda layers. One layer includes the custom runtime. Three additional layers include further PowerShell modules; the AWS Tools for PowerShell, your own custom modules, and third-party modules. You can also include additional modules with your function code.

Lambda layers

Lambda layers

Within your PowerShell code, you can load modules during the function initialization (init) phase. This initializes the modules before the handler function runs, which speeds up subsequent warm-start invocations.

Adding modules from the AWS Tools for PowerShell

This post shows how to use the AWS Tools for PowerShell to manage your AWS services and resources. The tools are packaged as a set of PowerShell modules that are built on the functionality exposed by the AWS SDK for .NET. You can follow similar packaging steps to add other modules to your functions.

The AWS Tools for PowerShell are available as three distinct packages:

The AWS.Tools package is the preferred modularized version, which allows you to load only the modules for the services you want to use. This reduces package size and function memory usage. The AWS.Tools cmdlets support auto-importing modules without having to call Import-Module first. However, specifically importing the modules during the function init phase is more efficient and can reduce subsequent invoke duration. The AWS.Tools.Common module is required and provides cmdlets for configuration and authentication that are not service specific.

The accompanying GitHub repository contains the code for the custom runtime, along with a number of example applications. There are also module build instructions for adding a number of common PowerShell modules as Lambda layers, including AWS.Tools.

Building an event-driven PowerShell function

The repository contains an example of an event-driven demo application that you can build using serverless services.

A clothing printing company must manage its t-shirt size and color inventory. The printers store t-shirt orders for each day in a CSV file. The inventory service is one service that must receive the CSV file. It parses the file and, for each order, records the details to manage stock deliveries.

The stores upload the files to S3. This automatically invokes a PowerShell Lambda function, which is configured to respond to the S3 ObjectCreated event. The Lambda function receives the S3 object location as part of the $LambdaInput event object. It uses the AWS Tools for PowerShell to download the file from S3. It parses the contents and, for each line in the CSV file, sends the individual order details as an event to an EventBridge event bus.

In this example, there is a single rule to log the event to Amazon CloudWatch Logs to show the received event. However, you could route each order, depending on the order details, to different targets. For example, you can send different color combinations to SQS queues, which the dyeing service can use to order dyes. You could send particular size combinations to another Lambda function that manages cloth orders.

Example event-driven application

Example event-driven application

The previous blog post shows how to use the AWS Serverless Application Model (AWS SAM) to build a Lambda layer, which includes only the AWS.Tools.Common module to run Get-AWSRegion. To build a PowerShell application to process objects from S3 and send events to EventBridge, you can extend this functionality by also including the AWS.Tools.S3 and AWS.Tools.EventBridge modules in a Lambda layer.

Lambda layers, including S3 and EventBridge

Lambda layers, including S3 and EventBridge

Building the AWS Tools for PowerShell layer

You could choose to add these modules and rebuild the existing layer. However, the example in this post creates a new Lambda layer to show how you can have different layers for different module combinations of AWS.Tools. The example also adds the Lambda layer Amazon Resource Name (ARN) to AWS Systems Manager Parameter Store to track deployed layers. This allows you to reference them more easily in infrastructure as code tools.

The repository includes build scripts for both Windows and non-Windows developers. Windows does not natively support Makefiles. When using Windows, you can use either Windows Subsystem for Linux (WSL)Docker Desktop, or native PowerShell.

When using Linux, macOS, WSL, or Docker, the Makefile builds the Lambda layers. After downloading the modules, it also extracts the additional AWS.Tools.S3 and AWS.Tools.EventBridge modules.

# Download AWSToolsLayer module binaries
curl -L -o $(ARTIFACTS_DIR)/AWS.Tools.zip https://sdk-for-net.amazonwebservices.com/ps/v4/latest/AWS.Tools.zip
mkdir -p $(ARTIFACTS_DIR)/modules

# Extract select AWS.Tools modules (AWS.Tools.Common required)
unzip $(ARTIFACTS_DIR)/AWS.Tools.zip 'AWS.Tools.Common/**/*' -d $(ARTIFACTS_DIR)/modules/
unzip $(ARTIFACTS_DIR)/AWS.Tools.zip 'AWS.Tools.S3/**/*' -d $(ARTIFACTS_DIR)/modules/
unzip $(ARTIFACTS_DIR)/AWS.Tools.zip 'AWS.Tools.EventBridge/**/*' -d $(ARTIFACTS_DIR)/modules/

When using native PowerShell on Windows to build the layer, the build-AWSToolsLayer.ps1 script performs the same file copy functionality as the Makefile. You can use this option for Windows without WSL or Docker.

### Extract entire AWS.Tools modules to stage area but only move over select modules
…
Move-Item "$PSScriptRoot\stage\AWS.Tools.Common" "$PSScriptRoot\modules\" -Force
Move-Item "$PSScriptRoot\stage\AWS.Tools.S3" "$PSScriptRoot\modules\" -Force
Move-Item "$PSScriptRoot\stage\AWS.Tools.EventBridge" "$PSScriptRoot\modules\" -Force

The Lambda function code imports the required modules in the function init phase.

Import-Module "AWS.Tools.Common"
Import-Module "AWS.Tools.S3"
Import-Module "AWS.Tools.EventBridge"

For other combinations of AWS.Tools, amend the example build-AWSToolsLayer.ps1 scripts to add the modules you require. You can use a similar download and copy process, or PowerShell’s Save-Module to build layers for modules from other locations.

Building and deploying the event-driven serverless application

Follow the instructions in the GitHub repository to build and deploy the application.

The demo application uses AWS SAM to deploy the following resources:

  1. PowerShell custom runtime.
  2. Additional Lambda layer containing the AWS.Tools.Common, AWS.Tools.S3, and AWS.Tools.EventBridge modules from AWS Tools for PowerShell. The layer ARN is stored in Parameter Store.
  3. S3 bucket to store CSV files.
  4. Lambda function triggered by S3 upload.
  5. Custom EventBridge event bus and rule to send events to CloudWatch Logs.

Testing the event-driven application

Use the AWS CLI or AWS Tools for PowerShell to copy the sample CSV file to S3. Replace BUCKET_NAME with your S3 SourceBucket Name from the AWS SAM outputs.

AWS CLI

aws s3 cp .\test.csv s3://BUCKET_NAME

AWS Tools for PowerShell

Write-S3Object -BucketName BUCKET_NAME -File .\test.csv

The S3 file copy action generates an S3 notification event. This invokes the PowerShell Lambda function, passing the S3 file location details as part of the function $LambdaInput event object.

The function downloads the S3 CSV file, parses the contents, and sends the individual lines to EventBridge, which logs the events to CloudWatch Logs.

Navigate to the CloudWatch Logs group /aws/events/demo-s3-lambda-eventbridge.

You can see the individual orders logged from the CSV file.

EventBridge logs showing CSV lines

EventBridge logs showing CSV lines

Conclusion

You can extend PowerShell Lambda applications to provide additional functionality.

This post shows how to import your own or vendor PowerShell modules and explains how to build Lambda layers for the AWS Tools for PowerShell.

You can also take advantage of the event-driven nature of Lambda to run Lambda functions in response to events. The demo application shows how a clothing printing company builds a PowerShell serverless application to manage its t-shirt size and color inventory.

See the accompanying GitHub repository, which contains the code for the custom runtime, along with additional installation options and additional examples.

Start running PowerShell on Lambda today.

For more serverless learning resources, visit Serverless Land.

Testing Amazon EventBridge events using AWS Step Functions

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/testing-amazon-eventbridge-events-using-aws-step-functions/

This post is written by Siarhei Kazhura, Solutions Architect and Riaz Panjwani, Solutions Architect.

Amazon EventBridge is a serverless event bus that can be used to ingest and process events from a variety of sources, such as AWS services and SaaS applications. With EventBridge, developers can build loosely coupled and independently scalable event-driven applications.

It can be useful to know with EventBridge when events are not able to reach the desired destination. This can be caused by multiple factors, such as:

  1. Event pattern does not match the event rule
  2. Event transformer failure
  3. Event destination expects a different payload (for example, API destinations) and returns an error

EventBridge sends metrics to Amazon CloudWatch, which allows for the detection of failed invocations on a given event rule. You can also use EventBridge rules with a dead-letter queue (DLQ) to identify any failed event deliveries. The messages delivered to the queue contain additional metadata such as error codes, error messages, and the target ARN for debugging.

However, understanding why events fail to deliver is still a manual process. Checking CloudWatch metrics for failures, and then the DLQ takes time. This is evident when developing new functionality, when you must constantly update the event matching patterns and event transformers, and run tests to see if they provide the desired effect. EventBridge sandbox functionality can help with manual testing but this approach does not scale or help with automated event testing.

This post demonstrates how to automate testing for EventBridge events. It uses AWS Step Functions for orchestration, along with Amazon DynamoDB and Amazon S3 to capture the results of your events, Amazon SQS for the DLQ, and AWS Lambda to invoke the workflows and processing.

Overview

Using the solution provided in this post, users can track events from its inception to delivery and identify where any issues or errors are occurring. This solution is also customizable, and can incorporate integration tests against events to test pattern matching and transformations.

Reference architecture

At a high level:

  1. The event testing workflow is exposed via an API Gateway endpoint, and users can send a request.
  2. This request is validated and routed to a Step Functions EventTester workflow, which performs the event test.
  3. The EventTester workflow creates a sample event based on the received payload, and performs multiple tests on the sample event.
  4. The sample event is matched against the rule that is being tested. The results are stored in an Amazon DynamoDB EventTracking table, and the transformed event payload is stored in the TransformedEventPayload Amazon S3 bucket.
  5. The EventTester workflow has an embedded AWS Step Functions workflow called EventStatusPoller. The EventStatusPoller workflow polls the EventTracking table.
  6. The EventStatusPoller workflow has a customizable 10-second timeout. If the timeout is reached, this may indicate that the event pattern does not match. EventBridge tests if the event does not match against a given pattern, using the AWS SDK for EventBridge.
  7. After completing the tests, the response is formatted and sent back to the API Gateway. By default, the timeout is set to 15 seconds.
  8. API Gateway processes the response, strips the unnecessary elements, and sends the response back to the issuer. You can use this response to verify if the test event delivery is successful, or identify the reason a failure occurred.

EventTester workflow

After an API call, this event is sent to the EventTester express workflow. This orchestrates the automated testing, and returns the results of the test.

EventTester workflow

In this workflow:

1. The test event is sent to EventBridge to see if the event matches the rule and can be transformed. The result is stored in a DynamoDB table.
2. The PollEventStatus synchronous Express Workflow is invoked. It polls the DynamoDB table until a record with the event ID is found or it reaches the timeout. The configurable timeout is 15 seconds by default.
3. If a record is found, it checks the event status.

From here, there are three possible states. In the first state, if the event status has succeeded:

4. The response from the PollEventStatus workflow is parsed and the payload is formatted.
5. The payload is stored in an S3 bucket.
6. The final response is created, which includes the payload, the event ID, and the event status.
7. The execution is successful, and the final response is returned to the user.

In the second state, if no record is found in the table and the PollEventStatus workflow reaches the timeout:

8. The most likely explanation for reaching the timeout is that the event pattern does not match the rule, so the event is not processed. You can build a test to verify if this is the issue.
9. From the EventBridge SDK, the TestEventPattern call is made to see if the event pattern matches the rule.
10. The results of the TestEventPattern call are checked.
11. If the event pattern does not match the rule, then the issue has been successfully identified and the response is created to be sent back to the user. If the event pattern matches the rule, then the issue has not been identified.
12. The response shows that this is an unexpected error.

In the third state, this acts as a catch-all to any other errors that may occur:

13. The response is created with the details of the unexpected error.
14. The execution has failed, and the final response is sent back to the user.

Event testing process

The following diagram shows how events are sent to EventBridge and their results are captured in S3 and DynamoDB. This is the first step of the EventTester workflow:

Event testing process

When the event is tested:

  1. The sample event is received and sent to the EventBridge custom event bus.
  2. A CatchAll rule is triggered, which captures all events on the custom event bus.
  3. All events from the CatchAll rule are sent to a CloudWatch log group, which allows for an original payload inspection.
  4. The event is also propagated to the EventTesting rule. The event is matched against the rule pattern, and if successful the event is transformed based on the transformer provided.
  5. If the event is matched and transformed successfully, the Lambda function EventProcessor is invoked to process the transformed event payload. You can add additional custom code to this function for further testing of the event (for example, API integration with the transformed payload).
  6. The event status is updated to SUCCESS and the event metadata is saved to the EventTracking DynamoDB table.
  7. The transformed event payload is saved to the TransformedEventPayload S3 bucket.
  8. If there’s an error, EventBridge sends the event to the SQS DLQ.
  9. The Lambda function ErrorHandler polls the DLQ and processes the errors in batches.
  10. The event status is updated to ERROR and the event metadata is saved to the EventTracking DynamoDB table.
  11. The event payload is saved to the TransformedEventPayload S3 bucket.

EventStatusPoller workflow

EventStatusPoller workflow

When the poller runs:

  1. It checks the DynamoDB table to see if the event has been processed.
  2. The result of the poll is checked.
  3. If the event has not been processed, the workflow loops and polls the DynamoDB table again.
  4. If the event has been processed, the results of the event are passed to next step in the Event Testing workflow.

Visit Composing AWS Step Functions to abstract polling of asynchronous services for additional details.

Testing at scale

Testing at scale

The EventTester workflow uses Express Workflows, which can handle testing high volume event workloads. For example, you can run the solution against large volumes of historical events stored in S3 or CloudWatch.

This can be achieved by using services such as Lambda or AWS Fargate to read the events in batches and run tests simultaneously. To achieve optimal performance, some performance tuning may be required depending on the scale and events that are being tested.

To minimize the cost of the demo, the DynamoDB table is provisioned with 5 read capacity units and 5 write capacity units. For a production system, consider using on-demand capacity, or update the provisioned table capacity.

Event sampling

Event sampling

In this implementation, the EventBridge EventTester can be used to periodically sample events from your system for testing:

  1. Any existing rules that must be tested are provisioned via the AWS CDK.
  2. The sampling rule is added to an existing event bus, and has the same pattern as the rule that is tested. This filters out events that are not processed by the tested rule.
  3. SQS queue is used for buffering.
  4. Lambda function processes events in batches, and can optionally implement sampling. For example, setting a 10% sampling rate will take one random message out of 10 messages in a given batch.
  5. The event is tested against the endpoint provided. Note that the EventTesting rule is also provisioned via AWS CDK from the same code base as the tested rule. The tested rule is replicated into the EventTesting workflow.
  6. The result is returned to a Lambda function, and is then sent to CloudWatch Logs.
  7. A metric is set based on the number of ERROR responses in the logs.
  8. An alarm is configured when the ERROR metric crosses a provided threshold.

This sampling can complement existing metrics exposed for EventBridge via CloudWatch.

Solution walkthrough

To follow the solution walkthrough, visit the solution repository. The walkthrough explains:

  1. Prerequisites required.
  2. Detailed solution deployment walkthrough.
  3. Solution customization and testing.
  4. Cleanup process.
  5. Cost considerations.

Conclusion

This blog post outlines how to use Step Functions, Lambda, SQS, DynamoDB, and S3 to create a workflow that automates the testing of EventBridge events. With this example, you can send events to the EventBridge Event Tester endpoint to verify that event delivery is successful or identify the root cause for event delivery failures.

For more serverless learning resources, visit Serverless Land.

Monitoring and alerting break-glass access in an AWS Organization

Post Syndicated from Haresh Nandwani original https://aws.amazon.com/blogs/architecture/monitoring-and-alerting-break-glass-access-in-an-aws-organization/

Organizations building enterprise-scale systems require the setup of a secure and governed landing zone to deploy and operate their systems. A landing zone is a starting point from which your organization can quickly launch and deploy workloads and applications with confidence in your security and infrastructure environment as described in What is a landing zone?. Nationwide Building Society (Nationwide) is the world’s largest building society. It is owned by its 16 million members and exists to serve their needs. The Society is one of the UK’s largest providers for mortgages, savings and current accounts, as well as being a major provider of ISAs, credit cards, personal loans, insurance, and investments.

For one of its business initiatives, Nationwide utilizes AWS Control Tower to build and operate their landing zone which provides a well-established pattern to set up and govern a secure, multi-account AWS environment. Nationwide operates in a highly regulated industry and our governance assurance requires adequate control of any privileged access to production line-of-business data or to resources which have access to them. We chose for this specific business initiative to deploy our landing zone using AWS Organizations, to benefit from ongoing account management and governance as aligned with AWS implementation best practices. We also utilized AWS Single Sign-On (AWS SSO) to create our workforce identities in AWS once and manage access centrally across our AWS Organization. In this blog, we describe the integrations required across AWS Control Tower and AWS SSO to implement a break-glass mechanism that makes access reporting publishable to system operators as well as to internal audit systems and processes. We will outline how we used AWS SSO for our setup as well as the three architecture options we considered, and why we went with the chosen solution.

Sourcing AWS SSO access data for near real-time monitoring

In our setup, we have multiple AWS Accounts and multiple trails on each of these accounts. Users will regularly navigate across multiple accounts as they operate our infrastructure, and their journeys are marked across these multiple trails. Typically, AWS CloudTrail would be our chosen resource to clearly and unambiguously identify account or data access.  The key challenge in this scenario was to design an efficient and cost-effective solution to scan these trails to help identify and report on break-glass user access to account and production data. To address this challenge, we developed the following two architecture design options.

Option 1: A decentralized approach that uses AWS CloudFormation StackSets, Amazon EventBridge and AWS Lambda

Our solution entailed a decentralized approach by deploying a CloudFormation StackSet to create, update, or delete stacks across multiple accounts and AWS Regions with a single operation. The Stackset created Amazon EventBridge rules and target AWS Lambda functions. These functions post to EventBridge in our audit account. Our audit account has a set of Lambda functions running off EventBridge to initiate specific events, format the event message and post to Slack, our centralized communication platform for this implementation. Figure 1 depicts the overall architecture for this option.

De-centralized logging using Amazon EventBridge and AWS Lambda

Figure 1. De-centralized logging using Amazon EventBridge and AWS Lambda

Option 2: Use an organization trail in the Organization Management account

This option uses the centralized organization trail in the Organization Management account to source audit data. Details of how to create an organization trail can be found in the AWS CloudTrail User Guide. CloudTrail was configured to send log events to CloudWatch Logs. These events are then sent via Lambda functions to Slack using webhooks. We used a public terraform module in this GitHub repository to build this Lambda Slack integration. Figure 2 depicts the overall architecture for this option.

Centralized logging pattern using Amazon CloudWatch

Figure 2. Centralized logging pattern using Amazon CloudWatch

This was our preferred option and is the one we finally implemented.

We also evaluated a third option which was to use centralized logging and auditing feature enabled by Control Tower. Users authenticate and federate to target accounts from a central location so it seemed possible to source this info from the centralized logs. These log events arrive as .gz compressed json objects, which meant having to expand these archives repeatedly for inspection. We therefore decided against this option.

A centralized, economic, extensible solution to alert of SSO break-glass

Our requirement was to identify break-glass access across any of the access mechanisms supported by AWS, including CLI and User Portal access. To ensure we have comprehensive coverage across all access mechanisms, we identified all the events initiated for each access mechanism:

  1. User Portal/AWS Console access events
    • Authenticate
    • ListApplications
    • ListApplicationProfiles
    • Federate – this event contains the role that the user is federating into
  2. CLI access events
    • CreateToken
    • ListAccounts
    • ListAccountRoles
    • GetRoleCredentials – this event contains the role that the user is federating into

EventBridge is able to initiate actions after events only when the event is trying to perform changes (when the “readOnly” attribute on the event record body equals “false”).

The AWS support team was aware of this attribute and recommended that we, change the data flow we were using to one able to initiate actions after any kind of event, regardless of the value on its readOnly attribute. The solution in our case was to send the CloudTrail logs to CloudWatch Logs. This then and initiates the Lambda function through a filter subscription that detects the desired event names on the log content.

The filter used is as follows:

{($.eventSource = sso.amazonaws.com) && ($.eventName = Federate||$.eventName = GetRoleCredentials)}

Due to the query size in the CloudWatch Log queries we had to remove the subscription filters and do the parsing of the content of the log lines inside the lambda function. In order to determine what accounts would initiate the notifications, we sent the list of accounts and roles to it as an environment variable at runtime.

Considerations with cross-account SSO access

With direct federation users get an access token. This is most obvious in AWS single sign on at the chiclet page as “Command line or programmatic access”. SSO tokens have a limited lifetime (we use the default 1-hour). A user does not have to get a new token to access a target resource until the one they are using is expired. This means that a user may repeatedly access a target account using the same token during its lifetime. Although the token is made available at the chiclet page, the GetRoleCredentials event does not occur until it is used to authenticate an API call to the target AWS account.

Conclusion

In this blog, we discussed how AWS Control Tower and AWS Single Sign-on enabled Nationwide to build and govern a secure, multi-account AWS environment for one of their business initiatives and centralize access management across our implementation. The integration was important for us to accurately and comprehensively identify and audit break-glass access for our implementation. As a result, we were able to satisfy our security and compliance audit requirements for privileged access to our AWS accounts.

Use direct service integrations to optimize your architecture

Post Syndicated from Jerome Van Der Linden original https://aws.amazon.com/blogs/architecture/use-direct-service-integrations-to-optimize-your-architecture/

When designing an application, you must integrate and combine several AWS services in the most optimized way for an effective and efficient architecture:

  • Optimize for performance by reducing the latency between services
  • Optimize for costs operability and sustainability, by avoiding unnecessary components and reducing workload footprint
  • Optimize for resiliency by removing potential point of failures
  • Optimize for security by minimizing the attack surface

As stated in the Serverless Application Lens of the Well-Architected Framework, “If your AWS Lambda function is not performing custom logic while integrating with other AWS services, chances are that it may be unnecessary.” In addition, Amazon API Gateway, AWS AppSync, AWS Step Functions, Amazon EventBridge, and Lambda Destinations can directly integrate with a number of services. These optimizations can offer you more value and less operational overhead.

This blog post will show how to optimize an architecture with direct integration.

Workflow example and initial architecture

Figure 1 shows a typical workflow for the creation of an online bank account. The customer fills out a registration form with personal information and adds a picture of their ID card. The application then validates ID and address, and scans if there is already an existing user by that name. If everything checks out, a backend application will be notified to create the account. Finally, the user is notified of successful completion.

Figure 1. Bank account application workflow

Figure 1. Bank account application workflow

The workflow architecture is shown in Figure 2 (click on the picture to get full resolution).

Figure 2. Initial account creation architecture

Figure 2. Initial account creation architecture

This architecture contains 13 Lambda functions. If you look at the code on GitHub, you can see that:

Five of these Lambda functions are basic and perform simple operations:

Additional Lambda functions perform other tasks, such as verification and validation:

  • One function generates a presigned URL to upload ID card pictures to Amazon Simple Storage Service (Amazon S3)
  • One function uses the Amazon Textract API to extract information from the ID card
  • One function verifies the identity of the user against the information extracted from the ID card
  • One function performs simple HTTP request to a third-party API to validate the address

Finally, four functions concern the websocket (connect, message, and disconnect) and notifications to the user.

Opportunities for improvement

If you further analyze the code of the five basic functions (see startWorkflow on GitHub, for example), you will notice that there are actually three lines of fundamental code that start the workflow. The others 38 lines involve imports, input validation, error handling, logging, and tracing. Remember that all this code must be tested and maintained.

import os
import json
import boto3
from aws_lambda_powertools import Tracer
from aws_lambda_powertools import Logger
import re

logger = Logger()
tracer = Tracer()

sfn = boto3.client('stepfunctions')

PATTERN = re.compile(r"^arn:(aws[a-zA-Z-]*)?:states:[a-z]{2}((-gov)|(-iso(b?)))?-[a-z]+-\d{1}:\d{12}:stateMachine:[a-zA-Z0-9-_]+$")

if ('STATE_MACHINE_ARN' not in os.environ
    or os.environ['STATE_MACHINE_ARN'] is None
    or not PATTERN.match(os.environ['STATE_MACHINE_ARN'])):
    raise RuntimeError('STATE_MACHINE_ARN env var is not set or incorrect')

STATE_MACHINE_ARN = os.environ['STATE_MACHINE_ARN']

@logger.inject_lambda_context
@tracer.capture_lambda_handler
def handler(event, context):
    try:
        event['requestId'] = context.aws_request_id

        sfn.start_execution(
            stateMachineArn=STATE_MACHINE_ARN,
            input=json.dumps(event)
        )

        return {
            'requestId': event['requestId']
        }
    except Exception as error:
        logger.exception(error)
        raise RuntimeError('Internal Error - cannot start the creation workflow') from error

After running this workflow several times and reviewing the AWS X-Ray traces (Figure 3), we can see that it takes about 2–3 seconds when functions are warmed:

Figure 3. X-Ray traces when Lambda functions are warmed

Figure 3. X-Ray traces when Lambda functions are warmed

But the process takes around 10 seconds with cold starts, as shown in Figure 4:

Figure 4. X-Ray traces when Lambda functions are cold

Figure 4. X-Ray traces when Lambda functions are cold

We use an asynchronous architecture to avoid waiting time for the user, as this can be a long process. We also use WebSockets to notify the user when it’s finished. This adds some complexity, new components, and additional costs to the architecture. Now let’s look at how we can optimize this architecture.

Improving the initial architecture

Direct integration with Step Functions

Step Functions can directly integrate with some AWS services, including DynamoDB, Amazon SQS, and EventBridge, and more than 10,000 APIs from 200+ AWS services. With these integrations, you can replace Lambda functions when they do not provide value. We recommend using Lambda functions to transform data, not to transport data from one service to another.

In our bank account creation use case, there are four Lambda functions we can replace with direct service integrations (see large arrows in Figure 5):

  • Query a DynamoDB table to search for a user
  • Send a message to an SQS queue when the extraction fails
  • Create the user in DynamoDB
  • Send an event on EventBridge to notify the backend
Figure 5. Lambda functions that can be replaced

Figure 5. Lambda functions that can be replaced

It is not as clear that we need to replace the other Lambda functions. Here are some considerations:

  • To extract information from the ID card, we use Amazon Textract. It is available through the SDK integration in Step Functions. However, the API’s response provides too much information. We recommend using a library such as amazon-textract-response-parser to parse the result. For this, you’ll need a Lambda function.
  • The identity cross-check performs a simple comparison between the data provided in the web form and the one extracted in the ID card. We can perform this comparison in Step Functions using a Choice state and several conditions. If the business logic becomes more complex, consider using a Lambda function.
  • To validate the address, we query a third-party API. Step Functions cannot directly call a third-party HTTP endpoint, but because it’s integrated with API Gateway, we can create a proxy for this endpoint.

If you only need to retrieve data from an API or make a simple API call, use the direct integration. If you need to implement some logic, use a Lambda function.

Direct integration with API Gateway

API Gateway also provides service integrations. In particular, we can start the workflow without using a Lambda function. In the console, select the integration type “AWS Service”, the AWS service “Step Functions”, the action “StartExecution”, and “POST” method, as shown in Figure 6.

Figure 6. API Gateway direct integration with Step Functions

Figure 6. API Gateway direct integration with Step Functions

After that, use a mapping template in the integration request to define the parameters as shown here:

{
  "stateMachineArn":"arn:aws:states:eu-central-1:123456789012:stateMachine: accountCreationWorkflow",
  "input":"$util.escapeJavaScript($input.json('$'))"
}

We can go further and remove the websockets and associated Lambda functions connect, message, and disconnect. By using Synchronous Express Workflows and the StartSyncExecution API, we can start the workflow and wait for the result in a synchronous fashion. API Gateway will then directly return the result of the workflow to the client.

Final optimized architecture

After applying these optimizations, we have the updated architecture shown in Figure 7. It uses only two Lambda functions out of the initial 13. The rest have been replaced by direct service integrations or implemented in Step Functions.

Figure 7. Final optimized architecture

Figure 7. Final optimized architecture

We were able to remove 11 Lambda functions and their associated fees. In this architecture, the cost is mainly driven by Step Functions, and the main price difference will be your use of Express Workflows instead of Standard Workflows. If you need to keep some Lambda functions, use AWS Lambda Power Tuning to configure your function correctly and benefit from the best price/performance ratio.

One of the main benefits of this architecture is performance. With the final workflow architecture, it now takes about 1.5 seconds when the Lambda function is warmed and 3 seconds on cold starts (versus up to 10 seconds previously), see Figure 8:

Figure 8. X-Ray traces for the final architecture

Figure 8. X-Ray traces for the final architecture

The process can now be synchronous. It reduces the complexity of the architecture and vastly improves the user experience.

An added benefit is that by reducing the overall complexity and removing the unnecessary Lambda functions, we have also reduced the risk of failures. These can be errors in the code, memory or timeout issues due to bad configuration, lack of permissions, network issues between components, and more. This increases the resiliency of the application and eases its maintenance.

Testing

Testability is an important consideration when building your workflow. Unit testing a Lambda function is straightforward, and you can use your preferred testing framework and validate methods. Adopting a hexagonal architecture also helps remove dependencies to the cloud.

When removing functions and using an approach with direct service integrations, you are by definition directly connected to the cloud. You still must verify that the overall process is working as expected, and validate these integrations.

You can achieve this kind of tests locally using Step Functions Local, and the recently announced Mocked Service Integrations. By mocking service integrations, for example, retrieving an item in DynamoDB, you can validate the different paths of your state machine.

You also have to perform integration tests, but this is true whether you use direct integrations or Lambda functions.

Conclusion

This post describes how to simplify your architecture and optimize for performance, resiliency, and cost by using direct integrations in Step Functions and API Gateway. Although many Lambda functions were reduced, some remain useful for handling more complex business logic and data transformation. Try this out now by visiting the GitHub repository.

For further reading:

Benefits of migrating to event-driven architecture

Post Syndicated from Talia Nassi original https://aws.amazon.com/blogs/compute/benefits-of-migrating-to-event-driven-architecture/

Two common options when building applications are request-response and event-driven architecture. In request-response architecture, an application’s components communicate via API calls. The client sends a request and expects a response before performing the next task. In event-driven architecture, the client generates an event and can immediately move on to its next task. Different parts of the application then respond to the event as needed.

events

In this post, you learn about reasons to consider moving from request-response architecture to an event-driven architecture.

Challenges with request-response architecture

When starting to a build a new application, many developers default to a request-response architecture. A request-response architecture may tightly integrate components and those components communicate via synchronous calls. While a request-response approach is often easier to get started with, it can become challenging as your application grows in complexity.

In this post, I review an example request-response ecommerce application and demonstrate the challenges of tightly coupled integrations. Then I show you how building the same application with an event-driven architecture can give you increased scalability, fault tolerance, and developer velocity.

Close coordination between microservices

In a typical ecommerce application that uses a synchronous API, the client makes a request to place an order and the order service sends the request downstream to an invoice service. If successful, the order service responds with a success message or confirmation number.

In this initial stage, this is a straightforward connection between the two services. The challenge comes when you add more services that integrate with the order service.

picture

If you add a fulfillment service and a forecasting service, the order service has more responsibilities and more complexity. The order service must know how to call each service’s API, from the API call structure to the API’s retry semantics. If there are any backwards incompatible changes to the APIs, the order service team must update them. The system forwards heavy traffic spikes to the order service’s dependency, which may not have the same scaling capabilities. Also, dependent services may transmit errors back up the stack to the client.

Error handling and retries

Now, you add new downstream services for fulfillment and shipping orders to the ecommerce application.

architecture

In the happy path, everything works as expected: The order service triggers invoicing, payment systems, and updates forecasting. Once payment clears, this triggers the fulfillment and packing of the order, and then informs the shipping service to request tracking information.

However, if the fulfillment center cannot find the product because they are out of stock, then fulfillment might have to alert the invoice service, then reverse the payment or issue a refund. If fulfillment fails, then the system that triggers shipping might fail as well. Forecasting must also be updated to reflect the change. This remediation workflow is all just to address one of the many potential “unhappy paths” that can occur in this API-driven ecommerce application.

Close coordination between development teams

In a synchronously integrated application, teams must coordinate any new services that are added to the application. This can slow down each development team’s ability to release new features. Imagine your team works on the payment service but you weren’t told that another team added a new rewards service. What now happens when the fulfillment service errors?

Fulfillment may orchestrate all the other services. Your payments team gets a message and you undo the payment, but you may not know who handles retries and error logic. If the rewards service changes vendors and has a new API, and does not tell your team, you may not be aware of the new service.

Ultimately, it can be hard to coordinate these orchestrations and workflows as systems become more complex and management adds more services. This is one reason that it can be beneficial to migrate to event-driven architecture.

Benefits of event-driven architecture

Event-driven architecture can help solve the problems of the close coordination of microservices, error handling and retries, and coordination between development teams.

Close coordination between microservices

In event-driven architecture, the publisher emits an event, which is acknowledged by the event bus. The event bus routes events to subscribers, which process events with self-contained business logic. There is no direct communication between publishers and subscribers.

Decoupled applications enable teams to act more independently, which can increase their velocity. For example, with an API-based integration, if your team wants to know about a change that happened in another team’s microservice, you might have to ask that team to make an API call to your service. Consequently, you may have to account for authentication, coordination with the other team over the structure of the API call. This causes back and forth between teams, which slows down development time. With an event-driven application, you can subscribe to events sent from your microservice and the event bus (for example, Amazon EventBridge) takes care of routing the event and handling authentication.

Error handling and retries

Another reason to migrate to event-driven architecture is to handle unpredictable traffic. Ecommerce websites like Amazon.com have variable amounts of traffic depending on the day. Once you place an order, several things happen.

First, Amazon checks your credit card to make sure that funds are available. Then, Amazon has to pack the merchandise and load onto trucks. That all happens in an Amazon fulfillment center. There is no synchronous API call for the Amazon backend to package and ship products. After the system confirms your payment, the front end puts together some information describing the event and puts your account number, credit card info, and what you bought in a packaged event and put it into the cloud and onto a queue. Later, another piece of software removes the event from the queue and starts the packaging and shipping.

The key point about this process is that these processes can all run at different rates. Normally, the rate at which customers place orders and the rate at which the warehouses can get the boxes packed are roughly equivalent. However, on busy days like Prime Day, customers place orders much more quickly than the warehouses can operate.

Ecommerce applications, like Amazon.com, must be able to scale up to handle unpredictable traffic. When a customer places an order, an event bus like Amazon EventBridge receives the event and all of the downstream microservices are able to select the order event for processing. Because each of the microservices can fail independently, there are no single points of failure.

Loose coordination between development teams

Event-driven architectures promote development team independence due to loose coupling between publishers and subscribers. Applications can subscribe to events with routing requirements and business logic that are separate from the publisher and other subscribers. This allows publishers and subscribers to change independently of each other, providing more flexibility to the overall architecture.

Decoupled applications also allow you to build new features faster. Adding new features or extending existing ones can be simpler with event-driven architectures because you either add new events, or modify existing ones. This process removes complexity in your application.

Conclusion

In this post, you learn about the challenges of developing applications with request-response architecture. In request-response architecture, the client must send a request and wait for a response before moving on to its next task. As applications grow in complexity, this tightly coupled architecture can cause issues. Event-driven architectures can increase scalability, fault tolerance, and developer velocity by decoupling components of your application.

For more serverless content, go to serverlessland.com.

Let’s Architect! Serverless architecture on AWS

Post Syndicated from Luca Mezzalira original https://aws.amazon.com/blogs/architecture/lets-architect-serverless-architecture-on-aws/

Serverless architecture and computing allow you and your teams to focus on delivering business value in place of investing time tweaking the infrastructure characteristics. AWS is not only providing serverless computing as a service, but share that half of our new applications built by Amazon are using AWS Lambda, as noted by Andy Jassy in his 2020 re:Invent keynote.

In this post, we share insights into reimagining a serverless environment.

I Build Applications – Event-driven Architecture

Event-driven architecture is common in modern applications built with microservices, and it is the cornerstone for designing serverless workloads. It uses events to trigger and communicate between decoupled services.

With this video, you can learn how to start with a prototype then scale to mass adoption using decoupled systems that run when responding to, without needing to redesign. Danilo Poccia, Chief Evangelist at AWS, begins the session with the APIs, then gives an example on how to build an event-driven architecture using Amazon EventBridge. The session closes with how to understand what is happening in this exchange of events.

Event-driven communication with asynchronous invocation

Event-driven communication with asynchronous invocation

Building modern cloud applications? Think integration

This re:Invent 2021 session explains modern cloud applications based on serverless or microservices, and how connections between components define important characteristics, like scalability, availability, and coupling.

How your systems are interconnected describes your system’s essential properties, such as resiliency and changeability. Gregor Hohpe, AWS Enterprise Strategist, shares tips on what to consider when integrating different services, such as lifecycle, level of control over the systems you are integrating, and how integration becomes an integral part of your software delivery cycle. The goal is to use the same method to integrate at the same speed as software deployment.

Integration approaches with Gregor Hohpe

Integration approaches with Gregor Hohpe

Serverless architectural patterns and best practices

Serverless architectures require a mindset shift: existing patterns need to be revisited, and new patterns created using the new architecture style. For each pattern created by AWS, we provide operational, security, and reliability best practices and discuss potential challenges. We also demonstrate some patterns in reference architecture diagrams.

This session helps you identify services and applications to create serverless architectures and understand areas of potential savings, increased agility, and reliability in your organization. Heitor Lessa, Principal Solutions Architect at AWS, starts the session identifying the benefits of Lambda Power Tuning: he details setting up memory when there are hundreds of functions, then follows with best practices for the pattern created.

Best practices for serverless architecture

Best practices for serverless architecture

Best practices of advanced serverless developers

This session is an overview of architectural best practices, optimizations, and handy codes that can be used to build secure, scalable, and high-performance serverless applications.

Julian Wood, Senior Developer Advocate at AWS, provides the recommended practices for implementing serverless applications inside your company, such as Lambda, to transform and not transport, avoid monolithic services and functions, orchestrate workflow with step functions, choreograph events. Julian also touches on understanding different ways you can invoke Lambda functions and what you should be aware of with each invocation model.

Three types of AWS Lambda invocation models

Three types of AWS Lambda invocation models

Building next-gen applications with event-driven architectures

Maintaining data consistency across multiple services can be challenging. It can also be difficult to work with large amounts of data in different data stores and locations. Teams building microservices architectures often find that integration with other applications and external services can make their workloads more monolithic and tightly coupled.

In this session, you can learn how to use event-based architectures to decouple and decentralize application components. Coupling is not one-dimensional, and it’s a trade-off to balance and optimize over time. This video demonstrates patterns based on message queues and events: for each pattern you can learn the advantages, the disadvantages, and the options for building it on AWS.

Sam Dengler, Principal Solutions Architect at AWS, explains the mental models to apply while designing choreography and orchestration in a scenario with microservices. The strategy adopted by Taco Bell for identifying their bounded contexts is also detailed, as well as the architecture built on Lambda for running the business logic and on AWS Step Functions for orchestration.

Choreography and orchestration are two modes of interaction in a microservices architecture

Choreography and orchestration are two modes of interaction in a microservices architecture

See you next time!

Thanks for joining our discussion on serverless architecting! If you want to deep dive into the topic, read all about Serverless on AWS!

See you in a couple of weeks when we discuss architecting for resilience!

Looking for more architecture content? AWS Architecture Center provides reference architecture diagrams, vetted architecture solutions, Well-Architected best practices, patterns, icons, and more!

Other posts in this series

Let’s Architect! Using open-source technologies on AWS

Post Syndicated from Luca Mezzalira original https://aws.amazon.com/blogs/architecture/lets-architect-using-open-source-technologies-on-aws/

With open-source technology, authors make software available to the public, who can view, use, or change it and add new features or support new capabilities. Open-source technology promotes collaboration across different teams, organizations, and people because the process often includes different perspectives and ideas, which typically results a stronger solution.

It can be difficult to create a multi-use solution when building to solve for a specific challenge. With an open-source project or an initiative, multiple teams work together, which prevents coupling and makes the solution easier to generalize.

In this edition of Let’s Architect!, we show you some open-source technologies built with AWS and options for running well-known, open-source projects on AWS.

Firecracker: Secure and Fast microVMs for Serverless Computing

Firecracker was developed at AWS to improve the customer experience of services like AWS Lambda and AWS Fargate. This technology is used to deploy workloads in lightweight virtual machines (VMs), called microVMs. For example, when a new Lambda function is triggered in response to an event, AWS Lambda provisions a microVM (if none already exists) to handle the request. Behind the scenes, this is powered by Firecracker.

This video introduces Firecracker and the concept of virtual machine monitor as a technology to create and manage microVMs. This talk explains Firecracker’s foundation, the minimal device model, and how it interacts with various containers. You’ll learn about the performance, security, and utilization improvements enabled by Firecracker and how Firecracker is used for Lambda and Fargate.

An example host running Firecracker microVMs

An example host running Firecracker microVMs

Deep dive into AWS Cloud Development Kit

AWS Cloud Development Kit (CDK) is an open-source software development framework that allows you to define your cloud application resources using familiar programming languages. It uses object-oriented design to create resources and build an end-to-end process for application development from infrastructure and software-development perspectives.

This video introduces AWS CDK core concepts and demonstrates how to create custom resources and deploy them to the cloud. With AWS CDK, you can make deployments repeatable, automate operations through infrastructure as code, and use the software design patterns while coding your architecture.

AWS CDK is an open-source software development framework for defining cloud infrastructure as code

AWS CDK is an open-source software development framework for defining cloud infrastructure as code

Using Apollo Server on AWS Lambda with Amazon EventBridge for real-time, event-driven streaming

Apollo Server is an open-source, spec-compliant GraphQL server that’s compatible with any GraphQL client. This blog posts covers how you can architect Apollo Server on AWS Lambda in an event-driven architecture. It shows you how to use the Apollo Server on AWS Lambda, integrate it with REST and WebSocket APIs and communicate asynchronously via event bus.

Sample application: a chat app that receives a text message from the client and responds with French and German translations of the message

Sample application: a chat app that receives a text message from the client and responds with French and German translations of the message

Observability the open-source way

Removing the undifferentiated heavy lifting for implementing open-source software can allow you to plug-and-play your favorite solutions with existing AWS services. This video addresses best practices and real-world use cases for Amazon Managed Service for Prometheus, Amazon Managed Grafana, and AWS Distro for OpenTelemetry to gain observability. Observability is fundamental to collect and analyze data coming from your architecture, understand the status of your system, and take action to improve application performance.

Setting up Amazon Managed Service for Prometheus

Setting up Amazon Managed Service for Prometheus

See you next time!

See you in a couple of weeks when we discuss strategies for running serverless applications on AWS!

Looking for more architecture content? AWS Architecture Center provides reference architecture diagrams, vetted architecture solutions, Well-Architected best practices, patterns, icons, and more!

Other posts in this series

Working with events and the Amazon EventBridge schema registry

Post Syndicated from Talia Nassi original https://aws.amazon.com/blogs/compute/working-with-events-and-amazon-eventbridge-schema-registry/

Event-driven architecture, at its core, is driven by producers creating events and subscribers being made aware of those events and acting upon them. An event is a data representation of something that happened elsewhere in the application or from an outside producer. When building event-driven applications, it is critical to determine what events exist in the application, who produces them, and who subscribes and react to them.

The first step in identifying these events is to work through the process of event discovery. In this process, you decide the events that the event source produces, and what parts of the application must know about those events. Events schemas describe the structure of an event and fields included in the event. If the event’s contents match the event target’s requirements, the service sends the event to the target. If you have an existing application that you want to discover event schemas automatically for you, can enable EventBridge schema discovery. If you are building a new application, you can conduct an event discovery exercise.

An event bus connects the event to the subscriber. The event bus at AWS is Amazon EventBridge. Use EventBridge to choreograph interactions between event sources and event targets.

In this post, you learn how to perform an event discovery exercise with your team. It shows how to create a schema registry with EventBridge, and how to represent the event as an object in your code to use in your application.

Event discovery

In the event discovery phase, all business stakeholders of an application come together and write down all the possible events that can happen. For example, possible events for an ecommerce application include: Account Created, Item added to cart, Order Placed. Write events in Noun + Past Tense Verb format.

Events
Account Created
Item Added
Order Placed

Notice that events are not technical or focused on implementation; rather they are real-world things that happened in your system. This is because everyone, from developers to product owners, must understand them. In the event discovery phase, events act as your business requirements. Later, developers then translate those requirements into code.

Once you have the events laid out, you decide who is interested in each event. For example, consider the events listed for the ecommerce application: Account Created, Item Added to cart, and Order Placed.

Events Subscribers
Account Created

Marketing team

Security team

Item Added Inventory team
Order Placed Fulfillment team

Whenever a user creates an account, the marketing team subscribes to that event to send the account holder promotions. The security team encrypts the account holder’s user name and password into a database to save the account credentials. Whenever a user places an order, the system sends an email notification with the order details to stakeholders. The system also sends a message to the fulfillment team to start packing and shipping the order. This approach decouples the interactions between all of these services. This decoupling is beneficial because it increases developer independence by reducing the dependencies on other teams to write integrations.

Once the initial event schema planning is complete, you want to ensure that developers can continue to build new features without needing to coordinate with other teams closely on event schemas. For this reason, \EventBridge’s schema registry and automated schema discovery can help developers quickly build new features based on their application’s events.

You use EventBridge schema registries to search for, find, and track different schemas generated by your application. You can also automatically find schemas with the automated schema registry.

EventBridge schema registry and discovery

Applications can have many different types of events. Events generated from AWS services, third-party SaaS applications, and your custom applications.

With so many event sources, it can be challenging to know what to expect when consuming events. A schema represents the structure of an event. It describes what happened in the event, where the event came from, and the timestamp. The event schema is important for developers as it shows what data contained in the event, and allows them to write code based on that data.

For example, an Order Placed event might always contain a list of items in the order as an array, and a user ID as an integer. EventBridge helps automate the manual process of finding and documenting schemas. There are two capabilities to highlight: Schema registry and schema discovery.

A schema registry is a repository that stores a collection of schemas. You can use a schema registry to search for, find, and track different schemas used and generated by your application. AWS automatically stores schemas for all AWS sources for EventBridge in your schema registry. SaaS partner and custom schemas can be generated and added to the registry using the schema discovery feature. A schema registry enables you to use events as objects in your code more easily.

Adding an event to the EventBridge schema registry

In this tutorial, you create an Account Created event, which includes a user’s name and email address.

  1. Navigate to the Amazon EventBridge console and choose Schemas from the left panel.step 1There are three types of schemas represented in the tabs: AWS event schema registry, discovered schema registry, and custom schema registry.When you choose AWS event schema registry, you can search for any AWS service or event that is supported by EventBridge. There, you can view the schema for that event.
    pic 2

  2. To create a custom schema registry for your application, navigate to the custom schema registry tab and choose Create registry.
  3. Enter a name for the registry and then choose Create.
  4. There are currently no schemas in the registry. Choose Create custom schema to create one.
  5. Choose your registry as the destination and call the schema “user”. You can choose to load the schema template using an OpenAPI format from the Load Template option. You can then manually enter data for each of the fields.
  6. Alternatively, you can have the service discover the schema from JSON. Remember that events are written in JSON. Choose the Discover from JSON tab and enter the following code:{"id": 1, "name": "Talia Nassi", "emailAddress": "[email protected]"}
  7. Choose Discover schema.
  8. EventBridge extrapolates the schema from this information. The schema shows that the ID is a number, the name is a string, and the email address is a string. Choose Create to create the schema.
  9. When you choose your schema from the schema registry, you can see the structure of the event you just created.

Representing events as objects in your code with code bindings

Once a schema is added to the registry, you can download a code binding, which allows you to represent the event as an object in your code. You can take advantage of IDE features such as validation and auto-complete. Code bindings are available for Java, Python, or TypeScript programming languages. You can download bindings from the Amazon EventBridge Console, or directly from your IDE with the AWS Toolkit plugin for IntelliJ and Visual Studio Code.

Choose the programming language you prefer, then choose Download. This downloads the code binding to your local machine.

You can also choose to download code bindings directly to your IDE with the AWS Toolkit. This tutorial uses VS Code but you can also use IntelliJ.

  1. Ensure you have VS Code installed.
  2. Navigate to the VS Code marketplace and search for AWS and install the AWS Toolkit. You may have to restart VS Code.
  3. Choose a profile to connect to AWS. Set the Region to the same Region that you created the schema in. You see this icon on the left panel when you access the AWS Explorer:
  4. Choose Schemas from the left panel, then choose your schema, myRegistry. Open user by right clicking and choosing View Schema.
  5. You can now use this event object in your code.

Conclusion

In this post, you learn about event discovery, schema registry, and schema discovery. Event discovery is essential when creating event-driven applications because it allows the team to see which events are created by your application, and who needs to subscribe to those events.

Events have specific structures, called schemas. Your schema registry includes all of the schemas for your events. You can use the schema registry to search for events produced by other teams, which can make development faster. You learn how to create a custom schema registry, and how to download code bindings to use events in your code.

For more information, visit Serverless Land.

Building an event-driven application with Amazon EventBridge

Post Syndicated from Talia Nassi original https://aws.amazon.com/blogs/compute/building-an-event-driven-application-with-amazon-eventbridge/

In event-driven architecture, services interact with each other through events. An event is something that happened in your application (for example, an item was put into a cart, a new order was placed). Events are JSON objects that tell you information about something that happened in your application. In event-driven architecture, each component of the application raises an event whenever anything changes. Other components listen and decide what to do with it and how they would like to react.

When you build applications with event-driven architecture, you decouple your event sources and event targets. This can enable teams to act more independently, because your services are loosely coupled. When you add new features to your applications, you raise new events and then decide on the event source and event target. The event source is what emits the event, and the event target is what subscribes to or receives the event. Decoupling event sources and event targets can greatly speed up development time, and it can simplify making changes to your application.

Decoupling your application can allow for more seamless cross-team collaboration. For example, let’s say you are a developer at an ecommerce company and you are building a serverless ecommerce application. Your team is in charge of the account creation and authentication process. You build the login workflow, and raise an event when a new user creates an account.

When the event is raised, other teams can be alerted. The marketing team can listen for the Account Created event and act on it (for example, send promotional emails). In this decoupled architecture, event producers and consumers don’t have to know about each other. They only have to listen to events and act accordingly when they are interested in an event. This can speed up development by reducing the complexity caused by building new features.

In AWS, events are choreographed through Amazon EventBridge rules. A rule matches incoming events from an event source and sends them to event targets for processing.

eventbridge architecture

EventBridge accepts events from many different event sources, including over 200 AWS services, custom events from your Lambda functions or applications, and third-party SaaS applications. You specify an action to take when EventBridge receives an event that matches the event pattern in the rule. When an event matches, Amazon EventBridge sends the event to the specified target and triggers the action defined in the rule.

To route events from these sources to the correct target, the events must be placed on a corresponding event bus. There are three types of event buses. The first type is the default bus, which is always available in every account, and it’s where AWS events are routed to. The second type is a custom event bus. You can create custom event buses for your own applications to meet your business needs. Lastly, you can also create SaaS event buses, which are created when you configure SaaS applications as an event source.

There are many potential event targets. Event targets are what the event bus route to once a corresponding event happens. Targets include AWS Lambda, Amazon Kinesis, AWS Step Functions, Amazon API Gateway, and even event buses in other accounts. This flexible design allows you to create a wide variety of integration patterns based on your specific needs.

Configuring events with Amazon EventBridge

This tutorial sends an event from Amazon S3 (the event source) to AWS Lambda (the event target) using an event rule.

In this tutorial, you learn how to configure events with Amazon EventBridge by deploying an AWS Serverless Application Model template. The AWS Serverless Application Model (AWS SAM) is an open-source framework for building serverless applications. It provides shorthand syntax to express functions, APIs, databases, and event source mappings. AWS SAM is an extension of AWS CloudFormation, which is the AWS infrastructure as code tool. You define resources using CloudFormation in your AWS SAM template and use the full suite of resources, intrinsic functions, and other template features that are available in AWS CloudFormation.

First, you upload images to an S3 bucket. This raises an event, which invokes a Lambda function, which resizes the image and places it in a different S3 bucket.

Prerequisites:

  1. AWS SAM CLI (If you use AWS Cloud9, this is installed for you)

To configure events with Amazon EventBridge:

  1. Navigate to the Serverlessland Patterns Collection and choose the pattern Amazon S3 to Amazon EventBridge to AWS Lambda. This AWS SAM template deploys an S3 bucket, a Lambda function, an EventBridge rule, and the IAM resources required to run the application.
  2. Copy and paste the cloning instructions in your terminal.
  3. Run sam deploy --guided to deploy the pattern.
  4. You see the success message:
  5. Navigate to the EventBridge console and choose Rules from the left panel. Then choose the rule that was created by AWS SAM (starting with sam-app)
    The event source is S3 and the rule is invoked when an image is put into the source bucket in S3. Next, notice that the event target is the Lambda function that you created from the AWS SAM template.
  6. Navigate to the S3 console and choose Buckets on the left panel. Then choose the bucket that was created for you (starting with sam-app). Choose the Properties tab, and note that the integration with EventBridge is on.
  7. From the Objects tab, choose Upload, and upload an image.

  8. Navigate to the Lambda console and choose your Lambda function (starting with sam-app). Select the Monitor tab, and choose View Logs in CloudWatch.
  9. You can see the event that triggered the Lambda function in the logs:

Adding more event rules to your application

In the previous example, you add an EventBridge rule that routes events from S3 (the event source) to Lambda (the event target) using an event bus. Now, add another rule:

  1. From the EventBridge console, choose Rules, and then choose Create rule.
  2. Enter a name and description for the rule.
  3. Define the event pattern that is used to invoke the event targets. For Service provider, choose AWS, and for Service name choose S3. For Event type, choose Amazon S3 event notification, and in the event dropdown choose Object created. You are configuring the event source to be an object created in your S3 bucket.
  4. Select either the default AWS event bus or a custom event bus.
  5. Select the event target. In this example, configure an Amazon CloudWatch log group. Enter any name for the log group, which is created automatically.
  6. Choose Create.
  7. Upload an image to the S3 bucket, as shown in step 7 above.
  8. Navigate to the Amazon CloudWatch console and choose Log groups from the left panel. Choose the log group, and then choose a log stream.
  9. The event is logged to CloudWatch Logs.

Adding a second event rule did not change the event source’s behavior or affect other event targets.

Conclusion

This post is a brief introduction of event-driven architecture, and walks through a tutorial where you create an event-driven application with the Serverlessland Patterns Collection. You also add two different event rules to your event bus.

For more serverless learning resources, visit Serverless Land.

Introducing global endpoints for Amazon EventBridge

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/introducing-global-endpoints-for-amazon-eventbridge/

This post is written by Stephen Liedig, Sr Serverless Specialist SA.

Last year, AWS announced two new features for Amazon EventBridge that allow you to route events from any commercial AWS Region, and across your AWS accounts. This supported a wide range of use cases, allowing you to implement easily global event delivery and replication scenarios.

From today, EventBridge extends this capability with global endpoints. Global endpoints provide a simpler and more reliable way for you to improve the availability and reliability of event-driven applications. The feature allows you to fail over event ingestion automatically to a secondary Region during service disruptions. Global endpoints also provide optional managed event replication, simplifying your event bus configuration and reducing the risk of event loss during any service disruption.

This blog post explains how to configure global endpoints in your AWS account, update your applications to publish events to the endpoint, and how to test endpoint failover.

How global endpoints work

Customers building multi-Region architectures today are building more resilience by using self-managed replication via EventBridge’s cross-Region capabilities. With this architecture, events are sent directly to an event bus in a primary Region and replicated to another event bus in a secondary Region.

Architecture

This event flow can be interrupted if there is a service disruption. In this scenario, event producers in the primary Region cannot PutEvents to their event bus, and event replication to the secondary Region is impacted.

To put more resiliency around multi-Region architectures, you can now use global endpoints. Global endpoints solve these issues by introducing two core service capabilities:

  1. A global endpoint is a managed Amazon Route 53 DNS endpoint. It routes events to the event buses in either Region, depending on the health of the service in the primary Region.
  2. There is a new EventBridge metric called IngestionToInvocationStartLatency. This exposes the time to process events from the point at which they are ingested by EventBridge to the point the first invocation of a target in your rules is made. This is a service-level metric measured across all of your rules and provides an indication of the health of the EventBridge service. Any extended periods of high latency over 30 seconds may indicate a service disruption.

These two features provide you with the ability to failover event ingestion automatically to the event bus in the secondary Region. The failover is triggered via a Route 53 health check that monitors a CloudWatch alarm that observes the IngestionToInvocationStartLatency in the primary Region.

If the metric exceeds the configured threshold of 30 seconds consecutively for 5 minutes, the alarm state changes to “ALARM”. This causes the Route 53 health check state to become unhealthy, and updates the routing of the global endpoint. All events from that point on are delivered to the event bus in the secondary Region.

The diagram below illustrates how global endpoints reroutes events being delivered from the event bus in the primary Region to the event bus in the secondary Region when CloudWatch alarms trigger the failover of the Route 53 health check.

Rerouting events with global endpoints

Once events are routed to the secondary Region, you have a couple of options:

  1. Continue processing events by deploying the same solution that processes events in the primary Region to the secondary Region.
  2. Create an EventBridge archive to persist all events coming through the secondary event bus. EventBridge archives provide you with a type of “Active/Archive” architecture allowing you to replay events to the event bus once the primary Region is healthy again.

When the global endpoint alarm returns to a healthy state, the health check updates the endpoint configuration, and begins routing events back to the primary Region.

Global endpoints can be optionally configured to replicate events across Regions. When enabled, managed rules are created on your primary and secondary event buses that define the event bus in the other Region as the rule target.

Under normal operating conditions, events being sent to the primary event bus are also sent to the secondary in near-real-time to keep both Regions synchronized. When a failover occurs, consumers in the secondary Region have an up-to-date state of processed events, or the ability to replay messages delivered to the secondary Region, before the failover happens.

When the secondary Region is active, the replication rule attempts to send events back to the event bus in the primary Region. If the event bus in the primary Region is not available, EventBridge attempts to redeliver the events for up to 24 hours, per its default event retry policy. As this is a managed rule, you cannot change this. To manage for a longer period, you can archive events being ingested by the secondary event bus.

How do you identify if the event has been replicated from another Region? Events routed via global endpoints have identical resource fields, which contain the Amazon Resource Name (ARN) of the global endpoint that routed the event to the event bus. The region field shows the origin of the event. In the following example, the event is sent to the event bus in the primary Region and replicated to the event bus in the secondary Region. The event in the secondary Region is us-east-1, showing the source of the event was the event bus in the primary Region.

If there is a failover, events are routed to the secondary Region and replicated to the primary Region. Inspecting these events, you would expect to see us-west-2 as the source Region.

Replicated events

The two preceding events are identical except for the id. Event IDs can change across API calls so correlating events across Regions requires you to have an immutable, unique identifier. Consumers should also be designed with idempotency in mind. If you are replicating events, or replaying them from archives, this ensures that there are no side effects from duplicate processing.

Setting up a global endpoint

To configure a Global endpoint, define two event buses — one in the “primary” Region (this is the same Region you configure the endpoint in) and one in a “secondary” Region. To ensure that events are routed correctly, the event bus in the secondary Region must have the same name, in the same account, as the primary event bus.

  1. Create two event buses in different Regions with the same name. This is quickly set up using the AWS Command Line Interface (AWS CLI):Primary event bus:
    aws events create-event-bus --name orders-bus --region us-east-1Secondary event bus:
    aws events create-event-bus --name orders-bus --region us-west-2
  2. Open the Amazon EventBridge console in the Region where you want to create the global endpoint. This aligns with your primary Region. Navigate to the new global endpoints page and create a new endpoint.
    EventBridge console
  3. In the Endpoint details panel, specify a name for your global endpoint (for example, OrdersGlobalEndpoint) and enter a description.
  4. Select the event bus in the primary Region, orders-bus.
  5. Select the Region used when creating the secondary event bus previously. the secondary event bus by choosing the Region it was created in from the dropdown.Create global endpoint
  6. Select the Route 53 health check for triggering failover and recovery. If you have not created one before, choose New health check. This opens an AWS CloudFormation console to create the “LatencyFailuresHealthCheck” health check and CloudWatch alarm in your account. EventBridge provides a template with recommended defaults for a CloudWatch alarm that is triggered when the average latency exceeds 30 seconds for 5 minutes.Endpoint configuration
  7. Once the CloudFormation stack is deployed, return to the EventBridge console and refresh the dropdown list of health checks. Select the physical ID of the health check you created.Failover and recovery
  8. Ensure that event replication is enabled, and create the endpoint.
    Event replication
  9. Once the endpoint is created, it appears in the console. The global endpoint URL contains the EndpointId, which you must specify in PutEvents API calls to publish events to the endpoint.
    Endpoint listed in console

Testing failover

Once you have created an endpoint, you can test the configuration by creating “catch all” rules on the primary and secondary event buses. The simplest way to see events being processed is to create rules with CloudWatch log group target.

Testing global endpoint failure over is accomplished by inverting the Route 53 health check. This can be accomplished using any of the Route 53 APIs. Using the console, open the Route 53 health checks landing page and edit the “LatencyFailuresHealthCheck” associated with your global endpoint. Check “Invert health check status” and save to update the health check.

Within a few minutes, the health check changes state from “Healthy” to “Unhealthy” and you see events flowing to the event bus in the secondary.

Configure health check

Using the PutEvents API with global endpoints

To use global endpoints in your applications, you must update your current PutEvents API call. All AWS SDKs have been updated to include an optional EndpointId parameter that you must set when publishing events to a global endpoint. Even though you are no longer putting events directly on the event bus, the EventBusName must be defined to validate the endpoint configuration.

PutEvents SDK support for global endpoints requires the AWS Common Runtime (CRT) library, which is available for multiple programming languages, including Python, Node.js, and Java:

https://github.com/awslabs/aws-crt-python
https://github.com/awslabs/aws-crt-nodejs
https://github.com/awslabs/aws-crt-java

To install the awscrt module for Python using pip, run:

python3 -m pip install boto3 awscrt

This example shows how to send an event to a global endpoint using the Python SDK:

import json
import boto3
from datetime import datetime
import uuid
import random

client = session.client('events', config=my_config)

detail = {
    "order_date": datetime.now().isoformat(),
    "customer_id": str(uuid.uuid4()),
    "order_id": str(uuid.uuid4()),
    "order_total": round(random.uniform(1.0, 1000.0), 2)
}

put_response = client.put_events(
    EndpointId=" y6gho8g4kc.veo",
    Entries=[
        {
            'Source': 'com.aws.Orders',
            'DetailType': 'OrderCreated',
            'Detail': json.dumps(detail),
            'EventBusName': 'orders-bus'
        }
    ]
)

Event producers can suffer data loss if the PutEvents API call fails, even if you are using global endpoints. Global endpoints allow you to automate the re-routing of events to another event-bus in another Region, but the health checks triggering the failover won’t be invoked for at least 5 minutes. It’s possible that your applications experience increased error rates for PutEvents operations before the failover occurs and events are routed to a healthy Region. To safeguard against message loss during this time, it’s best practice to use exponential retry and back-off patterns and durable store-and-forward capability at the producer level.

Conclusion

This blog shows how to create an EventBridge global endpoint to improve the availability and reliability of event ingestion of event-driven applications. This example shows how to use the PutEvents in the Python AWS SDK to publish events to a global endpoint.

To create a global endpoint using the API, see CreateEndpoint in the Amazon EventBridge API Reference. You can also create a global endpoint by using AWS CloudFormation using an AWS::Events:: Endpoints resource.

To learn more about EventBridge global endpoints, see the EventBridge Developer Guide. For more serverless learning resources, visit Serverless Land.