Tag Archives: Amazon EventBridge

Introducing cross-account targets for Amazon EventBridge Event Buses

Post Syndicated from Chris McPeek original https://aws.amazon.com/blogs/compute/introducing-cross-account-targets-for-amazon-eventbridge-event-buses/

This post is written by Anton Aleksandrov, Principal Solutions Architect, Serverless and Alexander Vladimirov, Senior Solutions Architect, Serverless

Today, Amazon EventBridge is announcing support for cross-account targets for Event Buses. This new capability allows you to send events directly to targets, such as Amazon Simple Queue Service (Amazon SQS), AWS Lambda, and Amazon Simple Notification Service (Amazon SNS), located in other accounts.

Previously, EventBridge supported cross-account event delivery from an event bus in one account to an event bus in another account. This launch extends that capability and allows you to configure the source event bus to deliver events directly to all EventBridge supported targets in other accounts, not just event buses. This removes the need for an additional event bus in the target account.

Overview

Event-driven architectures built with EventBridge allow you to create solutions spanning many company departments and business domains, while remaining asynchronous and loosely coupled. As solutions grow, you may need to send events across account boundaries.

For example, you may have a set of event buses hosted in multiple accounts that are dispatching security-related events to an Amazon SQS queue hosted in a centralized account for further asynchronous processing and analysis.

Previously, EventBridge rules allowed you to define targets in the same account. The only target type that supported cross-account event delivery was another event bus. If you wanted to send events across account boundaries, you had to create event buses in both source and target accounts. After, you would configure a rule on the source event bus to send events to the target bus, and another rule on the target event bus to deliver the event to a desired target in the target account. Alternatively, a Lambda function or SNS topic could be used as a bridging mechanism to send events across accounts.

The following diagram illustrates what an architecture of cross-account event delivery looked like before the new capability. A “bridging” component, like another event bus, SNS topic, or Lambda function, was required to send events from one account to another.

Delivering cross-account events from source bus to target bus.

Figure 1: Delivering cross-account events from source bus to target bus

With this new EventBridge feature, you can deliver events from the source event bus to the desired targets in different accounts directly. This simplifies the architecture and persmission model. It also reduces latency in your event-driven solutions by having fewer components processing events along the path from source to targets.

Delivering cross-account events to target directly.

Figure 2: Delivering cross-account events to target directly

Configuring EventBridge delivery rule targets for cross-account event delivery

Enabling cross-account event delivery should be done with security in mind. You must establish mutual trust between the source and the target. Source event bus rules must have an AWS Identity and Access Management (IAM) role that allows them to send events to specific targets. This is achieved by attaching an execution role to the delivery rule targets.

Event delivery targets hosted in different accounts must have a resource access policy attached that explicitly allows receiving events from the execution role used in the source account. Due to this requirement, you can enable cross-account event delivery only for targets that support resource access policies, such as Amazon SQS queues, Amazon SNS topics, and AWS Lambda functions.

Having both an IAM role in the source account and resource policy in the target account allows you to have fine-grained control over which principals are allowed to use the PutEvents action and under which conditions. You can define service control policies (SCPs) to set organizational boundaries determining who can send and receive events in your organization.

As illustrated in the following diagram, consider Team A owns the source account (Account A). Team A is responsible for setting up the source event bus, its execution role, rules, and targets. Teams B and C own the target accounts (Account B and Account C, accordingly). Both teams manage their respective target accounts. For example, creating delivery targets, such as SQS queues, and granting permissions to accept events from the event bus in the source account. This approach enables Team A to manage the centralized source event bus for other teams, and Teams B and C to control who can send events to their targets. It provides high degree of overall control and governance.

A cross-team collaboration sending events from source account to target account targets.

Figure 3: A cross-team collaboration sending events from source account to target account targets

The following example describes setting up cross-account event delivery to an SQS queue. You can apply the same technique to other target types as well, such as Lambda functions or SNS topics.

See the following diagram for a conceptual architectural layout and resource creation order.

Permissions required for cross-account event delivery.

Figure 4: Permissions required for cross-account event delivery

Assuming the source event bus already exists, there are three general steps in setting up cross-account event delivery:

  1. Target account – create the delivery target, such as an SQS queue.
  2. Source account – configure a rule for cross-account event delivery. Set the target SQS queue ARN as rule target, and attach an execution role with permissions to send messages to the target SQS Queue.
  3. Target account – apply a resource policy to the target SQS queue allowing the source event bus execution role to send events.

Showing cross-account delivery in action

Follow the instructions in this GitHub repository for provisioning the sample in your AWS accounts using AWS Serverless Application Model (AWS SAM). An event bus rule in the source account sends events directly to an SQS queue, a Lambda function, and an SNS topic in a target account. You must have two accounts for the sample to work.

The sample project architecture, delivering events cross-account to Lambda, SQS, and SNS.

Figure 5: The sample project architecture, delivering events cross-account to Lambda, SQS, and SNS.

Make sure you enter a valid email address as SnsSubscriptionEmail value and confirm your email subscription once target stack is deployed.

After deployment, open the EventBridge console in the source account. Navigate to the newly created event bus, which has “SourceEventBus” in its name. Use the Send Events functionality to publish sample events, as shown in the following screenshot. Make sure that the Source of your events is set to “test”.

Sending test event.

Figure 6: Sending test event

Validate that the events are successfully delivered to all three cross-account targets. Open the target account in a different browser or an incognito window:

  1. Navigate to the SQS console. Open the newly created queue, which has “TargetSqsQueue” in its name.
  2. Choose Send and Receive messages then choose Poll for messages. You can see the events sent in the previous step.Receiving test event with SQS.Figure 7: Receiving test event with SQS
  3. Navigate to Amazon CloudWatch Logs. Open the Log Group for the newly created Lambda function, which has “TargetLambdaFunction” in its name. It shows events sent in the previous step.
    Receiving test event with Lambda.Figure 8: Receiving test event with Lambda
  4. Check your email. If you have confirmed the SNS topic subscription during the sample code deployment, it shows the events sent in the previous step.Receiving test event with SNS.Figure 9: Receiving test event with SNS

Conclusion

The new EventBridge capability allows you to deliver events directly to targets across account boundaries. This capability helps to simplify your event-driven architectures, as well as improve latency by reducing the number of components processing your events as they travel from event buses to their destinations.

Refer to the EventBridge pricing page to learn more about cross-account events delivery costs.

For additional documentation, refer to Amazon EventBridge documentation. Get the sample code used in this blog from this GitHub repository.

For more serverless learning resources, visit Serverless Land.

Serverless ICYMI Q4 2024

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/serverless-icymi-q4-2024/

Welcome to the 27th edition of the AWS Serverless ICYMI (in case you missed it) quarterly recap. At the end of a quarter, we share the most recent product launches, feature enhancements, blog posts, webinars, live streams, and other interesting things that you might have missed!

In case you missed our last ICYMI, check out what happened in Q2 here.

Calendar showing October through December 2024

2024 Q4 calender

Serverless at re:Invent 2024

AWS re:Invent 2024 had 60,000 in-person attendees and 400,000 online viewers for the keynotes. The conference delivered 1,900 sessions from 3,500 speakers and included 546 AWS service and feature announcements.

The serverless content consisted of two tracks: Serverless (SVS) and App Integration (API). These tracks included 70 unique sessions and attracted nearly 11,000 attendees. Serverlesspresso, the coffee shop powered by serverless technology, operated in two locations during the event: the Expo Hall and the certification lounge.

Crowd of people standing around the AWS reI:nvent expo hall waiting to order coffee at the Serverlesspresso booth.

Serverlesspresso booth in the expo hall

Videos are available on Serverless Land YouTube.

AWS Lambda and Amazon Elastic Container Service (Amazon ECS) 10-year anniversary.

AWS marked significant milestones in serverless computing, celebrating 10 years of AWS Lambda and Amazon ECS. Lambda now serves over 1.5 million monthly customers and processes tens of trillions of requests each month. Amazon ECS launches more than 2.4 billion container tasks weekly and is used by over 65% of new AWS container customers.

AWS is commemorating this anniversary with insights from AWS Serverless Heroes, product leads, principal engineers, and AWS leadership sharing their perspectives on serverless evolution and future directions. These stories and insights are available at https://aws.amazon.com/serverless/10th-anniversary/.

AWS Lambda

The AWS Lambda team has spent a significant amount of time improving the Lambda development experience. Several enhancements have been made in the console as well as the local development experience.

Screen capture of the new AWS Lambda console with Code-OSS

Code-OSS as the new AWS Lambda inline editor

Lambda has launched a significant upgrade to its console by integrating Code-OSS, the open-source version of Visual Studio Code, delivering a familiar development experience directly in the cloud. The new Lambda Code Editor supports viewing larger function packages up to 50 MB, features a split-screen interface for simultaneous code editing and testing, and includes built-in Amazon Q Developer AI assistance for real-time coding suggestions. This enhancement comes at no additional cost and prioritizes accessibility with features like screen reader support and keyboard navigation. The update bridges the gap between cloud and local development by simplifying the process of downloading function code and AWS SAM templates, ultimately providing developers with a more streamlined and familiar serverless development experience. Watch the video explaining the changes in detail.

Additionally, the Lambda console enhances developer experience with two new features: a built-in CloudWatch Metrics Insights dashboard that surfaces key function metrics, and CloudWatch Logs Live Tail support for real-time log streaming and analysis, enabling faster troubleshooting without leaving the Lambda environment.

Screen capture of the new top 10 functions in the new AWS Lambda console

Top 10 Functions

Lambda now supports native JSON structured logging for .NET managed runtime applications, improving log searchability and analysis capabilities without requiring manual configuration of logging libraries.

Lambda has expanded its runtime support by adding Python 3.13 and Node.js 22 as both managed runtimes and container base images, providing access to the latest language features and ensuring long-term support through October 2029 and April 2027, respectively.

Lambda SnapStart capability is now available for Python and .NET runtimes, delivering sub-second startup performance for latency-sensitive applications by caching initialized execution environments.

Diagram of how SnapStart works compared to not having SnapStart

SnapStart support comparison

New CloudWatch metrics for Lambda Event Source Mappings provide enhanced visibility into event processing states for Amazon Simple Queue Service (SQS), Amazon Kinesis, and Amazon DynamoDB event sources, helping customers monitor and troubleshoot event processing issues.

Lambda introduces Provisioned Mode for Kafka event source mappings, allowing customers to optimize throughput by configuring dedicated event polling resources for applications with stringent performance requirements.

Finally, Lambda introduces an enhanced local development experience through the AWS Toolkit for Visual Studio Code, streamlining the serverless application development workflow. The update features a new Application Builder interface that guides developers through environment setup, offers sample applications, and provides quick-action buttons for common tasks like build, deploy, and invoke operations. Developers can now efficiently iterate on their code with features such as configurable build settings, step-through debugging, and the ability to sync local changes quickly to the cloud or perform full deployments. The toolkit integrates with AWS Infrastructure Composer for visual application building and includes comprehensive local testing capabilities with shareable test events. This enhancement simplifies the Lambda development process by enabling developers to author, test, debug, and deploy serverless applications without leaving their preferred IDE environment.

Screen capture of the getting started experience for serverless in a local IDE

Local IDE getting started

Amazon ECS and AWS Fargate

AWS enhances observability for containerized applications with CloudWatch Application Signals for Amazon ECS, adding infrastructure metrics correlation to existing traces and logs monitoring, enabling operators to identify and resolve performance issues across their application stack.

Amazon ECS adds service revision and deployment history tracking, allowing customers to monitor changes, track ongoing deployments, and debug deployment failures for long-running applications deployed after October 25, 2024.

A graph explaining the flow for service order and history

Service revisions and deployment history

Amazon ECS expands testing capabilities by supporting network fault injection experiments on AWS Fargate through AWS Fault Injection Service, enabling developers to verify application resilience using six different types of fault injection actions, including network disruptions and resource stress testing.

Amazon EventBridge

Amazon EventBridge announces significant performance improvements, reducing end-to-end latency by up to 94% from 2,235ms to 129.33ms at P99, enabling faster event processing for time-sensitive applications like fraud detection and gaming.

Amazon EventBridge and AWS Step Functions now integrate with private APIs through AWS PrivateLink and Amazon VPC Lattice, enabling secure connectivity between cloud and on-premises applications without custom networking code.

Screen capture of the Amazon EventBridge create connection screen showing the new Private option

Connections to Private APIs

EventBridge API destinations introduces proactive OAuth token refresh for public and private authorization endpoints, helping prevent delays and errors by automatically refreshing tokens before expiration.

AWS Step Functions

AWS Step Functions introduces the ability to export workflows as CloudFormation or SAM templates directly from the AWS console, enabling repeatable provisioning across accounts. Developers can export and customize templates from existing workflows, and use AWS Infrastructure Composer to visually connect workflows with other AWS resources.

Step Functions also adds Variables and JSONata support to enhance workflow development. Variables allow data assignment and reference between states, simplifying payload management, while JSONata provides advanced data transformation capabilities, including date formatting and mathematical operations. These features reduce the need for custom code and intermediate states, making it easier to build distributed serverless applications. Watch the in depth video to learn more.

Screen capture of AWS Step Function workflow studio using JSONata and variables in an example

JSONata and variables

Amazon Kinesis

Amazon Kinesis introduces significant updates to its client libraries. The new Kinesis Client Library (KCL) 3.0 reduces compute costs by up to 33% through enhanced load balancing, while the Kinesis Producer Library (KPL) 1.0 improves performance and security. Both libraries now support AWS SDK for Java 2.x and eliminate dependencies on SDK for Java 1.x, enabling seamless upgrades without requiring application code changes.

Screen capture of CPU usage metrics

KCL 3.0 metrics

Amazon MQ

Amazon MQ adds support for AWS PrivateLink, enabling customers to access Amazon MQ API endpoints directly from their VPC through interface VPC endpoints, eliminating the need for internet access and providing enhanced security through AWS’s internal network infrastructure.

Amazon Finch

AWS announces general availability of Linux support for Finch, an open source container development tool that simplifies building, running, and publishing Linux containers across all major operating systems. The release includes support for the Finch Daemon with Docker API compatibility and is available through RPM packages for Amazon Linux 2 and Amazon Linux 2023.

Amazon Simple Queue Service (SQS)

Amazon SQS increases the in-flight message limit for FIFO queues from 20,000 to 120,000 messages, enabling higher concurrent message processing. This enhancement allows customers to scale their receivers and process up to six times more messages simultaneously, provided they have sufficient publish throughput.

Amazon Managed Streaming for Apache Kafka(Amazon MSK)

Amazon MSK now introduces Managed Streaming for Apache Flink blueprints to simplify real-time AI application development. The service enables vector-embedding generation through Amazon Bedrock, streamlining the integration of streaming data with generative AI models. Using a straightforward configuration process, users can generate and index vector embeddings in Amazon OpenSearch, while leveraging LangChain’s data chunking capabilities for enhanced data retrieval efficiency. The service handles all integration aspects between MSK, embedding models, and Amazon OpenSearch vector stores.

AWS Amplify

AWS Amplify launches the Amplify AI kit for Amazon Bedrock, providing fullstack developers with tools to integrate AI capabilities into web applications. The kit includes a customizable React UI component, secure Bedrock access, and context-sharing features, enabling developers to implement chat, search, and summarization functionalities without machine learning expertise.

AWS AppSync

AWS AppSync launches AppSync Events, enabling developers to broadcast real-time data to multiple subscribers through serverless WebSocket APIs. The service eliminates the need to build and manage WebSocket infrastructure while providing secure, scalable event broadcasting capabilities. Developers can create APIs that automatically scale and integrate with services like Amazon EventBridge. The system supports features such as channel namespaces, event handlers, and multiple authorization modes, and is available in all regions where AWS AppSync operates. Users only pay for API operations and real-time connection minutes used.

Screen capture from the AWS AppSync console to create a new Event API.

Creating an AppSunc Event API

Amazon API Gateway

Amazon API Gateway released a significant enhancement to Amazon API Gateway, enabling customers to manage private REST APIs using custom private DNS names. This highly requested feature allows API providers to use user-friendly domain names like private.example.com, while maintaining TLS encryption for security. The implementation process involves creating a private custom domain, configuring certificates through AWS Certificate Manager (ACM), mapping private APIs, and setting resource policies. The feature supports cross-account sharing through AWS Resource Access Manager (AWS RAM) and is now available in all AWS Regions, including AWS GovCloud (US).

Serverless blog posts

October

November

Serverless Office Hours

Image from YouTube from the latest four Serverless Office Hours

Serverless office hours videos

October

November

Still looking for more?

The Serverless landing page has more information. The Lambda resources page contains case studies, webinars, whitepapers, customer stories, reference architectures, and even more Getting Started tutorials.

You can also follow the Serverless Developer Advocacy team on X (formerly Twitter) to see the latest news, follow conversations, and interact with the team.

And finally, visit the Serverless Land  for all your serverless needs.

Implement a custom subscription workflow for unmanaged Amazon S3 assets published with Amazon DataZone

Post Syndicated from Somdeb Bhattacharjee original https://aws.amazon.com/blogs/big-data/implement-a-custom-subscription-workflow-for-unmanaged-amazon-s3-assets-published-with-amazon-datazone/

Organizational data is often fragmented across multiple lines of business, leading to inconsistent and sometimes duplicate datasets. This fragmentation can delay decision-making and erode trust in available data. Amazon DataZone, a data management service, helps you catalog, discover, share, and govern data stored across AWS, on-premises systems, and third-party sources. Although Amazon DataZone automates subscription fulfillment for structured data assets—such as data stored in Amazon Simple Storage Service (Amazon S3), cataloged with the AWS Glue Data Catalog, or stored in Amazon Redshift—many organizations also rely heavily on unstructured data. For these customers, extending the streamlined data discovery and subscription workflows in Amazon DataZone to unstructured data, such as files stored in Amazon S3, is critical.

For example, Genentech, a leading biotechnology company, has vast sets of unstructured gene sequencing data organized across multiple S3 buckets and prefixes. They need to enable direct access to these data assets for downstream applications efficiently, while maintaining governance and access controls.

In this post, we demonstrate how to implement a custom subscription workflow using Amazon DataZone, Amazon EventBridge, and AWS Lambda to automate the fulfillment process for unmanaged data assets, such as unstructured data stored in Amazon S3. This solution enhances governance and simplifies access to unstructured data assets across the organization.

Solution overview

For our use case, the data producer has unstructured data stored in S3 buckets, organized with S3 prefixes. We want to publish this data to Amazon DataZone as discoverable S3 data. On the consumer side, users need to search for these assets, request subscriptions, and access the data within an Amazon SageMaker notebook, using their own custom AWS Identity and Access Management (IAM) roles.

The proposed solution involves creating a custom subscription workflow that uses the event-driven architecture of Amazon DataZone. Amazon DataZone keeps you informed of key activities (events) within your data portal, such as subscription requests, updates, comments, and system events. These events are delivered through the EventBridge default event bus.

An EventBridge rule captures subscription events and invokes a custom Lambda function. This Lambda function contains the logic to manage access policies for the subscribed unmanaged asset, automating the subscription process for unstructured S3 assets. This approach streamlines data access while ensuring proper governance.

To learn more about working with events using EventBridge, refer to Events via Amazon EventBridge default bus.

The solution architecture is shown in the following screenshot.

Custom subscription workflow architecture diagram

To implement the solution, we complete the following steps:

  1. As a data producer, publish an unstructured S3 based data asset as S3ObjectCollectionType to Amazon DataZone.
  2. For the consumer, create a custom AWS service environment in the consumer Amazon DataZone project and add a subscription target for the IAM role attached to a SageMaker notebook instance. Now, as a consumer, request access to the unstructured asset published in the previous step.
  3. When the request is approved, capture the subscription created event using an EventBridge rule.
  4. Invoke a Lambda function as the target for the EventBridge rule and pass the event payload to it:
  5. The Lambda function does 2 things:
    1. Fetches the asset details, including the Amazon Resource Name (ARN) of the S3 published asset and the IAM role ARN from the subscription target.
    2. Uses the information to update the S3 bucket policy granting List/Get access to the IAM role.

Prerequisites

To follow along with the post, you should have an AWS account. If you don’t have one, you can sign up for one.

For this post, we assume you know how to create an Amazon DataZone domain and Amazon DataZone projects. For more information, see Create domains and Working with projects and environments in Amazon DataZone.

Also, for simplicity, we use the same IAM role for the Amazon DataZone admin (creating domains) as well the producer and consumer personas.

Publish unstructured S3 data to Amazon DataZone

We have uploaded some sample unstructured data into an S3 bucket. This is the data that will be published to Amazon DataZone. You can use any unstructured data, such as an image or text file.

On the Properties tab of the S3 folder, note the ARN of the S3 bucket prefix.

Complete the following steps to publish the data:

  1. Create an Amazon DataZone domain in the account and navigate to the domain portal using the link for Data portal URL.

DataZone domain creation

  1. Create a new Amazon DataZone project (for this post, we name it unstructured-data-producer-project) for publishing the unstructured S3 data asset.
  2. On the Data tab of the project, choose Create data asset.

Data asset creation

  1. Enter a name for the asset.
  2. For Asset type, choose S3 object collection.
  3. For S3 location ARN, enter the ARN of the S3 prefix.

After you create the asset, you can add glossaries or metadata forms, but it’s not necessary for this post. You can publish the data asset so it’s now discoverable within the Amazon DataZone portal.

Set up the SageMaker notebook and SageMaker instance IAM role

Create an IAM role which will be attached to the SageMaker notebook instance. For the trust policy, allow SageMaker to assume this role and leave the Permissions tab blank. We refer to this role as the instance-role throughout the post.

SageMaker instance role

Next, create a SageMaker notebook instance from the SageMaker console. Attach the instance-role to the notebook instance.

SageMaker instance

Set up the consumer Amazon DataZone project, custom AWS service environment, and subscription target

Complete the following steps:

  1. Log in to the Amazon DataZone portal and create a consumer project (for this post, we call it custom-blueprint-consumer-project), which will used by the consumer persona to subscribe to the unstructured data asset.

Custom blueprint project name

We use the recently launched custom blueprints for AWS services for creating the environment in this consumer project. The custom blueprint allows you to bring your own environment IAM role to integrate your existing AWS resources with Amazon DataZone. For this post, we create a custom environment to directly integrate SageMaker notebook access from the Amazon DataZone portal.

  1. Before you create the custom environment, create the environment IAM role that will be used in the custom blueprint. The role should have a trust policy as shown in the following screenshot. For the permissions, attach the AWS managed policy AmazonSageMakerFullAccess. We refer to this role as the environment-role throughout the post.

Custom Environment role

  1. To create the custom environment, first enable the Custom AWS Service blueprint on the Amazon DataZone console.

Enable custom blueprint

  1. Open the blueprint to create a new environment as shown in the following screenshot.
  2. For Owning project, use the consumer project that you created earlier and for Permissions, use the environment-role.

Custom environment project and role

  1. After you create the environment, open it to create a customized URL for the SageMaker notebook access.

SageMaker custom URL

  1. Create a new custom AWS link and enter the URL from the SageMaker notebook.

You can find it by navigating to the SageMaker console and choosing Notebooks in the navigation pane.

  1. Choose Customize to add the custom link.

Add the custom link

  1. Next, create a subscription target in the custom environment to pass the instance role that needs access to the unstructured data.

A subscription target is an Amazon DataZone engineering concept that allows Amazon DataZone to fulfill subscription requests for managed assets by granting access based on the information defined in the target like domain-id, environment-id, or authorized-principals.

Currently, creation of subscription targets is only allowed using the AWS Command Line Interface (AWS CLI). You can use the command create-subscription-target to create the subscription target.

The following is an example JSON payload for the subscription target creation. Create it as a JSON file on your workstation (for this post, we call it blog-sub-target.json). Replace the domain ID and the environment ID with the corresponding values for your domain and environment.

{
"domainIdentifier": "<<your-domain-id>>",
"environmentIdentifier": "<<your-environment-id>>",
"name": "custom-s3-target-consumerenv",
"type": "GlueSubscriptionTargetType",
"manageAccessRole": "<<provide the environment-role here>>",
"applicableAssetTypes": ["S3ObjectCollectionAssetType"],
"provider": "Custom Provider",
"authorizedPrincipals": [ "<<provide the instance-role here>>"],
"subscriptionTargetConfig": [{
"formName": "GlueSubscriptionTargetConfigForm",
"content": "{\"databaseName\":\"customdb1\"}"
}]
}

You can get the domain ID from the user name button in the upper right Amazon DataZone data portal; it’s in the format dzd_<<some-random-characters>>.

For the environment ID, you can find it on the Settings tab of the environment within your consumer project.

  1. Open an AWS CloudShell environment and upload the JSON payload file using the Actions option in the CloudShell terminal.
  2. You can now create a new subscription target using the following AWS CLI command:

aws datazone create-subscription-target --cli-input-json file://blog-sub-target.json

Create subscription target

  1. To verify the subscription target was created successfully, run the list-subscription-target command from the AWS CloudShell environment:
aws datazone list-subscription-targets —domain-identifier <<domain-id>> —environment-identifier <<environment-id>>

Create a function to respond to subscription events

Now that you have the consumer environment and subscription target set up, the next step is to implement a custom workflow for handling subscription requests.

The simplest mechanism to handle subscription events is a Lambda function. The exact implementation may vary based on environment; for this post, we walk through the steps to create a simple function to handle subscription creation and cancellation.

  1. On the Lambda console, choose Functions in the navigation pane.
  2. Choose Create function.
  3. Select Author from scratch.
  4. For Function name, enter a name (for example, create-s3policy-for-subscription-target).
  5. For Runtime¸ choose Python 3.12.
  6. Choose Create function.

Author Lambda function

This should open the Code tab for the function and allow editing of the Python code for the function. Let’s look at some of the key components of a function to handle the subscription for unmanaged S3 assets.

Handle only relevant events

When the function gets invoked, we check to make sure it’s one of the events that’s relevant for managing access. Otherwise, the function can simply return a message without taking further action.

def lambda_handler(event, context):
    # Get the basic info about the event
    event_detail = event['detail']

    # Make sure it's one of the events we're interested in
    event_source = event['source']
    event_type = event['detail-type']

    if event_source != 'aws.datazone':
        return '{"Response" : "Not a DataZone event"}'
    elif event_type not in ['Subscription Created', 'Subscription Cancelled', 
                               'Subscription Revoked']:
        return '{"Response" : "Not a subscription created, cancelled, or revoked event"}'

These subscription events should include both the domain ID and a request ID (among other attributes). You can use these to look up the details of the subscription request in Amazon DataZone:

sub_request = dz.get_subscription_request_details(
domainIdentifier = domain_id,
identifier= sub_request_id
)
asset_listing = sub_request['subscribedListings'][0]['item']['assetListing']
form_data = json.loads(asset_listing['forms'])
asset_id = asset_listing['entityId']
asset_version = asset_listing['entityRevision']
asset_type = asset_listing['entityType']

Part of the subscription request should include the ARN for the S3 bucket in question, so you can retrieve that:

# We only want to take action if this is a S3 asset
    if asset_type == 'S3ObjectCollectionAssetType':
        # Get the bucket ARN from the form info for the asset
        bucket_arn = form_data['S3ObjectCollectionForm']['bucketArn']
        
        #Get the principal from the subscription target
        principal = get_principal(domain_id,project_id)

        try:
            # Get the bucket name from the ARN                    
            bucket_name_with_prefix = bucket_arn.split(':')[5]
            bucket_name = bucket_name_with_prefix.split('/')[0]
           
        except IndexError:
            response = '{"Response" : "Could not find bucket name in ARN"}'
            return response

You can also use the Amazon DataZone API calls to get the environment associated with the project making the subscription request for this S3 asset. After retrieving the environment ID, you can check which IAM principals have been authorized to access unmanaged S3 assets using the subscription target:

        list_sub_target = dz.list_subscription_targets(
            domainIdentifier=domain_id,
            environmentIdentifier=environment_id,
            maxResults=50,
            sortBy='CREATED_AT',
            sortOrder='DESCENDING'
            )
        
        print('asset type:', list_sub_target['items'][0]['applicableAssetTypes'])
        
        if list_sub_target['items'][0]['applicableAssetTypes'] == ['S3ObjectCollectionAssetType']:
            role_arn = list_sub_target['items'][0]['authorizedPrincipals']
            print('role arn',role_arn)

If this is a new subscription, add the relevant IAM principal to the S3 bucket policy by appending a statement that allows the desired S3 actions on this bucket for the new principal:

        if event_type == 'Subscription Created':
            if bucket_arn[-1] == '/':
                statement_block.append({
                    'Sid' : sid_string,
                    'Action': S3_ACTION_STRING,
                    'Resource': [
                        bucket_arn,
                        bucket_arn + '*'
                    ],
                    'Effect': 'Allow',
                    'Principal': {'AWS': principal}
                })

Conversely, if this is a subscription being revoked or cancelled, remove the previously added statement from the bucket policy to make sure the IAM principal no longer has access:

        elif event_type == 'Subscription Cancelled' or event_type == 'Subscription Revoked':
            # Remove the statement from the policy if it's there
            # Made sure to handle case where there's no Sid for a statement
            pruned_statement_block = []
            for statement in statement_block:
                if 'Sid' not in statement or statement['Sid'] != sid_string:
                    pruned_statement_block.append(statement)
            statement_block = pruned_statement_block

The completed function should be able to handle adding or removing principals like IAM roles or users to a bucket policy. Be sure to handle cases where there is no existing bucket policy or where a cancellation means removing the only statement in the policy, meaning the entire bucket policy is no longer needed.

The following is an example of a completed function:

import json
import boto3
import os


dz = boto3.client('datazone')
s3 = boto3.client('s3')

# The list of actions to be permitted on the bucket in the newly granted policy
S3_ACTION_STRING = 's3:*'

def build_policy_statements(event_type, statement_block, principal, sub_request_id, bucket_arn):
        # Generate a Sid that should be unique
        sid_string = ''.join(c for c in f'DZ{principal}{sub_request_id}' if c.isalnum())
        # Add a new policy statement that gives the prinicpal access to whole bucket.
        # If it turns out something other than bucket ARN is allowed in asset, we can
        # get more granular than that
        # Sid that should be unique in case we need to handle unsubscribe
        print('statement block :',statement_block)
        if event_type == 'Subscription Created':
            if bucket_arn[-1] == '/':
                statement_block.append({
                    'Sid' : sid_string,
                    'Action': S3_ACTION_STRING,
                    'Resource': [
                        bucket_arn,
                        bucket_arn + '*'
                    ],
                    'Effect': 'Allow',
                    'Principal': {'AWS': principal}
                })
            else:
                statement_block.append({
                    'Sid' : sid_string,
                    'Action': S3_ACTION_STRING,
                    'Resource': [
                        bucket_arn,
                        bucket_arn + '/*'
                    ],
                    'Effect': 'Allow',
                    'Principal': {'AWS': principal}
                })
        elif event_type == 'Subscription Cancelled' or event_type == 'Subscription Revoked':
            # Remove the statement from the policy if it's there
            # Made sure to handle case where there's no Sid for a statement
            pruned_statement_block = []
            for statement in statement_block:
                if 'Sid' not in statement or statement['Sid'] != sid_string:
                    pruned_statement_block.append(statement)
            statement_block = pruned_statement_block
           

        return statement_block

def lambda_handler(event, context):
    """Lambda function reacting to DataZone subscribe events

    Parameters
    ----------
    event: dict, required
        Event Bridge Events Format

    context: object, required
        Lambda Context runtime methods and attributes

    Returns
    ------
        Simple reponse indicating success or failure reason
    """
    # Get the basic info about the event
    event_detail = event['detail']

    # Make sure it's one of the events we're interested in
    event_source = event['source']
    event_type = event['detail-type']

    if event_source != 'aws.datazone':
        return '{"Response" : "Not a DataZone event"}'
    elif event_type not in ['Subscription Created', 'Subscription Cancelled', 
                               'Subscription Revoked']:
        return '{"Response" : "Not a subscription created, cancelled, or revoked event"}'

    
    # get the domain_id and other information
    domain_id = event_detail['metadata']['domain']
    project_id = event_detail['metadata']['owningProjectId']
    sub_request_id = event_detail['data']['subscriptionRequestId']
    listing_id = event_detail['data']['subscribedListing']['id']
    listing_version = event_detail['data']['subscribedListing']['version']
    
    print('domain-id',domain_id)
    print('project-id:',project_id)
    
    sub_request = dz.get_subscription_request_details(
        domainIdentifier = domain_id,
        identifier= sub_request_id
    )
   
    # Retrieve info about the asset from the request
    asset_listing = sub_request['subscribedListings'][0]['item']['assetListing']
    form_data = json.loads(asset_listing['forms'])
    asset_id = asset_listing['entityId']
    asset_version = asset_listing['entityRevision']
    asset_type = asset_listing['entityType']

    # We only want to take action if this is a S3 asset
    if asset_type == 'S3ObjectCollectionAssetType':
        # Get the bucket ARN from the form info for the asset
        bucket_arn = form_data['S3ObjectCollectionForm']['bucketArn']
        
        #Get the principal from the subscription target
        principal = get_principal(domain_id,project_id)

        try:
            # Get the bucket name from the ARN                    
            bucket_name_with_prefix = bucket_arn.split(':')[5]
            bucket_name = bucket_name_with_prefix.split('/')[0]
           
        except IndexError:
            response = '{"Response" : "Could not find bucket name in ARN"}'
            return response

        # Get the current bucket policy, or else make a blank one if there currently
        # is no policy
        try:
            bucket_policy = json.loads(s3.get_bucket_policy(Bucket=bucket_name)['Policy'])
        except s3.exceptions.from_code('NoSuchBucketPolicy'):
            bucket_policy = {'Statement': []}
        except:
            response = '{"Response" : "Could not get bucket policy"}'
            return response
        
        # Gets new policy with the subscribing principal either added or removed based on
        # event type
        new_policy_statements = build_policy_statements(event_type, bucket_policy['Statement'], principal, 
                                               sub_request_id, bucket_arn)

            
        # Write back the new policy. This can fail if the new policy is too big
        # or if for some reason the function role doesn't have rights to do this
        # If we removed the only policy statement, then just delete the policy
        try: 
            if not new_policy_statements:
                s3.delete_bucket_policy(Bucket = bucket_name)
            else:
                bucket_policy['Statement'] = new_policy_statements
                policy_string = json.dumps(bucket_policy)
                print('policy string :',policy_string)
                s3.put_bucket_policy(
                    Bucket=bucket_name,
                    Policy = policy_string
                )
        except Exception as e: 
            response = f'{{"Response" : "Error updating bucket policy: {e.args}"}}'
            return response
        
        # If we got here everything went as planned
        response = f'{{"Response" : "Updated policy for " + {bucket_name}}}'
    else:
        response = '{"Response" : "Not an S3 asset"}'


    return response

def get_principal(domain_id,project_id):
    # Call list environments to get the environment id
    listenv_request = dz.list_environments(
        domainIdentifier = domain_id,
        projectIdentifier= project_id
    )
    
   # In our example environment, there is only one of these
    environment_id = listenv_request['items'][0]['id']

   # Get the role we want to give access to from the subscription target info
    list_sub_target = dz.list_subscription_targets(
        domainIdentifier=domain_id,
        environmentIdentifier=environment_id,
        maxResults=50,
        sortBy='CREATED_AT',
        sortOrder='DESCENDING'
        )

    if list_sub_target['items'][0]['applicableAssetTypes'] == ['S3ObjectCollectionAssetType']:
       role_arn = list_sub_target['items'][0]['authorizedPrincipals']
   else:
        role_arn = []

    return role_arn

Because this Lambda function is intended to manage bucket policies, the role assigned to it will need a policy that allows the following actions on any buckets it is intended to manage:

  • s3:GetBucketPolicy
  • s3:PutBucketPolicy
  • s3:DeleteBucketPolicy

Now you have a function that is capable of editing bucket policies to add or remove the principals configured for your subscription targets, but you need something to invoke this function any time a subscription is created, cancelled, or revoked. In the next section, we cover how to use EventBridge to integrate this new function with Amazon DataZone.

Respond to subscription events in EventBridge

For events that take place within Amazon DataZone, it publishes information about each event in EventBridge. You can watch for any of these events, and invoke actions based on matching predefined rules. In this case, we’re interested in asset subscriptions being created, cancelled, or revoked, because those will determine when we grant or revoke access to the data in Amazon S3.

  1. On the EventBridge console, choose Rules in the navigation pane.

The default event bus should automatically be present; we use it for creating the Amazon DataZone subscription rule.

  1. Choose Create rule.
  2. In the Rule detail section, enter the following:
    1. For Name, enter a name (for example, DataZoneSubscriptions).
    2. For Description, enter a description that explains the purpose of the rule.
    3. For Event bus, choose default.
    4. Turn on Enable the rule on the selected event bus.
    5. For Rule type, select Rule with an event pattern.
  3. Choose Next.

EventBridge rule

  1. In the Event source section, select AWS Events or EventBridge partner events as the source of the events.

Define Event source

  1. In the Creation method section, select Custom Pattern (JSON editor) to enable exact specification of the events needed for this solution.

Choose custom pattern

  1. In the Event pattern section, enter the following code:

{
"detail-type": ["Subscription Created", "Subscription Cancelled", "Subscription Revoked"],
"source": ["aws.datazone"]
}

Define custom pattern JSON

  1. Choose Next.

Now that we’ve defined the events to watch for, we can make sure those Amazon DataZone events get sent to the Lambda function we defined in the previous section.

  1. On the Select target(s) page, enter the following for Target 1:
    1. For Target types, select AWS service.
    2. For Select a target, choose Lambda function
    3. For Function, choose create-s3policy-for-subscription-target.
  2. Choose Skip to Review and create.

Define event target

  1. On the Review and create page, choose Create rule.

Subscribe to the unstructured data asset

Now that you have the custom subscription workflow in place, you can test the workflow by subscribing to the unstructured data asset.

  1. In the Amazon DataZone portal, search for the unstructured data asset you published by browsing the catalog.

Search unstructured asset

  1. Subscribe to the unstructured data asset using the consumer project, which starts the Amazon DataZone approval workflow.

Subscribe to unstructured asset

  1. You should get a notification for the subscription request; follow the link and approve it.

When the subscription is approved, it will invoke the custom EventBridge Lambda workflow, which will create the S3 bucket policies for the instance role to access the S3 object. You can verify that by navigating to the S3 bucket and reviewing the permissions.

Access the subscribed asset from the Amazon DataZone portal

Now that the consumer project has been given access to the unstructured asset, you can access it from the Amazon DataZone portal.

  1. In the Amazon DataZone portal, open the consumer project and navigate to the Environments
  2. Choose the SageMaker-Notebook

Choose SageMaker notebook on the consumer project

  1. In the confirmation pop-up, choose Open custom.

Choose Custom

This will redirect you to the SageMaker notebook assuming the environment role. You can see the SageMaker notebook instance.

  1. Choose Open JupyterLab.

Open JupyterLab Notebook

  1. Choose conda_python3 to launch a new notebook.

Launch Notebook

  1. Add code to run get_object on the unstructured S3 data that you subscribed earlier and run the cells.

Now, because the S3 bucket policy has been updated to allow the instance role access to the S3 objects, you should see the get_object call return a HTTPStatusCode of 200.

Multi-account implementation

In the instructions so far, we’ve deployed everything in a single AWS account, but in larger organizations, resources can be distributed throughout AWS accounts, often managed by AWS Organizations. The same pattern can be applied in a multi-account environment, with some minor additions. Instead of directly acting on a bucket, the Lambda function in the domain account can assume a role in other accounts that contain S3 buckets to be managed. In each account with an S3 bucket containing assets, create a role that allows editing the bucket policy and has a trust policy referencing the Lambda role in the domain account as a principal.

Clean up

If you’ve finished experimenting and don’t want to incur any further cost for the resources deployed, you can clean up the components as follows:

  1. Delete the Amazon DataZone domain.
  2. Delete the Lambda function.
  3. Delete the SageMaker instance.
  4. Delete the S3 bucket that hosted the unstructured asset.
  5. Delete the IAM roles.

Conclusion

By implementing this custom workflow, organizations can extend the simplified subscription and access workflows provided by Amazon DataZone to their unstructured data stored in Amazon S3. This approach provides greater control over unstructured data assets, facilitating discovery and access across the enterprise.

We encourage you to try out the solution for your own use case, and share your feedback in the comments.


About the Authors

Somdeb Bhattacharjee is a Senior Solutions Architect specializing on data and analytics. He is part of the global Healthcare and Life sciences industry at AWS, helping his customers modernize their data platform solutions to achieve their business outcomes.

Sam YatesSam Yates is a Senior Solutions Architect in the Healthcare and Life Sciences business unit at AWS. He has spent most of the past two decades helping life sciences companies apply technology in pursuit of their missions to help patients. Sam holds BS and MS degrees in Computer Science.

Securely share AWS resources across VPC and account boundaries with PrivateLink, VPC Lattice, EventBridge, and Step Functions

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/securely-share-aws-resources-across-vpc-and-account-boundaries-with-privatelink-vpc-lattice-eventbridge-and-step-functions/

At some point, every AWS customer tells me that they have the desire to move into the future as quickly as possible. They want to simplify their modernization efforts, drive growth, and adapt to the cloud, while also reducing costs as they proceed. These customers typically have a large suite of legacy applications, possibly running on-premises, that are running on diverse technology stacks managed by disparate parts of the organization. To make things even more challenging, these organizations often have to meet stringent security and compliance requirements.

Prepare to Share
You can now share AWS resources such as Amazon Elastic Compute Cloud (Amazon EC2) instances, Amazon Elastic Container Service (Amazon ECS) and Amazon Elastic Kubernetes Service (Amazon EKS) container services, and your own HTTPS services across Amazon Virtual Private Cloud (Amazon VPC) and AWS account boundaries and use them to build event-driven apps via Amazon EventBridge and orchestrate workflows with AWS Step Functions. You can update your existing workloads, connect your modern cloud-native apps to on-premises legacy systems, with all communication routed across private endpoints and networks.

These new features build on Amazon VPC Lattice and AWS PrivateLink, and give you a lot of new options to design and control your network, along with some cool new ways to integrate and orchestrate across all of your technology stacks. For example, you can build hybrid event-driven architectures that make use of your existing on-premises applications.

Today, some customers use AWS Lambda functions or Amazon Simple Queue Service (Amazon SQS) queues to transfer data into VPCs. This undifferentiated heavy lifting can now be replaced with a simpler and more efficient solution.

Bringing all of this together, you get a set of services that will help you to accelerate your modernization efforts and simplify integration between your applications, regardless of where they are situated. EventBridge and Step Functions work hand-in-hand with PrivateLink and VPC Lattice to enable integration of public and private HTTPS-based applications into your event-driven architectures and workflows.

Here are the essential terms and concepts:

Resource Owner VPC – A VPC that has resources to be shared. The owner of this VPC creates a Resource Gateway with one or more associated Resource Configurations, then uses AWS Resource Access Manager (RAM) to share the Resource Configuration with the Resource Consumer, such as another AWS account, or a developer building event-driven architectures and workflows using EventBridge and Step Functions. Let’s define the Resource Owner as the person (maybe you) in your organization who is responsible for the care and feeding of this VPC.

Resource Gateway – Provides a point of ingress to a VPC so that clients can access resources in the Resource Owner VPC, as indicated by the Resource Configurations that are associated with the gateway. One Resource Gateway can make multiple resources available.

Resource – This can be a HTTPS endpoint, a database, a database cluster, an EC2 instance, an Application Load Balancer in front of multiple EC2 instances, an ECS service discoverable via AWS Cloud Map, an Amazon Elastic Kubernetes Service (Amazon EKS) service behind a Network Load Balancer, or a legacy service running in the Resource Owner VPC or running in on-premises across AWS Site-to-Site VPN or AWS Direct Connect.

Resource Configuration – Defines a set of resources that can be accessed through a particular Resource Gateway. The resources can be referenced by IP address, DNS name, or (for AWS resources) an ARN.

Resource Consumer – The person in your organization who is responsible for building applications that connect with and consume services provided by resources in a Resource Owner VPC.

Sharing Resources
You can put all of this power to use in a lot of different ways; I’ll focus on one for this post.

First, I will play the role of the Resource Owner. I click Resource gateways in the VPC Console, see that I don’t have a gateway, and click Create resource gateway to get started:

I assign a name (main-rg) and an IP address type, then pick the VPC and the private subnets where the gateway will have a presence (this is a one-shot selection that cannot be changed without creating a new Resource Gateway). I also choose up to five security groups to control inbound traffic:

I scroll down, assign any desired tags, and click Create resource gateway to proceed:

My new gateway is active within seconds; I nod in appreciation and click Create resource configuration to move ahead:

Now I need to create my first Resource Configuration. Let’s say that I have a HTTPS service running on an EC2 instance on a private subnet in my Resource Owner VPC. I assign a DNS name to the service and use a Amazon Route 53 Alias record which returns the IP address of the instance:

I am using a public hosted zone in this example. We already working on support for private hosted zones.

With DNS all set up, I click Create resource configuration to move ahead. I enter a name (rc-service1), choose Resource as the type, and select the Resource Gateway that I created earlier:

I scroll down and define my EC2 instance as a resource, entering the DNS name and setting up sharing for ports 80 and 443:

Now I take a small detour, and hop over to the RAM Console to create a Resource Share so that other AWS accounts can access the resources (this is optional, and only relevant for cross-account scenarios). I could create one Resource Share for each service, but in most cases I would create one share and use it to package up a collection of related services. I’ll do that, and call it shared-services:

Returning from my detour, I refresh the list of resource shares, pick the one that I created, and click Create resource configuration:

The resource configuration is ready within seconds.

Recap and Planning Time
Before moving ahead, let’s do a quick recap and make some plans. Here’s what I (in the role of Resource Provider) have so far:

  • MainVPC – My Resource Owner VPC.
  • main-rg – A Resource Gateway in MainVPC.
  • rc-service1 – The Resource Configuration for main-rg.
  • service1 – An HTTPS service hosted on an EC2 instance in a private subnet of MainVPC, at a fixed IP address.

Ok, so what’s next?

Share – This is the first and most obvious use use. I can use AWS Resource Access Manager (RAM) to share the Resource Configuration with another AWS account and access the service from another VPC. On the other side (as the Resource Consumer), I take a couple of quick steps to connect to the service that has been shared with me:

  • Service Network – I can create a service network, add the Resource Configuration to the Service Network, and create a VPC endpoint in a VPC to connect to the service network.
  • Endpoint – I can create a VPC endpoint in a VPC and access the shared resource via the endpoint.

Modernize – I can remove my legacy Lambda or SQS integration to get rid of some undifferentiated heavy lifting.

Build – I can use EventBridge and Step Functions to build event-driven architectures and orchestrate applications. I’ll take this option!

Accessing Private Resources with EventBridge and Step Functions
EventBridge and Step Functions already make it easy access to public HTTPS endpoints such as those from SaaS providers like Slack, Salesforce, and Adobe. With today’s launch, consuming private HTTPS services is just as easy.

As a Resource Consumer, I simply create an EventBridge connection, reference a Resource Configuration that was shared with me, and call the service from my event-driven application. Everything that I already know still applies, and I now have the new-found power to access private services.

To create the EventBridge connection, I open the EventBridge console and click Connections in the Integration  menu:

I review my existing connections (none so far), then click Create connection to move ahead:

I enter a name (MyService1) and a description for my connection, select Private as the API type, and choose the Resource Configuration that I created earlier:

Scrolling down, I need to configure the authorization for the service that I am connecting to. I select Custom configuration and Basic authorization, and enter the Username and Password for my service. I also add Action=Forecast to the query string (as you can see there are a lot of options for authorization), and click Create:

The connection is created and ready within minutes. Then I use it in my Step Functions workflows by using the HTTP Task, selecting the connection, entering the URL of my API endpoint, and choosing an HTTP method:

And that’s all there is to it: your Step Functions workflows can now make use of Private Resources!

I can also use this connection as an EventBridge API destination target in Event Buses and Pipes.

Things to Know
Here a couple of things to know about these cool new features:

Pricing – Existing pricing for Step Functions, EventBridge, PrivateLink, and VPC Lattice apply including the per-GB charge for data transfer into the VPC.

Regions – You can create and use Resource Gateways and Resource Configurations in 21 AWS Regions: US East (Ohio, N. Virginia), US West (N. California, Oregon), Africa (Cape Town), Asia Pacific (Hong Kong, Mumbai, Osaka, Seoul, Singapore, Sydney, Tokyo), Canada (Central), Europe (Frankfurt, Ireland, London, Milan, Paris, Stockholm), Middle East (Bahrain), and South America (São Paulo).

In the Works – As I noted earlier, we are already working on support for private hosted zones. We are also planning to support access to other types of AWS resources through EventBridge and Step Functions .

Jeff;

Automating event validation with Amazon EventBridge Schema Discovery

Post Syndicated from Chris McPeek original https://aws.amazon.com/blogs/compute/automating-event-validation-with-amazon-eventbridge-schema-discovery/

This post is written by Kurt Tometich, Senior Solutions Architect, and Giedrius Praspaliauskas, Senior Solutions Architect, Serverless

Event-driven architectures face challenges with event validation due to unique domains, varying event formats, frequencies, and governance levels. Events are constantly evolving, requiring a balanced approach between speed and governance. This blog post describes approaches to consumer and producer event validation, focusing on automated solutions for producer validation using Amazon EventBridge and Amazon API Gateway.

Consumer and Producer Event Validation

In an event-driven system, events should be validated by both producers and consumers to maintain data integrity. The producers’ job is to create and send valid events before they are routed to consumers. Failing to do so can lead to data inconsistencies, downstream errors in processing and unnecessary costs. As a consumer, even if events come from a trusted source, validation should still be applied. Producers may change data format over time, data may become corrupt, or interfaces between the producer and consumer may alter it.

A common way to manage and route events is through an event bus. EventBridge is a serverless event bus that can perform discovery, versioning and consumption of event schemas. When schema discovery is enabled on an event bus, new schema versions are generated when the event structure changes. These schemas can be used to perform validation on events.

The EventBridge Schema registry stores schemas in OpenAPI or JSONSchema formats. Schemas can be added to the registry automatically through schema discovery or by manually uploading your schema to the registry through the AWS console or programmatically. Schema discovery automates the process of finding schemas and adding them to your registry. Schemas for AWS events are automatically added to the registry.

Once a schema is added to the registry, you can generate a code binding for the schema. This allows you to represent the event as a strongly typed object in your code. Code bindings are available for Golang, Java, Python, or TypeScript programming languages. If preferred language-specific bindings are not available, schemas can be downloaded and validated using third-party schema validation libraries. For example, Ajv for JavaScript or the jsonschema library for Python.

If using code bindings, you can download them using the console, API, or within a supported IDE using the AWS Toolkit. Code bindings can be used like other code artifacts. If an AWS Lambda function is used as a consumer, add the code binding as a layer dependency. Bindings are not automatically synced to any artifact repositories, such as AWS CodeArtifact. The Lambda function code in this solution can be extended to automate binding uploads to your artifact repository.

The following diagram depicts a common producer (left) and consumer (right) event architecture on AWS. Producers send events through API Gateway or directly to an EventBridge event bus. It’s common to use API Gateway as a front door to provide authorization, validation and pre-processing of incoming events. Events going directly to EventBridge may also come from SaaS Partner Integrations (Salesforce, Jira, ServiceNow, etc.) or an application running in a private subnet using the AWS private network to connect to EventBridge. For these events, you can use third-party libraries to validate events prior to them arriving on EventBridge.

Image of Common Architecture for Producer and Consumer Event Validation.

Common Architecture for Producer and Consumer Event Validation

Workflow steps:

  1. Producers send events through API Gateway or directly to EventBridge. API Gateway provides request validation, parses and sends events to EventBridge if they pass validation. Invalid events that do not match the schema in API Gateway will be rejected before reaching EventBridge. Events going directly to EventBridge are validated using third party schema validation libraries (e.g. Ajv for JavaScript and jsonschema library for Python).
  2. With schema discovery enabled on a custom event bus, that bus will receive the event from an application and generate a new schema version in the registry. New schema versions are only created when the event structure changes. When new schema versions are created, a schema version created event is automatically emitted on the default EventBridge event bus. The default bus automatically receives AWS events. EventBridge rules can be configured to match all schema version changes or by filtering on schema name, type and other fields available on the event.
  3. Consumers define EventBridge rules to react to schema version change events. Consumers download the schema or code bindings from EventBridge and perform validation and parsing.
  4. Producers define EventBridge rules to react to schema version change events. The new schema is retrieved from the registry and either used in local development with third-party schema validation libraries, or a model in API Gateway is updated with the new schema directly. This step doesn’t exist as a native feature of EventBridge. The solution later in this post will demonstrate how to automate this step.

To scale this architecture to multiple event sources and API endpoints, you can create different models in API Gateway for each event schema. A model in API Gateway is a data schema that defines the structure and format of data for request and response payloads. Those models are then applied to different resources and methods defined on your APIs. The solutions below will demonstrate how event schemas can be automatically synced to models in API Gateway.

Solution Walkthrough

The following solutions use API Gateway to perform request validation and EventBridge schema discovery to automatically generate up-to-date schema versions. Both can be extended or modified to fit unique use cases. These solutions build upon the general producer and consumer validation architecture covered previously by incorporating automated solutions to downloading, processing and applying new schemas to API Gateway. Refer to the README.md file in the AWS Samples GitHub repository for pre-requisites, deployment instructions and testing.

Lambda Driven Schema Updater

The following architecture uses EventBridge schema discovery to generate new schema versions, download, process and post the schema to an API Gateway model for request validation. The Lambda schema updater function will trigger on schema version changes. The function trigger can be enabled or disabled by updating the rule in EventBridge console.

This solution is a good fit for quick updates with minimal processing. If complex testing and validation is required before updating a new schema, see the CI/CD driven schema updater solution covered later in this post. The rule in this solution triggers when a new schema version is added to the registry. To filter further, the rule can be modified or additional processing can be applied to the Lambda function. This provides flexibility in handling multiple domains or event types.

Image of Architecture for Lambda Driven Schema Updater.

Architecture for Lambda Driven Schema Updater

Workflow Steps:

  1. Producers send events to API Gateway endpoint or directly to EventBridge.
  2. API Gateway performs request validation on the body, modifies the event format and sends to EventBridge. If the event does not match the schema, API Gateway will reject the request.
  3. A custom event bus will receive the event and an optional rule based on source can log all events for tracking and troubleshooting.
  4. With schema discovery enabled on custom event bus, new event structures generate schema versions that are stored in the registry. If a new schema version is generated, consumers can download latest schema and code bindings from the registry.
  5. The schema version creation rule will invoke the Lambda function.
  6. The function will download, process and update the API Gateway model with the new schema. A new schema version is only generated if the structure of the event changes.

CI/CD Driven Schema Updater

The alternative approach uses a CI/CD pipeline to control schema changes. Instead of the Lambda function directly applying the new schema to the API Gateway model, it downloads, processes, and stores the schema in a repository. The CI/CD pipeline references the stored schema, performing additional tests and checks before the schema is promoted and enforced. This provides more control over the schema update process, though it introduces some additional complexity. The following diagram describes the CI/CD driven update process. The solution can be adapted to other artifact repositories and CI/CD systems.

Image of Architecture for CI/CD Driven Schema Updater

Architecture for CI/CD Driven Schema Updater

Workflow steps:

  1. Producers send events to API Gateway endpoint or directly to EventBridge.
  2. API Gateway will perform request validation against the body, modify the event format and send to EventBridge.
  3. A custom event bus will receive event and an optional rule based on source can log all events for tracking and troubleshooting.
  4. With discovery enabled on the custom event bus, schema versions are produced and stored in the registry.
  5. The schema version creation rule will invoke the Lambda function.
  6. The function will download, process and store the new schema in a repository of choice (i.e. S3, Git, Artifact Repository).
  7. The CI/CD pipeline updates the model in API Gateway and runs any necessary tests.
  8. The consumer downloads schema and code bindings from appropriate repositories.

Conclusion

Event validation can be challenging, but leveraging schema discovery and request validation minimizes custom logic and overhead. EventBridge can discover new schemas from events, while API Gateway validates incoming requests. This approach streamlines validation, improves data quality, and reduces the maintenance burden of manual validation.

For more information on event driven architectures, you can view additional resources on AWS Samples and Serverless Land.

The serverless attendee’s guide to AWS re:Invent 2024

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/the-serverless-attendees-guide-to-aws-reinvent-2024/

AWS re:Invent 2024 offers an extensive selection of serverless and application integration content.

AWS re:Invent Banner

AWS re:Invent Banner

For detailed descriptions and schedule, visit the AWS re:Invent Session Catalog.

Join AWS serverless experts and community members at the AWS Modern Apps and Open Source Zone in the AWS Expo Village. This serves as a hub for serverless discussions at re:Invent. While you are there, enjoy a free coffee and learn about serverless architectures at the Serverlesspresso booth. There are two this year, another one at the Certificate Lounge. The AWS Expo Village also includes Serverless and Serverless Containers booths.

Don’t have a ticket yet? Join us in Las Vegas from November 28-December 2, 2022 by registering for re:Invent 2024.

This guide organizes the sessions into categories to help you find the content this is most relevant to you.

Session Types

  • Breakout Sessions are lecture-style presentations covering architecture, best practices, and deep dives into AWS services.
  • Workshops are 2-hour hands-on sessions where you work through tasks in AWS accounts using AWS services. Laptops are required and AWS credits are provided.
  • Chalk Talks are highly interactive 60-minute sessions with smaller audiences, focused on technical deep dives with whiteboards for architectural discussions.
  • Builders’ Sessions are 60-minute small-group sessions led by an AWS expert who guides you through a technical problem using AWS services.
  • Code Talks are 60-minute live coding sessions where AWS experts show how to build solutions using AWS services.

Leadership session: Nick Coult, Usman Khalid, Kathleen deValk

  • SVS211: Celebrating 10 years of pioneering serverless and containers – Breakout.
    • Explore how serverless has evolved to help organizations drive the highest performance, availability, and security at low costs.

Getting started sessions

Are you new to serverless or taking your first steps? Hear from AWS experts and customers on best practices and strategies for building serverless workloads. Get hands-on with services by attending a workshop or builders session. Create the next great “to do” app or add a new customer experience for a theme park.

  • SVS202: Thinking serverless – Chalk Talk
    • Learn how to approach building solutions with a serverless mindset by breaking down business problems into serverless building blocks.
  • SVS205: Building a serverless web application for a theme park – Workshop
    • Learn how to build a complete serverless web application for a theme park called Innovator Island.
  • SVS201: Getting started with serverless patterns – Workshop
    • Learn how to recognize and apply common serverless patterns by building production-ready code for a serverless application.
  • SVS204: Write less code: Building applications with a serverless mindset – Builders Session
    • Get more value by using built-in integrations between AWS services through configuration rather than writing glue code.
  • SVS207: Effectively model costs for your serverless applications – Chalk Talk
    • Gain insights into modeling the cost of serverless applications on AWS by considering request loads, payload sizes, and service pricing.
  • API201: The AWS Step Functions workshop – Workshop
    • Learn about the features of AWS Step Functions through hands-on interactive modules.
  • API204: Building event-driven architectures – Workshop
    • Learn about the basics of event-driven design using examples involving Amazon SNS, Amazon SQS, AWS Lambda, Amazon EventBridge, and more.
  • API205: Unlock the power of an exceptional serverless developer experience – Code Talk
    • Learn how to accelerate your serverless development with AWS tools, including Amazon Q Developer integrated into IDEs.
  • SEG209: Getting started building serverless SaaS architectures
    • Discover how to build your first serverless application, and learn how to handle multi-tenant architectures for SaaS applications.

Understanding serverless architectures

  • SVS208: Balance consistency and developer freedom with platform engineering – Breakout
    • Learn how platform teams can provide opinionated security, cost, observability, reliability, and sustainability patterns while maintaining developer flexibility.
  • SVS209: Containers or serverless functions: A path for cloud-native success – Breakout
    • Explore the fundamental differences between containers and serverless functions through real-world scenarios and insights into choosing the right approach.
  • OPN301: Level up your serverless applications with Powertools for AWS Lambda – Workshop
    • Learn why Powertools for AWS Lambda can be the developer toolkit of choice for serverless workloads.
  • DEV341: From single to multi-tenant: Scaling a mission-critical serverless app
    • Explore how to transition a mission-critical application from a single-tenant to a multi-tenant architecture
  • DEV337: Zero to production serverless in 8 weeks
    • Hear about a real-world project journey, from concept to production in only eight weeks. Expect practical insights, mistakes, tips, and how using the right technologies and development process can deliver results fast.

Building event-driven applications

  • API204: Building event-driven architectures – Workshop
    • Learn about the basics of event-driven design using examples involving Amazon SNS, Amazon SQS, AWS Lambda, Amazon EventBridge, and more.
  • API206: How event-driven architectures can go wrong and how to fix them – Chalk Talk
    • Explore common event-driven pitfalls including YOLO events, god events, observability soup, event loops, and surprise bills.
  • DEV321: Choosing the right serverless compute services
    • Learn when to use AWS serverless compute services like AWS Lambda and Amazon ECS on AWS Fargate and how to integrate them into your application architectures.
  • API307: Event-driven architectures at scale: Manage millions of events – Breakout
    • Discover proven patterns for building high-scale event-driven systems that can be effectively managed across a distributed organization with Amazon EventBridge.
  • SVS206: Building an event sourcing system using AWS serverless technologies – Chalk Talk
    • Explore strategies for building effective event sourcing architectures using AWS serverless technologies to store application state as an append-only event log.
  • COP408: Coding for serverless observability
    • Join this code talk to learn best practices for collecting signals from your serverless applications. Dive deep into techniques to effectively instrument your applications to provide you with optimal observability.

Incorporating orchestration

  • API201: The AWS Step Functions workshop – Workshop
    • Learn about the features of AWS Step Functions through hands-on interactive modules.
  • API203: Building common orchestrated workflows with AWS Step Functions – Builders Session
    • Build three orchestrated workflows, including streamlined data processing with Distributed Map state, external system integration using callback, and implementing the saga pattern.
  • API207: Optimize data processing with built-in AWS Step Functions features – Chalk Talk
    • Learn to optimize your serverless data processing workflows at scale using AWS Step Functions features, including intrinsic functions and Distributed Map state.
  • API402: Building advanced workflows with AWS Step Functions – Breakout
    • Learn how you can use generative AI to generate state machines automatically from textual descriptions and chat with your workflow to optimize it.

Understanding integration patterns

  • API208: Building an integration strategy for the future – Breakout
    • Boost productivity and create better customer experiences by building a modern integration strategy using AWS application, data, and file integration services.
  • API306: Integration patterns for distributed systems – Breakout
    • Learn about common design trade-offs for distributed systems and how to navigate them with design patterns, illustrated with real-world examples.
  • API311: Application integration for platform builders – Breakout
    • Explore the implementation of application integration using serverless components in enterprise environments.

Building APIs and frontends

  • SVS203: Create your first API from scratch with OpenAPI and Amazon API Gateway – Builders Session
    • Learn how to design and provision complete APIs using infrastructure as code following the OpenAPI specification.
  • API303: Building modern API architectures: Which front door should I use? – Chalk Talk
    • Explore options for building modern APIs including REST, GraphQL, and real-time APIs along with their benefits and drawbacks.
  • API304: Building rate-limited solutions on AWS – Chalk Talk
    • Learn some of the best ways to build rate limiting into your systems for improved reliability.
  • API305: Asynchronous frontends: Building seamless event-driven experiences – Breakout
    • Explore patterns to enable asynchronous, event-driven integrations with the frontend designed for architects and frontend, backend, and full-stack engineers.

Diving deep into advanced topics

  • SVS401: Best practices for serverless developers – Breakout
    • Discover architectural best practices, optimizations, and useful shortcuts for building production-ready serverless workloads.
  • SVS403: From serverful to serverless Java – Workshop
    • Learn how to bring your traditional Java Spring application to AWS Lambda with minimal effort and iteratively apply optimizations.
  • SVS406: Scale streaming workloads with AWS Lambda – Chalk Talk
    • Learn how to implement parallel processing techniques for ordered and unordered use cases to address throughput limitations in streaming data processing.

Processing data

  • SVS404: Building serverless distributed data processing workloads – Workshop
    • Learn how serverless technologies like AWS Step Functions and AWS Lambda can help you simplify management and scaling of distributed data processing.
  • API401: Multi-tenant Amazon SQS queues: Mitigating noisy neighbors – Chalk Talk
    • Explore advanced strategies for managing multi-tenant Amazon SQS queues and effective mitigation techniques, including shuffle sharding and overflow queues.
  • SVS321: AWS Lambda and Apache Kafka for real-time data processing applications – Breakout
    • Gain practical insights into building scalable, serverless data processing applications by integrating AWS Lambda with Apache Kafka.

Incorporating generative AI

  • API209: Generative AI at scale: Serverless workflows for enterprise-ready apps – Workshop
    • Learn to build enterprise-ready, scalable generative AI applications that can scale from serving 100 to 100,000 users.
  • API310: Build a meeting summarization solution with generative AI & serverless – Code Talk
    • See live coding of a serverless application for producing meeting summaries with generative AI using Amazon Transcribe and Amazon Bedrock, orchestrated with AWS Step Functions.
  • SVS319: Unlock the power of generative AI with AWS Serverless – Breakout
    • Learn to harness AWS Serverless to build robust, cost-effective generative AI applications. Explore using AWS Step Functions to orchestrate complex AI workflows.
  • SVS325: Secure access to enterprise generative AI with serverless AI gateway – Chalk Talk
    • Explore how to architect a serverless AI gateway on AWS to securely integrate and consume large language models from multiple providers.

Additional resources

For social activities see the Unofficial list of AWS re:Invent Conference and Vendor Parties.

If you are attending re:Invent, connect at our AWS Modern Apps and Open Source Zone in the AWS Expo Village. The AWS Expo Village also includes Serverless and Serverless Containers booths.

If you can not join us in-person, breakout sessions will be available via our YouTube channel after the event.

We look forward to seeing you at re:Invent 2024! For more serverless learning resources, visit Serverless Land.

How CyberArk is streamlining serverless governance by codifying architectural blueprints

Post Syndicated from Anton Aleksandrov original https://aws.amazon.com/blogs/architecture/how-cyberark-is-streamlining-serverless-governance-by-codifying-architectural-blueprints/

This post was co-written with Ran Isenberg, Principal Software Architect at CyberArk and an AWS Serverless Hero.

Serverless architectures enable agility and simplified cloud resource management. Organizations embracing serverless architectures build robust, distributed cloud applications. As organizations grow and the number of development teams increases, maintaining architectural consistency, standardization, and governance across projects becomes crucial.

In this post, you will discover how CyberArk, a leading identity security company, efficiently implements serverless architecture governance, reduces duplicative efforts, and saves months of development time by codifying architectural blueprints. This approach helps to prevent redundant efforts and promotes uniform architectural standards, facilitating the seamless adoption of organizational best practices and governance across diverse teams.

Overview

The risk of duplicative efforts and architectural inconsistencies is particularly pronounced in large organizations, especially for requirements unrelated to specific business domains owned by individual teams. Diverse approaches to Infrastructure-as-Code, CI/CD, observability, and security can lead to inconsistent implementations across teams. Application developers should focus on delivering business value efficiently, rather than navigating the complexities of building and operating distributed architectures while adhering to organizational best practices. To achieve this, you need an approach that empowers developers and provides guardrails to ensure vetted architectural patterns are consistently applied. This solution should enable accelerated delivery without sacrificing agility and innovation.

Some organizations implement internal wiki consolidating architectural guidance. While well-intentioned, relying solely on documentation assumes development teams diligently follow the guidelines, which often requires manual validation and limits scalability. To overcome this limitation, organizations should adopt a scalable approach that codifies, automates, and promotes architectural best practices. This mechanism allows developers to focus on delivering business-domain value and drives standardized operational excellence, governance, and organizational policies adherence.

Introducing serverless blueprints

CyberArk engineering team had over 900 developers. It was looking for ways to ensure they build their serverless services based on vetted architectural and security best practices with fully automated governance controls enforcement. The solution came in the form of codified architecture blueprints and automated tooling.

Serverless architectures are composed using loosely coupled services, integrated based on the application requirements. Application developers use IaC tools such as AWS CDK and HashiCorp Terraform to define their serverless architectures and integration patterns. CyberArk has augmented the IaC with governance tools, such as cdk-nag, AWS Config, and AWS Control Tower. With these complementary tools in place, they’ve built serverless blueprints which include architectural definitions based on organizational best practices, as well as automatically applied governance controls

To illustrate this, consider a simple serverless architecture pattern. In this common pattern, an SQS queue serves as the event source for a Lambda function, which parses incoming messages and updates an Amazon S3 bucket.

A simple serverless architecture with SQS Queue, Lambda function, and S3 Bucket

Figure 1. A simple serverless architecture with SQS Queue, Lambda function, and S3 Bucket

While this pattern seems simple, turning it into an enterprise-ready service requires additional effort. You must consider aspects like resiliency, security, governance, observability, and coding best practices. Let’s examine several examples codified in architectural blueprints at CyberArk.

Error-handling best practices

Your services should be resilient. Retries can help to overcome occasional network hiccups, but you also need to handle scenarios when your function consistently fails to process particular messages (known as poison message) – for example, because of a code bug. This can lead to endless processing loops, data loss, and potential extra charges. To address this, a blueprint can implement a failure handling mechanism with a dead letter queue, alerting, and redrive. This pattern is straightforward to implement and adds extra resiliency to your architecture. It is also generic and does not contain any business domain code. This is a typical example of an architectural pattern that can be codified in a blueprint and reused across development teams.

The simple serverless architecture with added resiliency best practices

Figure 2. The simple serverless architecture with added resiliency best practices

Security best practices

Another example is securing S3 buckets. Organizations must enforce S3 security best practices, such as enabling access logs, blocking public access, and enabling encryption at rest. Codifying these guardrails in architectural blueprints adds an extra layer that allows your developers to comply with organization standards without having to explicitly implement adherence to each best practice and policy on their own.

The simple serverless architecture with added security best practices

Figure 3. The simple serverless architecture with added security best practices

The following code snippet uses AWS CDK to create an S3 bucket with common best practices:

def _create_bucket(self, server_access_logs_bucket: s3.Bucket, is_production_env: bool) -> s3.Bucket:
    # Create an S3 bucket with AWS-managed keys encryption
    bucket = s3.Bucket(
        self,
        constants.BUCKET_NAME,
        versioned=True if is_production_env else False,
        encryption=s3.BucketEncryption.S3_MANAGED,
        block_public_access=s3.BlockPublicAccess.BLOCK_ALL,
        enforce_ssl=True,
        server_access_logs_bucket=server_access_logs_bucket, 
        # redacted
    )

Additional security best practices you can codify in your blueprints include the principle of least privilege access, VPC-attachment, and code signing for sensitive Lambda functions, and using KMS keys for encryption.

Lambda best practices

Your Lambda functions are another example of where blueprints can help. By providing a function blueprint implementing the baseline for capabilities like observability, idempotency, and batch processing out-of-the-box, you enable developers to focus on their business domain code.

Layered view of a Lambda function in CyberArk’s serverless architecture blueprint

Figure 4. Layered view of a Lambda function in CyberArk’s serverless architecture blueprint

CyberArk embeds Powertools for AWS Lambda, a toolkit that implements serverless best practices to increase developer velocity, into their blueprints. The following code snippets embed Powertools for enabling enhanced observability and implementing batch processing.

# CDK code
lambda_function = lambda.Function(
    environment={
        constants.POWERTOOLS_SERVICE_NAME: constants.SERVICE_NAME,
        constants.POWER_TOOLS_LOG_LEVEL: 'INFO',  
    },
    tracing=lambda.Tracing.ACTIVE,
    layers=["powertools-layer"],
    log_format=lambda.LogFormat.JSON.value,
    system_log_level=lambda.SystemLogLevel.INFO.value
    # redacted
)

# Function handler code
processor = BatchProcessor(event_type=EventType.SQS, model=OrderSqsRecord)

@logger.inject_lambda_context
@metrics.log_metrics
@tracer.capture_lambda_handler(capture_response=False)
def lambda_handler(event, context: LambdaContext):
    return process_partial_response(
        event=event,
        record_handler=record_handler,
        processor=processor,
        context=context,
)

Governance controls

Blueprints are not static; they evolve as you adopt new best practices and governance policies. Developers start with a vetted blueprint but can deviate as they evolve their serverless apps. To enable continuous adherence, it is important to use a combination of organizational governance tools, such as AWS Control Tower and Service Control Policies, and architecture blueprints that embed governance controls automatically enforced by CI/CD. This ensures that any architectural modification will be validated for adhering to organizational standards.

AWS defines proactive controls as mechanisms that prevent developers from deploying resources that violate governance policies. Detective controls are mechanisms that detect, log, and alert on resource or configuration changes that violate governance policies.

Applying governance controls at all stages of CI/CD

Figure 5. Applying governance controls at all stages of CI/CD

Depending on the IaC tool, you can leverage different types of governance tools for proactive control enforcement. The following screenshot shows a proactive control violation identified during CI/CD via the cdk-nag framework. You can see cdk-nag throwing an error for the stack deployment due to Lambda execution role being assigned wild-card permissions.

Exception thrown by cdk-nag for using wildcard permissions

Figure 6. Exception thrown by cdk-nag for using wildcard permissions

See the practical guide for implementing serverless governance.

Sample code

Ran Isenberg has open-sourced a sample Lambda Handler Cookbook blueprint illustrating some of the patterns CyberArk has adopted.

Additional serverless architecture patterns you might consider implementing in your blueprints are server-side encryption for an Amazon SNS topic with an encrypted Amazon SQS queue subscribed, auto-adjusting provisioned concurrency for Lambda functions, secure Serverless Aurora Cluster with bastion host, and more.

See more patterns implemented at serverlessland.com and cdkpatterns.com

Conclusion

Translating architectural and security best practices into modular IaC definitions, such as CDK constructs or Terraform modules, is a scalable and reusable technique that allows CyberArk to reduce duplicative efforts and save months of development time. Using IaC tools like AWS CDK or Terraform, augmented with governance tools like cdk-nag or checkov, enabled CyberArk to share implementation best practices and encode governance policies into architectural blueprints. Development teams adopting these blueprints do not need to reinvent the wheel, each trying to solve the same problem on their own. Instead, they leverage the knowledge codified in the blueprint.

Further reading

Monitoring best practices for event delivery with Amazon EventBridge

Post Syndicated from Chris McPeek original https://aws.amazon.com/blogs/compute/monitoring-best-practices-for-event-delivery-with-amazon-eventbridge/

This post is written by Maximilian Schellhorn, Senior Solutions Architect and Michael Gasch, Senior Product Manager, EventBridge

Amazon EventBridge is a serverless event router that allows you to decouple your applications, using events to communicate important changes between event producers and consumers (targets). With EventBridge, producers publish events through an event bus, where you can configure rules to filter, transform, and route your events to a variety of targets such as AWS Lambda functions, Amazon Kinesis Data Streams, and public HTTPS endpoints (API destinations).

In event-driven architectures, the flow of sending and receiving events is asynchronous. There is no direct feedback to the producer when targets are invoked or if the invocation was successful. Therefore, to make sure business logic executes reliably in event-driven applications, it’s essential to get an understanding of your event delivery behavior with metrics, such as the number of delivery retries, failed delivery attempts, and the time it takes to deliver events. These metrics allow you to monitor the health of your event-driven architectures, and understand and mitigate event delivery issues caused by underperforming, undersized or unresponsive targets.

This post discusses how to monitor event delivery with EventBridge metrics to detect common event delivery issues and increase the reliability of your event-driven architectures on AWS.

Background

EventBridge is a multi-tenant system that handles more than 2.6 trillion events per month as of February 2024. EventBridge maintains fairness and availability under high load using mechanisms to detect and isolate noisy neighbors. As part of the AWS shared responsibility model, you are responsible to monitor and respond to target-related issues for reliable event delivery. For example, an underprovisioned Kinesis data stream or throttled API destination as a target will lead to delivery retries, delays, and failures.

Solution overview

EventBridge provides a variety of metrics to observe, troubleshoot, and optimize event delivery. For example, counter-based metrics such as InvocationAttempts, SuccessfulInvocationAttempts, RetryInvocationAttempts, and FailedInvocations allow you to observe throttling and calculate error rates. Latency-based metrics such as IngestionToInvocationSuccessLatency provide insights into event delivery and delays.

In the following sections, we demonstrate the behavior of these metrics through an example application and discuss best practices for reliable event delivery. The example is composed of three key components, as numbered in the following architecture:

  1. An HTTP load generator to simulate different load patterns through Amazon API Gateway.
  2. An EventBridge event bus and a rule with an API destination target, throttled at 50 requests per second to simulate an under-scaled resource.
  3. A dead-letter queue (DLQ) that makes sure events are retained in case of invocations that fail permanently.

Example application architecture.

Example application architecture

The load generator creates varying load over multiple phases. To observe the number of incoming events, use the EventBridge metrics MatchedEvents or TriggeredRules on the rule name dimension, as illustrated in the following graph.

Number of incoming events visualized in CloudWatch Metrics.

Number of incoming events visualized in CloudWatch Metrics

The following use cases focus on monitoring event delivery. Therefore, cases where event producers are not able to publish events due to permission errors or are experiencing throttling quotas on PutEvents are not covered.

Use case 1: Detecting event delivery issues due to target rate limiting

In this use case, event delivery will experience retries due to an under-scaled API destination target. The API processes all requests successfully. The load generator runs in three phases:

  • First, it warms up with a low number of requests and slowly increases the load while staying below the API destination rate limit of 50 requests per second
  • In the second phase, the load generator increases to 100 requests per second, exceeding the configured invocation rate on the API destination
  • Finally, the load generator slows to 50 requests per second, and eventually finishes

The following graph was created via CloudWatch Metrics and illustrates this scenario.

Load pattern of the example application.

Load pattern of the example application

EventBridge supports new rule name dimensions for selected metrics, making it straightforward to observe invocations (event delivery) per rule. The following metrics are recommended:

  • InvocationAttempts – The overall number of times EventBridge attempts to invoke the target, including retries
  • SuccessfulInvocationAttempts – The number of invocation attempts that were successful
  • RetryInvocationAttempts – The number attempts that originated from retries

The following graph visualizes the metrics within the first phase of the example scenario. In this phase, the load stays below the configured rate limit of the target. When events are delivered successfully without throttling or errors, InvocationAttempts and SuccessfulInvocationAttempts are equivalent and RetryInvocationAttempts is 0 (the metric is only emitted if there are retries).

EventBridge metrics during the first phase without throttling or errors.

EventBridge Metrics during the first phase without throttling or errors

In the second phase (06:55), the load generator creates more events than the target can handle, exceeding the API destination invocation rate limit. This is reflected in the graph by InvocationAttempts and MatchedEvents increasing, while SuccessfulInvocationAttempts stays at the configured API destination rate limit. At the beginning of the phase, RetryInvocationAttempts is 0 because retries due to rate limiting from API destinations are not immediately executed, but delayed with exponential backoff. After the delay, RetryInvocationAttempts starts increasing (06:58), as shown in the following graph.

EventBridge metrics during the ramp-up phase of the load generator.

EventBridge Metrics during the ramp-up phase of the load generator

Because InvocationAttempts also includes retries, the overall number of InvocationAttempts is higher than the incoming MatchedEvents.

Lastly, during the cool down period, when the number of incoming events is decreasing significantly (7:03), more retry attempts succeed, and therefore InvocationAttempts and RetryAttempts reduce. Even though there are no more new incoming events (07:05), there are still events being retried that will eventually finish (07:14).

EventBridge metrics during the cool down phase of the load generator.

EventBridge Metrics during the cool-down phase of the load generator

Based on the observations during this scenario, we can calculate the overall custom metric SuccessfulInvocationRate. If you consider retries as a first sign of degraded system state, you can calculate this rate as SuccessfulInvocationAttempts/InvocationAttempts. For example, in Amazon CloudWatch, you can use metric math. Depending on your requirements, you can set up CloudWatch alarms to create notifications when a certain threshold is hit.

Custom SuccessfulInvocationRate metric generated with CloudWatch metric math.

Custom SuccessfulInvocationRate metric generated with CloudWatch metric math

Although an occasional decrease of SuccessfulInvocationRate due to temporary traffic spikes or invocation errors can be considered normal, a constant mismatch is an indication of a misconfigured target and needs to be addressed as part of the shared responsibility model.

Use case 2: Detecting and handling event delivery failures

By default, EventBridge retries delivering an event for 24 hours and up to 185 times. After all retry attempts are exhausted, the event is dropped or sent to a DLQ. See Using dead-letter queues to process undelivered events in EventBridge for more information on how to configure a DLQ with EventBridge. These events can be visualized through the FailedInvocations or InvocationsSentToDlq metrics. Because FailedInvocations doesn’t consider retries that eventually succeed as failed invocations, this metric wasn’t visible in the previous example.

The following graph represents the same application and load pattern, but the EventBridge rule is configured with a maximum of three retries. During the first phase, there are no failed attempts because the load stays below the throttling limit.

EventBridge metrics with FailedInovations after maximum retries exceed.

EventBridge Metrics with FailedInvocations after maximum retries exceed

In the second phase, you can observe FailedInvocations starting after the initial retries (three) have been exceeded. Because the example application has a DLQ configured, InvocationsSentToDlq can provide the same insight, and can be used for alerting.

If you’re experiencing a large amount of FailedInvocations or InvocationsSentToDlq, it’s recommended to investigate if the target is properly scaled and able to receive the given traffic. For cases where retries are expected, the retry policy should be configured accordingly.

Use case 3: Detecting event delivery delays

The metrics outlined in the previous scenarios provided an overview of how to monitor your event delivery by the total number of retries or failures during a given time period. However, EventBridge also provides a metric that lets you observe the end-to-end latency (the time it takes from event ingestion to successful delivery to the target).

This can be achieved with the new IngestionToInvocationSuccessLatency metric. This metric surfaces effects from retries and delayed delivery, for example due to timeouts and slow responses from targets. In the following graph, you can observe 50th and 99th percentiles (p50 and p99) for IngestionToInvocationSuccessLatency on the right Y axis. During the second phase of the load generator, where invocations exceed the number of events the target can process, retries occur. Therefore, the overall time until events are delivered successfully to the target increases to almost 10 minutes (597,621ms, p99).

Combination of counter based metrics and latency based metrics.

Combination of counter based metrics and latency based metrics

IngestionToInvocationSuccessLatency includes the time the target takes to successfully respond to event delivery. This allows you to monitor the end-to-end latency between EventBridge and your target, and detect performance variations and degradations of targets, even when there is no target throttling or errors. For example, the following graph displays constant successful invocations while the latency increases due to longer response times of the target over a 5-minute period (starting at 09:07).

Visualization of increased target latency without errors or retries.

Visualization of increased target latency without errors or retries

Conclusion

In this post, we explored best practices for observing event delivery with EventBridge. By using key metrics like SuccessfulInvocationAttempts, RetryInvocationAttempts, and FailedInvocations, you can gain visibility and identify issues early. With CloudWatch metric math, you can calculate a SuccessfulInvocationRate metric, allowing you to define thresholds and alerts on a single key metric.

Furthermore, the new IngestionToInvocationSuccessLatency metric provides insights into the end-to-end event delivery latency between EventBridge and your targets, enabling you to detect and respond to performance degradation. It’s recommended to combine these key metrics into a holistic overview, such as using CloudWatch dashboards. By setting up appropriate alarms and taking a proactive approach to observability, you can mitigate event delivery problems and build resilient, scalable, event-driven applications on AWS with EventBridge. Navigate to Monitoring Amazon EventBridge to get an overview of the available metrics and how to get started.

Try these metrics out with your own use case!

To find more serverless patterns, check out Serverless Land.

AWS Weekly Roundup: Amazon Q Business, AWS CloudFormation, Amazon WorkSpaces update, and more (Aug 5, 2024)

Post Syndicated from Matheus Guimaraes original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-amazon-q-business-aws-cloudformation-amazon-workspaces-update-and-more-aug-5-2024/

Summer is reaching its peak for some of us around the globe, and many are heading out to their favorite holiday destinations to enjoy some time off. I just came back from holidays myself and I couldn’t help thinking about the key role that artificial intelligence (AI) plays in our modern world to help us scale the operation of simple things like traveling. Passport and identity verifications were quick, and thanks to the new airport security system rolling out across the world, so were my bag checks. I watched my backpack with a smile as it rolled along the security check belt with my computer, tablet, and portable game consoles all nicely tucked inside without any fuss.

If it wasn’t for AI, we wouldn’t be able to scale operations to keep up with population growth or the enormous volumes of data we generate on a daily basis. The advent of generative AI took this even further by unlocking the ability to put all this data to use in all kinds of creative ways, driving a new wave of exciting innovations that continues to elevate modern products and services.

This new landscape can be challenging for companies that are learning how generative AI can help them grow or succeed, such as startups. This is why I’m so excited about the AWS GenAI Lofts taking place in the next months around the world.

The AWS GenAI Lofts are collaborative spaces available in different cities around the world for a number of weeks. Startups, developers, investors, and industry experts can meet here while having access to AWS AI experts, and attend talks, workshops, fireside chats, and Q&As with industry leaders. All lofts are free and are carefully curated to offer something for everyone to help you accelerate your journey with AI. There are lofts scheduled in Bengaluru (July 29-Aug 9), San Francisco (Aug 14-Sept 27), Sao Paulo (Sept 2-Nov 20), London (Sept 30-Oct 25), Paris (Oct 8-Nov 25), and Seoul (Nov, pending exact dates). I highly encourage you to have a look at the agendas of a loft near you and drop in to learn more about GenAI and connect with others.

Last week’s launches
Here are some launches that got my attention last week.

Amazon Q Business cross-Region IdC — Amazon Q Business is a generative AI-powered assistant that deeply understands your business by providing connectors that you can easily set up to unify data from various sources such as Amazon S3, Microsoft 365, and more. You can then generate content, answer questions, and even automate tasks that are relevant and specific to your business. Q Business integrates with AWS IAM Identity Center to ensure that data can only be accessed by those who are authorized to do so. Previously, the IAM Identity Center instance had to be located in the same Region as the Q Business application. Now, you can connect to one in a different Region.

Git sync status changes publish to Amazon EventBridgeAWS CloudFormation Git sync is a very handy feature that can help streamline your DevOps operations by allowing you to automatically update your AWS CloudFormation stacks whenever you commit changes to the template or deployment file in source control. As of last week, any sync status change is published in near real-time as an event to EventBridge. This enables you to take your GitOps workflow further and stay on top of your Git repositories or resource sync status changes.

Some AWS Pinpoint’s capabilities are now under AWS End User Messaging — AWS Pinpoint’s SMS, MMS, push, and text to voice capabilities have been shuffled and now are offered through their own service called AWS End User Messaging. There is no impact to existing applications and no changes to APIs, the AWS Command Line Interface (AWS CLI), or IAM policies, however, the new name is now reflected on the AWS Management Console, AWS Billing console dashboard, documentation, and other places.

Amazon WorkSpaces updates — Microsoft Visual Studio Professional and Microsoft Visual Studio Enterprise 2022 are now added to the list of available license included applications on Workspaces Personal. Additionally, Amazon WorkSpaces Thin Client has received Carbon Trust verification. As verified by the Carbon Trust, the total lifecycle carbon emission is 77kg CO2e and 50% of the product is made from recycled materials.

GenAI for the Public Sector — There has been two significant launches that may interest those in the public sector looking into getting started with generative AI. Amazon Bedrock is now a FedRAMP High authorized service in the AWS GovCloud (US-West) Region. Additionally, both Llama 3 8B and Lllama 3 70B are now also available in that Region making this a perfect opportunity to start experimenting with Bedrock and Llama 3 if you have workloads in the AWS GovCloud (US-West) Region.

Customers in Germany can now sign up for AWS using their bank account — That means no debit or credit card is needed to create AWS accounts if you have a billing address in Germany. This can help simplify payment of AWS invoices for some businesses, as well as make it easier for others to get started on AWS.

Learning Materials

These are my recommended learning materials for this week.

AWS Skill Builder — This is more of a broad recommendation, but I’m still surprised that so many people never heard of AWS Skill Builder or have not tried it yet. There is so much learning you can do for free including a lot of hands-on courses. In July alone, AWS Skill Builder has launched 25 new digital training products including AWS SimulLearn and AWS Cloud Quest: Generative AI which are game-based learning experiences. Speaking of that, did you know that if you need to renew your Cloud Practitioner certification you can do it simply by playing the AWS Cloud Quest: Recertify Cloud Practioner game?

Get started with agentic code interpreter — Earlier last month we released a new capability on Agents for Amazon Bedrock which allows agents to dynamically generate and execute code within a secure sandboxed environment. As usual, my colleague Mike Chambers has created a great video and blog post on community.aws showing how you can start using it today.

That’s it for this week. Check back next Monday for another Weekly Roundup!

Automate data loading from your database into Amazon Redshift using AWS Database Migration Service (DMS), AWS Step Functions, and the Redshift Data API

Post Syndicated from Ritesh Sinha original https://aws.amazon.com/blogs/big-data/automate-data-loading-from-your-database-into-amazon-redshift-using-aws-database-migration-service-dms-aws-step-functions-and-the-redshift-data-api/

Amazon Redshift is a fast, scalable, secure, and fully managed cloud data warehouse that makes it simple and cost-effective to analyze all your data using standard SQL and your existing ETL (extract, transform, and load), business intelligence (BI), and reporting tools. Tens of thousands of customers use Amazon Redshift to process exabytes of data per day and power analytics workloads such as BI, predictive analytics, and real-time streaming analytics.

As more and more data is being generated, collected, processed, and stored in many different systems, making the data available for end-users at the right place and right time is a very important aspect for data warehouse implementation. A fully automated and highly scalable ETL process helps minimize the operational effort that you must invest in managing the regular ETL pipelines. It also provides timely refreshes of data in your data warehouse.

You can approach the data integration process in two ways:

  • Full load – This method involves completely reloading all the data within a specific data warehouse table or dataset
  • Incremental load – This method focuses on updating or adding only the changed or new data to the existing dataset in a data warehouse

This post discusses how to automate ingestion of source data that changes completely and has no way to track the changes. This is useful for customers who want to use this data in Amazon Redshift; some examples of such data are products and bills of materials without tracking details at the source.

We show how to build an automatic extract and load process from various relational database systems into a data warehouse for full load only. A full load is performed from SQL Server to Amazon Redshift using AWS Database Migration Service (AWS DMS). When Amazon EventBridge receives a full load completion notification from AWS DMS, ETL processes are run on Amazon Redshift to process data. AWS Step Functions is used to orchestrate this ETL pipeline. Alternatively, you could use Amazon Managed Workflows for Apache Airflow (Amazon MWAA), a managed orchestration service for Apache Airflow that makes it straightforward to set up and operate end-to-end data pipelines in the cloud.

Solution overview

The workflow consists of the following steps:

  1. The solution uses an AWS DMS migration task that replicates the full load dataset from the configured SQL Server source to a target Redshift cluster in a staging area.
  2. AWS DMS publishes the replicationtaskstopped event to EventBridge when the replication task is complete, which invokes an EventBridge rule.
  3. EventBridge routes the event to a Step Functions state machine.
  4. The state machine calls a Redshift stored procedure through the Redshift Data API, which loads the dataset from the staging area to the target production tables. With this API, you can also access Redshift data with web-based service applications, including AWS Lambda.

The following architecture diagram highlights the end-to-end solution using AWS services.

In the following sections, we demonstrate how to create the full load AWS DMS task, configure the ETL orchestration on Amazon Redshift, create the EventBridge rule, and test the solution.

Prerequisites

To complete this walkthrough, you must have the following prerequisites:

  • An AWS account
  • A SQL Server database configured as a replication source for AWS DMS
  • A Redshift cluster to serve as the target database
  • An AWS DMS replication instance to migrate data from source to target
  • A source endpoint pointing to the SQL Server database
  • A target endpoint pointing to the Redshift cluster

Create the full load AWS DMS task

Complete the following steps to set up your migration task:

  1. On the AWS DMS console, choose Database migration tasks in the navigation pane.
  2. Choose Create task.
  3. For Task identifier, enter a name for your task, such as dms-full-dump-task.
  4. Choose your replication instance.
  5. Choose your source endpoint.
  6. Choose your target endpoint.
  7. For Migration type, choose Migrate existing data.

  1. In the Table mapping section, under Selection rules, choose Add new selection rule
  2. For Schema, choose Enter a schema.
  3. For Schema name, enter a name (for example, dms_sample).
  4. Keep the remaining settings as default and choose Create task.

The following screenshot shows your completed task on the AWS DMS console.

Create Redshift tables

Create the following tables on the Redshift cluster using the Redshift query editor:

  • dbo.dim_cust – Stores customer attributes:
CREATE TABLE dbo.dim_cust (
cust_key integer ENCODE az64,
cust_id character varying(10) ENCODE lzo,
cust_name character varying(100) ENCODE lzo,
cust_city character varying(50) ENCODE lzo,
cust_rev_flg character varying(1) ENCODE lzo
)

DISTSTYLE AUTO;
  • dbo.fact_sales – Stores customer sales transactions:
CREATE TABLE dbo.fact_sales (
order_number character varying(20) ENCODE lzo,
cust_key integer ENCODE az64,
order_amt numeric(18,2) ENCODE az64
)

DISTSTYLE AUTO;
  • dbo.fact_sales_stg – Stores daily customer incremental sales transactions:
CREATE TABLE dbo.fact_sales_stg (
order_number character varying(20) ENCODE lzo,
cust_id character varying(10) ENCODE lzo,
order_amt numeric(18,2) ENCODE az64
)

DISTSTYLE AUTO;

Use the following INSERT statements to load sample data into the sales staging table:

insert into dbo.fact_sales_stg(order_number,cust_id,order_amt) values (100,1,200);
insert into dbo.fact_sales_stg(order_number,cust_id,order_amt) values (101,1,300);
insert into dbo.fact_sales_stg(order_number,cust_id,order_amt) values (102,2,25);
insert into dbo.fact_sales_stg(order_number,cust_id,order_amt) values (103,2,35);
insert into dbo.fact_sales_stg(order_number,cust_id,order_amt) values (104,3,80);
insert into dbo.fact_sales_stg(order_number,cust_id,order_amt) values (105,3,45);

Create the stored procedures

In the Redshift query editor, create the following stored procedures to process customer and sales transaction data:

  • Sp_load_cust_dim() – This procedure compares the customer dimension with incremental customer data in staging and populates the customer dimension:
CREATE OR REPLACE PROCEDURE dbo.sp_load_cust_dim()
LANGUAGE plpgsql
AS $$
BEGIN
truncate table dbo.dim_cust;
insert into dbo.dim_cust(cust_key,cust_id,cust_name,cust_city) values (1,100,'abc','chicago');
insert into dbo.dim_cust(cust_key,cust_id,cust_name,cust_city) values (2,101,'xyz','dallas');
insert into dbo.dim_cust(cust_key,cust_id,cust_name,cust_city) values (3,102,'yrt','new york');
update dbo.dim_cust
set cust_rev_flg=case when cust_city='new york' then 'Y' else 'N' end
where cust_rev_flg is null;
END;
$$
  • sp_load_fact_sales() – This procedure does the transformation for incremental order data by joining with the date dimension and customer dimension and populates the primary keys from the respective dimension tables in the final sales fact table:
CREATE OR REPLACE PROCEDURE dbo.sp_load_fact_sales()
LANGUAGE plpgsql
AS $$
BEGIN
--Process Fact Sales
insert into dbo.fact_sales
select
sales_fct.order_number,
cust.cust_key as cust_key,
sales_fct.order_amt
from dbo.fact_sales_stg sales_fct
--join to customer dim
inner join (select * from dbo.dim_cust) cust on sales_fct.cust_id=cust.cust_id;
END;
$$

Create the Step Functions state machine

Complete the following steps to create the state machine redshift-elt-load-customer-sales. This state machine is invoked as soon as the AWS DMS full load task for the customer table is complete.

  1. On the Step Functions console, choose State machines in the navigation pane.
  2. Choose Create state machine.
  3. For Template, choose Blank.
  4. On the Actions dropdown menu, choose Import definition to import the workflow definition of the state machine.

  1. Open your preferred text editor and save the following code as an ASL file extension (for example, redshift-elt-load-customer-sales.ASL). Provide your Redshift cluster ID and the secret ARN for your Redshift cluster.
{
"Comment": "State Machine to process ETL for Customer Sales Transactions",
"StartAt": "Load_Customer_Dim",
"States": {
"Load_Customer_Dim": {
"Type": "Task",
"Parameters": {
"ClusterIdentifier": "redshiftcluster-abcd",
"Database": "dev",
"Sql": "call dbo.sp_load_cust_dim()",
"SecretArn": "arn:aws:secretsmanager:us-west-2:xxx:secret:rs-cluster-secret-abcd"
},
"Resource": "arn:aws:states:::aws-sdk:redshiftdata:executeStatement",
"Next": "Wait on Load_Customer_Dim"
},
"Wait on Load_Customer_Dim": {
"Type": "Wait",
"Seconds": 30,
"Next": "Check_Status_Load_Customer_Dim"
},

"Check_Status_Load_Customer_Dim": {
"Type": "Task",
"Next": "Choice",
"Parameters": {
"Id.$": "$.Id"
},

"Resource": "arn:aws:states:::aws-sdk:redshiftdata:describeStatement"
},

"Choice": {
"Type": "Choice",
"Choices": [
{
"Not": {
"Variable": "$.Status",
"StringEquals": "FINISHED"
},
"Next": "Wait on Load_Customer_Dim"
}
],
"Default": "Load_Sales_Fact"
},
"Load_Sales_Fact": {
"Type": "Task",
"End": true,
"Parameters": {
"ClusterIdentifier": "redshiftcluster-abcdef”,
"Database": "dev",
"Sql": "call dbo.sp_load_fact_sales()",
"SecretArn": "arn:aws:secretsmanager:us-west-2:xxx:secret:rs-cluster-secret-abcd"
},

"Resource": "arn:aws:states:::aws-sdk:redshiftdata:executeStatement"
}
}
}
  1. Choose Choose file and upload the ASL file to create a new state machine.

  1. For State machine name, enter a name for the state machine (for example, redshift-elt-load-customer-sales).
  2. Choose Create.

After the successful creation of the state machine, you can verify the details as shown in the following screenshot.

The following diagram illustrates the state machine workflow.

The state machine includes the following steps:

  • Load_Customer_Dim – Performs the following actions:
    • Passes the stored procedure sp_load_cust_dim to the execute-statement API to run in the Redshift cluster to load the incremental data for the customer dimension
    • Sends data back the identifier of the SQL statement to the state machine
  • Wait_on_Load_Customer_Dim – Waits for at least 15 seconds
  • Check_Status_Load_Customer_Dim – Invokes the Data API’s describeStatement to get the status of the API call
  • is_run_Load_Customer_Dim_complete – Routes the next step of the ETL workflow depending on its status:
    • FINISHED – Passes the stored procedure Load_Sales_Fact to the execute-statement API to run in the Redshift cluster, which loads the incremental data for fact sales and populates the corresponding keys from the customer and date dimensions
    • All other statuses – Goes back to the wait_on_load_customer_dim step to wait for the SQL statements to finish

The state machine redshift-elt-load-customer-sales loads the dim_cust, fact_sales_stg, and fact_sales tables when invoked by the EventBridge rule.

As an optional step, you can set up event-based notifications on completion of the state machine to invoke any downstream actions, such as Amazon Simple Notification Service (Amazon SNS) or further ETL processes.

Create an EventBridge rule

EventBridge sends event notifications to the Step Functions state machine when the full load is complete. You can also turn event notifications on or off in EventBridge.

Complete the following steps to create the EventBridge rule:

  1. On the EventBridge console, in the navigation pane, choose Rules.
  2. Choose Create rule.
  3. For Name, enter a name (for example, dms-test).
  4. Optionally, enter a description for the rule.
  5. For Event bus, choose the event bus to associate with this rule. If you want this rule to match events that come from your account, select AWS default event bus. When an AWS service in your account emits an event, it always goes to your account’s default event bus.
  6. For Rule type, choose Rule with an event pattern.
  7. Choose Next.
  8. For Event source, choose AWS events or EventBridge partner events.
  9. For Method, select Use pattern form.
  10. For Event source, choose AWS services.
  11. For AWS service, choose Database Migration Service.
  12. For Event type, choose All Events.
  13. For Event pattern, enter the following JSON expression, which looks for the REPLICATON_TASK_STOPPED status for the AWS DMS task:
{
"source": ["aws.dms"],
"detail": {
"eventId": ["DMS-EVENT-0079"],
"eventType": ["REPLICATION_TASK_STOPPED"],
"detailMessage": ["Stop Reason FULL_LOAD_ONLY_FINISHED"],
"type": ["REPLICATION_TASK"],
"category": ["StateChange"]
}
}

  1. For Target type, choose AWS service.
  2. For AWS service, choose Step Functions state machine.
  3. For State machine name, enter redshift-elt-load-customer-sales.
  4. Choose Create rule.

The following screenshot shows the details of the rule created for this post.

Test the solution

Run the task and wait for the workload to complete. This workflow moves the full volume data from the source database to the Redshift cluster.

The following screenshot shows the load statistics for the customer table full load.

AWS DMS provides notifications when an AWS DMS event occurs, for example the completion of a full load or if a replication task has stopped.

After the full load is complete, AWS DMS sends events to the default event bus for your account. The following screenshot shows an example of invoking the target Step Functions state machine using the rule you created.

We configured the Step Functions state machine as a target in EventBridge. This enables EventBridge to invoke the Step Functions workflow in response to the completion of an AWS DMS full load task.

Validate the state machine orchestration

When the entire customer sales data pipeline is complete, you may go through the entire event history for the Step Functions state machine, as shown in the following screenshots.

Limitations

The Data API and Step Functions AWS SDK integration offers a robust mechanism to build highly distributed ETL applications within minimal developer overhead. Consider the following limitations when using the Data API and Step Functions:

Clean up

To avoid incurring future charges, delete the Redshift cluster, AWS DMS full load task, AWS DMS replication instance, and Step Functions state machine that you created as part of this post.

Conclusion

In this post, we demonstrated how to build an ETL orchestration for full loads from operational data stores using the Redshift Data API, EventBridge, Step Functions with AWS SDK integration, and Redshift stored procedures.

To learn more about the Data API, see Using the Amazon Redshift Data API to interact with Amazon Redshift clusters and Using the Amazon Redshift Data API.


About the authors

Ritesh Kumar Sinha is an Analytics Specialist Solutions Architect based out of San Francisco. He has helped customers build scalable data warehousing and big data solutions for over 16 years. He loves to design and build efficient end-to-end solutions on AWS. In his spare time, he loves reading, walking, and doing yoga.

Praveen Kadipikonda is a Senior Analytics Specialist Solutions Architect at AWS based out of Dallas. He helps customers build efficient, performant, and scalable analytic solutions. He has worked with building databases and data warehouse solutions for over 15 years.

Jagadish Kumar (Jag) is a Senior Specialist Solutions Architect at AWS focused on Amazon OpenSearch Service. He is deeply passionate about Data Architecture and helps customers build analytics solutions at scale on AWS.

Serverless ICYMI Q2 2024

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/serverless-icymi-q2-2024/

Welcome to the 26th edition of the AWS Serverless ICYMI (in case you missed it) quarterly recap. Every quarter, we share all the most recent product launches, feature enhancements, blog posts, webinars, live streams, and other interesting things that you might have missed!

In case you missed our last ICYMI, check out what happened last quarter here.

Calendar

Calendar

EDA Day – London 2024

The AWS Serverless DA team hosted the third Event-Driven Architecture (EDA) Day in London on May 14th. This event brought together prominent figures in the event-driven architecture community, AWS, and customer speakers.

EDA Day covered 13 sessions, 2 workshops, and a Q&A panel. David Boyne was the keynote speaker with a talk “Complexity is the Gotcha of Event-Driven Architecture”. There were AWS speakers including Matthew Meckes, Natasha Wright, Julian Wood, Gillian Amstrong, Josh Kahn, Veda Ramen, and Uma Ramadoss. There was also an impressive lineup of guest speakers, Daniele Frasca, David Anderson, Ryan Cormack, Sarah Hamilton, Sheen Brisals, Marcin Sodkiewicz, and Ben Ellerby.

Videos are available on YouTube

EDA Day London

EDA Day London

The future of Serverless

There has been a lot of talk about the future of serverless, with this year being the 10th anniversary of AWS Lambda. Eric Johnson addresses the topic in his ServerlessDays Milan keynote, “Now serverless is all grown up, what’s next”.

AWS Lambda

AWS launched support for the latest release of Ruby 3.3 is based on the new Amazon Linux 2023 runtime. The Ruby 3.3 runtime also provides access to the latest Ruby language features.

There is a new guide on how to retrieve data about Lambda functions that use a deprecated runtime.

Learn how to run code after returning a response from an AWS Lambda function. This post shows how to return a synchronous function response as soon as possible, yet also perform additional asynchronous work after you send the response. For example, you may store data in a database or send information to a logging system.

See how you can use the circuit-breaker pattern with Lambda extensions and Amazon DynamoDB. The circuit breaker pattern can help prevent cascading failures and improve overall system stability.

Circuit-breaker pattern

Circuit-breaker pattern

Lambda functions now scale up to 12X faster in the AWS GovCloud (US) Regions.

Powertools for AWS Lambda (Python) adds support for Agents for Amazon Bedrock.

The AWS SDK for JavaScript v2 enters maintenance mode on September 8, 2024 and reaches end-of-support on September 8, 2025.

Amazon CloudWatch Logs introduced Live Tail streaming CLI support.

Amazon ECS and AWS Fargate

You can now secure Amazon Elastic Container Service (Amazon ECS) workloads on AWS Fargate with customer managed keys (CMKs). Once you add your keys to AWS Key Management Service (AWS KMS), you can use these to encrypt the underlying ephemeral storage of an Amazon ECS task on AWS Fargate.

Windows containers on AWS Fargate now start faster, up to 42% for Windows Server 2022 Core. AWS has optimized the Windows Server AMIs, introduced EC2 fast launch with pre-provisioned snapshots, and reduced network latency.

Amazon ECS Service Connect is a networking capability to simplify service discovery, connectivity, and traffic observability for Amazon ECS. You can now proactively scale Amazon ECS services by using custom metrics.

ECS Connect custom metrics

ECS Service Connect custom metrics

AWS Step Functions

The AWS Step Functions TestState API allows you to test individual states independently and to integrate testing into your preferred development workflows. Learn how to accelerate workflow development to iterate faster.

Step Functions TestState API

Step Functions TestState API

Amazon EventBridge

Amazon EventBridge Pipes now supports event delivery through AWS PrivateLink. You can send events from an event source located in an Amazon Virtual Private Cloud (VPC) to a Pipes target without traversing the public internet.

Amazon Timestream for LiveAnalytics is now an EventBridge Pipes target. Timestream for LiveAnalytics is a fast, scalable, purpose-built time series database that makes it easy to store and analyze trillions of time series data points per day.

EventBridge has a new console dashboard which provides a centralized view of your resources, metrics, and quotas. The console has an improved Learn page and other console enhancements. When using the CloudFormation template export for Pipes, you can also generate the IAM role. There is a new Rules tab in the Event Bus detail page, and the monitoring tab in the Rule detail page now includes additional metrics.

EventBridge Scheduler has some new API request metrics for improved observability.

Generative AI

Amazon Bedrock is a fully managed Generative AI service that offers a choice of high-performing foundation models (FMs) from leading AI companies through a single API. Bedrock now supports new models, including Anthropic’s Claude 3.5, AI21 Labs’ Jamba-Instruct, Amazon Titan Text Premier.

The new Bedrock Converse API provides a consistent way to invoke Amazon Bedrock models and simplifies multi-turn conversations. There is also a JavaScript tutorial to walk you through sending requests to the Converse API using the Javascript SDK.

Amazon Q Developer is now generally available. Amazon Q Developer, part of the Amazon Q family, is a generative AI–powered assistant for software development. Amazon Q is available in the AWS Management Console and as an integrated development environment (IDE) extension for Visual Studio Code, Visual Studio, and JetBrains IDEs. Amazon Q Developer has knowledge of your AWS account resources and can help understand your costs.

Amazon Q list Lambda functions

Amazon Q list Lambda functions

You can use Amazon Q Developer to develop code features and transform code to upgrade Java applications. Amazon Q Developer also offers inline completions in the command line. For more information, see Reimagining software development with the Amazon Q Developer Agent.

Amazon Q code features

Amazon Q code features

Knowledge Bases for Amazon Bedrock now let you configure Guardrails, configure inference parameters, and offers observability logs.

Storage and data

Amazon S3 no longer charges for several HTTP error codes if initiated from outside your individual AWS account or AWS Organization.

You can automatically detect malware in new object uploads to S3 with Amazon GuardDuty.

Amazon Elastic File System (Amazon EFS) now support up to 1.5 GiB/s of throughput per client, a 3x increase over the previous limit of 500 MiB/s.

Discover architectural patterns for real-time analytics using Amazon Kinesis Data Streams in part 1 and part 2 and see how to optimize write throughput.

Amazon API Gateway

Amazon API Gateway now allows you to increase the integration timeout beyond the prior limit of 29 seconds. You can raise the integration timeout for Regional and private REST APIs, but this might require a reduction in your account-level throttle quota limit. This launch can help with workloads that require longer timeouts, such as Generative AI use cases with Large Language Models (LLMs).

You can also now use Amazon Verified Permissions to secure API Gateway REST APIs when using an Open ID connect (OIDC) compliant identity provider. You can now control access based on user attributes and group memberships, without writing code.

AWS AppSync

You can now invoke your AWS AppSync data sources in an event-driven manner. Previously, you could only invoke Lambda functions synchronously from AWS AppSync. AWS AppSync can now trigger Lambda functions in Event mode, asynchronously decoupling the API response from the Lambda invocation, which helps with long-running operations.

AWS AppSync now passes application request headers to Lambda custom authorizer functions. You can make authorization decisions based on the value of the authorization header, and the value of other headers that were sent with the request from the application client.

Learn best practices for AWS AppSync GraphQL APIs. See how to how to optimize the security, performance, coding standards, and deployment of your AWS AppSync API. AWS AppSync also has increase quotas, and new metrics

AWS Amplify

AWS Amplify Gen 2 is now generally available. This now provides a code-first developer experience for building full-stack apps using TypeScript. Amplify Gen 2 allows you to express app requirements like the data models, business logic, and authorization rules in TypeScript.

AWS Amplify Gen2

AWS Amplify Gen2

Amplify has a new experience for file storage. This post explores using Lambda to create serverless functions for Amplify using TypeScript. There are also new team environment workflows.

Serverless blog posts

April

May

June

Serverless container blog posts

April

May

June

Serverless Office Hours

Serverless Office Hours

Serverless Office Hours

April

May

June

Containers from the Couch

Containers from the Couch

Containers from the Couch

April

May

FooBar Serverless

April

February

June

Still looking for more?

The Serverless landing page has more information. The Lambda resources page contains case studies, webinars, whitepapers, customer stories, reference architectures, and even more Getting Started tutorials.

You can also follow the Serverless Developer Advocacy team on X (formerly Twitter) to see the latest news, follow conversations, and interact with the team.

And finally, visit the Serverless Land and Containers on AWS websites for all your serverless and serverless container needs.

Disaster recovery strategies for Amazon MWAA – Part 2

Post Syndicated from Chandan Rupakheti original https://aws.amazon.com/blogs/big-data/disaster-recovery-strategies-for-amazon-mwaa-part-2/

Amazon Managed Workflows for Apache Airflow (Amazon MWAA) is a fully managed orchestration service that makes it straightforward to run data processing workflows at scale. Amazon MWAA takes care of operating and scaling Apache Airflow so you can focus on developing workflows. However, although Amazon MWAA provides high availability within an AWS Region through features like Multi-AZ deployment of Airflow components, recovering from a Regional outage requires a multi-Region deployment.

In Part 1 of this series, we highlighted challenges for Amazon MWAA disaster recovery and discussed best practices to improve resiliency. In particular, we discussed two key strategies: backup and restore and warm standby. In this post, we dive deep into the implementation for both strategies and provide a deployable solution to realize the architectures in your own AWS account.

The solution for this post is hosted on GitHub. The README in the repository offers tutorials as well as further workflow details for both backup and restore and warm standby strategies.

Backup and restore architecture

The backup and restore strategy involves periodically backing up Amazon MWAA metadata to Amazon Simple Storage Service (Amazon S3) buckets in the primary Region. The backups are replicated to an S3 bucket in the secondary Region. In case of a failure in the primary Region, a new Amazon MWAA environment is created in the secondary Region and hydrated with the backed-up metadata to restore the workflows.

The project uses the AWS Cloud Development Kit (AWS CDK) and is set up like a standard Python project. Refer to the detailed deployment steps in the README file to deploy it in your own accounts.

The following diagram shows the architecture of the backup and restore strategy and its key components:

  • Primary Amazon MWAA environment – The environment in the primary Region hosts the workflows
  • Metadata backup bucket – The bucket in the primary Region stores periodic backups of Airflow metadata tables
  • Replicated backup bucket – The bucket in the secondary Region syncs metadata backups through Amazon S3 cross-Region replication
  • Secondary Amazon MWAA environment – This environment is created on-demand during recovery in the secondary Region
  • Backup workflow – This workflow periodically backups up Airflow metadata to the S3 buckets in the primary Region
  • Recovery workflow – This workflow monitors the primary Amazon MWAA environment and initiates failover when needed in the secondary Region

 

The backup restore architecture

Figure 1: The backup restore architecture

There are essentially two workflows that work in conjunction to achieve the backup and restore functionality in this architecture. Let’s explore both workflows in detail and the steps as outlined in Figure 1.

Backup workflow

The backup workflow is responsible for periodically taking a backup of your Airflow metadata tables and storing them in the backup S3 bucket. The steps are as follows:

  • [1.a] You can deploy the provided solution from your continuous integration and delivery (CI/CD) pipeline. The pipeline includes a DAG deployed to the DAGs S3 bucket, which performs backup of your Airflow metadata. This is the bucket where you host all of your DAGs for your environment.
  • [1.b] The solution enables cross-Region replication of the DAGs bucket. Any new changes to the primary Region bucket, including DAG files, plugins, and requirements.txt files, are replicated to the secondary Region DAGs bucket. However, for existing objects, a one-time replication needs to be performed using S3 Batch Replication.
  • [1.c] The DAG deployed to take metadata backup runs periodically. The metadata backup doesn’t include some of the auto-generated tables and the list of tables to be backed up is configurable. By default, the solution backs up variable, connection, slot pool, log, job, DAG run, trigger, task instance, and task fail tables. The backup interval is also configurable and should be based on the Recovery Point Objective (RPO), which is the data loss time during a failure that can be sustained by your business.
  • [1.d] Similar to the DAGs bucket, the backup bucket is also synced using cross-Region replication, through which the metadata backup becomes available in the secondary Region.

Recovery workflow

The recovery workflow runs periodically in the secondary Region monitoring the primary Amazon MWAA environment. It has two functions:

  • Store the environment configuration of the primary Amazon MWAA environment in the secondary backup bucket, which is used to recreate an identical Amazon MWAA environment in the secondary Region during failure
  • Perform the failover when a failure is detected

The following are the steps for when the primary Amazon MWAA environment is healthy (see Figure 1):

  • [2.a] The Amazon EventBridge scheduler starts the AWS Step Functions workflow on a provided schedule.
  • [2.b] The workflow, using AWS Lambda, checks Amazon CloudWatch in the primary Region for the SchedulerHeartbeat metrics of the primary Amazon MWAA environment. The environment in the primary Region sends heartbeats to CloudWatch every 5 seconds by default. However, to not invoke a recovery workflow spuriously, we use a default aggregation period of 5 minutes to check the heartbeat metrics. Therefore, it can take up to 5 minutes to detect a primary environment failure.
  • [2.c] Assuming that the heartbeat was detected in 2.b, the workflow makes the cross-Region GetEnvironment call to the primary Amazon MWAA environment.
  • [2.d] The response from the GetEnvironment call is stored in the secondary backup S3 bucket to be used in case of a failure in the subsequent iterations of the workflow. This makes sure the latest configuration of your primary environment is used to recreate a new environment in the secondary Region. The workflow completes successfully after storing the configuration.

The following are the steps for the case when the primary environment is unhealthy (see Figure 1):

  • [2.a] The EventBridge scheduler starts the Step Functions workflow on a provided schedule.
  • [2.b] The workflow, using Lambda, checks CloudWatch in the primary Region for the scheduler heartbeat metrics and detects failure. The scheduler heartbeat check using the CloudWatch API is the recommended approach to detect failure. However, you can implement a custom strategy for failure detection in the Lambda function such as deploying a DAG to periodically send custom metrics to CloudWatch or other data stores as heartbeats and using the function to check that metrics. With the current CloudWatch-based strategy, the unavailability of the CloudWatch API may spuriously invoke the recovery flow.
  • [2.c] Skipped
  • [2.d] The workflow reads the previously stored environment details from the backup S3 bucket.
  • [2.e] The environment details read from the previous step is used to recreate an identical environment in the secondary Region using the CreateEnvironment API call. The API also needs other secondary Region specific configurations such as VPC, subnets, and security groups that are read from the user-supplied configuration file or environment variables during the solution deployment. The workflow in a polling loop waits until the environment becomes available and invokes the DAG to restore metadata from the backup S3 bucket. This DAG is deployed to the DAGs S3 bucket as a part of the solution deployment.
  • [2.f] The DAG for restoring metadata completes hydrating the newly created environment and notifies the Step Functions workflow of completion using the task token integration. The new environment now starts running the active workflows and the recovery completes successfully.

Considerations

Consider the following when using the backup and restore method:

  • Recovery Time Objective – From failure detection to workflows running in the secondary Region, failover can take over 30 minutes. This includes new environment creation, Airflow startup, and metadata restore.
  • Cost – This strategy avoids the overhead of running a passive environment in the secondary Region. Costs are limited to periodic backup storage, cross-Region data transfer charges, and minimal compute for the recovery workflow.
  • Data loss – The RPO depends on the backup frequency. There is a design trade-off to consider here. Although shorter intervals between backups can minimize potential data loss, too frequent backups can adversely affect the performance of the metadata database and consequently the primary Airflow environment. Also, the solution can’t recover an actively running workflow midway. All active workflows are started fresh in the secondary Region based on the provided schedule.
  • Ongoing management – The Amazon MWAA environment and dependencies are automatically kept in sync across Regions in this architecture. As specified in the Step 1.b of the backup workflow, the DAGs S3 bucket will need a one-time deployment of the existing resources for the solution to work.

Warm standby architecture

The warm standby strategy involves deploying identical Amazon MWAA environments in two Regions. Periodic metadata backups from the primary Region are used to rehydrate the standby environment in case of failover.

The project uses the AWS CDK and is set up like a standard Python project. Refer to the detailed deployment steps in the README file to deploy it in your own accounts.

The following diagram shows the architecture of the warm standby strategy and its key components:

  • Primary Amazon MWAA environment – The environment in the primary Region hosts the workflows during normal operation
  • Secondary Amazon MWAA environment – The environment in the secondary Region acts as a warm standby ready to take over at any time
  • Metadata backup bucket – The bucket in the primary Region stores periodic backups of Airflow metadata tables
  • Replicated backup bucket – The bucket in the secondary Region syncs metadata backups through S3 Cross-Region Replication.
  • Backup workflow – This workflow periodically backups up Airflow metadata to the S3 buckets in both Regions
  • Recovery workflow – This workflow monitors the primary environment and initiates failover to the secondary environment when needed

 

The warm standby architecture

Figure 2: The warm standby architecture

Similar to the backup and restore strategy, the backup workflow (Steps 1a–1d) periodically backups up critical Amazon MWAA metadata to S3 buckets in the primary Region, which is synced in the secondary Region.

The recovery workflow runs periodically in the secondary Region monitoring the primary environment. On failure detection, it initiates the failover procedure. The steps are as follows (see Figure 2):

  • [2.a] The EventBridge scheduler starts the Step Functions workflow on a provided schedule.
  • [2.b] The workflow checks CloudWatch in the primary Region for the scheduler heartbeat metrics and detects failure. If the primary environment is healthy, the workflow completes without further actions.
  • [2.c] The workflow invokes the DAG to restore metadata from the backup S3 bucket.
  • [2.d] The DAG for restoring metadata completes hydrating the passive environment and notifies the Step Functions workflow of completion using the task token integration. The passive environment starts running the active workflows on the provided schedules.

Because the secondary environment is already warmed up, the failover is faster with recovery times in minutes.

Considerations

Consider the following when using the warm standby method:

  • Recovery Time Objective – With a warm standby ready, the RTO can be as low as 5 minutes. This includes just the metadata restore and reenabling DAGs in the secondary Region.
  • Cost – This strategy has an added cost of running similar environments in two Regions at all times. With auto scaling for workers, the warm instance can maintain a minimal footprint; however, the web server and scheduler components of Amazon MWAA will remain active in the secondary environment at all times. The trade-off is significantly lower RTO.
  • Data loss – Similar to the backup and restore model, the RPO depends on the backup frequency. Faster backup cycles minimize potential data loss but can adversely affect performance of the metadata database and consequently the primary Airflow environment.
  • Ongoing management – This approach comes with some management overhead. Unlike the backup and restore strategy, any changes to the primary environment configurations need to be manually reapplied to the secondary environment to keep the two environments in sync. Automated synchronization of the secondary environment configurations is a future work.

Shared considerations

Although the backup and restore and warm standby strategies differ in their implementation, they share some common considerations:

  • Periodically test failover to validate recovery procedures, RTO, and RPO.
  • Enable Amazon MWAA environment logging to help debug issues during failover.
  • Use the AWS CDK or AWS CloudFormation to manage the infrastructure definition. For more details, see the following GitHub repo or Quick start tutorial for Amazon Managed Workflows for Apache Airflow, respectively.
  • Automate deployments of environment configurations and disaster recovery workflows through CI/CD pipelines.
  • Monitor key CloudWatch metrics like SchedulerHeartbeat to detect primary environment failures.

Conclusion

In this series, we discussed how backup and restore and warm standby strategies offer configurable data protection based on your RTO, RPO, and cost requirements. Both use periodic metadata replication and restoration to minimize the area of effect of Regional outages.

Which strategy resonates more with your use case? Feel free to try out our solution and share any feedback or questions in the comments section!


About the Authors

Chandan RupakhetiChandan Rupakheti is a Senior Solutions Architect at AWS. His main focus at AWS lies in the intersection of Analytics, Serverless, and AdTech services. He is a passionate technical leader, researcher, and mentor with a knack for building innovative solutions in the cloud. Outside of his professional life, he loves spending time with his family and friends besides listening and playing music.

Parnab Basak is a Senior Solutions Architect and a Serverless Specialist at AWS. He specializes in creating new solutions that are cloud native using modern software development practices like serverless, DevOps, and analytics. Parnab works closely in the analytics and integration services space helping customers adopt AWS services for their workflow orchestration needs.

AWS Weekly Roundup – Application Load Balancer IPv6, Amazon S3 pricing update, Amazon EC2 Flex instances, and more (May 20, 2024)

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-application-load-balancer-ipv6-amazon-s3-pricing-update-amazon-ec2-flex-instances-and-more-may-20-2024/

AWS Summit season is in full swing around the world, with last week’s events in Bengaluru, Berlin, and  Seoul, where my blog colleague Channy delivered one of the keynotes.

AWS Summit Seoul Keynote

Last week’s launches
Here are some launches that got my attention:

Amazon S3 will no longer charge for several HTTP error codesA customer reported how he was charged for Amazon S3 API requests he didn’t initiate and which resulted in AccessDenied errors. The Amazon Simple Storage Service (Amazon S3) service team updated the service to not charge such API requests anymore. As always when talking about pricing, the exact wording is important, so please read the What’s New post for the details.

Introducing Amazon EC2 C7i-flex instances – These instances delivers up to 19 percent better price performance compared to C6i instances. Using C7i-flex instances is the easiest way for you to get price performance benefits for a majority of compute-intensive workloads. The new instances are powered by the 4th generation Intel Xeon Scalable custom processors (Sapphire Rapids) that are available only on AWS and offer 5 percent lower prices compared to C7i.

Application Load Balancer launches IPv6 only support for internet clientsApplication Load Balancer now allows customers to provision load balancers without IPv4s for clients that can connect using just IPv6s. To connect, clients can resolve AAAA DNS records that are assigned to Application Load Balancer. The Application Load Balancer is still dual stack for communication between the load balancer and targets. With this new capability, you have the flexibility to use both IPv4s or IPv6s for your application targets while avoiding IPv4 charges for clients that don’t require it.

Amazon VPC Lattice now supports TLS Passthrough – We announced the general availability of TLS passthrough for Amazon VPC Lattice, which allows customers to enable end-to-end authentication and encryption using their existing TLS or mTLS implementations. Prior to this launch, VPC Lattice supported HTTP and HTTPS listener protocols only, which terminates TLS and performs request-level routing and load balancing based on information in HTTP headers.

Amazon DocumentDB zero-ETL integration with Amazon OpenSearch Service – This new integration provides you with advanced search capabilities, such as fuzzy search, cross-collection search and multilingual search, on your Amazon DocumentDB (with MongoDB compatibility) documents using the OpenSearch API. With a few clicks in the AWS Management Console, you can now synchronize your data from Amazon DocumentDB to Amazon OpenSearch Service, eliminating the need to write any custom code to extract, transform, and load the data.

Amazon EventBridge now supports customer managed keys (CMK) for event buses – This capability allows you to encrypt your events using your own keys instead of an AWS owned key (which is used by default). With support for CMK, you now have more fine-grained security control over your events, satisfying your company’s security requirements and governance policies.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS news
Here are some additional news items, open source projects, and Twitch shows that you might find interesting:

The Four Pillars of Managing Email Reputation – Dustin Taylor is the manager of anti-abuse and email deliverability for Amazon Simple Email Service (SES). He wrote a remarkable post exploring Amazon SES approach to managing domain and IP reputation. Maintaining a high reputation ensures optimal recipient inboxing. His post outlines how Amazon SES protects its network reputation to help you deliver high-quality email consistently. A worthy read, even if you’re not sending email at scale. I learned a lot.

AWS Build On Generative AIBuild On Generative AI – Season 3 of your favorite weekly Twitch show about all things generative artificial intelligence (AI) is in full swing! Streaming every Monday, 9:00 AM US PT, my colleagues Tiffany and Darko discuss different aspects of generative AI and invite guest speakers to demo their work.

AWS open source news and updates – My colleague Ricardo writes this weekly open source newsletter, in which he highlights new open source projects, tools, and demos from the AWS Community.

Upcoming AWS events

AWS Summits – Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Hong Kong (May 22), Milan (May 23), Stockholm (June 4), and Madrid (June 5).

AWS re:Inforce – Explore 2.5 days of immersive cloud security learning in the age of generative AI at AWS re:Inforce, June 10–12 in Pennsylvania.

AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Midwest | Columbus (June 13), Sri Lanka (June 27), Cameroon (July 13), Nigeria (August 24), and New York (August 28).

Browse all upcoming AWS led in-person and virtual events and developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— seb

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Governing and securing AWS PrivateLink service access at scale in multi-account environments

Post Syndicated from Anandprasanna Gaitonde original https://aws.amazon.com/blogs/security/governing-and-securing-aws-privatelink-service-access-at-scale-in-multi-account-environments/

Amazon Web Services (AWS) customers have been adopting the approach of using AWS PrivateLink to have secure communication to AWS services, their own internal services, and third-party services in the AWS Cloud. As these environments scale, the number of PrivateLink connections outbound to external services and inbound to internal services increase and are spread out across multiple accounts in virtual private clouds (VPCs). While AWS Identity and Access Management (IAM) policies allow you to control access to individual PrivateLink services, customers want centralized governance for the use of PrivateLink in adherence with organizational standards and security needs.

This post provides an approach for centralized governance for PrivateLink based services across your multi-account environment. It provides a way to create preventative controls through the use of service control policies (SCPs) and detective controls through event-driven automation. This allows your application teams to consume internal and external services while adhering to organization policies and provides a mechanism for centralized control as your AWS environment grows.

Scenarios faced by customers

Figure 1 shows an example customer environment comprising a multi-account structure created through AWS Organizations or using AWS Control Tower. There are separate organizational units (OUs) pertaining to different business units (BUs) with respective accounts. The business services’ account hosts several backend services that are utilized by consuming applications for their functionality. Since these services provide functionality to more than one internal application and will require access across VPC and account boundaries, these are exposed through AWS PrivateLink. One such service is shown in the business services account.

The customer has partners that provide services for integration with the customer’s application stack. The approved partner account provides a service that is approved for use by the cloud administration team. The NotApproved partner account provides services that are not approved within the customer’s organization. The customer has another OU dedicated to application teams. The application 1 account has an application that consumes the business service of the approved partner account. It is also planning to use the service from the NotApproved partner, which should be blocked. The application in the application 2 account is planning on using AWS services through interface endpoints as well as the approved partner account through PrivateLink integration.

Note: Throughout this post, “organization” is used to refer to an organization that you create and manage through AWS Organizations.

Figure 1: A multi-account customer environment

Figure 1: A multi-account customer environment

Current challenges

Access to individual PrivateLink connections can be controlled through IAM policies. At scale, however, different teams use and adopt PrivateLink for incoming and outgoing connections, and the number of VPC endpoint policies to create and manage increases. As mentioned in the problem statement presented in the introduction, as the customer environment scales and the number of PrivateLink connections increases, customers want centralized guardrails to manage PrivateLink resources at scale. For our example, the customer would like to put the following controls in place:

Preventative controls:

Use case 1:

  • Allow creation of VPC endpoints and allow access only to PrivateLink enabled AWS services.
  • Allow creation of VPC endpoints and initiating connection only to approved PrivateLink enabled third-party services.
  • Allow creation of VPC endpoints and initiating connection only to internal business services owned by accounts in the same organization.

Use case 2:

  • Allow only a cloud admin role to add permissions to connect to an endpoint service to prevent connections from external clients to internal VPC endpoint services.

Detective controls:

Use case 3:

  • Detect if connections are made to PrivateLink services exposed by AWS accounts not belonging to the customer’s organization.

Use case 4:

  • Detect if connections are made by external AWS accounts (not belonging to the customer’s organization) to PrivateLink services exposed for internal use by the customer’s AWS accounts.

This post presents a solution that uses SCPs, AWS CloudTrail, and AWS Config to achieve governance. When the solution is deployed in your account, the following components are created as part of the architecture, as shown in Figure 2.

Figure 2: Resources deployed in the customer environment by the solution

Figure 2: Resources deployed in the customer environment by the solution

The following architecture is now in place:

  • SCPs to provide preventative controls for the PrivateLink connections.
  • Amazon EventBridge rules that are configured to trigger based on events from API calls captured by CloudTrail in specified accounts within specified OUs.
  • EventBridge rules in member accounts to send events to the event bus in the Audit account, and a central EventBridge rule in that account to trigger an AWS Lambda function based on PrivateLink related API calls.
  • A Lambda function that receives the events and validates if the VPC endpoint API call is allowed for the PrivateLink service and notifies a cloud administrator if a policy is violated.
  • An AWS Config rule that checks if PrivateLink enabled VPC endpoint services created within your AWS accounts have enabled auto accept of client connections and disabled notifications.

Use cases and solution approach

This section walks through each use case and how the solution components are used to address each use case.

Preventative control

Use case 1: Allowing the creation of a VPC endpoint connection to only AWS services and approved internal and third-party PrivateLink services

This solution allows creating a VPC endpoint for only approved partner PrivateLink services, PrivateLink services internal to the organization, and AWS services. This is implemented using an SCP and can be enforced at the individual account or OU. The approved partner services as well as the internal accounts that can host allowed PrivateLink services can be specified during the solution deployment. Application teams operating in AWS accounts within the customer environment can then create VPC endpoints to PrivateLink services of approved partners or AWS services. However, they will not be able to create a VPC endpoint to an unapproved PrivateLink service, for example. This is shown in Figure 3.

Figure 3: Allowed and disallowed paths in PrivateLink connections by SCP

Figure 3: Allowed and disallowed paths in PrivateLink connections by SCP

The SCP that allows you to do this preventative control is shown in the following code snippet. In this example SCP policy, AllowedPrivateLinkPartnerService-ServiceName refers to the service name of the allowed partner PrivateLink. Also, the SCP allows the creation of VPC endpoints to internal PrivateLink services that are hosted in AllowedPrivateLinkAccount. Make sure that this SCP does not interfere with the other policies you created within your organization. The solution currently uses ec2:VpceServiceName and ec2:VpceServiceOwner conditions to identify the PrivateLink service of AWS services or a third-party partner. These conditions can be used in an SCP to control the creation of VPC endpoints:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Condition": {
        "StringNotEquals": {
          "ec2:VpceServiceName": [
            "AllowedPrivateLinkPartnerService-ServiceName",
          ],
          "ec2:VpceServiceOwner": [
            "AllowedPrivateLinkAccount",
            "amazon"
          ]
        }
      },
      "Action": [
        "ec2:CreateVpcEndpoint"
      ],
      "Resource": "arn:aws:ec2:*:*:vpc-endpoint/*",
      "Effect": "Deny",
      "Sid": "SCPDenyPrivateLink"
    }
  ]
}

Use case 2: Allow only a cloud admin role to add permissions to connect to an endpoint service

This solution makes sure that PrivateLink services that are owned and created in AWS accounts of the customer cannot be connected to consumers unless it is allowed by the cloud administrator role. The cloud administrator can then make sure that only legitimate internal AWS accounts are allowed access to that service and restrict access from other accounts outside of the customer’s organization. This is achieved through the use of a service control policy that will restrict modifications of permissions of the PrivateLink endpoint service. This makes sure that individual teams are not able to use the Allow principals configuration to open access to other entities directly, and only a cloud administrator role with the right permissions can make that change.

{
  "Version": "2012-10-17",
  "Statement": [
  
      "Sid": "Statement1",
      "Effect": "Deny",
      "Action": [
        "ec2:ModifyVpcEndpointServicePermissions"
      ],
      "Resource": [
        "*"
      ],
      "Condition": {
        "StringNotEquals": {
          "aws:PrincipalArn": [
            "arn:aws:iam::*:role/CloudNetworkAdmin"
          ]
        }
      }
    }
  ]
}

This policy can help in achieving the access control, as shown in Figure 4. The cloud administrator uses the Allow principals configuration of the business services PrivateLink service to provide access only to the application 1 account. The SCP allows only the cloud administrator to make the modification and does not allow another member of the team from bypassing that process and adding a nonapproved client application account to access the internal PrivateLink service.

Figure 4: Centralized control on access to the internal PrivateLink service to the customer’s own accounts

Figure 4: Centralized control on access to the internal PrivateLink service to the customer’s own accounts

Detective controls

For detective controls, we discuss two use cases that are deployed as part of the solution and can be enabled and disabled based on the test that you want to perform.

Use case 3: Detecting if connections are made by external AWS accounts (not belonging to the customer’s organization) to PrivateLink services exposed by the customer’s AWS accounts

In this use case, the customer would like to detect if connections are made to their business services from accounts outside of its organization. The solution uses individual member account trails for capturing API calls across the multi-account structure and cross-account EventBridge integration. CloudTrail events from member accounts capture events when a PrivateLink service connection is accepted through the API call event AcceptVPCConnectionEndpoint and sent to the event bus in the audit account. This triggers a Lambda function that then captures the information of the entity requesting the connection and details of the PrivateLink service and sends a notification to the cloud administrator. This is shown in Figure 5.

Figure 5: Detecting the creation of a VPC endpoint or accepting a PrivateLink service connection using CloudTrail events in EventBridge

Figure 5: Detecting the creation of a VPC endpoint or accepting a PrivateLink service connection using CloudTrail events in EventBridge

Custom AWS Config rule for detective control

This detective control mechanism works in cases where PrivateLink services are configured to manually accept client connections. If the endpoint is configured to automatically accept connections, CloudTrail will not generate an event when a connection is accepted. AWS PrivateLink allows customers to configure connection notifications to send connection notification events to an Amazon Simple Notification Service (Amazon SNS) topic. Cloud administrators can get the notifications if they are subscribed to the SNS topic. However, if the notification configuration is removed by the member account, there is no way for the cloud administrator to have visibility for new connections and effectively apply governance requirements.

This solution employs an AWS Config rule to detect if PrivateLink services are created with the Auto Accept Connections setting enabled or without a connection notification configuration and flag it as noncompliant.

This is depicted in Figure 6.

Figure 6: Custom AWS Config rule and SNS notification deployed as part of the solution

Figure 6: Custom AWS Config rule and SNS notification deployed as part of the solution

When a PrivateLink service is created by one of the business services teams, an AWS Config organization rule in the audit account will detect the event, and the custom Lambda function will check if the connection notification configuration is present. If not, then the AWS Config rule will flag the resource as noncompliant. Cloud administrators can view these in the AWS Config dashboard or receive notifications configured through AWS Config.

Use case 4: Detecting if connections are made to PrivateLink services exposed by AWS accounts not belonging to the customer’s organization.

Using the same approach as presented in use case 3, connections made to PrivateLink services exposed by AWS accounts outside of the customer’s organization can be detected through the API call event from CloudTrail CreateVPCEndpoint. This event is sent to the centralized event bus and the Lambda function to check against the criteria and provide notifications to the cloud administrator.

Deploy and test the solution

This section walks through how to deploy and test our recommended solution.

Prerequisites

To deploy the solution, first follow these steps.

  1. In your AWS Organizations multi-account environment, go to the management account and enable trusted access for AWS CloudFormation, enable trusted access for AWS Config, and enable trusted access for CloudTrail.
  2. Identify an account in your organization to serve as the audit account and set it up as a delegated administrator for CloudFormation, AWS Config, and CloudTrail. Follow these steps to perform this step:
    1. Register a delegated administrator for CloudFormation.
    2. Perform the steps mentioned in step 1 of this post to register a delegated administrator for AWS Config.
    3. Register a delegated admin for CloudTrail.
  3. The solution uses the deployment of CloudFormation StackSets with self-managed permissions to set up the resources in the audit account. In order to enable this, create AWSCloudFormationStackSetAdministrationRole in the management account and AWSCloudFormationStackSetExecutionRole in the audit account by using the steps in the topic Grant self-managed permissions.
  4. In a separate AWS account that is different than your multi-account environment, create two PrivateLink VPC endpoint services as explained in the documentation. You can use this template to create a test PrivateLink VPC endpoint service. These will serve as two partner services, one of which is allowed, and another is untrusted and not allowed. Make note of their service names.

Figure 7: Simulated partner services (approved and not approved) in a separate test account

Figure 7: Simulated partner services (approved and not approved) in a separate test account

Deploying the solution

  1. Go to the management account of your AWS Organizations multi-account environment and use this CloudFormation template to deploy the solution, or choose the following Launch Stack button:

    Launch stack

    CloudFormation stacks can be deployed using the AWS CloudFormation console or using the AWS CLI.

  2. This initially displays the Create stack page. Leave the details entered by default, and then choose Next.
  3. On the Specify stack details page, enter the details for the input parameters for this solution. The following table shows the details that you will provide when setting up the CloudFormation template on the Specify stack details page on the CloudFormation console.

    AWSOrganizationsId Identifier for your organization. This can be obtained from your management account as described in the AWS Organizations User Guide.
    AdminRoleArn Role of the persona who is allowed to modify PrivateLink endpoint permissions.
    AllowedPrivateLinkAccounts AWS account IDs of accounts in your OU that host PrivateLink services.
    AllowedPrivateLinkPartnerServices Specify the service name of the approved PrivateLink services from partners. If you want to test with a simulated partner PrivateLink, take the service name of PrivateLink services created in Step 4 of the prerequisites as the partner services to which connections should be allowed. The unique service name of the partner’s PrivateLink service is provided by the partner to the customer so that they can connect to it.
    AuditAccountId AWS account ID of the audit account in your multi-account environment.
    PLOrganizationUnit OU identifier for the organizational unit where the solution will perform preventative and detective control.
    Figure 8: CloudFormation template input parameters for the solution as it appears on the console

    Figure 8: CloudFormation template input parameters for the solution as it appears on the console

  4. Choose Next and keep the defaults for the rest of the fields. Then, on the Review and create page, choose Submit to finish deploying the solution.

Testing the solution

Once the solution is deployed successfully, follow these steps to test the solution:

  1. For an account specified in the AllowedPrivateLinkAccounts parameter, create a VPC endpoint service as explained in the topic Create a service powered by AWS PrivateLink. Instead of creating this manually, use this CloudFormation template to create a test VPC endpoint service.
  2. Sign in to a member account within the OU that you specified in the CloudFormation template.
  3. From the member account, create a VPC endpoint connection to the internal PrivateLink service created in the account from Step 1. This connection will set up successfully because it is internal to the organization and therefore allowed by the SCP policy, and is not flagged to the cloud administrator as violating organization policy.
  4. From the member account, create a VPC endpoint connection to the AWS service that is supporting PrivateLink, such as AWS Key Management Service (AWS KMS). This connection will set up successfully because it is internal to the organization and therefore allowed by the SCP policy, and is not flagged to the cloud administrator as violating organization policy.
  5. From the member account, create a VPC endpoint connection to the PrivateLink service created in Step 4 of the prerequisites. This connection will set up successfully because it is internal to the organization and therefore allowed by the SCP policy, and is not flagged to the cloud administrator as violating organization policy.
  6. From the member account, create a VPC endpoint connection to the PrivateLink service created in Step 4 of the prerequisites and that is not an allowed partner service. This connection will fail because it is not allowed by the SCP policy.
  7. From an account outside of your organization, create a VPC endpoint connection to the internal PrivateLink service created in Step 1. The connection setup is successful, but the cloud administrator will see the internal PrivateLink service as NOT COMPLIANT because the connection from external clients is considered to be not compliant with organization requirements in this solution. This information allows the cloud admin to quickly find the noncompliant resource and work with the PrivateLink service owner team to remediate the issue.
  8. From the member account, create another VPC endpoint service without configuring the notification configuration, and leave the Acceptance required field unchecked. Navigate to the AWS Config console in the audit account and go to Aggregator->Rules. Check the evaluation of the rule starting with “OrgConfigRule-pl-governance-rule….” Once the evaluation is complete, it will indicate that this VPC endpoint service is NOT COMPLIANT, whereas the service created in Step 1 will show as COMPLIANT.

Considerations

  • The solution described here takes the approach of allowing all VPC endpoint connections from within a customer’s organization to the PrivateLink services in specified accounts and detecting and notifying all external ones. This can be modified based on your specific use cases and requirements.
  • The solution uses AWS Config rules that are applied to specific accounts of your organization, even though the solution is applied at an OU level. The AWS Config rules created in this solution are scoped to evaluate VPC endpoint services and should incur charges accordingly. Refer to the AWS Config pricing page to understand usage-based pricing for the service.
  • Other services, such AWS Lambda and Amazon EventBridge, also incur usage-based charges. Please verify that these are deleted to prevent incurring unnecessary charges.
  • SCP policies only affect member accounts. They do not apply to the management account, so actions denied through an SCP policy multi-account will still be allowed in the management account.

Cleanup

You can delete the solution by following these steps to avoid unnecessary charges:

  • Delete the CloudFormation stack created as part of Step 4 of the prerequisites.
  • Delete the CloudFormation stack of the main solution deployed in the management account as part of the Deploying the solution section.
  • Delete the CloudFormation stack created as part of Step 1 of Testing the solution.

Summary

As customers adopt AWS PrivateLink throughout their environment, the mechanisms discussed in this post provide a way for administrators to govern and secure their PrivateLink services at scale. This approach can help you create a scalable solution where interconnections are aligned to the organization’s guidelines and security requirements. While this solution presents an approach to governance, customers can tailor this solution to their unique organizational requirements.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Author

Anandprasanna Gaitonde

Anand is a Principal Solutions Architect at AWS, responsible for helping customers design and operate Well-Architected solutions to help them adopt the AWS Cloud successfully. He focuses on AWS networking and serverless technologies to design and develop solutions in the cloud across industry verticals. He holds a master of engineering in computer science and a postgraduate degree in software enterprise management.

Siva Devabakthini

Siva Devabakthini

Siva is a Senior Solutions Architect at AWS who covers hyperscale customers in the AWS Digital Native Business segment. He focuses on AWS security, data analytics, and artificial intelligence and machine learning (AI/ML) technologies to design and develop solutions in the cloud. Outside of work, Siva loves traveling, trying different cuisines, and being outdoors with his family.

Emmanuel Isimah

Emmanuel Isimah

Emmanuel is a Senior Solutions Architect at AWS who covers hyperscale customers in the enterprise retail space. He has a background in networking, security, and containers. Emmanuel helps customers build and secure innovative cloud solutions, solving their business problems by using data-driven approaches. Emmanuel’s areas of depth include security and compliance, containers, and networking.

AWS Weekly Roundup: Amazon Q, Amazon QuickSight, AWS CodeArtifact, Amazon Bedrock, and more (May 6, 2024)

Post Syndicated from Matheus Guimaraes original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-amazon-q-amazon-quicksight-aws-codeartifact-amazon-bedrock-and-more-may-6-2024/

April has been packed with new releases! Last week continued that trend with many new releases supporting a variety of domains such as security, analytics, devops, and many more, as well as more exciting new capabilities within generative AI.

If you missed the AWS Summit London 2024, you can now watch the sessions on demand, including the keynote by Tanuja Randery, VP & Marketing Director, EMEA, and many of the break-out sessions which will continue to be released over the coming weeks.

Last week’s launches
Here are some of the highlights that caught my attention this week:

Manual and automatic rollback from any stage in AWS CodePipeline – You can now rollback any stage, other than Source, to any previously known good state in if you use a V2 pipeline in AWS CodePipeline. You can configure automatic rollback which will use the source changes from the most recent successful pipeline execution in the case of failure, or you can initiate a manual rollback for any stage from the console, API or SDK and choose which pipeline execution you want to use for the rollback.

AWS CodeArtifact now supports RubyGems – Ruby community, rejoice, you can now store your gems in AWS CodeArtifact! You can integrate it with RubyGems.org, and CodeArtifact will automatically fetch any gems requested by the client and store them locally in your CodeArtifact repository. That means that you can have a centralized place for both your first-party and public gems so developers can access their dependencies from a single source.

Ruby-repo screenshot

Create a repository in AWS CodeArtifact and choose “rubygems-store” to connect your repository to RubyGems.org on the “Public upstream repositories” dropdown.

Amazon EventBridge Pipes now supports event delivery through AWS PrivateLink – You can now deliver events to an Amazon EventBridge Pipes target without traversing the public internet by using AWS PrivateLink. You can poll for events in a private subnet in your Amazon Virtual Private Cloud (VPC) without having to deploy any additional infrastructure to keep your traffic private.

Amazon Bedrock launches continue. You can now run scalable, enterprise-grade generative AI workloads with Cohere Command R & R+. And Amazon Titan Text V2 is now optimized for improving Retrieval-Augmented Generation (RAG).

AWS Trusted Advisor – last year we launched Trusted Advisor APIs enabling you to programmatically consume recommendations. A new API is available now that you can use to exclude resources from recommendations.

Amazon EC2 – there have been two new great launches this week for EC2 users. You can now mark your AMIs as “protected” to avoid them being deregistered by accident. You can also now easily discover your active AMIs by simply describing them.

Amazon CodeCatalyst – you can now view your git commit history in the CodeCatalyst console.

General Availability
Many new services and capabilities became generally available this week.

Amazon Q in QuickSight – Amazon Q has brought generative BI to Amazon QuickSight giving you the ability to build beautiful dashboards automatically simply by using natural language and it’s now generally available. To get started, head to the Quicksight Pricing page to explore all options or start a 30-day free trial which allows up to 4 users per QuickSight account to use all the new generative AI features.

With the new generative AI features enabled by Amazon Q in Amazon QuickSight you can use natural language queries to build, sort and filter dashboards. (source: AWS Documentation)

Amazon Q Business (GA) and Amazon Q Apps (Preview) – Also generally available now is Amazon Q Business which we launched last year at AWS re:Invent 2023 with the ability to connect seamlessly with over 40 popular enterprise systems, including Microsoft 365, Salesforce, Amazon Simple Storage Service (Amazon S3), Gmail, and so many more. This allows Amazon Q Business to know about your business so your employees can generate content, solve problems, and take actions that are specific to your business.

We have also launched support for custom plug-ins, so now you can create your own integrations with any third-party application.

Q-business screenshot

With general availability of Amazon Q Business we have also launched the ability to create your own custom plugins to connect to any third-party API.

Another highlight of this release is the launch of Amazon Q Apps, which enables you to quickly generate an app from your conversation with Amazon Q Business, or by describing what you would like it to generate for you. All guardrails from Amazon Q Business apply, and it’s easy to share your apps with colleagues through an admin-managed library. Amazon Q Apps is in preview now.

Check out Channy Yun’s post for a deeper dive into Amazon Q Business and Amazon Q Apps, which guides you through these new features.

Amazon Q Developer – you can use Q Developer to completely change your developer flow. It has all the capabilities of what was previously known as Amazon CodeWhisperer, such as Q&A, diagnosing common errors, generating code including tests, and many more. Now it has expanded, so you can use it to generate SQL, and build data integration pipelines using natural language. In preview, it can describe resources in your AWS account and help you retrieve and analyze cost data from AWS Cost Explorer.

For a full list of AWS announcements, be sure to keep an eye on the ‘What’s New with AWS?‘ page.

Other AWS news
Here are some additional projects, blog posts, and news items that you might find interesting:

AWS open source news and updates – My colleague Ricardo writes about open source projects, tools, and events from the AWS Community.

Discover Claude 3 – If you’re a developer looking for a good source to get started with Claude 3 them I recommend this great post from my colleague Haowen Huang: Mastering Amazon Bedrock with Claude 3: Developer’s Guide with Demos.

Upcoming AWS events
Check your calendars and sign up for upcoming AWS events:

AWS Summits – Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Singapore (May 7), Seoul (May 16–17), Hong Kong (May 22), Milan (May 23), Stockholm (June 4), and Madrid (June 5).

AWS re:Inforce – Explore 2.5 days of immersive cloud security learning in the age of generative AI at AWS re:Inforce, June 10–12 in Pennsylvania.

AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Turkey (May 18), Midwest | Columbus (June 13), Sri Lanka (June 27), Cameroon (July 13), Nigeria (August 24), and New York (August 28).

GOTO EDA Day LondonJoin us in London on May 14 to learn about event-driven architectures (EDA) for building highly scalable, fault tolerant, and extensible applications. This conference is organized by GOTO, AWS, and partners.

Browse all upcoming AWS led in-person and virtual events and developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

Matheus Guimaraes

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Accelerate security automation using Amazon CodeWhisperer

Post Syndicated from Brendan Jenkins original https://aws.amazon.com/blogs/security/accelerate-security-automation-using-amazon-codewhisperer/

In an ever-changing security landscape, teams must be able to quickly remediate security risks. Many organizations look for ways to automate the remediation of security findings that are currently handled manually. Amazon CodeWhisperer is an artificial intelligence (AI) coding companion that generates real-time, single-line or full-function code suggestions in your integrated development environment (IDE) to help you quickly build software. By using CodeWhisperer, security teams can expedite the process of writing security automation scripts for various types of findings that are aggregated in AWS Security Hub, a cloud security posture management (CSPM) service.

In this post, we present some of the current challenges with security automation and walk you through how to use CodeWhisperer, together with Amazon EventBridge and AWS Lambda, to automate the remediation of Security Hub findings. Before reading further, please read the AWS Responsible AI Policy.

Current challenges with security automation

Many approaches to security automation, including Lambda and AWS Systems Manager Automation, require software development skills. Furthermore, the process of manually writing code for remediation can be a time-consuming process for security professionals. To help overcome these challenges, CodeWhisperer serves as a force multiplier for qualified security professionals with development experience to quickly and effectively generate code to help remediate security findings.

Security professionals should still cultivate software development skills to implement robust solutions. Engineers should thoroughly review and validate any generated code, as manual oversight remains critical for security.

Solution overview

Figure 1 shows how the findings that Security Hub produces are ingested by EventBridge, which then invokes Lambda functions for processing. The Lambda code is generated with the help of CodeWhisperer.

Figure 1: Diagram of the solution

Security Hub integrates with EventBridge so you can automatically process findings with other services such as Lambda. To begin remediating the findings automatically, you can configure rules to determine where to send findings. This solution will do the following:

  1. Ingest an Amazon Security Hub finding into EventBridge.
  2. Use an EventBridge rule to invoke a Lambda function for processing.
  3. Use CodeWhisperer to generate the Lambda function code.

It is important to note that there are two types of automation for Security Hub finding remediation:

  • Partial automation, which is initiated when a human worker selects the Security Hub findings manually and applies the automated remediation workflow to the selected findings.
  • End-to-end automation, which means that when a finding is generated within Security Hub, this initiates an automated workflow to immediately remediate without human intervention.

Important: When you use end-to-end automation, we highly recommend that you thoroughly test the efficiency and impact of the workflow in a non-production environment first before moving forward with implementation in a production environment.

Prerequisites

To follow along with this walkthrough, make sure that you have the following prerequisites in place:

Implement security automation

In this scenario, you have been tasked with making sure that versioning is enabled across all Amazon Simple Storage Service (Amazon S3) buckets in your AWS account. Additionally, you want to do this in a way that is programmatic and automated so that it can be reused in different AWS accounts in the future.

To do this, you will perform the following steps:

  1. Generate the remediation script with CodeWhisperer
  2. Create the Lambda function
  3. Integrate the Lambda function with Security Hub by using EventBridge
  4. Create a custom action in Security Hub
  5. Create an EventBridge rule to target the Lambda function
  6. Run the remediation

Generate a remediation script with CodeWhisperer

The first step is to use VS Code to create a script so that CodeWhisperer generates the code for your Lambda function in Python. You will use this Lambda function to remediate the Security Hub findings generated by the [S3.14] S3 buckets should use versioning control.

Note: The underlying model of CodeWhisperer is powered by generative AI, and the output of CodeWhisperer is nondeterministic. As such, the code recommended by the service can vary by user. By modifying the initial code comment to prompt CodeWhisperer for a response, customers can change the corresponding output to help meet their needs. Customers should subject all code generated by CodeWhisperer to typical testing and review protocols to verify that it is free of errors and is in line with applicable organizational security policies. To learn about best practices on prompt engineering with CodeWhisperer, see this AWS blog post.

To generate the remediation script

  1. Open a new VS Code window, and then open or create a new folder for your file to reside in.
  2. Create a Python file called cw-blog-remediation.py as shown in Figure 2.
     
    Figure 2: New VS Code file created called cw-blog-remediation.py

    Figure 2: New VS Code file created called cw-blog-remediation.py

  3. Add the following imports to the Python file.
    import json
    import boto3

  4. Because you have the context added to your file, you can now prompt CodeWhisperer by using a natural language comment. In your file, below the import statements, enter the following comment and then press Enter.
    # Create lambda function that turns on versioning for an S3 bucket after the function is triggered from Amazon EventBridge

  5. Accept the first recommendation that CodeWhisperer provides by pressing Tab to use the Lambda function handler, as shown in Figure 3.
    &ngsp;
    Figure 3: Generation of Lambda handler

    Figure 3: Generation of Lambda handler

  6. To get the recommendation for the function from CodeWhisperer, press Enter. Make sure that the recommendation you receive looks similar to the following. CodeWhisperer is nondeterministic, so its recommendations can vary.
    import json
    import boto3
    
    # Create lambda function that turns on versioning for an S3 bucket after function is triggered from Amazon EventBridge
    def lambda_handler(event, context):
        s3 = boto3.client('s3')
        bucket = event['detail']['requestParameters']['bucketName']
        response = s3.put_bucket_versioning(
            Bucket=bucket,
            VersioningConfiguration={
                'Status': 'Enabled'
            }
        )
        print(response)
        return {
            'statusCode': 200,
            'body': json.dumps('Versioning enabled for bucket ' + bucket)
        }
    

  7. Take a moment to review the user actions and keyboard shortcut keys. Press Tab to accept the recommendation.
  8. You can change the function body to fit your use case. To get the Amazon Resource Name (ARN) of the S3 bucket from the EventBridge event, replace the bucket variable with the following line:
    bucket = event['detail']['findings'][0]['Resources'][0]['Id']

  9. To prompt CodeWhisperer to extract the bucket name from the bucket ARN, use the following comment:
    # Take the S3 bucket name from the ARN of the S3 bucket

    Your function code should look similar to the following:

    import json
    import boto3
    
    # Create lambda function that turns on versioning for an S3 bucket after function is triggered from Amazon EventBridge
    def lambda_handler(event, context):
        s3 = boto3.client('s3')
       bucket = event['detail']['findings'][0]['Resources'][0]['Id']
             # Take the S3 bucket name from the ARN of the S3 bucket
       bucket = bucket.split(':')[5]
    
        response = s3.put_bucket_versioning(
            Bucket=bucket,
            VersioningConfiguration={
                'Status': 'Enabled'
            }
        )
        print(response)
        return {
            'statusCode': 200,
            'body': json.dumps('Versioning enabled for bucket ' + bucket)
        }
    

  10. Create a .zip file for cw-blog-remediation.py. Find the file in your local file manager, right-click the file, and select compress/zip. You will use this .zip file in the next section of the post.

Create the Lambda function

The next step is to use the automation script that you generated to create the Lambda function that will enable versioning on applicable S3 buckets.

To create the Lambda function

  1. Open the AWS Lambda console.
  2. In the left navigation pane, choose Functions, and then choose Create function.
  3. Select Author from Scratch and provide the following configurations for the function:
    1. For Function name, select sec_remediation_function.
    2. For Runtime, select Python 3.12.
    3. For Architecture, select x86_64.
    4. For Permissions, select Create a new role with basic Lambda permissions.
  4. Choose Create function.
  5. To upload your local code to Lambda, select Upload from and then .zip file, and then upload the file that you zipped.
  6. Verify that you created the Lambda function successfully. In the Code source section of Lambda, you should see the code from the automation script displayed in a new tab, as shown in Figure 4.
     
    Figure 4: Source code that was successfully uploaded

    Figure 4: Source code that was successfully uploaded

  7. Choose the Code tab.
  8. Scroll down to the Runtime settings pane and choose Edit.
  9. For Handler, enter cw-blog-remediation.lambda_handler for your function handler, and then choose Save, as shown in Figure 5.
     
    Figure 5: Updated Lambda handler

    Figure 5: Updated Lambda handler

  10. For security purposes, and to follow the principle of least privilege, you should also add an inline policy to the Lambda function’s role to perform the tasks necessary to enable versioning on S3 buckets.
    1. In the Lambda console, navigate to the Configuration tab and then, in the left navigation pane, choose Permissions. Choose the Role name, as shown in Figure 6.
       
      Figure 6: Lambda role in the AWS console

      Figure 6: Lambda role in the AWS console

    2. In the Add permissions dropdown, select Create inline policy.
       
      Figure 7: Create inline policy

      Figure 7: Create inline policy

    3. Choose JSON, add the following policy to the policy editor, and then choose Next.
      {
          "Version": "2012-10-17",
          "Statement": [
              {
                  "Sid": "VisualEditor0",
                  "Effect": "Allow",
                  "Action": "s3:PutBucketVersioning",
                  "Resource": "*"
              }
          ]
      }

    4. Name the policy PutBucketVersioning and choose Create policy.

Create a custom action in Security Hub

In this step, you will create a custom action in Security Hub.

To create the custom action

  1. Open the Security Hub console.
  2. In the left navigation pane, choose Settings, and then choose Custom actions.
  3. Choose Create custom action.
  4. Provide the following information, as shown in Figure 8:
    • For Name, enter TurnOnS3Versioning.
    • For Description, enter Action that will turn on versioning for a specific S3 bucket.
    • For Custom action ID, enter TurnOnS3Versioning.
       
      Figure 8: Create a custom action in Security Hub

      Figure 8: Create a custom action in Security Hub

  5. Choose Create custom action.
  6. Make a note of the Custom action ARN. You will need this ARN when you create a rule to associate with the custom action in EventBridge.

Create an EventBridge rule to target the Lambda function

The next step is to create an EventBridge rule to capture the custom action. You will define an EventBridge rule that matches events (in this case, findings) from Security Hub that were forwarded by the custom action that you defined previously.

To create the EventBridge rule

  1. Navigate to the EventBridge console.
  2. On the right side, choose Create rule.
  3. On the Define rule detail page, give your rule a name and description that represents the rule’s purpose—for example, you could use the same name and description that you used for the custom action. Then choose Next.
  4. Scroll down to Event pattern, and then do the following:
    1. For Event source, make sure that AWS services is selected.
    2. For AWS service, select Security Hub.
    3. For Event type, select Security Hub Findings – Custom Action.
    4. Select Specific custom action ARN(s) and enter the ARN for the custom action that you created earlier.
       
    Figure 9: Specify the EventBridge event pattern for the Security Hub custom action workflow

    Figure 9: Specify the EventBridge event pattern for the Security Hub custom action workflow

    As you provide this information, the Event pattern updates.

  5. Choose Next.
  6. On the Select target(s) step, in the Select a target dropdown, select Lambda function. Then from the Function dropdown, select sec_remediation_function.
  7. Choose Next.
  8. On the Configure tags step, choose Next.
  9. On the Review and create step, choose Create rule.

Run the automation

Your automation is set up and you can now test the automation. This test covers a partial automation workflow, since you will manually select the finding and apply the remediation workflow to one or more selected findings.

Important: As we mentioned earlier, if you decide to make the automation end-to-end, you should assess the impact of the workflow in a non-production environment. Additionally, you may want to consider creating preventative controls if you want to minimize the risk of event occurrence across an entire environment.

To run the automation

  1. In the Security Hub console, on the Findings tab, add a filter by entering Title in the search box and selecting that filter. Select IS and enter S3 general purpose buckets should have versioning enabled (case sensitive). Choose Apply.
  2. In the filtered list, choose the Title of an active finding.
  3. Before you start the automation, check the current configuration of the S3 bucket to confirm that your automation works. Expand the Resources section of the finding.
  4. Under Resource ID, choose the link for the S3 bucket. This opens a new tab on the S3 console that shows only this S3 bucket.
  5. In your browser, go back to the Security Hub tab (don’t close the S3 tab—you will need to return to it), and on the left side, select this same finding, as shown in Figure 10.
     
    Figure 10: Filter out Security Hub findings to list only S3 bucket-related findings

    Figure 10: Filter out Security Hub findings to list only S3 bucket-related findings

  6. In the Actions dropdown list, choose the name of your custom action.
     
    Figure 11: Choose the custom action that you created to start the remediation workflow

    Figure 11: Choose the custom action that you created to start the remediation workflow

  7. When you see a banner that displays Successfully started action…, go back to the S3 browser tab and refresh it. Verify that the S3 versioning configuration on the bucket has been enabled as shown in figure 12.
     
    Figure 12: Versioning successfully enabled

    Figure 12: Versioning successfully enabled

Conclusion

In this post, you learned how to use CodeWhisperer to produce AI-generated code for custom remediations for a security use case. We encourage you to experiment with CodeWhisperer to create Lambda functions that remediate other Security Hub findings that might exist in your account, such as the enforcement of lifecycle policies on S3 buckets with versioning enabled, or using automation to remove multiple unused Amazon EC2 elastic IP addresses. The ability to automatically set public S3 buckets to private is just one of many use cases where CodeWhisperer can generate code to help you remediate Security Hub findings.

To sum up, CodeWhisperer acts as a tool that can help boost the productivity of security experts who have coding abilities, assisting them to swiftly write code to address security issues. However, security specialists should continue building their software development capabilities to implement robust solutions. Engineers should carefully review and test any generated code, since human oversight is still vital for security.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Brendan Jenkins

Brendan Jenkins

Brendan is a Solutions Architect at AWS who works with enterprise customers, providing them with technical guidance and helping them achieve their business goals. He specializes in DevOps and machine learning (ML) technology.

Chris Shea

Chris Shea

Chris is an AWS Solutions Architect serving enterprise customers in the PropTech and AdTech industry verticals, providing guidance and the tools that customers need for success. His areas of interest include AI for DevOps and AI/ML technology.

Tim Manik

Tim Manik

Tim is a Solutions Architect at AWS working with enterprise customers on migrations and modernizations. He specializes in cybersecurity and AI/ML and is passionate about bridging the gap between the two fields.

Angel Tolson

Angel Tolson

Angel is a Solutions Architect at AWS working with small to medium size businesses, providing them with technical guidance and helping them achieve their business goals. She is particularly interested in cloud operations and networking.

AWS Weekly Roundup: Amazon EC2 G6 instances, Mistral Large on Amazon Bedrock, AWS Deadline Cloud, and more (April 8, 2024)

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-mistral-large-aws-clean-rooms-ml-aws-deadline-cloud-and-more-april-8-2024/

We’re just two days away from AWS Summit Sydney (April 10–11) and a month away from the AWS Summit season in Southeast Asia, starting with the AWS Summit Singapore (May 7) and the AWS Summit Bangkok (May 30). If you happen to be in Sydney, Singapore, or Bangkok around those dates, please join us.

Last Week’s Launches
If you haven’t read last week’s Weekly Roundup yet, Channy wrote about the AWS Chips Taste Test, a new initiative from Jeff Barr as part of April’ Fools Day.

Here are some launches that caught my attention last week:

New Amazon EC2 G6 instances — We announced the general availability of Amazon EC2 G6 instances powered by NVIDIA L4 Tensor Core GPUs. G6 instances can be used for a wide range of graphics-intensive and machine learning use cases. G6 instances deliver up to 2x higher performance for deep learning inference and graphics workloads compared to Amazon EC2 G4dn instances. To learn more, visit the Amazon EC2 G6 instance page.

Mistral Large is now available in Amazon Bedrock — Veliswa wrote about the availability of the Mistral Large foundation model, as part of the Amazon Bedrock service. You can use Mistral Large to handle complex tasks that require substantial reasoning capabilities. In addition, Amazon Bedrock is now available in the Paris AWS Region.

Amazon Aurora zero-ETL integration with Amazon Redshift now in additional Regions — Zero-ETL integration announcements were my favourite launches last year. This Zero-ETL integration simplifies the process of transferring data between the two services, allowing customers to move data between Amazon Aurora and Amazon Redshift without the need for manual Extract, Transform, and Load (ETL) processes. With this announcement, Zero-ETL integrations between Amazon Aurora and Amazon Redshift is now supported in 11 additional Regions.

Announcing AWS Deadline Cloud — If you’re working in films, TV shows, commercials, games, and industrial design and handling complex rendering management for teams creating 2D and 3D visual assets, then you’ll be excited about AWS Deadline Cloud. This new managed service simplifies the deployment and management of render farms for media and entertainment workloads.

AWS Clean Rooms ML is Now Generally Available — Last year, I wrote about the preview of AWS Clean Rooms ML. In that post, I elaborated a new capability of AWS Clean Rooms that helps you and your partners apply machine learning (ML) models on your collective data without copying or sharing raw data with each other. Now, AWS Clean Rooms ML is available for you to use.

Knowledge Bases for Amazon Bedrock now supports private network policies for OpenSearch Serverless — Here’s exciting news for you who are building with Amazon Bedrock. Now, you can implement Retrieval-Augmented Generation (RAG) with Knowledge Bases for Amazon Bedrock using Amazon OpenSearch Serverless (OSS) collections that have a private network policy.

Amazon EKS extended support for Kubernetes versions now generally available — If you’re running Kubernetes version 1.21 and higher, with this Extended Support for Kubernetes, you can stay up-to-date with the latest Kubernetes features and security improvements on Amazon EKS.

AWS Lambda Adds Support for Ruby 3.3 — Coding in Ruby? Now, AWS Lambda supports Ruby 3.3 as its runtime. This update allows you to take advantage of the latest features and improvements in the Ruby language.

Amazon EventBridge Console Enhancements — The Amazon EventBridge console has been updated with new features and improvements, making it easier for you to manage your event-driven applications with a better user experience.

Private Access to the AWS Management Console in Commercial Regions — If you need to restrict access to personal AWS accounts from the company network, you can use AWS Management Console Private Access. With this launch, you can use AWS Management Console Private Access in all commercial AWS Regions.

From community.aws 
The community.aws is a home for us, builders, to share our learnings with building on AWS. Here’s my Top 3 posts from last week:

Other AWS News 
Here are some additional news items, open-source projects, and Twitch shows that you might find interesting:

Build On Generative AI – Join Tiffany and Darko to learn more about generative AI, see their demos and discuss different aspects of generative AI with the guest speakers. Streaming every Monday on Twitch, 9:00 AM US PT.

AWS open source news and updates – If you’re looking for various open-source projects and tools from the AWS community, please read the AWS open-source newsletter maintained by my colleague, Ricardo.

Upcoming AWS events
Check your calendars and sign up for these AWS events:

AWS Summits – Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Amsterdam (April 9), Sydney (April 10–11), London (April 24), Singapore (May 7), Berlin (May 15–16), Seoul (May 16–17), Hong Kong (May 22), Milan (May 23), Dubai (May 29), Thailand (May 30), Stockholm (June 4), and Madrid (June 5).

AWS re:Inforce – Explore cloud security in the age of generative AI at AWS re:Inforce, June 10–12 in Pennsylvania for two-and-a-half days of immersive cloud security learning designed to help drive your business initiatives.

AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Poland (April 11), Bay Area (April 12), Kenya (April 20), and Turkey (May 18).

You can browse all upcoming in-person and virtual events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— Donnie

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Serverless ICYMI Q1 2024

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/serverless-icymi-q1-2024/

Welcome to the 25th edition of the AWS Serverless ICYMI (in case you missed it) quarterly recap. Every quarter, we share all the most recent product launches, feature enhancements, blog posts, webinars, live streams, and other interesting things that you might have missed!

In case you missed our last ICYMI, check out what happened last quarter here.

2024 Q1 calendar

2024 Q1 calendar

Adobe Summit

At the Adobe Summit, the AWS Serverless Developer Advocacy team showcased a solution developed for the NFL using AWS serverless technologies and Adobe Photoshop APIs. The system automates image processing tasks, including background removal and dynamic resizing, by integrating AWS Step Functions, AWS Lambda, Amazon EventBridge, and AI/ML capabilities via Amazon Rekognition. This solution reduced image processing time from weeks to minutes and saved the NFL significant costs. Combining cloud-based serverless architectures with advanced machine learning and API technologies can optimize digital workflows for cost-effective and agile digital asset management.

Adobe Summit ServerlessVideo

Adobe Summit ServerlessVideo

ServerlessVideo is a demo application to stream live videos and also perform advanced post-video processing. It uses several AWS services, including Step Functions, Lambda, EventBridge, Amazon ECS, and Amazon Bedrock in a serverless architecture that makes it fast, flexible, and cost-effective. The team used ServerlessVideo to interview attendees about the conference experience and Adobe and partners about how they use Adobe. Learn more about the project and watch videos from Adobe Summit 2024 at video.serverlessland.com.

AWS Lambda

AWS launched support for the latest long-term support release of .NET 8, which includes API enhancements, improved Native Ahead of Time (Native AOT) support, and improved performance.

AWS Lambda .NET 8

AWS Lambda .NET 8

Learn how to compare design approaches for building serverless microservices. This post covers the trade-offs to consider with various application architectures. See how you can apply single responsibility, Lambda-lith, and read and write functions.

The AWS Serverless Java Container has been updated. This makes it easier to modernize a legacy Java application written with frameworks such as Spring, Spring Boot, or JAX-RS/Jersey in Lambda with minimal code changes.

AWS Serverless Java Container

AWS Serverless Java Container

Lambda has improved the responsiveness for configuring Event Source Mappings (ESMs) and Amazon EventBridge Pipes with event sources such as self-managed Apache Kafka, Amazon Managed Streaming for Apache Kafka (MSK), Amazon DocumentDB, and Amazon MQ.

Chaos engineering is a popular practice for building confidence in system resilience. However, many existing tools assume the ability to alter infrastructure configurations, and cannot be easily applied to the serverless application paradigm. You can use the AWS Fault Injection Service (FIS) to automate and manage chaos experiments across different Lambda functions to provide a reusable testing method.

Amazon ECS and AWS Fargate

Amazon Elastic Container Service (Amazon ECS) now provides managed instance draining as a built-in feature of Amazon ECS capacity providers. This allows Amazon ECS to safely and automatically drain tasks from Amazon Elastic Compute Cloud (Amazon EC2) instances that are part of an Amazon EC2 Auto Scaling Group associated with an Amazon ECS capacity provider. This simplification allows you to remove custom lifecycle hooks previously used to drain Amazon EC2 instances. You can now perform infrastructure updates such as rolling out a new version of the ECS agent by seamlessly using Auto Scaling Group instance refresh, with Amazon ECS ensuring workloads are not interrupted.

Credentials Fetcher makes it easier to run containers that depend on Windows authentication when using Amazon EC2. Credentials Fetcher now integrates with Amazon ECS, using either the Amazon EC2 launch type, or AWS Fargate serverless compute launch type.

Amazon ECS Service Connect is a networking capability to simplify service discovery, connectivity, and traffic observability for Amazon ECS. You can now more easily integrate certificate management to encrypt service-to-service communication using Transport Layer Security (TLS). You do not need to modify your application code, add additional network infrastructure, or operate service mesh solutions.

Amazon ECS Service Connect

Amazon ECS Service Connect

Running distributed machine learning (ML) workloads on Amazon ECS allows ML teams to focus on creating, training and deploying models, rather than spending time managing the container orchestration engine. Amazon ECS provides a great environment to run ML projects as it supports workloads that use NVIDIA GPUs and provides optimized images with pre-installed NVIDIA Kernel drivers and Docker runtime.

See how to build preview environments for Amazon ECS applications with AWS Copilot. AWS Copilot is an open source command line interface that makes it easier to build, release, and operate production ready containerized applications.

Learn techniques for automatic scaling of your Amazon Elastic Container Service  (Amazon ECS) container workloads to enhance the end user experience. This post explains how to use AWS Application Auto Scaling which helps you configure automatic scaling of your Amazon ECS service. You can also use Amazon ECS Service Connect and AWS Distro for OpenTelemetry (ADOT) in Application Auto Scaling.

AWS Step Functions

AWS workloads sometimes require access to data stored in on-premises databases and storage locations. Traditional solutions to establish connectivity to the on-premises resources require inbound rules to firewalls, a VPN tunnel, or public endpoints. Discover how to use the MQTT protocol (AWS IoT Core) with AWS Step Functions to dispatch jobs to on-premises workers to access or retrieve data stored on-premises.

You can use Step Functions to orchestrate many business processes. Many industries are required to provide audit trails for decision and transactional systems. Learn how to build a serverless pipeline to create a reliable, performant, traceable, and durable pipeline for audit processing.

Amazon EventBridge

Amazon EventBridge now supports publishing events to AWS AppSync GraphQL APIs as native targets. The new integration allows you to publish events easily to a wider variety of consumers and simplifies updating clients with near real-time data.

Amazon EventBridge publishing events to AWS AppSync

Amazon EventBridge publishing events to AWS AppSync

Discover how to send and receive CloudEvents with EventBridge. CloudEvents is an open-source specification for describing event data in a common way. You can publish CloudEvents directly to EventBridge, filter and route them, and use input transformers and API Destinations to send CloudEvents to downstream AWS services and third-party APIs.

AWS Application Composer

AWS Application Composer lets you create infrastructure as code templates by dragging and dropping cards on a virtual canvas. These represent CloudFormation resources, which you can wire together to create permissions and references. Application Composer has now expanded to the VS Code IDE as part of the AWS Toolkit. This now includes a generative AI partner that helps you write infrastructure as code (IaC) for all 1100+ AWS CloudFormation resources that Application Composer now supports.

AWS AppComposer generate suggestions

AWS AppComposer generate suggestions

Amazon API Gateway

Learn how to consume private Amazon API Gateway APIs using mutual TLS (mTLS). mTLS helps prevent man-in-the-middle attacks and protects against threats such as impersonation attempts, data interception, and tampering.

Serverless at AWS re:Invent

Serverless at AWS reInvent

Serverless at AWS reInvent

Visit the Serverless Land YouTube channel to find a list of serverless and serverless container sessions from reinvent 2023. Hear from experts like Chris Munns and Julian Wood in their popular session, Best practices for serverless developers, or Nathan Peck and Jessica Deen in Deploying multi-tenant SaaS applications on Amazon ECS and AWS Fargate.

Serverless blog posts

January

February

March

Serverless container blog posts

January

February

December

Serverless Office Hours

Serverless Office Hours

Serverless Office Hours

January

February

March

Containers from the Couch

Containers from the Couch

Containers from the Couch

January

February

March

FooBar Serverless

FooBar Serverless

FooBar Serverless

January

February

March

Still looking for more?

The Serverless landing page has more information. The Lambda resources page contains case studies, webinars, whitepapers, customer stories, reference architectures, and even more Getting Started tutorials.

You can also follow the Serverless Developer Advocacy team on Twitter to see the latest news, follow conversations, and interact with the team.

And finally, visit the Serverless Land and Containers on AWS websites for all your serverless and serverless container needs.

Sending and receiving CloudEvents with Amazon EventBridge

Post Syndicated from David Boyne original https://aws.amazon.com/blogs/compute/sending-and-receiving-cloudevents-with-amazon-eventbridge/

Amazon EventBridge helps developers build event-driven architectures (EDA) by connecting loosely coupled publishers and consumers using event routing, filtering, and transformation. CloudEvents is an open-source specification for describing event data in a common way. Developers can publish CloudEvents directly to EventBridge, filter and route them, and use input transformers and API Destinations to send CloudEvents to downstream AWS services and third-party APIs.

Overview

Event design is an important aspect in any event-driven architecture. Developers building event-driven architectures often overlook the event design process when building their architectures. This leads to unwanted side effects like exposing implementation details, lack of standards, and version incompatibility.

Without event standards, it can be difficult to integrate events or streams of messages between systems, brokers, and organizations. Each system has to understand the event structure or rely on custom-built solutions for versioning or validation.

CloudEvents is a specification for describing event data in common formats to provide interoperability between services, platforms, and systems using Cloud Native Computing Foundation (CNCF) projects. As CloudEvents is a CNCF graduated project, many third-party brokers and systems adopt this specification.

Using CloudEvents as a standard format to describe events makes integration easier and you can use open-source tooling to help build event-driven architectures and future proof any integrations. EventBridge can route and filter CloudEvents based on common metadata, without needing to understand the business logic within the event itself.

CloudEvents support two implementation modes, structured mode and binary mode, and a range of protocols including HTTP, MQTT, AMQP, and Kafka. When publishing events to an EventBridge bus, you can structure events as CloudEvents and route them to downstream consumers. You can use input transformers to transform any event into the CloudEvents specification. Events can also be forwarded to public APIs, using EventBridge API destinations, which supports both structured and binary mode encodings, enhancing interoperability with external systems.

Standardizing events using Amazon EventBridge

When publishing events to an EventBridge bus, EventBridge uses its own event envelope and represents events as JSON objects. EventBridge requires that you define top-level fields, such as detail-type and source. You can use any event/payload in the detail field.

This example event shows an OrderPlaced event from the orders-service that is unstructured without any event standards. The data within the event contains the order_id, customer_id and order_total.

{
  "version": "0",
  "id": "dbc1c73a-c51d-0c0e-ca61-ab9278974c57",
  "account": "1234567890",
  "time": "2023-05-23T11:38:46Z",
  "region": "us-east-1",
  "detail-type": "OrderPlaced",
  "source": "myapp.orders-service",
  "resources": [],
  "detail": {
    "data": {
      "order_id": "c172a984-3ae5-43dc-8c3f-be080141845a",
      "customer_id": "dda98122-b511-4aaf-9465-77ca4a115ee6",
      "order_total": "120.00"
    }
  }
}

Publishers may also choose to add an additional metadata field along with the data field within the detail field to help define a set of standards for their events.

{
  "version": "0",
  "id": "dbc1c73a-c51d-0c0e-ca61-ab9278974c58",
  "account": "1234567890",
  "time": "2023-05-23T12:38:46Z",
  "region": "us-east-1",
  "detail-type": "OrderPlaced",
  "source": "myapp.orders-service",
  "resources": [],
  "detail": {
    "metadata": {
      "idempotency_key": "29d2b068-f9c7-42a0-91e3-5ba515de5dbe",
      "correlation_id": "dddd9340-135a-c8c6-95c2-41fb8f492222",
      "domain": "ORDERS",
      "time": "1707908605"
    },
    "data": {
      "order_id": "c172a984-3ae5-43dc-8c3f-be080141845a",
      "customer_id": "dda98122-b511-4aaf-9465-77ca4a115ee6",
      "order_total": "120.00"
    }
  }
}

This additional event information helps downstream consumers, improves debugging, and can manage idempotency. While this approach offers practical benefits, it duplicates solutions that are already solved with the CloudEvents specification.

Publishing CloudEvents using Amazon EventBridge

When publishing events to EventBridge, you can use CloudEvents structured mode. A structured-mode message is where the entire event (attributes and data) is encoded in the message body, according to a specific event format. A binary-mode message is where the event data is stored in the message body, and event attributes are stored as part of the message metadata.

CloudEvents has a list of required fields but also offers flexibility with optional attributes and extensions. CloudEvents also offers a solution to implement idempotency, requiring that the combination of id and source must uniquely identify an event, which can be used as the idempotency key in downstream implementations.

{
  "version": "0",
  "id": "dbc1c73a-c51d-0c0e-ca61-ab9278974c58",
  "account": "1234567890",
  "time": "2023-05-23T12:38:46Z",
  "region": "us-east-1",
  "detail-type": "OrderPlaced",
  "source": "myapp.orders-service",
  "resources": [],
  "detail": {
    "specversion": "1.0",
    "id": "bba4379f-b764-4d90-9fb2-9f572b2b0b61",
    "source": "myapp.orders-service",
    "type": "OrderPlaced",
    "data": {
      "order_id": "c172a984-3ae5-43dc-8c3f-be080141845a",
      "customer_id": "dda98122-b511-4aaf-9465-77ca4a115ee6",
      "order_total": "120.00"
    },
    "time": "2024-01-01T17:31:00Z",
    "dataschema": "https://us-west-2.console.aws.amazon.com/events/home?region=us-west-2#/registries/discovered-schemas/schemas/myapp.orders-service%40OrderPlaced",
    "correlationid": "dddd9340-135a-c8c6-95c2-41fb8f492222",
    "domain": "ORDERS"
  }
}

By incorporating the required fields, the OrderPlaced event is now CloudEvents compliant. The event also contains optional and extension fields for additional information. Optional fields such as dataschema can be useful for brokers and consumers to retrieve a URI path to the published event schema. This example event references the schema in the EventBridge schema registry, so downstream consumers can fetch the schema to validate the payload.

Mapping existing events into CloudEvents using input transformers

When you define a target in EventBridge, input transformations allow you to modify the event before it reaches its destination. Input transformers are configured per target, allowing you to convert events when your downstream consumer requires the CloudEvents format and you want to avoid duplicating information.

Input transformers allow you to map EventBridge fields, such as id, region, detail-type, and source, into corresponding CloudEvents attributes.

This example shows how to transform any EventBridge event into CloudEvents format using input transformers, so the target receives the required structure.

{
  "version": "0",
  "id": "dbc1c73a-c51d-0c0e-ca61-ab9278974c58",
  "account": "1234567890",
  "time": "2024-01-23T12:38:46Z",
  "region": "us-east-1",
  "detail-type": "OrderPlaced",
  "source": "myapp.orders-service",
  "resources": [],
  "detail": {
    "order_id": "c172a984-3ae5-43dc-8c3f-be080141845a",
    "customer_id": "dda98122-b511-4aaf-9465-77ca4a115ee6",
    "order_total": "120.00"
  }
}

Using this input transformer and input template EventBridge transforms the event schema into the CloudEvents specification for downstream consumers.

Input transformer for CloudEvents:

{
  "id": "$.id",
  "source": "$.source",
  "type": "$.detail-type",
  "time": "$.time",
  "data": "$.detail"
}

Input template for CloudEvents:

{
  "specversion": "1.0",
  "id": "<id>",
  "source": "<source>",
  "type": "<type>",
  "time": "<time>",
  "data": <data>
}

This example shows the event payload that is received by downstream targets, which is mapped to the CloudEvents specification.

{
  "specversion": "1.0",
  "id": "dbc1c73a-c51d-0c0e-ca61-ab9278974c58",
  "source": "myapp.orders-service",
  "type": "OrderPlaced",
  "time": "2024-01-23T12:38:46Z",
  "data": {
      "order_id": "c172a984-3ae5-43dc-8c3f-be080141845a",
      "customer_id": "dda98122-b511-4aaf-9465-77ca4a115ee6",
      "order_total": "120.00"
    }
}

For more information on using input transformers with CloudEvents, see this pattern on Serverless Land.

Transforming events into CloudEvents using API destinations

EventBridge API destinations allows you to trigger HTTP endpoints based on matched rules to integrate with third-party systems using public APIs. You can route events to APIs that support the CloudEvents format by using input transformations and custom HTTP headers to convert EventBridge events to CloudEvents. API destinations now supports custom content-type headers. This allows you to send structured or binary CloudEvents to downstream consumers.

Sending binary CloudEvents using API destinations

When sending binary CloudEvents over HTTP, you must use the HTTP binding specification and set the necessary CloudEvents headers. These headers tell the downstream consumer that the incoming payload uses the CloudEvents format. The body of the request is the event itself.

CloudEvents headers are prefixed with ce-. You can find the list of headers in the HTTP protocol binding documentation.

This example shows the Headers for a binary event:

POST /order HTTP/1.1 
Host: webhook.example.com
ce-specversion: 1.0
ce-type: OrderPlaced
ce-source: myapp.orders-service
ce-id: bba4379f-b764-4d90-9fb2-9f572b2b0b61
ce-time: 2024-01-01T17:31:00Z
ce-dataschema: https://us-west-2.console.aws.amazon.com/events/home?region=us-west-2#/registries/discovered-schemas/schemas/myapp.orders-service%40OrderPlaced
correlationid: dddd9340-135a-c8c6-95c2-41fb8f492222
domain: ORDERS
Content-Type: application/json; charset=utf-8

This example shows the body for a binary event:

{
  "order_id": "c172a984-3ae5-43dc-8c3f-be080141845a",
  "customer_id": "dda98122-b511-4aaf-9465-77ca4a115ee6",
  "order_total": "120.00"
}

For more information when using binary CloudEvents with API destinations, explore this pattern available on Serverless Land.

Sending structured CloudEvents using API destinations

To support structured mode with CloudEvents, you must specify the content-type as application/cloudevents+json; charset=UTF-8, which tells the API consumer that the payload of the event is adhering to the CloudEvents specification.

POST /order HTTP/1.1
Host: webhook.example.com
 
Content-Type: application/cloudevents+json; charset=utf-8
{
    "specversion": "1.0",
    "id": "bba4379f-b764-4d90-9fb2-9f572b2b0b61",
    "source": "myapp.orders-service",
    "type": "OrderPlaced",      
    "data": {
      "order_id": "c172a984-3ae5-43dc-8c3f-be080141845a",
      "customer_id": "dda98122-b511-4aaf-9465-77ca4a115ee6",
      "order_total": "120.00"
    },
    "time": "2024-01-01T17:31:00Z",
    "dataschema": "https://us-west-2.console.aws.amazon.com/events/home?region=us-west-2#/registries/discovered-schemas/schemas/myapp.orders-service%40OrderPlaced",
    "correlationid": "dddd9340-135a-c8c6-95c2-41fb8f492222",
    "domain":"ORDERS"
}

Conclusion

Carefully designing events plays an important role when building event-driven architectures to integrate producers and consumers effectively. The open-source CloudEvents specification helps developers to standardize integration processes, simplifying interactions between internal systems and external partners.

EventBridge allows you to use a flexible payload structure within an event’s detail property to standardize events. You can publish structured CloudEvents directly onto an event bus in the detail field and use payload transformations to allow downstream consumers to receive events in the CloudEvents format.

EventBridge simplifies integration with third-party systems using API destinations. Using the new custom content-type headers with input transformers to modify the event structure, you can send structured or binary CloudEvents to integrate with public APIs.

For more serverless learning resources, visit Serverless Land.

Gain insights from historical location data using Amazon Location Service and AWS analytics services

Post Syndicated from Alan Peaty original https://aws.amazon.com/blogs/big-data/gain-insights-from-historical-location-data-using-amazon-location-service-and-aws-analytics-services/

Many organizations around the world rely on the use of physical assets, such as vehicles, to deliver a service to their end-customers. By tracking these assets in real time and storing the results, asset owners can derive valuable insights on how their assets are being used to continuously deliver business improvements and plan for future changes. For example, a delivery company operating a fleet of vehicles may need to ascertain the impact from local policy changes outside of their control, such as the announced expansion of an Ultra-Low Emission Zone (ULEZ). By combining historical vehicle location data with information from other sources, the company can devise empirical approaches for better decision-making. For example, the company’s procurement team can use this information to make decisions about which vehicles to prioritize for replacement before policy changes go into effect.

Developers can use the support in Amazon Location Service for publishing device position updates to Amazon EventBridge to build a near-real-time data pipeline that stores locations of tracked assets in Amazon Simple Storage Service (Amazon S3). Additionally, you can use AWS Lambda to enrich incoming location data with data from other sources, such as an Amazon DynamoDB table containing vehicle maintenance details. Then a data analyst can use the geospatial querying capabilities of Amazon Athena to gain insights, such as the number of days their vehicles have operated in the proposed boundaries of an expanded ULEZ. Because vehicles that do not meet ULEZ emissions standards are subjected to a daily charge to operate within the zone, you can use the location data, along with maintenance data such as age of the vehicle, current mileage, and current emissions standards to estimate the amount the company would have to spend on daily fees.

This post shows how you can use Amazon Location, EventBridge, Lambda, Amazon Data Firehose, and Amazon S3 to build a location-aware data pipeline, and use this data to drive meaningful insights using AWS Glue and Athena.

Overview of solution

This is a fully serverless solution for location-based asset management. The solution consists of the following interfaces:

  • IoT or mobile application – A mobile application or an Internet of Things (IoT) device allows the tracking of a company vehicle while it is in use and transmits its current location securely to the data ingestion layer in AWS. The ingestion approach is not in scope of this post. Instead, a Lambda function in our solution simulates sample vehicle journeys and directly updates Amazon Location tracker objects with randomized locations.
  • Data analytics – Business analysts gather operational insights from multiple data sources, including the location data collected from the vehicles. Data analysts are looking for answers to questions such as, “How long did a given vehicle historically spend inside a proposed zone, and how much would the fees have cost had the policy been in place over the past 12 months?”

The following diagram illustrates the solution architecture.
Architecture diagram

The workflow consists of the following key steps:

  1. The tracking functionality of Amazon Location is used to track the vehicle. Using EventBridge integration, filtered positional updates are published to an EventBridge event bus. This solution uses distance-based filtering to reduce costs and jitter. Distanced-based filtering ignores location updates in which devices have moved less than 30 meters (98.4 feet).
  2. Amazon Location device position events arrive on the EventBridge default bus with source: ["aws.geo"] and detail-type: ["Location Device Position Event"]. One rule is created to forward these events to two downstream targets: a Lambda function, and a Firehose delivery stream.
  3. Two different patterns, based on each target, are described in this post to demonstrate different approaches to committing the data to a S3 bucket:
    1. Lambda function – The first approach uses a Lambda function to demonstrate how you can use code in the data pipeline to directly transform the incoming location data. You can modify the Lambda function to fetch additional vehicle information from a separate data store (for example, a DynamoDB table or a Customer Relationship Management system) to enrich the data, before storing the results in an S3 bucket. In this model, the Lambda function is invoked for each incoming event.
    2. Firehose delivery stream – The second approach uses a Firehose delivery stream to buffer and batch the incoming positional updates, before storing them in an S3 bucket without modification. This method uses GZIP compression to optimize storage consumption and query performance. You can also use the data transformation feature of Data Firehose to invoke a Lambda function to perform data transformation in batches.
  4. AWS Glue crawls both S3 bucket paths, populates the AWS Glue database tables based on the inferred schemas, and makes the data available to other analytics applications through the AWS Glue Data Catalog.
  5. Athena is used to run geospatial queries on the location data stored in the S3 buckets. The Data Catalog provides metadata that allows analytics applications using Athena to find, read, and process the location data stored in Amazon S3.
  6. This solution includes a Lambda function that continuously updates the Amazon Location tracker with simulated location data from fictitious journeys. The Lambda function is triggered at regular intervals using a scheduled EventBridge rule.

You can test this solution yourself using the AWS Samples GitHub repository. The repository contains the AWS Serverless Application Model (AWS SAM) template and Lambda code required to try out this solution. Refer to the instructions in the README file for steps on how to provision and decommission this solution.

Visual layouts in some screenshots in this post may look different than those on your AWS Management Console.

Data generation

In this section, we discuss the steps to manually or automatically generate journey data.

Manually generate journey data

You can manually update device positions using the AWS Command Line Interface (AWS CLI) command aws location batch-update-device-position. Replace the tracker-name, device-id, Position, and SampleTime values with your own, and make sure that successive updates are more than 30 meters in distance apart to place an event on the default EventBridge event bus:

aws location batch-update-device-position --tracker-name <tracker-name> --updates "[{\"DeviceId\": \"<device-id>\", \"Position\": [<longitude>, <latitude>], \"SampleTime\": \"<YYYY-MM-DDThh:mm:ssZ>\"}]"

Automatically generate journey data using the simulator

The provided AWS CloudFormation template deploys an EventBridge scheduled rule and an accompanying Lambda function that simulates tracker updates from vehicles. This rule is enabled by default, and runs at a frequency specified by the SimulationIntervalMinutes CloudFormation parameter. The data generation Lambda function updates the Amazon Location tracker with a randomized position offset from the vehicles’ base locations.

Vehicle names and base locations are stored in the vehicles.json file. A vehicle’s starting position is reset each day, and base locations have been chosen to give them the ability to drift in and out of the ULEZ on a given day to provide a realistic journey simulation.

You can disable the rule temporarily by navigating to the scheduled rule details on the EventBridge console. Alternatively, change the parameter State: ENABLED to State: DISABLED for the scheduled rule resource GenerateDevicePositionsScheduleRule in the template.yml file. Rebuild and re-deploy the AWS SAM template for this change to take effect.

Location data pipeline approaches

The configurations outlined in this section are deployed automatically by the provided AWS SAM template. The information in this section is provided to describe the pertinent parts of the solution.

Amazon Location device position events

Amazon Location sends device position update events to EventBridge in the following format:

{
    "version":"0",
    "id":"<event-id>",
    "detail-type":"Location Device Position Event",
    "source":"aws.geo",
    "account":"<account-number>",
    "time":"<YYYY-MM-DDThh:mm:ssZ>",
    "region":"<region>",
    "resources":[
        "arn:aws:geo:<region>:<account-number>:tracker/<tracker-name>"
    ],
    "detail":{
        "EventType":"UPDATE",
        "TrackerName":"<tracker-name>",
        "DeviceId":"<device-id>",
        "SampleTime":"<YYYY-MM-DDThh:mm:ssZ>",
        "ReceivedTime":"<YYYY-MM-DDThh:mm:ss.sssZ>",
        "Position":[
            <longitude>, 
            <latitude>
	]
    }
}

You can optionally specify an input transformation to modify the format and contents of the device position event data before it reaches the target.

Data enrichment using Lambda

Data enrichment in this pattern is facilitated through the invocation of a Lambda function. In this example, we call this function ProcessDevicePosition, and use a Python runtime. A custom transformation is applied in the EventBridge target definition to receive the event data in the following format:

{
    "EventType":<EventType>,
    "TrackerName":<TrackerName>,
    "DeviceId":<DeviceId>,
    "SampleTime":<SampleTime>,
    "ReceivedTime":<ReceivedTime>,
    "Position":[<Longitude>,<Latitude>]
}

You could apply additional transformations, such as the refactoring of Latitude and Longitude data into separate key-value pairs if this is required by the downstream business logic processing the events.

The following code demonstrates the Python application logic that is run by the ProcessDevicePosition Lambda function. Error handling has been skipped in this code snippet for brevity. The full code is available in the GitHub repo.

import json
import os
import uuid
import boto3

# Import environment variables from Lambda function.
bucket_name = os.environ["S3_BUCKET_NAME"]
bucket_prefix = os.environ["S3_BUCKET_LAMBDA_PREFIX"]

s3 = boto3.client("s3")

def lambda_handler(event, context):
    key = "%s/%s/%s-%s.json" % (bucket_prefix,
                                event["DeviceId"],
                                event["SampleTime"],
                                str(uuid.uuid4())
    body = json.dumps(event, separators=(",", ":"))
    body_encoded = body.encode("utf-8")
    s3.put_object(Bucket=bucket_name, Key=key, Body=body_encoded)
    return {
        "statusCode": 200,
        "body": "success"
    }

The preceding code creates an S3 object for each device position event received by EventBridge. The code uses the DeviceId as a prefix to write the objects to the bucket.

You can add additional logic to the preceding Lambda function code to enrich the event data using other sources. The example in the GitHub repo demonstrates enriching the event with data from a DynamoDB vehicle maintenance table.

In addition to the prerequisite AWS Identity and Access Management (IAM) permissions provided by the role AWSBasicLambdaExecutionRole, the ProcessDevicePosition function requires permissions to perform the S3 put_object action and any other actions required by the data enrichment logic. IAM permissions required by the solution are documented in the template.yml file.

{
    "Version":"2012-10-17",
    "Statement":[
        {
            "Action":[
                "s3:ListBucket"
            ],
            "Resource":[
                "arn:aws:s3:::<S3_BUCKET_NAME>"
            ],
            "Effect":"Allow"
        },
        {
            "Action":[
                "s3:PutObject"
            ],
            "Resource":[
                "arn:aws:s3:::<S3_BUCKET_NAME>/<S3_BUCKET_LAMBDA_PREFIX>/*"
            ],
            "Effect":"Allow"
        }
    ]
}

Data pipeline using Amazon Data Firehose

Complete the following steps to create your Firehose delivery stream:

  1. On the Amazon Data Firehose console, choose Firehose streams in the navigation pane.
  2. Choose Create Firehose stream.
  3. For Source, choose as Direct PUT.
  4. For Destination, choose Amazon S3.
  5. For Firehose stream name, enter a name (for this post, ProcessDevicePositionFirehose).
    Create Firehose stream
  6. Configure the destination settings with details about the S3 bucket in which the location data is stored, along with the partitioning strategy:
    1. Use <S3_BUCKET_NAME> and <S3_BUCKET_FIREHOSE_PREFIX> to determine the bucket and object prefixes.
    2. Use DeviceId as an additional prefix to write the objects to the bucket.
  7. Enable Dynamic partitioning and New line delimiter to make sure partitioning is automatic based on DeviceId, and that new line delimiters are added between records in objects that are delivered to Amazon S3.

These are required by AWS Glue to later crawl the data, and for Athena to recognize individual records.
Destination settings for Firehose stream

Create an EventBridge rule and attach targets

The EventBridge rule ProcessDevicePosition defines two targets: the ProcessDevicePosition Lambda function, and the ProcessDevicePositionFirehose delivery stream. Complete the following steps to create the rule and attach targets:

  1. On the EventBridge console, create a new rule.
  2. For Name, enter a name (for this post, ProcessDevicePosition).
  3. For Event bus¸ choose default.
  4. For Rule type¸ select Rule with an event pattern.
    EventBridge rule detail
  5. For Event source, select AWS events or EventBridge partner events.
    EventBridge event source
  6. For Method, select Use pattern form.
  7. In the Event pattern section, specify AWS services as the source, Amazon Location Service as the specific service, and Location Device Position Event as the event type.
    EventBridge creation method
  8. For Target 1, attach the ProcessDevicePosition Lambda function as a target.
    EventBridge target 1
  9. We use Input transformer to customize the event that is committed to the S3 bucket.
    EventBridge target 1 transformer
  10. Configure Input paths map and Input template to organize the payload into the desired format.
    1. The following code is the input paths map:
      {
          EventType: $.detail.EventType
          TrackerName: $.detail.TrackerName
          DeviceId: $.detail.DeviceId
          SampleTime: $.detail.SampleTime
          ReceivedTime: $.detail.ReceivedTime
          Longitude: $.detail.Position[0]
          Latitude: $.detail.Position[1]
      }

    2. The following code is the input template:
      {
          "EventType":<EventType>,
          "TrackerName":<TrackerName>,
          "DeviceId":<DeviceId>,
          "SampleTime":<SampleTime>,
          "ReceivedTime":<ReceivedTime>,
          "Position":[<Longitude>, <Latitude>]
      }

  11. For Target 2, choose the ProcessDevicePositionFirehose delivery stream as a target.
    EventBridge target 2

This target requires an IAM role that allows one or multiple records to be written to the Firehose delivery stream:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "firehose:PutRecord",
                "firehose:PutRecords"
            ],
            "Resource": [
                "arn:aws:firehose:<region>:<account-id>:deliverystream/<delivery-stream-name>"
            ],
            "Effect": "Allow"
        }
    ]
}

Crawl and catalog the data using AWS Glue

After sufficient data has been generated, complete the following steps:

  1. On the AWS Glue console, choose Crawlers in the navigation pane.
  2. Select the crawlers that have been created, location-analytics-glue-crawler-lambda and location-analytics-glue-crawler-firehose.
  3. Choose Run.

The crawlers will automatically classify the data into JSON format, group the records into tables and partitions, and commit associated metadata to the AWS Glue Data Catalog.
Crawlers

  1. When the Last run statuses of both crawlers show as Succeeded, confirm that two tables (lambda and firehose) have been created on the Tables page.

The solution partitions the incoming location data based on the deviceid field. Therefore, as long as there are no new devices or schema changes, the crawlers don’t need to run again. However, if new devices are added, or a different field is used for partitioning, the crawlers need to run again.
Tables

You’re now ready to query the tables using Athena.

Query the data using Athena

Athena is a serverless, interactive analytics service built to analyze unstructured, semi-structured, and structured data where it is hosted. If this is your first time using the Athena console, follow the instructions to set up a query result location in Amazon S3. To query the data with Athena, complete the following steps:

  1. On the Athena console, open the query editor.
  2. For Data source, choose AwsDataCatalog.
  3. For Database, choose location-analytics-glue-database.
  4. On the options menu (three vertical dots), choose Preview Table to query the content of both tables.
    Preview table

The query displays 10 sample positional records currently stored in the table. The following screenshot is an example from previewing the firehose table. The firehose table stores raw, unmodified data from the Amazon Location tracker.
Query results
You can now experiment with geospatial queries.The GeoJSON file for the 2021 London ULEZ expansion is part of the repository, and has already been converted into a query compatible with both Athena tables.

  1. Copy and paste the content from the 1-firehose-athena-ulez-2021-create-view.sql file found in the examples/firehose folder into the query editor.

This query uses the ST_Within geospatial function to determine if a recorded position is inside or outside the ULEZ zone defined by the polygon. A new view called ulezvehicleanalysis_firehose is created with a new column, insidezone, which captures whether the recorded position exists within the zone.

A simple Python utility is provided, which converts the polygon features found in the downloaded GeoJSON file into ST_Polygon strings based on the well-known text format that can be used directly in an Athena query.

  1. Choose Preview View on the ulezvehicleanalysis_firehose view to explore its content.
    Preview view

You can now run queries against this view to gain overarching insights.

  1. Copy and paste the content from the 2-firehose-athena-ulez-2021-query-days-in-zone.sql file found in the examples/firehose folder into the query editor.

This query establishes the total number of days each vehicle has entered ULEZ, and what the expected total charges would be. The query has been parameterized using the ? placeholder character. Parameterized queries allow you to rerun the same query with different parameter values.

  1. Enter the daily fee amount for Parameter 1, then run the query.
    Query editor

The results display each vehicle, the total number of days spent in the proposed ULEZ, and the total charges based on the daily fee you entered.
Query results
You can repeat this exercise using the lambda table. Data in the lambda table is augmented with additional vehicle details present in the vehicle maintenance DynamoDB table at the time it is processed by the Lambda function. The solution supports the following fields:

  • MeetsEmissionStandards (Boolean)
  • Mileage (Number)
  • PurchaseDate (String, in YYYY-MM-DD format)

You can also enrich the new data as it arrives.

  1. On the DynamoDB console, find the vehicle maintenance table under Tables. The table name is provided as output VehicleMaintenanceDynamoTable in the deployed CloudFormation stack.
  2. Choose Explore table items to view the content of the table.
  3. Choose Create item to create a new record for a vehicle.
    Create item
  4. Enter DeviceId (such as vehicle1 as a String), PurchaseDate (such as 2005-10-01 as a String), Mileage (such as 10000 as a Number), and MeetsEmissionStandards (with a value such as False as Boolean).
  5. Choose Create item to create the record.
    Create item
  6. Duplicate the newly created record with additional entries for other vehicles (such as for vehicle2 or vehicle3), modifying the values of the attributes slightly each time.
  7. Rerun the location-analytics-glue-crawler-lambda AWS Glue crawler after new data has been generated to confirm that the update to the schema with new fields is registered.
  8. Copy and paste the content from the 1-lambda-athena-ulez-2021-create-view.sql file found in the examples/lambda folder into the query editor.
  9. Preview the ulezvehicleanalysis_lambda view to confirm that the new columns have been created.

If errors such as Column 'mileage' cannot be resolved are displayed, the data enrichment is not taking place, or the AWS Glue crawler has not yet detected updates to the schema.

If the Preview table option is only returning results from before you created records in the DynamoDB table, return the query results in descending order using sampletime (for example, order by sampletime desc limit 100;).
Query results
Now we focus on the vehicles that don’t currently meet emissions standards, and order the vehicles in descending order based on the mileage per year (calculated using the latest mileage / age of vehicle in years).

  1. Copy and paste the content from the 2-lambda-athena-ulez-2021-query-days-in-zone.sql file found in the examples/lambda folder into the query editor.
    Query results

In this example, we can see that out of our fleet of vehicles, five have been reported as not meeting emission standards. We can also see the vehicles that have accumulated high mileage per year, and the number of days spent in the proposed ULEZ. The fleet operator may now decide to prioritize these vehicles for replacement. Because location data is enriched with the most up-to-date vehicle maintenance data at the time it is ingested, you can further evolve these queries to run over a defined time window. For example, you could factor in mileage changes within the past year.

Due to the dynamic nature of the data enrichment, any new data being committed to Amazon S3, along with the query results, will be altered as and when records are updated in the DynamoDB vehicle maintenance table.

Clean up

Refer to the instructions in the README file to clean up the resources provisioned for this solution.

Conclusion

This post demonstrated how you can use Amazon Location, EventBridge, Lambda, Amazon Data Firehose, and Amazon S3 to build a location-aware data pipeline, and use the collected device position data to drive analytical insights using AWS Glue and Athena. By tracking these assets in real time and storing the results, companies can derive valuable insights on how effectively their fleets are being utilized and better react to changes in the future. You can now explore extending this sample code with your own device tracking data and analytics requirements.


About the Authors

Alan Peaty is a Senior Partner Solutions Architect at AWS. Alan helps Global Systems Integrators (GSIs) and Global Independent Software Vendors (GISVs) solve complex customer challenges using AWS services. Prior to joining AWS, Alan worked as an architect at systems integrators to translate business requirements into technical solutions. Outside of work, Alan is an IoT enthusiast and a keen runner who loves to hit the muddy trails of the English countryside.

Parag Srivastava is a Solutions Architect at AWS, helping enterprise customers with successful cloud adoption and migration. During his professional career, he has been extensively involved in complex digital transformation projects. He is also passionate about building innovative solutions around geospatial aspects of addresses.