Tag Archives: Amazon DynamoDB

Build priority-based message processing with Amazon MQ and AWS App Runner

Post Syndicated from Aritra Nag original https://aws.amazon.com/blogs/architecture/build-priority-based-message-processing-with-amazon-mq-and-aws-app-runner/

Organizations need message processing systems that can prioritize critical business operations while handling routine tasks efficiently. When handling time-sensitive tasks like rush orders from key customers, critical system alerts, or multi-step business processes, you need to prioritize urgent messages while making sure other routine requests are processed reliably.

In this post, we show you how to build a priority-based message processing system using Amazon MQ for priority queuing, Amazon DynamoDB for data persistence, and AWS App Runner for serverless compute. We demonstrate how to implement application-level delays that high-priority messages can bypass, create real-time UIs with WebSocket connections, and configure dual-layer retry mechanisms for maximum reliability.

This solution addresses three critical challenges in modern data processing systems:

  • Implementing configurable delay processing at the application level
  • Supporting priority-based message routing that respects business requirements
  • Providing real-time feedback to users through WebSocket connections

The use of AWS managed services reduces operational complexity, so teams can focus on business logic rather than infrastructure management. Message handling with priority-based processing makes sure operations receive attention while routine tasks are processed in the background. Users will experience status updates that provide visibility into their requests, while retry mechanisms provide reliability during failures. The infrastructure as code (IaC) approach supports deployments across different environments, from development through production.

Solution overview

The solution consists of several AWS managed services to create a serverless, priority-based message processing system with real-time user feedback. The architecture implements intelligent routing based on three message priority levels, to make sure critical messages receive immediate processing:

  • High-priority path – Messages bypass delays and queue immediately with JMS priority 9
  • Standard-priority path – Messages undergo configured delays before queuing with JMS priority 4
  • Low-priority path – Messages process after all higher priority messages with JMS priority 0

The following diagram illustrates this architecture.

The solution uses the following AWS managed services to deliver a scalable, serverless architecture:

  • AWS App Runner is a fully managed container application service that automatically builds, deploys, and scales containerized applications. It provides automatic scaling based on traffic, built-in load balancing and HTTPS, seamless integration with container registries, and zero infrastructure management overhead.
  • Amazon MQ is a managed message broker service for Apache ActiveMQ that offers priority-based message queuing, automatic failover for high availability, message persistence and durability, and JMS protocol support for enterprise applications.
  • Amazon DynamoDB is a fully managed NoSQL database service providing single-digit millisecond performance at any scale, automatic scaling with on-demand pricing, built-in security and backup capabilities, and global tables for multi-Region deployments.

The system uses JMS priority levels with High=9, Medium=4, and Low=0 for automatic ordering, combined with conditional delay processing based on priority classification. Amazon MQ provides reliable message delivery and persistence with dead-letter queue (DLQ) configuration for failed message handling.

Asynchronous delay processing uses CompletableFuture implementation for non-blocking delays, thread pool management for concurrent processing, graceful error handling with retry mechanisms, and configurable delay periods per message type to optimize resource utilization. For real-time status updates, the solution provides WebSocket connections for bidirectional communication, Amazon DynamoDB Streams for change data capture (CDC), comprehensive status tracking throughout the processing lifecycle, and a React frontend integration for live updates, so users have complete visibility into their message processing status.

The standard priority messaging flow (shown in the following diagram) handles messages with configurable delays using JMS asynchronous processing capabilities. Messages wait for their specified delay period before entering the Amazon MQ queue, where they’re processed.

The high-priority messaging flow (shown in the following diagram) provides an express lane for critical messages. These messages skip the delay mechanism entirely and proceed directly to the queue, providing immediate processing for time-sensitive operations.

To make it even more straightforward to get started, we’ve prepared an example application that you can use to observe the Amazon MQ behavior with varying message volumes. You can find the source code repository, IaC implementation, and instructions to run the sample on GitHub.

In the following sections, we walk you through deploying the complete processing system.

Prerequisites

Make sure you have the following tools, permissions, and knowledge to successfully deploy the priority-based message processing system. You must have an active AWS account with the following configurations:

# JSON
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
"apprunner:CreateService",
"apprunner:UpdateService",
"apprunner:DeleteService"
      ],
      "Resource": "arn:aws:apprunner:*:*:service/reactive-demo-*"
    },
    {
      "Effect": "Allow",
      "Action": [
"mq:SendMessage",
"mq:ReceiveMessage",
"mq:DeleteMessage"
      ],
      "Resource": "arn:aws:mq:*:*:broker/reactive-demo-broker/*"
    },
    {
      "Effect": "Allow",
      "Action": [
"dynamodb:PutItem",
"dynamodb:GetItem",
"dynamodb:UpdateItem",
"dynamodb:Query"
      ],
      "Resource": "arn:aws:dynamodb:*:*:table/reactive-items*"
    }
  ]
}

Install and configure the following development tools on your local machine:

To successfully implement this solution, you should have basic familiarity with the following:

  • Spring Boot applications
  • Message queue concepts
  • WebSocket protocols
  • React development

Configure the infrastructure stack

This step involves creating the core AWS services using the AWS Cloud Development Kit (AWS CDK). This modular approach enables independent stack management and environment-specific configurations.

  1. Create a new AWS CDK project:
# Bash
mkdir priority-processing && cd priority-processing
cdk init app --language python
pip install aws-cdk-lib constructs
  1. Create the infrastructure stack:
# Python
from aws_cdk import (
    Stack,
    aws_dynamodb as dynamodb,
    aws_amazonmq as mq,
    aws_kms as kms,
    Duration,
    RemovalPolicy,
    CfnOutput
)
from constructs import Construct

class MessageProcessingStack(Stack):
    def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
super().__init__(scope, construct_id, **kwargs)

# Create KMS key for encryption
self.kms_key = kms.Key(
    self, "ProcessingKey",
    description="Key for message processing encryption",
    enable_key_rotation=True
)

# DynamoDB table with comprehensive configuration
self.items_table = dynamodb.Table(
    self, "ItemsTable",
    table_name="reactive-items",
    partition_key=dynamodb.Attribute(
name="id",
type=dynamodb.AttributeType.STRING
    ),
    stream=dynamodb.StreamViewType.NEW_AND_OLD_IMAGES,
    billing_mode=dynamodb.BillingMode.ON_DEMAND,
    encryption=dynamodb.TableEncryption.CUSTOMER_MANAGED,
    encryption_key=self.kms_key,
    point_in_time_recovery=True,
    removal_policy=RemovalPolicy.DESTROY
)

# Add Global Secondary Index for status queries
self.items_table.add_global_secondary_index(
    index_name="StatusIndex",
    partition_key=dynamodb.Attribute(
name="status",
type=dynamodb.AttributeType.STRING
    ),
    sort_key=dynamodb.Attribute(
name="createdAt",
type=dynamodb.AttributeType.STRING
    )
)

# Amazon MQ broker configuration
self.mq_broker = mq.CfnBroker(
    self, "MessageBroker",
    broker_name="reactive-demo-broker",
    engine_type="ACTIVEMQ",
    engine_version="5.18",
    host_instance_type="mq.t3.micro",
    deployment_mode="SINGLE_INSTANCE",
    publicly_accessible=False,
    logs=mq.CfnBroker.LogListProperty(
audit=True,
general=True
    ),
    encryption_options=mq.CfnBroker.EncryptionOptionsProperty(
use_aws_owned_key=False,
kms_key_id=self.kms_key.key_id
    ),
    users=[mq.CfnBroker.UserProperty(
username="admin",
password="SecurePassword123!",
console_access=True
    )]
)

# Output values for application configuration
CfnOutput(self, "TableName", 
    value=self.items_table.table_name,
    description="DynamoDB table name")
CfnOutput(self, "MQBrokerEndpoint",
    value=self.mq_broker.attr_amqp_endpoints[0],
    description="Amazon MQ broker endpoint")
  1. Run the following commands to deploy the stack:
# Bash
cdk bootstrap
cdk deploy MessageProcessingStack

You can verify the infrastructure on the AWS Management Console.

Configure the message processing application

In this step, we create the Spring Boot application with priority-based message processing capabilities. First, we configure the application.properties file to incorporate environment variables, including AWS credentials, AWS Regions, and other configuration parameters such as log levels into the application and business logic implementation. Next, we implement the message service using a JMS template with comprehensive error handling, followed by enhancing the JMS configuration with connection pooling for improved performance.

The following code illustrates an example message service implementation:

// Example message service implementation
@Service
public class MessageService {
    @Autowired
    private JmsTemplate jmsTemplate;
    
    public void sendPriorityMessage(Message message) {
jmsTemplate.send(session -> {
    Message jmsMessage = session.createTextMessage(message.getContent());
    jmsMessage.setJMSPriority(message.getPriority());
    return jmsMessage;
});
    }
}

For proper timestamp update implementation, we integrate the DynamoDB SDK service with caching capabilities. Finally, after implementing the REST controller for the API with asynchronous processing support, we can deploy the message processing application. This implementation includes Java code application-level delay processing for demonstration purposes. Although this approach effectively showcases the priority-based message routing capabilities and real-time WebSocket updates in our demo environment, AWS recommends using Amazon MQ delay processing features for production workloads. For production implementations, use Amazon MQ delay and scheduling capabilities instead of application-level delays through features like Amazon MQ delay queues, ActiveMQ scheduling features, and appropriate message Time-to-Live (TTL) configurations.

The following code is an example snippet showcasing the Amazon MQ feature:

// Create connection factory with Amazon MQ endpoint
ActiveMQConnectionFactory factory = new ActiveMQConnectionFactory(brokerUrl);
factory.setUserName("admin");
factory.setPassword("your-password");
try (Connection connection = factory.createConnection();
     Session session = connection.createSession(false, Session.AUTO_ACKNOWLEDGE)) {
    
    // Create destination and producer
    Destination destination = session.createQueue(queueName);
    MessageProducer producer = session.createProducer(destination);
    
    // Create message
    TextMessage message = session.createTextMessage(messageContent);
    
    // Set native delay using ActiveMQ scheduled delivery
    message.setLongProperty(ScheduledMessage.AMQ_SCHEDULED_DELAY, delayMillis);
    
    // Optionally set priority for delayed message
    message.setJMSPriority(4);
    
    // Send the message - it will be delivered after the specified delay
    producer.send(message);
}

Build and deploy the Spring Boot application to App Runner

In this step, we push the application to Amazon Elastic Container Registry (Amazon ECR) to run it in App Runner:

  1. Build and push the Docker image to Amazon ECR:
# Bash

# Build the Docker image
docker build -t reactive-demo .

# Create ECR repository
aws ecr create-repository --repository-name reactive-demo --region us-east-1

# Get login token and login to ECR
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin $ECR_URI

# Tag and push image
ECR_URI=$(aws ecr describe-repositories --repository-names reactive-demo --query 'repositories[0].repositoryUri' --output text)
docker tag reactive-demo:latest $ECR_URI:latest
docker push $ECR_URI:latest
  1. Create the App Runner service with environment variables for the DynamoDB table and Amazon MQ broker endpoint:
# Python

from aws_cdk import (
    aws_apprunner as apprunner,
    aws_iam as iam
)

class AppRunnerStack(Stack):
    def __init__(self, scope: Construct, id: str, 
 table_name: str, mq_endpoint: str, **kwargs):
super().__init__(scope, id, **kwargs)

# Create IAM role for App Runner
app_runner_role = iam.Role(
    self, "AppRunnerRole",
    assumed_by=iam.ServicePrincipal("tasks.apprunner.amazonaws.com"),
    managed_policies=[
iam.ManagedPolicy.from_aws_managed_policy_name(
    "AmazonDynamoDBFullAccess"
),
iam.ManagedPolicy.from_aws_managed_policy_name(
    "AmazonMQFullAccess"
)
    ]
)

# Create App Runner service
self.service = apprunner.CfnService(
    self, "ReactiveProcessingService",
    service_name="reactive-processing-service",
    source_configuration=apprunner.CfnService.SourceConfigurationProperty(
authentication_configuration=apprunner.CfnService.AuthenticationConfigurationProperty(
    access_role_arn=app_runner_role.role_arn
),
image_repository=apprunner.CfnService.ImageRepositoryProperty(
    image_identifier=f"{ECR_URI}:latest",
    image_configuration=apprunner.CfnService.ImageConfigurationProperty(
port="8080",
runtime_environment_variables=[
    {"name": "DYNAMODB_TABLE_NAME", "value": table_name},
    {"name": "MQ_BROKER_URL", "value": mq_endpoint}
]
    ),
    image_repository_type="ECR"
)
    ),
    health_check_configuration=apprunner.CfnService.HealthCheckConfigurationProperty(
path="/actuator/health",
protocol="HTTP",
interval=10,
timeout=5,
healthy_threshold=1,
unhealthy_threshold=5
    ),
    instance_configuration=apprunner.CfnService.InstanceConfigurationProperty(
cpu="0.5 vCPU",
memory="1 GB"
    )
)

Set up real-time updates

For this step, we implement WebSocket support for real-time status updates using AWS Lambda to process DynamoDB streams and send updates to connected clients using Amazon API Gateway WebSocket connections. You can find the code snippet for this in this link

Deploy the React application to Amazon S3 and Amazon CloudFront

In this step, we create a frontend application to enable the WebSocket connection for seeing the messaging getting updated in the DynamoDB and API Gateway WebSocket connections.

Similar to the above section, here is the AWS cdk code for building the frontend for proceeding towards the validation of the solution

Validate the solution

This section provides comprehensive testing procedures to validate the priority-based message processing system.

Automated testing script

After you have completed the preceding steps, you can initiate a comprehensive testing script to validate priority processing and delay behavior:

# Bash
#!/bin/bash
curl -X POST "$API_URL/api/items" \
  -H "Content-Type: application/json" \
  -d '{
    "title": "High Priority Task",
    "priority": "High",
    "delay": 10
  }'

Validation through the web interface

The following screenshot of the UI illustrates how the queueing mechanism can work with the real-time updates using WebSockets.

The web interface provides validation of the priority-based message processing system. Access the Amazon CloudFront URL to view the following information:

  • Real-time message processing with live status updates
  • Queue statistics showing message distribution by priority
  • Processing timeline demonstrating priority bypass behavior
  • WebSocket connection status indicating real-time connectivity

Amazon CloudWatch dashboards and alarms

AWS recommends creating Amazon CloudWatch dashboards to track your priority-based message processing system’s performance across multiple dimensions. Monitor message processing by priority levels to make sure high-priority messages are processed first and identify any bottlenecks in your priority routing logic. The following screenshot shows an example dashboard.

You can track queue depth and processing times to understand system load and latency patterns, helping you optimize resource allocation and identify when scaling is needed. Observe DynamoDB performance metrics including read/write capacity consumption, throttling events, and latency to make sure your database layer maintains optimal performance under varying loads.

Additionally, implement application-specific custom metrics such as message processing success rates, retry counts, and business-specific KPIs to gain deeper insights into your application’s behavior and make data-driven decisions for continuous improvement.

Security considerations

AWS recommends implementing comprehensive security measures to safeguard your message processing system. Start by implementing least privilege IAM policies that grant only the minimum permissions required for each component to function, making sure services like App Runner can only access the specific DynamoDB tables and Amazon MQ queues they need. Configure your network architecture using a virtual private cloud (VPC) with private subnets for Amazon MQ, isolating your message broker from direct internet access while maintaining connectivity through NAT gateways for necessary outbound connections.

Enable encryption at rest using AWS Key Management Service (AWS KMS) for DynamoDB tables and Amazon MQ data and enforce encryption in transit by configuring SSL/TLS connections for all service communications, particularly for ActiveMQ broker connections. Finally, configure security groups with minimal access rules that explicitly define allowed traffic between components, restricting inbound connections to only the ports and protocols required for your application to function, such as port 61617 for ActiveMQ SSL connections from App Runner instances.

Cost considerations

The following table contains cost estimates based on the US East (N. Virginia) Region. Actual costs might vary based on your Region, usage patterns, and pricing changes.

Service Small (1,000 msg/day) Medium (10,000 msg/day) Large (100,000 msg/day)
Amazon DynamoDB $5–10 $25–50 $200–400
Amazon MQ $15 (t3.micro) $30 (m5.large) $120 (m5.xlarge)
AWS App Runner $20–40 $50–150 $400–800
Amazon API Gateway WebSocket $3–5 $10–25 $50–100
Amazon CloudWatch Logs $5–10 $10–20 $30–50
Data Transfer $5 $10-20 $50-100
Total Estimated Cost $53–95 $135–295 $850–1,570

Troubleshooting

The following are common issues and their solutions when implementing the priority-based message processing system:

  • Messages not processing in priority order:
    • Verify JMS priority is configured correctly: message.setJMSPriority(priority)
    • Check ActiveMQ broker configuration for priority queue support
    • Confirm CLIENT_ACKNOWLEDGE mode is properly configured
    • Review queue consumer concurrency settings
  • WebSocket updates not working:
    • Verify DynamoDB Streams is enabled on the table
    • Check the Lambda function is triggered by stream events
    • Validate API Gateway WebSocket configuration and IAM permissions
    • Test the WebSocket connection using browser developer tools
  • Application scaling issues:
    • Monitor App Runner metrics in CloudWatch
    • Adjust auto scaling configuration based on traffic patterns
    • Consider Amazon MQ broker capacity and upgrade if needed
    • Review DynamoDB capacity settings and enable auto scaling

Clean up

To avoid incurring ongoing AWS charges, delete the resources you created in this walkthrough:

  1. Delete the CDK stacks:
cdk destroy MessageProcessingStack
cdk destroy FrontendStack
  1. Remove the App Runner service:
aws apprunner delete-service --service-arn <your-service-arn>
  1. Delete the ECR repositories and container images.
  2. Remove CloudWatch log groups if not set to auto-delete.
  3. Delete S3 buckets used for frontend hosting.

Next steps

To extend this solution and add additional capabilities, consider the following enhancements:

Conclusion

This solution demonstrates how to build a production-ready priority-based message processing system using AWS managed services. By combining Amazon MQ priority queuing with DynamoDB real-time streams and App Runner serverless compute, you create a resilient architecture that intelligently handles messages based on business priorities.The implementation of application-level delays with priority bypass makes sure critical messages receive immediate attention, and the dual-layer retry mechanism provides maximum reliability. Real-time WebSocket updates keep users informed of processing status, creating a responsive and transparent system.To learn more about the services and patterns used in this solution, explore the following resources:


About the authors

AWS Weekly Roundup: Amazon EC2, Amazon Q Developer, IPv6 updates, and more (September 1, 2025)

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-amazon-ec2-amazon-q-developer-ipv6-updates-and-more-september-1-2025/

My LinkedIn feed was absolutely packed this week with pictures from the AWS Heroes Summit event in Seattle. It was heartwarming to see so many familiar faces and new Heroes coming together.

AWS Heroes Summit 2025

For those not familiar with the AWS Heroes program, it’s a global community recognition initiative that honors individuals who make outstanding contributions to the AWS community. These Heroes share their deep AWS knowledge through content creation, speaking at events, organizing community gatherings, and contributing to open-source projects.

The AWS Heroes Summit brings these exceptional community leaders together, providing a unique platform for knowledge exchange, networking, and collaboration. As someone who regularly interacts with Heroes through our AWS initiatives, I always find these summits invaluable – they offer deep technical discussions, early access to AWS roadmaps, and opportunities to provide direct feedback to AWS service teams. The insights and connections made at these events often translate into better resources and guidance for the broader AWS community.

Last week’s launches

In addition to this inspiring community, here are some AWS launches that caught my attention:

  • AWS expands Internet Protocol v6 (IPv6) support to AWS App Runner, AWS Client VPN, and RDS Data API — Three more AWS services now support IPv6 connectivity, helping you meet compliance requirements and removes the need for handling address translation between IPv4 and IPv6. AWS App Runner now supports IPv6-based inbound and outbound traffic on both public and private App Runner service endpoints. AWS Client VPN announced support for remote access to IPv6 workloads, allowing you to establish secure VPN connections to your IPv6-enabled VPC resources. Finally, RDS Data API now supports IPv6, enabling dual-stack configuration (IPv4 and IPv6) connectivity for your Aurora databases.
  • We launched two new instance families this week: the new storage-optimized I8ge and the general-purpose M8i instances —Our I8ge instances, powered by AWS Graviton4 processors, deliver up to 60% better compute performance compared to their Graviton2-based predecessors. These instances feature third-generation AWS Nitro SSDs, providing up to 55% better real-time storage performance per TB and significantly lower I/O latency. With 120 TB of storage and sizes up to 48xlarge (including two metal options), they offer the highest storage density among AWS Graviton-based storage optimized instances. We also launched M8i and M8i-flex instances with custom Intel Xeon 6 processors. These instances deliver up to 15% better price-performance and 2.5x more memory bandwidth than their predecessors. M8i-flex instances are ideal for general-purpose workloads, available from large to 16xlarge. For demanding applications, you can choose from our SAP-certified M8i instances in 13 sizes, including 2 bare metal options and a new 96xlarge size.
  • Amazon EC2 Mac Dedicated hosts now support Host Recovery and Reboot-based host maintenance — you can enable two new capabilities for your EC2 Mac Dedicated Hosts: Host Recovery and Reboot-based Host Maintenance. Host Recovery automatically detects potential hardware issues on Mac Dedicated Hosts and seamlessly migrates Mac instances to a new replacement host, minimizing disruption to workloads. Reboot-based Host Maintenance automatically stops and restarts instances on replacement hosts when scheduled maintenance events occur, eliminating the need for manual intervention during planned maintenance windows.
  • Amazon Q Developer now supports MCP admin control — Administrators have now the ability to enable or disable the MCP functionality for all the Q Developer clients in their organization. When an administrator disables the functionality, users will not be allowed to add any MCP servers, nor will any previously defined servers be initialized.

Other AWS news

Here are some additional projects and blog posts that you might find interesting:

  • Mastering Amazon Q Developer with Rules — I read an interesting article about Amazon Q Developer’s rules feature this weekend that I want to share with you. What caught my attention is how it solves a pain point I often encounter when working with AI assistants – having to repeatedly explain my coding preferences and standards. With rules, you define your preferences once in Markdown files, and Amazon Q Developer automatically follows them for every interaction. I particularly like how transparent the system is, showing which rules it’s following, and how it helps maintain consistency across teams. Since implementing rules in my projects, I’ve seen more consistent code quality, all while reducing the cognitive load of having to repeatedly explain our standards.
  • Strategies for excelling across all four exam domains of the AWS Certified Machine Learning – Specialty certification. The AWS Training & Certification team, where I spent my first three years at AWS, shared how to prepare for the AWS Certified Machine Learning – Specialty certification, whether you’re starting from scratch or building upon existing AWS Certifications. They share the prerequisites and guidance to help you get ready for this certification and demonstrate your expertise in building ML solutions with AWS.
  • As is now our tradition after Prime Day, we shared the impressive metrics showing how AWS services scaled to support one of the world’s largest shopping events. Amazon Prime Day 2025 was the biggest ever, setting records for both sales volume and total items sold during the 4-day event. This year was particularly special as we saw a significant transformation in the Prime Day experience through advancements in our generative AI offerings, with customers using Alexa+, Rufus, and AI Shopping Guides to discover deals and get product information. The numbers are staggering – Amazon DynamoDB handled tens of trillions of API calls while maintaining high availability, delivering single-digit millisecond responses and peaking at 151 million requests per second. Amazon API Gateway processed over 1 trillion internal service requests—a 30 percent increase in requests on average per day compared to Prime Day 2024.

Upcoming AWS events
Check your calendars and sign up for these upcoming AWS events:

  • AWS Summits — Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Toronto (September 4), Los Angeles (September 17), and Bogotá (October 9).
  • AWS re:Invent 2025 — This flagship annual conference is coming to Las Vegas from December 1–5. The event catalog is now available. Mark your calendars for this not to be missed gathering of the AWS community.
  • AWS Community Days — Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Adria (September 5), Baltic (September 10), Aotearoa (September 18), South Africa (September 20), Bolivia (September 20), Portugal (September 27).

Join the AWS Builder Center to learn, build, and connect with builders in the AWS community. Browse here for upcoming in-person and virtual developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— seb

How Karrot built a feature platform on AWS, Part 1: Motivation and feature serving

Post Syndicated from Hyeonho Kim original https://aws.amazon.com/blogs/architecture/how-karrot-built-a-feature-platform-on-aws-part-1-motivation-and-feature-serving/

This post is co-written with Hyeonho Kim, Jinhyeong Seo and Minjae Kwon from Karrot.

Karrot is Korea’s leading local community and a service centered on all possible connections in the neighborhood. Beyond simple flea markets, it strengthens connections between neighbors, local stores, and public institutions, and creates a warm and active neighborhood as its core value.

Karrot uses a recommendation system to provide users with connections that match their interests and neighborhoods, and to provide personalized experiences. In particular, you can check customized content on the home screen of the Karrot application. Personalized content is continuously updated by analyzing the user’s activity patterns without having to set a special interest category. The core of the feed is to provide new and interesting content, and Karrot is constantly working to improve user satisfaction for this purpose. Karrot actively uses a recommendation system to provide personalized and recommended content. In this system, the feature platform plays a key role along with the machine learning (ML) recommendation model. The feature platform acts as a data store that stores and serves data necessary for the ML recommendation model, such as the user’s behavior history and article information.

This two-part series starts by presenting our motivation, our requirements, and the solution architecture, focusing on feature serving. Part 2 covers the process of collecting features in real-time and batch ingestion into an online store, and the technical approaches for stable operation.

Background of the feature platform at Karrot

Karrot recognized the need for a feature platform in early 2021, about 2 years after implementing a recommendation system in their application. At that time, Karrot was achieving significant growth in various metrics through active usage of the recommendation system. By showing personalized feeds to each user beyond chronological feeds, they observed a more than 30% increase in click-through rates and higher user satisfaction. As the recommendation system’s impact continued to grow, the ML team naturally faced the challenge of advancing the system.

In ML-based systems, various high-quality input data (clicks, conversion actions, and so on) is considered a crucial element. These input data are typically called features. At Karrot, data including user behavior logs, action logs, and status values are collectively referred to as user features, and logs related to articles are called article features.

To improve the accuracy of personalized recommendations, various types of features are needed. A system that can efficiently manage these features and quickly deliver them to ML recommendation models is essential. Here, serving means the process of providing real-time data needed when the recommendation system suggests personalized content to users. However, the feature management approach in the existing recommendation system had some limitations, with the following key issues:

  • Dependency on flea market server – Because the initial recommendation system existed as an internal library on the flea market server, the source code of the web application had to be changed whenever the recommendation logic was modified or a feature was added. This reduced the flexibility of deployment and made it difficult to optimize resources.
  • Limited scalability of recommendation logic and features – The initial recommendation system directly depended on the flea market database and only considered flea market articles. This made it impossible to expand to new article types like local community, local jobs, and advertisements, which are managed by different data sources. Additionally, feature-related code was hardcoded, making it difficult to explore, add, or modify features.
  • Lack of feature data source reliability – Although features were retrieved from various repositories such as Amazon Simple Storage Service (Amazon S3), Amazon ElastiCache, and Amazon Aurora, the reliability of data quality was low due to the lack of a consistent schema and collection pipeline. This was a major limitation in securing the latest features and consistency.

The following diagram illustrates the initial recommendation system backend structure.

To solve these problems, we needed a new central system that could efficiently support feature management, real-time ingestion, and serving, and so we started the feature platform project.

Requirements of the feature platform

The following functional requirements were organized by separating the feature platform into an independent service:

  • Record and rapidly serve the top N most recent actions performed by users. Allow parameterization of both the top N value and the lookup period.
  • Support user-specific features such as notification keywords in addition to action features.
  • Process features from various article types beyond just flea market articles.
  • Handle arbitrary data types for all features, including primitive types, lists, sets, and maps.
  • Provide real-time updates for both action features and user characteristic features.
  • Provide flexibility in feature lists, counts, and lookup periods for each request.

To implement these functional requirements, a new platform was necessary. This platform needed three core capabilities: real-time ingestion of various feature types, storage with consistent schema, and quick response to diverse query requests. Although these requirements initially seemed ambiguous, designing a generalized structure enabled efficient configuration of data ingestion pipelines, storage methods, and serving schemas, leading to clearer development objectives.

In addition to functional requirements, the technical requirements included:

  • Serving traffic: 1,500 or more requests per second (RPS)
  • Ingestion traffic: 400 or more writes per second (WPS)
  • Top N values: 30–50
  • Single feature size: Up to 8 KB
  • Total number of features: Over 3 billion or more

At the time, the variety and number of features in use were limited, and the recommendation models were simple, resulting in modest technical requirements. However, considering the rapid growth rate, a significant increase in system requirements was anticipated. Based on this prediction, higher targets were set beyond the initial requirements. As of February 2025, the serving and ingestion traffic has increased by about 90 times compared to the initial requirements, and the total number of features has increased by hundreds of times. The ability to handle this rapid growth was made possible by the highly scalable architecture of the feature platform, which we discuss in the following sections.

Solution overview

The following diagram illustrates the architecture of the feature platform.

The feature platform consists of three main components: feature serving, a stream ingestion pipeline, and a batch ingestion pipeline.

Part 1 of this series will cover feature serving. Feature serving is the core function of receiving client requests and providing the required features. Karrot designed this system with four major components:

  • Server – A server that receives and processes feature serving requests, and is a pod located on Amazon Elastic Kubernetes Service (Amazon EKS)
  • Remote cache – A remote cache layer shared by servers, and uses ElastiCache
  • Database – A persistence layer that stores features, and uses Amazon DynamoDB
  • On-demand feature server – A server that serves features that can’t be stored in the remote cache and database due to compliance issues, or that require real-time calculations every time

From a data store perspective, feature serving should serve high-cardinality features with low latency at scale. Karrot introduced multi-level cache and subdivided serving strategies according to the characteristics of the features:

  • Local cache (tier 1 cache) – An in-memory store located within the server, suitable for cases where the data size is small and is frequently accessed or requires fast response times
  • Remote cache (tier 2 cache) – Suitable for cases where the data size is medium and is frequently accessed
  • Database (tier 3 cache) – Suitable for cases where the data size is large and is not frequently accessed or is less sensitive to response times

Schema design

The feature platform stores multiple features together using the concept of feature groups, such as column families. All feature groups are defined through the feature group schema, called feature group specifications, and each feature group specification defines the name of the feature group, required features, and so on.

Based on this concept, the key design is defined as follows:

  • Partition key: <feature_group_name>#<feature_group_id>
  • Sort key: <feature_group_timestamp> or a string representing null

To illustrate how this works in practice, let’s explore an example of a feature group representing recently clicked flea market articles by user 1234. Consider the following scenario:

  • Feature group name: recent_user_clicked_fleaMarketArticles
  • User ID: 1234
  • Click timestamp: 987654321
  • Features in the feature group:
    • Clicked article ID: a
    • User session ID: 1111

In this example, the keys and feature group are created as follows:

  • Partition key: recent_user_clicked_fleaMarketArticles#1234
  • Sort Key: 987654321
  • Value: {"0": "a", "1": "1111"}

Features defined in the feature group specification maintain a fixed order, using this ordering like an enum when saving the feature group.

Feature serving read/write flow

The feature platform uses a multi-level cache and database for feature serving, as shown in the following diagram.

To illustrate this process, let’s examine how the system retrieves feature groups 1, 2, and 3 from flea market articles. The read flow (solid lines in the preceding diagram) demonstrates data access optimization using a multi-level cache strategy:

  1. When a query request comes in, first check the local cache.
  2. Data not in the local cache is searched in ElastiCache.
  3. Data not in ElastiCache is searched in DynamoDB.
  4. The feature groups found at each stage are collected and returned as the final response.

The write flow (dotted lines in the preceding diagram) consists of the following steps:

  1. Feature groups that have cache misses are stored in each cache level.
  2. Data not found in the local cache but found in the remote cache or database is stored in the upper-level cache.
    1. Data found in ElastiCache is stored in the local cache.
    2. Data found in DynamoDB is stored in both ElastiCache and the local cache.
  3. Cache write operations are performed asynchronously in the background.

This approach presents a strategy to maintain data consistency and improve future access time in the multi-level cache structure. In an ideal situation, serving works well without any problems with just the preceding flow. However, the reality was not like that. The problems experienced included cache misses, consistency, and penetration problems:

  • Cache miss problem – Frequent cache misses slow down the response time and put a burden on the next level cache or database. Karrot uses the Probabilistic Early Expirations (PEE) technique to proactively refresh data that is likely to be retrieved again in the future, thereby maintaining low latency and mitigating cache stampede.
  • Cache consistency problem – If the Time-To-Live (TTL) of a cache is set incorrectly, it can affect recommendation quality or reduce system efficiency. Karrot sets soft and hard TTL separately, and sometimes uses a write-through caching strategy together to synchronize cache and database to alleviate consistency problems. In addition, jitter is added to spread out the TTL deletion time to alleviate the cache stampede of feature groups written at similar times.
  • Cache penetration problem – Continuous queries for non-existent feature groups can lead to DynamoDB queries, resulting in increased costs and response times. The platform resolves this through negative caching, storing information about non-existent feature groups to reduce unnecessary database queries. Additionally, the system monitors the ratio of missing feature groups in DynamoDB, negative cache hit rates, and potential consistency problems.

Future improvements for feature serving

Karrot is considering the following future improvements to their feature serving solution:

  • Large data caching – Recently, the demand for storing large data features has been increasing. This is because as Karrot grows, the number of features also increases. Also, as the demand for embeddings increases along with the rapid growth of large language models (LLMs), the size of data to be stored has increased. Accordingly, we are reviewing more efficient serving by using an embedded database.
  • Efficient use of cache memory – Even if an efficient TTL value is set initially, the efficiency tends to decrease as the user’s usage pattern changes and the model is changed. Also, as more feature groups are defined, monitoring becomes more difficult. It should be straightforward to find the optimal TTL value for the cache based on data. We are considering a method to efficiently use memory while maintaining a high recommendation quality through cache hit rate and feature group loss prevention. Should we cache a feature group that is only retrieved once? What about a feature group that is retrieved twice? The current feature platform attempts caching even if a cache miss occurs only one time. We believe that all feature groups that have cache misses are worth caching. This naturally increases the inefficiency of caching. An advanced policy is needed to determine and cache feature groups that are worth caching based on various data. This will increase the efficiency of cache usage.
  • Multi-level cache optimization – Currently, the feature platform has a multi-level cache structure, and the complexity will increase if an embedded database is added in the future. Therefore, it is necessary to find and set the optimal settings by considering different cache levels. In the future, we will try to maximize efficiency by considering different levels of cache settings.

Conclusion

In this post, we examined how Karrot built their feature platform, focusing on feature serving capabilities. As of February 2025, the platform reliably handles over 100,000 RPS with P99 latency under 30 milliseconds, providing stable recommendation services through a scalable architecture that efficiently manages traffic increases.

Part 2 will explore how features are generated using consistent feature schemas and ingestion pipelines through the feature platform.


About the authors

AWS Weekly Roundup: OpenAI models, Automated Reasoning checks, Amazon EVS, and more (August 11, 2025)

Post Syndicated from Veliswa Boya original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-openai-models-automated-reasoning-checks-amazon-evs-and-more-august-11-2025/

AWS Summits in the northern hemisphere have mostly concluded but the fun and learning hasn’t yet stopped for those of us in other parts of the globe. The community, customers, partners, and colleagues enjoyed a day of learning and networking last week at the AWS Summit Mexico City and the AWS Summit Jakarta.


Last week’s launches
These are the launches from last week that caught my attention:

  • OpenAI open weight models on AWSOpenAI open weight models (gpt-oss-120b and gpt-oss-20b) are now available on AWS. These open weight models excel at coding, scientific analysis, and mathematical reasoning, with performance comparable to leading alternatives.
  • Amazon Elastic VMware Service — Amazon Elastic VMware Service (Amazon EVS), a new AWS service that lets you run VMware Cloud Foundation (VCF) environments directly within your Amazon Virtual Private Cloud (Amazon VPC), is now generally available.
  • Automated Reasoning checks — Automated Reasoning checks, a new Amazon Bedrock Guardrails policy that was previewed during AWS re:Invent, is now generally available. Automated Reasoning checks helps you validate the accuracy of content generated by foundation models (FMs) against a domain knowledge. Read more in Danilo’s post on how this can help prevent factual errors that can be caused by AI hallucinations.
  • Multi-Region application recovery service — In this post, Sébastien writes about the announcement of Amazon Application Recovery Controller (ARC) Region switch, a fully managed, highly available capability that enables organizations to plan, practice, and orchestrate Region switches with confidence, eliminating the uncertainty around cross-Region recovery operations.

Additional updates
I thought these projects, blog posts, and news items were also interesting:

Upcoming AWS events
Keep a look out and be sure to sign up for these upcoming events:

AWS re:Invent 2025 (December 1-5, 2025, Las Vegas) — AWS’s flagship annual conference offering collaborative innovation through peer-to-peer learning, expert-led discussions, and invaluable networking opportunities.

AWS Summits — Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Coming up soon are the summits at São Paulo (August 13) and Johannesburg (August 20).

AWS Community Days — Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Australia (August 15), Adria (September 5), Baltic (September 10), Aotearoa (September 18), and South Africa (September 20).

Join the AWS Builder Center to learn, build, and connect with builders in the AWS community. Browse here for upcoming in-person and virtual developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

Veliswa.

Build the highest resilience apps with multi-Region strong consistency in Amazon DynamoDB global tables

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/build-the-highest-resilience-apps-with-multi-region-strong-consistency-in-amazon-dynamodb-global-tables/

While tens of thousands of customers are successfully using Amazon DynamoDB global tables with eventual consistency, we’re seeing emerging needs for even stronger resilience. Many organizations find that the DynamoDB multi-Availability Zone architecture and eventually consistent global tables meet their requirements, but critical applications like payment processing systems and financial services demand more.

For these applications, customers require a zero Recovery Point Objective (RPO) during rare Region-wide events, meaning you can direct your app to read the latest data from any Region. Your multi-Region applications always need to access the same data regardless of location.

Starting today, you can use a new Amazon DynamoDB global tables capability that provides multi-Region strong consistency (MRSC), enabling zero RPO. This capability, first announced as a preview at AWS re:Invent 2024, simplifies building highly resilient global applications.

Here’s how you can enable MRSC starting with an existing empty DynamoDB table:

With MRSC, DynamoDB now offers you the highest level of resilience for your applications. If application processing is interrupted in a Region, you can redirect traffic to another Region containing a MRSC replica and know that you’ll be processing the latest data.

Getting started
Let me walk you through how you can use this new capability.

To start using MRSC, I need to create a global table from an existing DynamoDB table that doesn’t contain any data. I navigate to my existing table, select the Global tables tab and select Create replica.

The MRSC availability architecture requires three AWS Regions. I can configure MRSC with either three full replicas or with two replicas and a witness. A witness contains only replicated change data to provide the required availability without maintaining a full copy of my table data.

The following screenshot shows how I configure MRSC with two replicas and a witness. To configure MRSC with three full replicas instead, I can clear Configure Region 2 as Witness.

If I need to update my existing table programmatically, I can use the Amazon Q CLI to generate the required command with the following prompt:

> Hey Q! Update my existing DynamoDB table called "demo-mrsc" in us-east-1 with multi-Region strong consistency across us-east-2 with witness in us-west-2 Regions

Shortly after, Q CLI will respond back with following command:

> To update your DynamoDB table with multi-Region strong consistency, you'll need to use the update-table command with the appropriate parameters. Here's how you would do it:

aws dynamodb update-table \ 
    --table-name demo-mrsc \ 
    --replica-updates '[{"Create": {"RegionName": "us-east-2"}}]' \ 
    --global-table-witness-updates '[{"Create": {"RegionName": "us-west-2"}}]' \ 
    --multi-region-consistency STRONG \ 
    --region us-east-1

After it’s finished processing, I can check the status of my MRSC global table. I can see I have a witness configured for my DynamoDB global table. A witness reduces costs while still providing the resilience benefits of multi-Region strong consistency.

Then, in my application, I can use ConsistentRead to read data with strong consistency. Here’s a Python example:

import boto3

# Configure the DynamoDB client for your region
dynamodb = boto3.resource('dynamodb', region_name='us-east-2')
table = dynamodb.Table('demo-mrsc')

pk_id = "demo#test123"

# Read with strong consistency across regions
response = table.get_item(
    Key={
        'PK': pk_id
    },
    ConsistentRead=True
)

print(response)

For operations that require the strongest resilience, I can use ConsistentRead=True. For less critical operations where eventual consistency is acceptable, I can omit this parameter to improve performance and reduce costs.

Additional things to know
Here are a couple of things to note:

  • Availability – The Amazon DynamoDB multi-Region strong consistency capability is available in following AWS Regions: US East (Ohio, N. Virginia), US West (Oregon), Asia Pacific (Osaka, Seoul, Tokyo), and Europe (Frankfurt, Ireland, London, Paris)
  • Pricing – Multi-Region strong consistency pricing follows the existing global tables pricing structure. DynamoDB recently reduced global tables pricing by up to 67 percent, making this highly resilient architecture more affordable than ever. Visit Amazon DynamoDB lowers pricing for on-demand throughput and global tables in the AWS Database Blog to learn more.

Learn more about how you can achieve the highest level of application resilience, enable your applications to be always available and always read the latest data regardless of the Region by visiting Amazon DynamoDB global tables.

Happy building!

Donnie

 

Integrating aggregators and Quick Service Restaurants with AWS serverless architectures

Post Syndicated from Mike Gomez original https://aws.amazon.com/blogs/compute/integrating-aggregators-and-quick-service-restaurants-with-aws-serverless-architectures/

In this post, you learn how to use AWS serverless technologies, such as Amazon EventBridge and AWS Lambda, to build an integration between Quick Service Restaurants (QSRs) and online ordering and food delivery aggregators. These aggregators have taken off as an option to QSRs to expand their consumer base, enabling them with delivery options to help grow their businesses.

QSR overview

QSRs prioritize speedy and convenient service, offering a streamlined menu. To meet evolving consumer expectations, QSRs can use API integrations with third-party aggregators. This technological synergy enables QSRs to expand their capabilities, introducing diverse payment methods and incorporating delivery services. These features have become standard in this restaurant segment.

Behind the scenes, the APIs are used to orchestrate the interaction between the aggregator and the QSR while having a consistent ordering and delivery experience.

QSR business objectives are:

  • Providing consistent ordering and delivery experiences
  • Offering personalized menu items
  • Retaining repeat customers
  • Reducing third-party delivery cancellation due to lack of delivery personalization options

This post starts with a simple architecture and adds components to solve architectural challenges.

Architecture

As a solutions architect, you’ve been approached by a thriving local restaurant business seeking technological solutions to fuel their expansion. Your task is to design an optimal integration architecture that aligns with their technical requirements, streamlines operations, and enhances customer experience.

At the core of this integration is Amazon API Gateway, which accepts the incoming orders from various delivery aggregators. The API Gateway becomes the front door, connecting the QSRs with the end customers for a streamlined and dynamic order processing system.

Driving the backend of this integration are Lambda functions. These functions validate orders and securely communicate with delivery aggregators. Lambda functions can scale dynamically based on-demand, and make sure of optimal resource usage and cost-effectiveness.

Order placement workflow

The following steps outline the serverless integration between API Gateway and Lambda functions, as shown in the following figure:

  • Customers can place orders either through food delivery aggregators or the business’s own ordering system.
  • The order request is sent to API Gateway.

This architecture works for small and simple integrations. To scale this architecture for high traffic, use asynchronous integration to reduce the coupling between API and Lambda function.

Order routing workflow

The following steps outline a serverless integration where API Gateway connects to Lambda functions through Amazon EventBridge as the event routing service, as shown in the following figure:

  1. API Gateway receives the order request.
  2. The API Gateway routes the customer’s order request to an EventBridge bus for processing.

EventBridge routes events (for example order status changes) to Lambda functions, making sure of resiliency during service disruptions. This eliminates manual error handling and keeps QSRs and aggregators synchronized.

EventBridge delivers the following essential capabilities:

  • EventBridge receives events triggered by various actions, such as new orders or menu updates.
  • It routes events to the relevant Lambda functions, initiating the appropriate actions.
  • EventBridge supports event replay, allowing recovery from Lambda deployment issues or function failures. This feature enables business continuity by storing events during service disruptions and automatically resuming processing when the system stabilizes.

To maintain order history and enable fast data retrieval, the system needs a highly performant database. Amazon DynamoDB, a serverless NoSQL database service, meets these requirements by efficiently storing and managing order information and metadata. The order processing Lambda function interacts with DynamoDB to persist order details. This approach enables asynchronous processing of the stored data by other backend processes. The database solution provides the scalability and responsiveness needed to handle growing order volumes while maintaining consistent performance, separating order intake from subsequent processing steps.

Order processing workflow

The following steps outline the order processing workflow, as shown in the following figure:

  • The order processing Lambda function validates the order and updates the DynamoDB database with the new order details.
  • The function publishes error events to EventBridge, enabling downstream processing for error handling and retry logic. These events can trigger more Lambda functions designed to manage specific error scenarios and recovery processes.

EventBridge implementation patterns: single or dual bus approaches

EventBridge offers multiple approaches for event bus topology. Architects can choose to either use a single event bus with distinct event patterns based on order status or implement a multi-bus strategy.

The single-bus approach uses one event bus for all events with routing rule patterns based on order status. For example, rules would match specific statuses (for example “new” or “processed”) to trigger appropriate Lambda functions. Although it is architecturally simple, it needs careful management of the event schema to avoid potential errors. However, a single-bus approach requires careful handling to prevent recursive processing, where messages trigger additional messages in an endless loop.

Alternatively, the multi-bus method, separating order placement and processing across different buses, effectively prevents loops and recursion issues. This approach provides better separation of transactions, albeit with a slightly more complex setup.

EventBridge can directly target external services using the API destination option, eliminating the need for Lambda functions for third party integrations.

Orchestrating order processing

In complex order processing systems for QSRs, managing multiple interdependent Lambda functions can become challenging, potentially leading to intricate code and difficult-to-maintain architectures. To address this, AWS Step Functions can be introduced as an orchestration layer.

Step Functions acts as a central coordinator for the business logic needed in QSR order flows. This service manages the progression of activities in the order processing workflow, thereby efficiently coordinating tasks such as kitchen preparation and delivery logistics. Defining and managing complex workflows allows Step Functions to optimize the overall efficiency of QSR operations, providing a structured and adaptable solution. This orchestration enhances the restaurant’s ability to handle dynamic processing, achieving a smooth and responsive integration with delivery services while streamlining the underlying architecture.

The following steps outline the orchestration of order processing, as shown in the following figure:

  • Order processing trigger respective Lambda function, which updates the order data in the DynamoDB database.
  • The updated order is made available for subsequent Lambda functions that process more business logic being performed by further Lambda functions.

In a multi-bus EventBridge architecture, the process flows are as follows:

  1. The first EventBridge bus receives the initial order event and routes it to a Step Functions workflow.
  2. The Step Functions workflow orchestrates the order processing, coordinating various tasks and checks.
  3. Upon completion, the Step Functions workflow emits an event with the processing results to the second EventBridge bus.
  4. Based on the output from the Step Function workflow, this second bus contains a rule that triggers the Aggregator API as an API destination.

User engagement workflow

When a customer places an order, there must be a way to confirm or notify them when the order is ready. For this purpose, you can use AWS End User Messaging services to push notifications for order completion and new offers to customers.

Analyzing customer data and individual preferences allows Amazon Personalize to be used to present personalized recommendations and promotions.

Amazon Personalize can analyze historical order data to enhance the user experience through personalized recommendations, such as optimal delivery times, preferred menu items, and tailored promotions based on individual ordering patterns.

Conclusion

This post showed how to use AWS serverless services to build a platform for your order processing without worrying about managing underlying infrastructure. The serverless services included were Amazon API Gateway, AWS Lambda, Amazon EventBridge, AWS Step Functions, AWS End User Messaging, and Amazon Personalize.

This post is a brief introduction to event-driven architectures focused on integrations of internal ordering systems with delivery aggregators and third-party ordering platforms. This can help expand the user base, and it has been a key factor in the growth of many QSRs. Making the ordering, take-out, and delivery experience more efficient translates to revenue growth, reduction of order abandonment, as well as increased recurrent customer retention and brand loyalty.

For more serverless learning resources, visit Serverless Land. To find more patterns, go directly to the Serverless Patterns Collection.

How UNiDAYS achieved AWS Region expansion in 3 weeks

Post Syndicated from Sanhawat Taongern original https://aws.amazon.com/blogs/architecture/how-unidays-achieved-aws-region-expansion-in-3-weeks/

UNiDAYS is a fast, free digital platform that provides exclusive student offers and benefits to over 29 million verified members worldwide. With a rapidly growing user base and an increasing number of global partnerships, UNiDAYS recognized the need to enhance its platform’s performance to deliver a seamless consumer experience in geographic regions far from its original base of operations.

In this post, we share how UNiDAYS achieved AWS Region expansion in just 3 weeks using AWS services.

Business challenges

In response to growth opportunities, UNiDAYS faced a pressing business requirement: deliver low-latency responses and provide high availability for users across diverse geographic regions. At the same time, the platform needed to guarantee global data consistency while adhering to tight deadlines—all within just a few weeks. However, the existing monolithic application, although built on Amazon Web Services (AWS), wasn’t optimized for active-active multi-Region deployments.

The challenge was further complicated by the need to extend functionality from this legacy system, which used the AWS global network for improved user experience but fell short of meeting new business requirements. Re-architecting the entire platform to support multi-Region deployments within the given timeframe wasn’t feasible.

Solution overview

UNiDAYS opted to create complementary services tailored to these new requirements, using AWS services for a multi-Region, active-active architecture. The key services used included:

This approach allowed UNiDAYS to meet its latency, availability, and consistency goals while seamlessly integrating with existing infrastructure. The following diagram is the architecture for the solution.

Global delivery and resiliency

To provide the lowest latency and multi-Region resiliency, CloudFront was used with latency-based routing configured in Route 53. This routing directs requests to the Regional Application Load Balancers with the lowest latency, automatically providing resiliency in the event of Regional issues. Security was a key consideration. AWS WAF integration with CloudFront provided application-layer protection at the edge. Additional security measures included:

  • Custom HTTP headers on origin requests, enforced using Application Load Balancer listener rule conditions
  • Prefix lists to restrict access to Application Load Balancers, making sure that traffic originated from the intended CloudFront distributions

Rapid Regional deployment

The core infrastructure is deployed through Terraform, and applications are deployed using custom tooling that wraps AWS CloudFormation. This hybrid approach enabled rapid delivery by using existing patterns without disrupting established workflows. Resources were organized into tiers: platform, global, and Regional. Platform and global resources were deployed one time, and Regional resources were rolled out to each activated Region, streamlining expansion efforts.

One technical challenge involved CloudFormation exports, which are Regional by design. To address this, we implemented a custom CloudFormation macro to enable cross-Region access to exported values, providing consistency across deployments.

Amazon ECS enabled progressive application deployments within each Region, allowing teams to focus on scaling applications rather than managing infrastructure. For cost-efficiency, we used Spot Instances. During testing, container start-up latency was observed due to cross-Region image downloads from Amazon Elastic Container Registry (Amazon ECR). This issue was resolved by enabling private image replication in Amazon ECR so that container images were available locally in each Region. This solution significantly reduced start-up times, improving application responsiveness during deployments and scaling events.

Data consistency and performance

DynamoDB global tables were instrumental in providing eventual data consistency and Regional replication. With DynamoDB handling these aspects, we could focus on application logic.

The result was a substantial reduction in latency at key locations. For example, client-experienced latency in one Region dropped from approximately 200 milliseconds to 50 milliseconds upon deployment, as shown in the following screenshot.

Outcome

Key technical hurdles

We addressed the following technical obstacles while developing the solution:

  • Cross-Region CloudFormation exports – CloudFormation exports are Regional by design. We addressed this by creating a custom CloudFormation macro to read exports across Regions.
  • Container start-up latency – Latency caused by cross-Region image downloads was mitigated by implementing Amazon ECR private image replication. This meant that container images were readily available in each Region, reducing deployment times and improving overall performance.
  • Security assurance – By using CloudFront, AWS WAF, and Application Load Balancer security features, we made sure that traffic and data remained secure.

Why AWS?

UNiDAYS chose AWS due to its comprehensive global infrastructure and robust service offerings, which allowed the platform to:

  • Seamlessly expand compute operations to Regions closer to its user base
  • Take advantage of a full stack of services for reliable, secure, and low-latency content delivery
  • Meet tight delivery deadlines without compromising on performance or security
  • Maintain flexibility where required, with the ability to use more managed services, which allowed a focus on our applications

Conclusion

By adopting a multi-Region, active-active architecture on AWS, UNiDAYS successfully met its business goals within only 3 weeks, rapidly expanding to new Regions while promoting platform resiliency. The solution improved latency by 75% in new Regions (from 200 milliseconds to 50 milliseconds), provided Regional data availability through DynamoDB global tables, and maintained 100% service uptime during resiliency tests, even in cases of Regional connectivity loss. Additionally, deployment velocity increased by over 40%, allowing faster feature releases and improved agility. This architecture not only provides a scalable and resilient platform for current operations but also establishes a strong foundation for future global expansion.

Learn more

Is your organization looking to expand into new Regions while maintaining performance and reliability?

  • Contact AWS experts to explore tailored solutions for your multi-Region strategy.
  • Use AWS Global Infrastructure to optimize your expansion.
  • Share your challenges and successes in the comments—we’d love to hear your insights!


About the Authors

From virtual machine to Kubernetes to serverless: How dacadoo saved 78% on cloud costs and automated operations

Post Syndicated from Andreas Gehrig original https://aws.amazon.com/blogs/architecture/from-virtual-machine-to-kubernetes-to-serverless-how-dacadoo-saved-78-on-cloud-costs-and-automated-operations/

dacadoo is a global Swiss-based technology company that develops solutions for digital health engagement and health risk quantification. Their products include a software-as-a-service (SaaS)-based digital health engagement platform that uses behavioral science, AI, and gamification to help end users improve their health outcomes.

The company embarked on a journey to modernize an API to quantify health and lifestyle data plus a risk engine to calculate mortality and morbidity probabilities based on years of scientific research data.

To transform a virtual machine–based API service into a globally redundant, scalable health score and risk calculation solution dacadoo chose Amazon Web Services (AWS) technology. The service handles highly sensitive health data from a global customer base and must comply with regional regulations.

The result is a cost reduction of 78% and an infrastructure maintenance effort of less than an hour per year , allowing dacadoo to deliver and operate more AWS infrastructure without scaling its site reliability engineering (SRE) team, thanks to a high level of automation and an agile mindset.

In this post, we walk you step-by-step through dacadoo’s journey of embracing managed services, highlighting their architectural decisions as we go.

Background

The solution architecture went through a three-stage journey:

  1. Incubation – Single virtual machine on premises with disaster recovery (DR) in Switzerland
  2. Global and scalable – Multiple global Kubernetes clusters
  3. Operational excellence – Fully serverless and geo-redundant on AWS

Stage 1: Incubation with a virtual machine

After years of scientific research and development, the service was launched, running on a single on-premises virtual machine that used hypervisor technology to provide disaster recovery (DR). However, it had no high availability (HA) capability and it required manual recovery.

The application serving the API requests and the NoSQL database were both running on the same host. Software deployment and operating system maintenance were performed manually using Secure Shell (SSH)—a typical low-automation setup that also included downtime.

The following architecture diagram shows a virtual machine encompassing the monolithic application and its database.

Monolithic architecture

Challenges

A single virtual machine was quick to set up and inexpensive to operate, but it had considerable shortcomings. The health API was only available in Switzerland, infrastructure maintenance was performed manually, and software deployment was handled manually. Additionally, database backups were done using virtual machine snapshots, uptime monitoring only, and testing was conducted on the developer workstation.

Stage 2: Global and scalable with Kubernetes

At that time, dacadoo made a strategic decision to heavily invest in Kubernetes for managing containerized workloads on a global scale. As part of this technology rollout, the health score and risk service were migrated to Kubernetes.

Due to the geographically distributed customer base and low latency requirements, three Kubernetes clusters were deployed, one on each continent. The NoSQL database was hosted in proximity to the workload to reduce service latency and keep the migration effort low.

To reduce the operational maintenance, the NoSQL database was integrated as a SaaS offering, and monitoring was centralized using Datadog.

All cloud infrastructure was provisioned exclusively with Terraform, covering the Kubernetes cluster, NoSQL database , and integration with GitLab and Datadog.

dacadoo containerized the API service and used Gitlab continuous integration and continuous deployment (CI/CD) pipelines to deploy multiple environments and clusters on a global hyperscaler.

In retrospect, this was a typical replatform modernization project from virtual machine to Kubernetes, with a high level of automation and a SaaS-first approach.

The following diagram is the architecture for the container solution with managed NoSQL database.

Containers architecture

Challenges

The service faced several challenges, including increased costs from deploying three regional Kubernetes clusters across three environments, resulting in 27 cluster nodes and additional expenses from managing NoSQL database SaaS instances for each cluster. The complexity of CI/CD pipelines for multi-environment multi-cluster deployments added to the difficulty. Significant operational effort was required to keep infrastructure and Kubernetes components up to date.

Stage 3: Operational excellence with serverless

The Kubernetes-based architecture met the requirements, but some features in the dacadoo API service backlog needed to fit better with the application architecture at the time.

This was the right moment to take a holistic view of the infrastructure and software architecture and refactor the solution according to the latest AWS technologies and best practices, the next frontier for dacadoo’s engineering team.

Solution requirements

Requirements for the solution refactoring were as follows:

  • Keep the functionality of the API unmodified
  • Constrain data processing to a region of choice for compliance with local data protection laws
  • Avoid weekly patch cycles by exclusively using managed serverless services
  • Reduce costs by choosing services with a pay-as-you-go billing model
  • Delegate authentication to a dedicated service
  • Use an established web framework with an extensive ecosystem

Refactoring the apps

The API service has two components: a developer portal and the health score and risk calculations API. The database is only required for API keys, algorithm parameters, quotas, and usage statistics. Health data is processed regionally by the compute layer but not persisted, opening the door for a distributed database: Amazon DynamoDB global tables is the perfect fit for the solution. Writes are distributed to all connected Regions, whereas reads are local, providing low latency for complying with dacadoo service level agreements (SLAs).

The developer portal is a web UI with API documentation and API key management features. AWS Lambda is a great fit because it scales automatically and has a pay-per-request billing model.

The health and risk API uses algorithms implemented in the C programming language for short bursting, compute-intense simulations. These calls are wrapped by a REST API using the Python FastAPI framework. These characteristics make AWS Lambda a great fit.

Serverless architecture

HTTP requests are routed to the Lambda functions using Amazon API Gateway with AWS WAF for protection from malicious requests and attacks. Static assets are served from an Amazon Simple Storage Service (Amazon S3) bucket through API Gateway. The additional features of Amazon CloudFront aren’t required, and Amazon S3 reduces the complexity.

Amazon Route 53 provides a powerful feature known as latency-based routing, which allows it to direct DNS queries to the endpoint that offers the lowest latency for the requester.

This feature provides Regional high availability for API users without data processing location requirements. Alternatively, the user can call specific Regional endpoints to make sure requests are processed in the desired Region.

API authorization is HTTP header-based and is performed in the application with data stored in Amazon DynamoDB.

The following diagram is the architecture for a geo-redundant fully serverless solution.

Serverless architecture

With a dacadoo SRE team proficient in Python, they opted for Pulumi for its advanced features such as programming language flow control constructs, powerful configuration capabilities, and multi-cloud support.

For continuous integration, GitLab CI compiles the algorithm library, tests the FastAPI applications and packages everything. The application deployment is just an update of the AWS Lambda, a simple and reliable workflow.

Summary

The solution evolved from a managed infrastructure setup, where the customer held most of the responsibility, to an AWS managed service architecture.

Infrastructure provisioning evolved from manual, error-prone processes to powerful code-driven workflows in Pulumi. The SRE needed to enhance their software engineering skills to adopt Pulumi, transitioning from configuration-based approaches to designing and maintaining an infrastructure code base using object-oriented Python. This was part of dacadoo’s investment in the SRE team and broader modernization efforts. The serverless architecture enabled a GitOps engineering culture focused on productivity.

The transformation maximized scalability and availability while reducing costs and operational effort:

Virtual machine

  • Scalability: Low
  • Availability: Best effort
  • Infrastructure costs: Low
  • Maintenance effort: High

Kubernetes

  • Scalability: High
  • Availability: 99.95%
  • Infrastructure costs: High
  • Maintenance effort: Medium

Serverless

  • Scalability: Very high
  • Availability: 99.999% (with failover to another AWS Region)
  • Infrastructure costs: Low
  • Maintenance effort: Very low

The global redundancy elevates availability to an impressive 99.999% while keeping the costs low.

Conclusion

Migrating from a virtual machine to Kubernetes and ultimately to AWS Lambda demonstrates the progression of cloud engineering toward enhanced efficiency and scalability.

Each step in this journey reduced the complexity of managing resources while increasing flexibility and automation. Transitioning dacadoo’s API service to a fully serverless, geo-redundant architecture not only advanced the platform but also upskilled engineers, maintained a lean SRE team, and kept infrastructure costs low. Get started with your own AWS serverless solution.


About the Authors

Build an enterprise API management solution using Amazon API Gateway

Post Syndicated from Roger Zhang original https://aws.amazon.com/blogs/architecture/build-an-enterprise-api-management-solution-using-amazon-api-gateway/

Enterprises face many challenges when they build and manage application programming interfaces (APIs). These challenges include security controls, version management, traffic control, and usage analytics. As digital businesses expand, a mature API management (APIM) solution is crucial for ensuring scalability, security, and operational efficiency.

This blog post shows how you can use Amazon API Gateway—along with AWS Lambda, Amazon DynamoDB, and other AWS services—to create a comprehensive and customizable APIM solution. This solution addresses the complex requirements of large enterprises managing APIs at scale.

Core features of APIM

API Management (APIM) centralizes the management and publishing of APIs for the entire enterprise, acting as a hub between clients, applications, and administrators on one side, and internal services, external systems, and large language models (LLMs) on the other, as shown in the following figure.

APIM capabilities

The key features of APIM include:

  • Security and governance
    • Authentication, authorization, rate limiting, and security policy enforcement.
    • Helps ensure APIs meet organizational or industry standards.
  • Monitoring and logging
    • Provides monitoring, alarms, and logging to track API performance and troubleshoot issues quickly.
  • Customization and transformation
    • Offers protocol and field transformations, plus orchestration and aggregation.
    • Makes it easier to integrate with different systems and meet various client needs.
  • API lifecycle management
    • Publishing, rollback, version control, and documentation.
    • Streamlines development and maintenance throughout the API lifecycle.
  • Developer and business tools
    • Portals for developers, business owners, and administrators to manage documentation, billing, and analytics.
  • Integration with LLMs
    • Specialized adapters, proxy configurations, and switching to integrate AI models seamlessly.
  • Flexible deployment options
    • Canary releases, pipeline automation, and other advanced release strategies.
    • Helps ensure stable, controlled API updates.

Unified management of multiple API gateways

API Gateway enforces resource limits of 300 resources per gateway, with a hard limit of 600. For enterprises that require more resources, managing multiple gateways individually can be time-consuming and error prone. APIM simplifies this by integrating API Gateway, Lambda, and DynamoDB; creating a centralized platform for managing APIs across multiple gateways. This integration streamlines the process, making it easier to scale and maintain APIs.

API lifecycle management

Managing API versions, publishing updates, and maintaining documentation often requires separate tools and manual processes, leading to inefficiencies. APIM centralizes these tasks in one portal, offering version control, publishing workflows, and rollback options. This streamlines the API lifecycle, ensuring consistency and reducing the chances for errors.

Enhanced security

Enterprises often need to implement different authentication strategies for various clients. These configurations typically require custom Lambda logic and database lookups, adding complexity and cost. APIM introduces configurable security policies that allow client-specific authentication without the need for additional custom code, reducing both complexity and operational overhead.

Customization and transformation

Enterprises frequently handle diverse client requests that involve different formats and protocols. Traditional API management approaches might struggle to support such variations. APIM allows for seamless protocol and field transformations, enabling integrations that meet a wide range of client requirements without additional development effort.

Developer portal

Developers need clear documentation, easy testing environments, and efficient API key management to work effectively. Traditional systems often lack these features, slowing down adoption. APIM provides a developer portal that consolidates API documentation, offers sandbox environments for testing, and simplifies API key management, reducing onboarding time and improving the developer experience.

Logging and monitoring

Log management is key to maintaining API performance, diagnosing issues, and gaining insights into usage. APIM uses API Gateway custom access logging, allowing teams to define logs based on business needs; whether creating separate CloudWatch metrics for each API path or exporting data to external platforms like ELK or Grafana.

Architecture overview

The APIM architecture, shown in the following figure, includes a management state (represented by numbers) and a runtime state (represented by letters). Both parts use a serverless paradigm.

APIM Architecture

Management state

The management state includes the following elements:

  1. Administrator portal access: Administrators access the APIM solution through a secured web portal.
  2. API Requests to APIM Lambda: Requests from the administrator’s API go through API Gateway, which then invokes the APIM Lambda function. This function handles logic related to configuration changes and other administrative actions.

In the following example, we show you how the APIM Lambda function dynamically applies different middleware based on the route configuration. This approach allows for flexible handling of authentication, client access restrictions, and request/response transformations. Here’s a quick breakdown of the key elements:

// If the route requires OIDC (OpenID Connect) authentication,
// add the OIDC authentication middleware to the route.
if route.Auth == "OIDC" {
    r.Use(middleware.OidcAuthenticator)
}
// If the route configuration specifies a list of allowed clients
// and the list is not empty, add a middleware to restrict access
// to only the specified clients.
if route.Allow.Clients != nil && len(route.Allow.Clients) != 0 {
    r.Use(middleware.AllowClients(route.Allow.Clients, cfg.Clients))
}
// Remove specific headers injected by the API Gateway
// to reduce exposure of internal details to downstream systems.
r.Use(middleware.RemoveGatewayHeaders)

// Add additional middleware for handling outbound logic.
// This could include retries, logging, or other outbound-specific functionality.
r.Use(outboundMiddlewares)
// Dynamically constructs and applies a chain of middlewares 
// based on the outbound configuration associated with the current request.
func outboundMiddlewares(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        // Retrieve the outbound configuration from the request context.
        outbound, _ := r.Context().Value(selectedOutboundContext).(config.Outbound)

        // Initialize a slice to store the middlewares to be applied.
        middlewares := []func(http.Handler) http.Handler{}

        // Middleware to rewrite the HTTP request based on the outbound configuration.
        middlewares = append(middlewares, middleware.ProxyRequestRewrite(&outbound))

        // Add a middleware for mapping request data if specified in the outbound configuration.
        if len(outbound.Convert.Request) != 0 {
            middlewares = append(middlewares, middleware.RequestDataMapping(outbound.Convert.Request))
        }

        // Middleware to log the outbound response for monitoring or debugging purposes.
        middlewares = append(middlewares, middleware.OutboundResponseLog)

        // Add a middleware for mapping response data if specified in the outbound configuration.
        if len(outbound.Convert.Response) != 0 {
            middlewares = append(middlewares, middleware.ResponseDataMapping(outbound.Convert.Response))
        }

        // Add a middleware for modifying the response if a modification function is defined.
        if outbound.ModifyResponse != "" {
            f, ok := system.MODIFY[outbound.ModifyResponse]
            if ok {
                middlewares = append(middlewares, f())
            }
        }

        // Chain the constructed middlewares together and apply them to the request.
        chain := chi.Chain(middlewares...)
        chain.Handler(next).ServeHTTP(w, r)
    })
}

By using a middleware chain, you can customize how each request and response is processed on a per-route basis. This architecture not only keeps your code organized but also makes the API Gateway-integrated Lambda function far more adaptable to changing requirements. You can add or remove configurations from APIM portal as new use cases emerge—such as data transformations, custom logging, or additional security checks—without rewriting core logic.

  1. Configuration management: Administrators set up server-side and client-side settings, such as API Gateway parameters, authentication requirements, transformations, and more.
  2. Persistence:  DynamoDB stores these configurations, providing persistent data storage and auditing capabilities.
  3. Asynchronous resource provisioning: After administrators save configurations and release them from the APIM portal, APIM creates or updates AWS resources—such as API Gateway, Lambda functions, and AWS Identity and Access Management (IAM). Lambda runs these updates in the background, so administrators can continue working uninterrupted.

Runtime state

The runtime state includes the following elements:

A. Client request: Clients send requests to the APIM endpoint.

B. Routing to the correct gateway: APIM uses the URI prefix in the API mappings associated with custom domain names to route requests to the appropriate API gateway, as shown in the following figure. Each mapping defines a specific API, stage, and an optional path. When a request arrives, APIM checks the path and directs the request to the correct stage and API if it matches. Unmatched requests default to the mapping with no path defined.

C. APIM core processing: A Lambda function (APIM CORE) uses DynamoDB configurations to handle authentication, authorization, protocol conversion, field transformation, and routing.

D. Downstream service call: APIM forwards each request to the configured internal or external endpoint.

E. Logging and monitoring: API Gateway access logs and custom logs track requests in detail.

F. Alarm: Metrics and alarms detect anomalies and notify stakeholders. Use Amazon CloudWatch or self-hosted solutions such as ELK to enable real-time monitoring and alerting.

api-mapping

Conclusion

In this post, we’ve demonstrated how to build an enterprise API management (APIM) solution using Amazon API Gateway, AWS Lambda, Amazon DynamoDB, and other AWS services. We’ve also shown how APIM centralizes critical features—such as version management, security policies, and request/response transformations—to accommodate large-scale enterprise requirements.

You can use the APIM portal to store and manage configurations in DynamoDB, dynamically applying these settings to multiple API gateways without rewriting code. This approach ensures consistent governance across diverse client types and business scenarios, helping to keep APIs both secure and flexible.

Finally, you’ve seen how the APIM architecture unifies the management state and runtime state, streamlines administrative tasks, and provides end-to-end monitoring and alerting. By adopting these best practices, your enterprise can establish a robust, scalable, and secure API management foundation, all within a serverless paradigm.


About the Authors

Top Architecture Blog Posts of 2024

Post Syndicated from Andrea Courtright original https://aws.amazon.com/blogs/architecture/top-architecture-blog-posts-of-2024/

Well, it’s been another historic year! We’ve watched in awe as the use of real-world generative AI has changed the tech landscape, and while we at the Architecture Blog happily participated, we also made every effort to stay true to our channel’s original scope, and your readership this last year has proven that decision was the right one.

AI/ML carries itself in the top posts this year, but we’re also happy to see that foundational topics like resiliency and cost optimization are still of great interest to our audience.

(By the way, if you were hoping for more AI/ML content, head on over to our sister channel, the AWS Machine Learning Blog!).

Without further ado, here are our top posts from 2024!

#10 Deploy Stable Diffusion ComfyUI on AWS elastically and efficiently

This post helps you get started using ComfyUI, and was so successful that we followed it up later in the year with How to build custom nodes workflow with ComfyUI on EKS!

Architecture for deploying stable diffusion on ComfyUI

Figure 1. Architecture for deploying stable diffusion on ComfyUI

#9 Let’s Architect! Designing Well-Architected systems

In keeping with Let’s Architect! series, we have our first of three favorites for the year. This set of resources helps you apply Well-Architected standards in practice.

Let's Architect

Figure 2. Let’s Architect

#8 Let’s Architect! Learn About Machine Learning on AWS

As I said, Let’s Architect! has a winning series, and they’ve got a finger on the pulse of the tech world. This post about machine learning showcases some of the most exciting things happening at AWS.

Let's Architect

Figure 3. Let’s Architect

If you’re more interested in generative AI, you can also take a look at another post from 2024: Let’s Architect! GenAI

#7 Creating an organizational multi-Region failover strategy

Preparedness is another common theme in this year’s favorites. Michael, John, and Saurabh are well-versed in multi-Region architecture, and they’re here to share some strategies to contain failure impact.

When the application experiences an impairment using S3 resources in the primary Region, it fails over to use an S3 bucket in the secondary Region.

Figure 4. When the application experiences an impairment using S3 resources in the primary Region, it fails over to use an S3 bucket in the secondary Region.

#6 Building a three-tier architecture on a budget

Let’s talk cost optimization. This post about a three-tier architecture that relies on the AWS Free Tier is a must-read for anyone looking for tips to help them avoid unnecessary costs (and that’s everyone).

Example of a three-tier architecture on AWS

Figure 5. Example of a three-tier architecture on AWS

#5 Announcing updates to the AWS Well-Architected Framework guidance

As usual, Haleh & team are pros at making sure the Well-Architected Framework is current and relevant. Take a look at the enhanced and expanded guidance in all six pillars.

Well-Architected logo

Figure 6. Well-Architected logo

#4 Let’s Architect! Serverless developer experience in AWS

One more winning post from Luca, Federica, Vittorio, and Zamira! This collection of developer resources includes new ideas in AWS Lambda, Amazon Q Developer, and Amazon DynamoDB.

Let's Architect

Figure 7. Let’s Architect

#3 London Stock Exchange Group uses chaos engineering on AWS to improve resilience

This post from April 1 was not an April Fool’s joke! See how LSEG designed failure scenarios to test their resilience and observability.

Chaos engineering pattern for hybrid architecture (3-tier application)

Figure 8. Chaos engineering pattern for hybrid architecture (3-tier application)

#2 Achieving Frugal Architecture using the AWS Well-Architected Framework Guidance

Frugality AND Well-Architected? What a winning combo! This post, inspired by the 2023 re:Invent keynote, outlines the seven laws of Frugal Architecture.

Well-Architected logo

Figure 9. Well-Architected logo

#1 How an insurance company implements disaster recovery of 3-tier applications

And finally, our number one post of the year! Amit and Luiz showcase a customer solution with real-world applications that builds on the guidelines of other posts in this list! Well done!

The Pilot Light scenario for a 3-tier application that has application servers and a database deployed in two Regions

Figure 10. The Pilot Light scenario for a 3-tier application that has application servers and a database deployed in two Regions

Thank you!

As always, thanks to our contributors for their dedication and desire to share, and to you, our readers! We would be nothing with you. Literally.

For other top post lists, see our Top 10 and Top 5 posts from previous years.

AWS Weekly Roundup: New Asia Pacific Region, DynamoDB updates, Amazon Q developer, and more (January 13, 2025)

Post Syndicated from Betty Zheng (郑予彬) original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-new-asia-pacific-region-dynamodb-updates-amazon-q-developer-and-more-january-13-2025/

As we move into the second week of 2025, China is celebrating Laba Festival (腊八节), a traditional holiday, which marks the beginning of Chinese New Year preparations. On this day, Chinese people prepare Laba congee, a special porridge combining various grains, dried fruits, and nuts. This

nutritious mixture symbolizes harmony, prosperity, and good fortune — with each ingredient representing the diversity and abundance of life. This traditional practice dates back to when Buddha achieved enlightenment after consuming rice porridge, making it a symbol of both material and spiritual nourishment. The festival, occurring on the eighth day of the twelfth lunar month, marks the countdown to Spring Festival, China’s most significant traditional holiday celebrating family reunion and renewal.

As our global tech community grows, such cultural celebrations remind us of the importance of inclusive innovation and shared progress.

Last week’s launches

Let’s take a look at what Amazon Web Services (AWS) launched in this week.

New AWS Asia Pacific (Thailand) Region– AWS has expanded its global infrastructure with the launch of the new Asia Pacific (Thailand) AWS Region, featuring three Availability Zones. With this addition, customers in Thailand and throughout Southeast Asia can serve customers with reduced latency while maintaining data residency within Thailand. The newly launched Region supports the complete range of AWS services and strengthens our presence in the rapidly growing ASEAN market.

New AWS Direct Connect location in Bangkok – Following the launch of our Thailand Region, we’ve established a new AWS Direct Connect location in Bangkok and expanded our existing infrastructure. This addition provides customers in Thailand with improved connectivity options and reduced network latency when accessing AWS services.

Database and analytics

Configurable point-in-time recovery periods for Amazon DynamoDBAmazon DynamoDB now enables customizable point-in-time recovery (PITR) periods, which means customers can specify recovery durations ranging from 1 to 35 days on a per-table basis. This enhancement enables organizations to meet precise compliance requirements while maximizing cost-efficiency. The feature is now available across all AWS Regions, including AWS GovCloud (US West) and China Regions. This flexibility in data recovery periods empowers customers to align their backup policies precisely with their business requirements and regulatory obligations.

Amazon MSK Connect APIs with AWS PrivateLinkAmazon Managed Streaming for Apache Kafka Connect (Amazon MSK Connect) APIs now support AWS PrivateLink, giving customers access to MSK Connect APIs through private endpoints within their virtual private cloud (VPC). This enhancement provides increased security and reduced data exposure by keeping traffic within the AWS network.

Generative AI and machine learning

Amazon Q Developer in SageMaker Code EditorAmazon Q Developer is now integrated into the Amazon SageMaker Code Editor integrated development environment (IDE), enhancing the developer’s experience with AI-powered code assistance. Intelligent code suggestions, documentation assistance, and contextual recommendations are now directly available within the SageMaker development environment.

Management and governance

AWS Systems Manager Automation in AWS ChatbotAWS Chatbot now offers 20 additional AWS Systems Manager Automation runbook recommendations, expanding its capabilities for automated operations management. These new recommendations help customers streamline their operational tasks and implement best practices more efficiently through chat-based interactions.

AWS Transit Gateway cost analysis enhancement – We’ve introduced new capabilities for analyzing Transit Gateway data processing charges using cost allocation tags. This feature provides improved visibility and control over networking costs, enabling organizations to track and optimize AWS Transit Gateway usage efficiently. The enhanced cost analysis tools deliver detailed insights into network traffic patterns and associated costs.

Other AWS news and highlights

2024’s most popular DevOps blog posts – The retrospective blog post “The most visited DevOps and Developer Productivity blog posts in 2024” has reached the top one position on this week’s AWS most popular articles chart. This compilation presents the most influential DevOps content from 2024, offering insights into trending topics and best practices. The collection examines key developments in continuous integration and continuous development (CI/CD), infrastructure as code (IaC), and automation practices.

New security course for generative AIAWS Skill Builder has released a new course focusing on securing generative AI applications on AWS. This comprehensive training teaches professionals to implement security best practices for artificial intelligence and machine learning (AI/ML) workloads, addressing data protection, model security, and compliance requirements. The course meets the growing demand for specialized security knowledge in the rapidly evolving field of generative AI.

Amazon Connect Contact Lens free trials – We’re introducing free trials for first-time users of Amazon Connect Contact Lens conversational analytics and performance evaluations. New customers can process up to 100,000 voice minutes monthly at no cost for 2 months, and first-time performance evaluation users receive a 30-day free trial starting with their first evaluation. With this initiative, customers can experience Contact Lens capabilities in their environment without additional costs. The free trials are available across all AWS Regions where Contact Lens is supported.

For a full list of AWS announcements, be sure to keep an eye on the What’s New with AWS page.

Whether you’re a developer, architect, business leader, or you’re starting your cloud journey – and regardless of what 2024 brought your way – 2025 presents new opportunities for everyone.

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Betty

Let’s Architect! Serverless developer experience in AWS

Post Syndicated from Luca Mezzalira original https://aws.amazon.com/blogs/architecture/lets-architect-serverless-developer-experience-in-aws/

Are you a developer approaching serverless for the first time, or even an experienced one looking for a better way to accelerate your feedback loop from code to production? This collection of resources is perfect for you!

There are plenty of developer goodies available on AWS to streamline your code creation and achieve a faster flow in your development lifecycle. Let us share a few examples with you.

What if I told you that you could have an assistant to create your tests? Or that you could review the schema of DynamoDB tables without logging into the AWS Console? Get ready to discover some game-changing tools and techniques that will revolutionize your serverless development process.

And if you want to know more, check out the AWS developer center for more content dedicated to your developer experience on AWS.

Enjoy the journey!

Introducing an enhanced local IDE experience for AWS Lambda developers

We’re excited to announce significant enhancements to the AWS Toolkit, designed to streamline the AWS Lambda development experience. These new features bring the power of Lambda directly to your local development environment, allowing you to work more efficiently within your preferred IDE.

With this update, you can now create, test, and debug Lambda functions locally with unprecedented ease. The toolkit supports local invocation of Lambda functions, enabling real-time testing and debugging without cloud deployment. We’ve also incorporated intelligent code completion and inline documentation for AWS SDK calls, reducing errors and accelerating your coding process.

These improvements offer substantial benefits: faster iteration cycles, deeper insights into Lambda function behavior, and the ability to deliver high-quality serverless applications more rapidly. Whether you’re new to serverless or an experienced Lambda developer, this enhanced local development experience provides a more intuitive and productive environment for building cloud-native solutions.

AWS Toolkit offers the possibility to retrieve real-time the logs of your AWS Lambda functions directly inside your IDE

Figure 1. AWS Toolkit offers the possibility to retrieve real-time the logs of your AWS Lambda functions directly inside your IDE

Take me to this blog

Test Driven Development with Amazon Q Developer

Amazon Q for developers is a versatile AI-powered assistant designed to enhance various aspects of the software development lifecycle. This innovative tool can help streamline numerous tasks, from writing code and documentation to generating unit tests, effectively reducing the time spent on common development activities. By embracing Amazon Q Developer, developers can boost their productivity and focus more on creative problem-solving, with capabilities like test generation serving as just one example of how it can accelerate the development process and improve code quality.

In this example, you will discover how Amazon Q Developer can help out to embrace test-driven development (TDD) in your projects.

Amazon Q developer in action! As you can see you can choose the right recommendation for your code

Figure 2. Amazon Q Developer in action! As you can see you can choose the right recommendation for your code

Take me to this blog

Stop guesstimating the Lambda functions memory size

Optimizing Lambda function performance is crucial for both cost efficiency and user experience, yet many developers still rely on guesswork when setting memory allocations. This approach often leads to suboptimal configurations, resulting in either wasted resources or underperforming functions. Here is where AWS Lambda Power Tuning comes in. By automatically testing your Lambda function with various memory configurations, you can identify the optimal balance between performance and cost. This data-driven approach ensures your functions run at peak efficiency, potentially reducing costs and improving response times. Moreover, as your application evolves, regular power tuning can help you adapt to changing requirements and usage patterns.

The output of running Lambda Power Tuning with your code is a diagram that shows you the best memory size based on your goals. Either optimized for cost or response time or you can choose a more balanced approach

Figure 3. The output of running Lambda Power Tuning with your code is a diagram that shows you the best memory size based on your goals. Either optimized for cost or response time or you can choose a more balanced approach

Take me to this tool

NoSQL Workbench for Amazon DynamoDB

Developers working with Amazon DynamoDB have a powerful ally in their local development toolkit: NoSQL Workbench for Amazon DynamoDB. This intuitive, graphical tool changes the way you interact with DynamoDB tables, offering a fast and efficient feedback loop right on your laptop. With NoSQL Workbench, you can visually design, create, and modify your DynamoDB table structures without the need to constantly access the AWS Console. The tool’s data modeler allows you to experiment with different schemas, ensuring optimal design before deployment. Need to populate your tables for testing? NoSQL Workbench has you covered with its data visualization and manipulation features, enabling quick data insertion and querying. Moreover, its ability to generate sample data and visualize query results in real-time accelerates the development and debugging process.

Visualizing single table design helps you to understand how to structure your serverless applications

Figure 4. Visualizing single table design helps you to understand how to structure your serverless applications

Take me to the documentation

Instrument observability for Lambda functions with Powertools

AWS Lambda Powertools is your go-to open source project when you want to instrument observability and beyond for AWS Lambda functions. Available for multiple programming languages including Python, Node.js, Java, and .NET, Powertools empowers developers to build production-ready Lambda functions with ease. At its core, it provides comprehensive observability features, enabling structured logging, creating custom metrics, and implementing distributed tracing with minimal overhead. But Powertools doesn’t stop there – it also includes utilities for parameter store and secrets management, making it simpler to handle configuration and sensitive data. The suite offers idempotency helpers to ensure reliable execution of your functions, even in the face of retries or duplicates. With its event handler functions, Powertools streamlines the processing of various AWS events, reducing boilerplate code and potential errors. By adopting Powertools, developers can significantly reduce the time spent on implementing best practices, allowing them to focus on building business logic while ensuring their Lambda functions are performant, secure, and easily maintainable.

Powertools for Python goes over and beyond just observability as you can see by the list on the left of this screenshot

Figure 5. Powertools for Python goes over and beyond just observability as you can see by the list on the left of this screenshot

Take me to this tool

AWS Serverless developer experience workshop

The AWS Serverless Developer Experience workshop is an hands-on guide that brings together all the cutting-edge tools and techniques we’ve discussed, offering developers a holistic approach to building serverless applications. This free, self-paced workshop is designed to elevate your serverless development skills, regardless of your experience level. It covers a wide range of topics, from implementing best practices with AWS Lambda Powertools, to optimizing your functions using AWS Lambda Power Tuning. The workshop also delves into CI/CD practices, showing you how to automate your deployment pipeline for faster, more reliable releases.

The serverless developer experience architecture you will work on during the workshop

Figure 6. The serverless developer experience architecture you will work on during the workshop

Take me to the workshop

See you next time!

Thanks for reading! This is the last post of the year, thank you so much for being with us for the 3rd year in a row. To revisit any of our previous posts or explore the entire series, visit the Let’s Architect! page.

Introducing new Event Source Mapping (ESM) metrics for AWS Lambda

Post Syndicated from Chris McPeek original https://aws.amazon.com/blogs/compute/introducing-new-event-source-mapping-esm-metrics-for-aws-lambda/

This post is written by Tarun Rai Madan, Principal Product Manager –  Serverless, and Rajesh Kumar Pandey, Principal Software Engineer, Serverless

Today, AWS is announcing new opt-in Amazon CloudWatch metrics for AWS Lambda Event Source Mappings that subscribe to Amazon Simple Queue Service (Amazon SQS), Amazon Kinesis, and Amazon DynamoDB event sources. These metrics include PolledEventCount, InvokedEventCount, FilteredOutEventCount, FailedInvokeEventCount, DeletedEventCount, DroppedEventCount, and OnFailureDestinationDeliveredEventCount. The new metrics enable customers to monitor the processing state of events read by Event Source Mappings (ESMs), and helps them diagnose processing issues.

Previously, customers found it challenging to monitor the processing state of events read by an ESM. An ESM is a resource that polls events from an event source and invokes a Lambda function. With the new metrics for ESMs, you can count events by their processing state, which includes events that were polled, invoked, filtered out, deleted, dropped, failed, or sent to on-failure destination.

Overview

Customers building modern event-driven applications use services like SQS, Kinesis, and DynamoDB as fundamental building blocks for developing decoupled architectures, and use a Lambda function as a consumer to benefit from its simplicity, auto-scaling and cost effectiveness. To subscribe to an event source, customers configure a Lambda Event Source Mapping (ESM). An ESM is a fully-managed Lambda resource that runs an event poller which polls, processes (e.g., filters and batches), and delivers the events to a Lambda function. Due to the processing that happens on an ESM, for example, filtering, batching, and delivery to on-failure destinations, events can end up in varying terminal states. As a result, some polled events may not invoke a Lambda function. Previously, the count of polled, filtered, invoked, deleted or dropped events was not visible to customers. This made it challenging for customers to diagnose processing issues with their ESM, resulting from faulty permissions, misconfiguration, or function errors.

What’s new

With today’s announcement, customers can opt-in to CloudWatch metrics to monitor the processing state of events that are read by an ESM configured with SQS, Kinesis and DynamoDB as event sources.

PolledEventCount metric counts the number of events read by an ESM from the event source.

InvokedEventCount metric counts the number of events that invoked your Lambda function. For an event that experiences function errors, this metric may increase the count multiple times for the same polled event, due to retries.

FilteredOutEventCount metric counts the number of events filtered out by your ESM, based on the Filter Criteria defined by you.

FailedInvokeEventCount metric counts the number of events that attempted to invoke a Lambda function, but encountered partial or complete failure.

DeletedEventCount metric counts the events that have been deleted from the SQS queue by Lambda upon successful processing.

DroppedEventCount metric counts the number of events dropped due to event expiry or exhaustion of retry attempts, for Kinesis and DynamoDB ESMs configured with MaximumRecordAgeInSeconds or MaximumRetryAttempts.

OnFailureDestinationDeliveredEventCount metric counts the events sent to an on-failure destination upon reaching the MaximumRecordAgeInSeconds or MaximumRetryAttempts, for ESMs configured with DestinationConfig.

How to use the new ESM metrics

Once an ESM is created and reaches enabled state, it continuously polls the event source for new events. You can monitor the PolledEventCount metric to catch issues with polling due to misconfigured or deleted event source, misconfigured or deleted Lambda function execution role, incorrect permissions, or throttles from the event source. This metric typically increases when there is an increase in traffic in the event source. You can observe the InvokedEventCount metric to catch issues with the Lambda function, and whether the events are properly invoking your Lambda function. In case of Lambda function errors, InvokedEventCount could be more than PolledEventCount due to retries. This metric would also increase when there is an increase in events processed by an ESM. For ESMs that have filter criteria configured, you can monitor the FilteredOutEventCount to count events that were not sent to a Lambda function because they were filtered out per the defined filter criteria.

You can monitor the FailedInvokeEventCount metric to observe the number of events that failed processing when Lambda service tried to invoke your Lambda function. Invocations can fail due to network configuration issues, incorrect permissions, or a deleted Lambda function, version, or alias. If your event source mapping has partial batch responses enabled, this metric includes any event with a non-empty BatchItemFailures in the response. If all events in a batch are successfully processed by your Lambda function, Lambda service emits a 0 value for this metric. You can use the DeletedEventCount metric to ensure that processed events have been successfully deleted from your SQS queue after being processed by the Lambda function. You can use the DroppedEventCount metric to identify issues with message backlogs or misconfigured event expiry rules. You can use the OnFailureDestinationDeliveredEventCount metric to monitor issues such as failed events caused by Lambda function invocation errors.

The classification for available Lambda ESM metrics by event source is presented below:

CloudWatch metric SQS DynamoDB Kinesis Data Stream
PolledEventCount
InvokedEventCount
FilteredOutEventCount
FailedInvokeEventCount
DeletedEventCount
DroppedEventCount
OnFailureDestinationDeliveredEventCount

Activating and testing the new ESM metrics

You can enable the new ESM metrics using AWS Lambda Console, AWS Command Line Interface (CLI), Lambda ESM API, AWS SDK, AWS CloudFormation, and AWS Serverless Application Model (SAM). The metrics will be published under the AWS/Lambda namespace and EventSourceMappingUUID dimension in the CloudWatch console. To learn more, see CloudWatch metrics for Lambda.

Using AWS CLI

To turn on the new metrics using AWS CLI, use the –metrics-config parameter.

aws lambda create-event-source-mapping \
    --region <region-name> \
    --function-name <function-name> \
    --event-source-arn <event-source-arn> \
    --metrics-config '{"Metrics": ["EventCount"]}'

Using AWS Lambda Console

To turn on the new metrics using AWS Lambda Console, click on “Enable metrics” while adding the trigger for your function.

Enabling ESM metrics in AWS Console.

Figure 1: Enabling ESM metrics in AWS Console

A typical scenario where the new ESM metrics can help with better observability is an ESM that uses event filtering. To test the ESM metrics, you can deploy a sample Lambda application with Kinesis as an event source using this serverless pattern, which uses event filtering with a certain criteria to control which events are sent to Lambda. Use this pattern for both the example scenarios; please follow the setup guidelines for this pattern and continue with testing for the scenarios. Running this sample project in your account may incur charges. See AWS Lambda pricing and Amazon Kinesis pricing.

Configuring Lambda function with Kinesis event source.

Figure 2: Configuring Lambda function with Kinesis event source

Example scenario 1: ESM metrics with event filtering configured

The following diagram shows the results for the test scenario with Kinesis ESM, where the total polled events, filtered events, invoked events, and failed events are represented by PolledEventCount, FilteredOutEventCount, InvokedEventCount and FailedInvokeEventCount.

Image of ESM metrics for scenario 1.

Figure 3: ESM metrics for scenario 1

Example Scenario 2: ESM metrics with event filtering and On-Failure Destination configured

Another common scenario is where you want to have visibility around the number of events delivered to Lambda function, events filtered, and additionally, the count of events routed to on-failure destination upon failure. To test this scenario, follow a setup similar to the one in scenario 1. Create or update the ESM with an on-failure destination, and set MaximumRetryCount to 1, as shown below.

aws lambda update-event-source-mapping \
    --uuid <event-source-mapping-uuid> \
    --maximum-retry-attempts 1 \
    --filter-criteria '{"Filters": [{"Pattern": "{\"data\": { \"tire_pressure\": [ { \"numeric\": [ \"<\", 32 ] } ] } }"}]}' \
    --destination-config '{"OnFailure": {"Destination": "<your_SQS_queue_ARN>"}}' \
    --function-name <lambda-function-name>

Publish a sample payload which matches the FilterCriteria defined above. Also generate sample data with different “tire_pressure” < 32 to match the event and invoke the Lambda function.

Sample Data:

{
    "time": "2021-11-09 13:32:04",
    "fleet_id": "fleet-452",
    "vehicle_id": "a42bb15c-43eb-11ec-81d3-0242ac130003",
    "lat": 47.616226213162406,
    "lon": -122.33989110734133,
    "speed": 43,
    "odometer": 43519,
    "tire_pressure": [41, 40, 31, 41],
    "weather_temp": 76,
    "weather_pressure": 1013,
    "weather_humidity": 66,
    "weather_wind_speed": 8,
    "weather_wind_dir": "ne"
}

Once you have published these records to the stream, you should be able to see the CloudWatch metrics under AWS/Lambda namespace with the EventSourceMappingUUID dimension, as shown below. Note that if an event experiences a function error, InvokedEventCount may increase multiple times for the same polled event due to automatic retries.

Image of ESM metrics for scenario 2.

Figure 4: ESM metrics for scenario 2

Available Now

The new ESM metrics are generally available in all commercial regions that Lambda service is available in. Support is also available through AWS Lambda partners like Datadog, Elastic, and Lumigo. The Lambda service sends these new metrics to CloudWatch at no additional cost to you. However, charges apply for CloudWatch metrics at standard CloudWatch metrics pricing for these opt-in metrics, in addition to your AWS Lambda pricing and event source pricing.

Conclusion

With these new CloudWatch metrics, you can gain visibility into the processing state of your events that are polled by Lambda Event Source Mapping (ESM) for queue-based or stream-based applications. The blog explains the new metrics PolledEventCount, InvokedEventCount, FilteredOutEventCount, FailedInvokeEventCount, DeletedEventCount, DroppedEventCount, and OnFailureDestinationDeliveredEventCount, and how to use them to troubleshoot event processing issues for Lambda functions. These metrics help you track the invocation requests sent to Lambda via an ESM, monitor any delays or issues in processing, and take corrective actions if required. To learn more about these metrics, visit Lambda developer guide.

For more serverless learning resources, visit Serverless Land.

Build a Two-Way Email-to-SMS Service with Amazon SES and Amazon End User Messaging

Post Syndicated from Cheng Wang original https://aws.amazon.com/blogs/messaging-and-targeting/build-a-two-way-email-to-sms-service-with-amazon-ses-and-amazon-end-user-messaging/

Introduction

Businesses and organizations today struggle to effectively communicate with their customers, employees, or other stakeholders across the diverse range of digital channels they now use. One common problem arises when the requirement to exchange information quickly and reliably extends beyond traditional email. This issue challenges organizations where recipients lack immediate access to email. This applies to field workers, remote teams, or customers who prefer to communicate via text messages. It is vital to bridge this gap between email and SMS communication for timely updates, urgent notifications, and seamless collaboration. However, separate management of these disparate channels independently proves cumbersome and leads to inefficiencies.

To address this challenge, one approach is to leverage Amazon Simple Email Service (SES) and Amazon End User Messaging services to create a robust, scalable, and cost-effective messaging system. This system seamlessly bridges the gap between email and SMS, enhances the reach and delivery of your messages and streamlines your communication workflows. Ultimately, this approach delivers a superior experience for your audience, ensuring that critical information reaches recipients through their preferred channels in a timely and efficient manner.

This blog post will delve into the step-by-step process of building a solution that enables both Email-to-SMS and SMS-to-Email communication. This solution allows you to send SMS messages using email and receive any SMS replies on the same email address you used to send the original message. Furthermore, you can continue the conversation by replying to the email you receive in response. By the end, you’ll have the knowledge and tools necessary to revolutionize your communication strategy and deliver a superior experience to your audience.

Here are some of the use cases for this solution:

  • Real estate agents can use this solution to send listing updates to clients via SMS, and then receive client inquiries and responses as emails.
  • Customer service team can leverage the Email to SMS functionality to proactively reach out to customers with important notifications. Customers are able to respond directly via SMS.
  • Retailers can use this solution to send order confirmations, shipping updates, and promotional offers to customers via SMS. Customers are able to respond with inquiries or feedback that are then received as emails.
  • Medical practices and hospitals can leverage the Email to SMS functionality to quickly notify patients of appointment reminders, prescription refills, or other time-sensitive information. Patients can then respond via SMS, which gets converted to an email that the healthcare staff can access.

Solution Overview

The following figure shows the high level architecture for this solution.

Figure 1: Two-Way Email-To-SMS architecture

  1. Email Users send an email to the email address formatted as “mobile-number@verified-domain”. Amazon SES email receiving receives the email and triggers a receipt rule.
  2. The email is published to Amazon Simple Notification Service (SNS) topic (EmailToSMS) based on the receipt rule action, which triggers an AWS Lambda function (ConvertEmailToSMS). The ConvertEmailToSMS Lambda function performs the following actions:
    1. Receives the event from SNS and constructs a text message using the email body content.
    2. The constructed message is then sent to the “mobile-number” in the destination email address using the “SendTextMessage” API from AWS End User Messaging SMS. This is achieved by using a phone number in AWS End User Messaging SMS as the origination identity.
    3. The SMS message ID and source email address are stored as items in the Amazon DynamoDB table (MessageTrackingTable). This tracks email addresses for replies from SMS users.
  3. Mobile Users receive the SMS, and they have the option to reply to the phone number with two-way SMS messaging
  4. AWS End User Messaging receives the incoming SMS message from the Mobile Users. It then publishes this message to a SNS topic (SMSToEmail) for two-way SMS integration, which triggers a Lambda function (ConvertSMSToEmail). The ConvertSMSToEmail Lambda function performs the following actions:
    1. Retrieves the item from “MessageTrackingTable” using “previousPublishedMessageId” (SMS message ID) from the SNS event, and locate the corresponding email address.
    2. Sends the SMS message body to the Email Users using SES. This step uses “mobile-number@verified-domain” as the source email address, and the email address retrieved from the previous step as the destination.
  5. Email Users receive the email, and they have the option to reply to the email to continue the conversation. Mobile Users will receive the latest reply from Email Users.

Walkthrough

This section will dive into the step-by step process for the deployment. There are 4 steps to deploy this solution:

  1. Configure SES verified identity for email receiving and sending.
  2. Deploy the CloudFormation stack for the Email to SMS functionality.
  3. Deploy the CloudFormation stack for the SMS to Email functionality.
  4. Set up two-way SMS messaging in AWS End User Messaging SMS.

Note: the Lambda code for this solution is developed based on phone numbers and long code as the supported origination identity in Australia. You need to adjust the Lambda code (“format_phone_number” function) accordingly for this to work in your country.

Refer to this GitHub repository for the solution source code.

Prerequisites

Prerequisites for this walkthrough:

  1. Administrator-level access to an AWS account
  2. A domain or subdomain that you own to create SES verified identity
  3. An origination identity that supports two-way messaging, following Choosing an origination identity for two-way messaging use cases. Simulator phone numbers are available if you are in the US
  4. A mobile phone to send and receive SMS
  5. An email address to send and receive emails

Step 1: Configure SES Verified Identity

Follow the steps outlined in Creating a domain identity to create a verified identity for your domain in your AWS account. Confirm your domain identity is in the “Verified” status before proceeding to the next step:

Figure 2: Verified Identity

Step 2: Deploy Email to SMS functionality

The following steps create a CloudFormation stack to deploy the required components for Email to SMS functionality:

  1. Sign in to your AWS account.
  2. Download the email-to-sms.yaml for creating the stack.
  3. Navigate to the AWS CloudFormation page.
  4. Choose Create stack, and then choose With new resources (standard).
  5. Choose Upload a template file and upload the CloudFormation template that you downloaded earlier: email-to-sms.yaml. Then choose Next.
  6. Enter the stack name Email-To-SMS.
  7. Enter the following values for the parameters:
    • RuleName: The name of your SES Rule Set and receipt rule.
    • Recipient1: Domain name used for recipient condition in the SES Rule Set.
    • Recipient2: Domain name used for recipient condition in the SES Rule Set if you need additional recipients.
    • PhoneNumberId: Phone number ID of the phone number to send SMS messages.
  8. Choose Next, and then optionally enter tags and choose Submit. Wait for the stack creation to finish.

Now you have the required components to convert email to text, and sending it as SMS to a phone number using AWS End User Messaging SMS.

Note: if required, modify the following code in email-to-sms.yaml to format your phone numbers accordingly:

def format_phone_number(email_address):

    # Extract the local part of the email address (before @)

    local_part = email_address.split('@')[0]   

    # Remove the leading '0' and add '+61' for phone number (Australia)

    if local_part.startswith('0'):

        formatted_number = '+61' + local_part[1:]

    return formatted_number

Step 3: Deploy SMS to Email functionality

The following steps create a CloudFormation stack to deploy the required components for SMS to Email functionality:

  1. Sign in to your AWS account.
  2. Download the sms-to-email.yaml for creating the stack.
  3. Navigate to the AWS CloudFormation page.
  4. Choose Create stack, and then choose With new resources (standard).
  5. Choose Upload a template file and upload the CloudFormation template that you downloaded earlier: sms-to-email.yaml. Then choose Next.
  6. Enter the stack name SMS-To-Email.
  7. Enter the following values for the parameters:
    • EmailDomain: The email domain to use for the SMS-to-Email function
  8. Choose Next, and then optionally enter tags and choose Submit. Wait for the stack creation to finish.

Note: if required, modify the following code in sms-to-email.yaml to format your phone numbers accordingly:

def format_phone_number(phone_number):

    # Replace the '+61' with '0' from the phone number (Australia)

    formatted_number = f"0{mobile_number[3:]}"

    return formatted_number

Step 4: Set up Two-Way Messaging in AWS End User Messaging SMS

Follow the steps 1 – 5 outlined in Set up two-way SMS messaging for a phone number in AWS End User Messaging SMS.

For step 6:

  • For Destination type, choose Amazon SNS.
  • Choose Existing SNS standard topic.
  • For Incoming messages destination, choose the SNS topic created from Step 3 (default topic name is SMSToEmailTopic).
  • For Two-way channel role, choose Use SNS topic policies.
  • Choose Save changes.

This allows your origination identity (phone number) to receive incoming messages, which is then published to an SNS topic and converted into emails by Lambda.

Testing

To test the solution, send an email with the destination address of “mobile-number@verified-domain”. You should receive a SMS on your mobile with the following information:

Note: AWS End User Messaging SMS has character limit for SMS messages depending on the type of characters the message contains. This solution takes the first 160 characters of the email body by default, you can adjust the EmailToSMS Lambda function as required.

Reply directly to the SMS, you should receive an email at the same email address that sent the original email, with the following information:

  • Subject: Re:mobile-number
  • Body: SMS message content
  • Source email address: mobile-number@verified-domain

If you are not receiving the email or SMS, check the Lambda CloudWatch logs for troubleshooting.

Cleaning up

To remove unneeded resources after testing the solution, follow these steps:

  1. In the CloudFormation console, delete the Email-To-SMS stack
  2. In the CloudFormation console, delete the SMS-To-Email stack
  3. If applicable, in Amazon SES, delete the verified identities
  4. If applicable, in AWS End User Messaging, release the unused phone numbers

Additional Consideration

Conclusion

This blog has explored how organizations can leverage AWS services to build a flexible, two-way communication solution bridging the gap between email and SMS channels. By integrating Amazon SES and Amazon End User Messaging, businesses can reach their audience through multiple channels, ensuring timely and effective delivery of critical messages.

The detailed guide provided the knowledge to create a scalable, cost-effective system tailored to evolving communication needs – whether facilitating email-to-SMS or SMS-to-email exchanges. This unified approach to email and SMS capabilities empowers companies to address the common challenge of managing disparate communication platforms, streamlining workflows and enhancing responsiveness.

If you run into issues or want to submit a feature request, use the New issue button under the issues tab in the GitHub repository.

Get started with Amazon DynamoDB zero-ETL integration with Amazon Redshift

Post Syndicated from Ekta Ahuja original https://aws.amazon.com/blogs/big-data/get-started-with-amazon-dynamodb-zero-etl-integration-with-amazon-redshift/

We’re excited to announce the general availability (GA) of Amazon DynamoDB zero-ETL integration with Amazon Redshift, which enables you to run high-performance analytics on your DynamoDB data in Amazon Redshift with little to no impact on production workloads running on DynamoDB. As data is written into a DynamoDB table, it’s seamlessly made available in Amazon Redshift, eliminating the need to build and maintain complex data pipelines.

Zero-ETL integrations facilitate point-to-point data movement without the need to create and manage data pipelines. You can create zero-ETL integration on an Amazon Redshift Serverless workgroup or Amazon Redshift provisioned cluster using RA3 instance types. You can then run enhanced analysis on this DynamoDB data with the rich capabilities of Amazon Redshift, such as high-performance SQL, built-in machine learning (ML) and Spark integrations, materialized views (MV) with automatic and incremental refresh, data sharing, and the ability to join data across multiple data stores and data lakes.

The DynamoDB zero-ETL integration with Amazon Redshift has helped our customers simplify their extract, transform, and load (ETL) pipelines. The following is a testimony from Keith McDuffee, Director of DevOps at Verisk Analytics, a customer who used zero-ETL integration with DynamoDB in place of their homegrown solution and benefitted from the seamless replication that it provided:

“We have dashboards built on top of our transactional data in Amazon Redshift. Earlier, we used our homegrown solution to move data from DynamoDB to Amazon Redshift, but those jobs would often time out and lead to a lot of operational burden and missed insights on Amazon Redshift. Using the DynamoDB zero-ETL integration with Amazon Redshift, we no longer run into such issues and the integration seamlessly and continuously replicates data to Amazon Redshift.”

In this post, we showcase how an ecommerce application can use this zero-ETL integration to analyze the distribution of customers by attributes such as location and customer signup date. You can also use the integration for retention and churn analysis by calculating retention rates by comparing the number of active profiles over different time periods.

Solution overview

The zero-ETL integration provides end-to-end fully managed process that allows data to be seamlessly moved from DynamoDB tables to Amazon Redshift without the need for manual ETL processes, ensuring efficient and incremental updates in Amazon Redshift environment. It leverages DynamoDB exports to incrementally replicate data changes from DynamoDB to Amazon Redshift every 15-30 minutes. The initial data load is a full load, which may take longer depending on the data volume. This integration also enables replicating data from multiple DynamoDB tables into a single Amazon Redshift provisioned cluster or serverless workgroup, providing a holistic view of data across various applications.

This replication is done with little to no performance or availability impact to your DynamoDB tables and without consuming DynamoDB read capacity units (RCUs). Your applications will continue to use DynamoDB while data from those tables will be seamlessly replicated to Amazon Redshift for analytics workloads such as reporting and dashboards.

The following diagram illustrates this architecture.

In the following sections, we show how to get started with DynamoDB zero-ETL integration with Amazon Redshift. This general availability release supports creating and managing the zero-ETL integrations using the AWS Command Line Interface (AWS CLI), AWS SDKs, API, and AWS Management Console. In this post, we demonstrate using the console.

Prerequisites

Complete the following prerequisite steps:

  1. Enable point-in-time recovery (PITR) on the DynamoDB table.
  2. Enable case sensitivity for the target Redshift data warehouse.
  3. Attach the resource-based policies to both DynamoDB and Amazon Redshift as mentioned in here.
  4. Make sure the AWS Identity and Access Management (IAM) user or role creating the integration has an identity-based policy that authorizes actions listed in here.

Create the DynamoDB zero-ETL integration

You can create the integration either on the DynamoDB console or Amazon Redshift console. The following steps use the Amazon Redshift console.

  1. On the Amazon Redshift console, choose Zero-ETL integrations in the navigation pane.
  2. Choose Create DynamoDB integration.

If you choose to create the integration on the DynamoDB console, choose Integrations in the navigation pane and then choose Create integration and Amazon Redshift.

  1. For Integration name, enter a name (for example, ddb-rs-customerprofiles-zetl-integration).
  2. Choose Next.
  1. Choose Browse DynamoDB tables and choose the table that will be the source for this integration.
  2. Choose Next.

You can only choose one table. If you need data from multiple tables in a single Redshift cluster, you need to create a separate integration for each table.

If you don’t have PITR enabled on the source DynamoDB table, an error will pop up while choosing the source. In this case, you can select Fix it for me for DynamoDB to enable the PITR on your source table. Review the changes and choose Continue.

  1. Choose Next.
  1. Choose your target Redshift data warehouse. If it’s in the same account, you can browse and choose the target. If the target resides in a different account, you can provide the Amazon Resource Name (ARN) of the target Redshift cluster.

If you get an error about the resource policy, select Fix it for me for Amazon Redshift to fix policies as part of this creation process. Alternatively, you can add resource policies for Amazon Redshift manually prior to creating zero-ETL integration. Review the changes and choose Reboot and continue.

  1. Choose Next and complete your integration.

The zero-ETL integration creation should show the status Creating. Wait for the status to change to Active.

Create a Redshift database from the integration

Complete the following steps to create a Redshift database:

  1. On the Amazon Redshift console, navigate to the recently created zero-ETL integration.
  2. Choose Create database from integration.

  1. For Destination database name, enter a name (for example, ddb_rs_customerprofiles_zetl_db).
  2. Choose Create database.

After you create the database, the database state should change from Creating to Active. This will start the replication of data in the source DynamoDB tables to the target Redshift tables, which will be created under the public schema of the destination database (ddb_rs_customerprofiles_zetl_db).

Now you can query your data in Amazon Redshift using the integration with DynamoDB.

Understanding your data

Data exported from DynamoDB to Amazon Redshift is stored in the Redshift database that you created from your zero-ETL integration (ddb_rs_customerprofiles_zetl_db). A single table of the same name as the DynamoDB source table is created and is under the default (public) Redshift schema. DynamoDB only enforces schemas for the primary key attributes (partition key and optionally sort key). Because of this, your DynamoDB table structure is replicated to Amazon Redshift in three columns: partition key, sort key, and a SUPER data type column named value that contains all the attributes. The data in this value column is in DynamoDB JSON format. For information about the data format, see DynamoDB table export output format.

The DynamoDB partition key is used as the Redshift table distribution key, and the combination of the DynamoDB partition and sort keys are used as the Redshift table sort keys. Amazon Redshift also allows changing the sort keys on the zero-ETL integration replicated tables using the ALTER SORT KEY command.

The DynamoDB data in Amazon Redshift is read-only data. After the data is available in the Amazon Redshift table, you can query the value column as a SUPER data type using PartiQL SQL or create and query materialized views on the table, which are incrementally refreshed automatically.

For more information about the SUPER data type, see Semistructured data in Amazon Redshift.

Query the data

To validate the ingested records, you can use the Amazon Redshift Query Editor to query the target table in Amazon Redshift using PartiQL SQL. For example, you can use the following query to select email and unnest the data in the value column to the retrieve the customer name and address:

select email, 
       value.custname."S"::text custname, 
       value.address."S"::text custaddress, 
       value 
from "ddb_rs_customerprofiles_zetl_db".public."customerprofiles"

To demonstrate the replication of incremental changes in action, we make the following updates to the source DynamoDB table:

  1. Add two new items in the DynamoDB table:
    ##Incremental changes
    ##add 2 items
    
    aws dynamodb put-item --table-name customerprofiles --item  '{ "email": { "S": "[email protected]" }, "custname": { "S": "Sarah Wilson" }, "username": { "S": "swilson789" }, "phone": { "S": "555-012-3456" }, "address": { "S": "789 Oak St, Chicago, IL 60601" }, "custcreatedt": { "S": "2023-04-01T09:00:00Z" }, "custupddt": { "S": "2023-04-01T09:00:00Z" }, "status": { "S": "active" } }'
    
    aws dynamodb put-item --table-name customerprofiles --item  '{ "email": { "S": "[email protected]" }, "custname": { "S": "Michael Taylor" }, "username": { "S": "mtaylor123" }, "phone": { "S": "555-246-8024" }, "address": { "S": "246 Maple Ave, Los Angeles, CA 90001" }, "custcreatedt": { "S": "2022-11-01T08:00:00Z" }, "custupddt": { "S": "2022-11-01T08:00:00Z" }, "status": { "S": "active" } }'

  2. Update the address for one of the items in the DynamoDB table:
    ##update an item
    aws dynamodb update-item --table-name customerprofiles --key '{"email": {"S": "[email protected]"}}' --update-expression "SET address = :a" --expression-attribute-values '{":a":{"S":"124 Main St, Somewhereville USA "}}' 

  3. Delete the item where email is [email protected]:
    # # delete an item
    
    aws dynamodb delete-item --table-name customerprofiles --key '{"email": {"S": "[email protected]"}}' 

With these changes, the DynamoDB table customerprofiles has four items (three existing, two new, and one delete), as shown in the following screenshot.

Next, you can go to the query editor to validate these changes. At this point, you can expect incremental changes to reflect in the Redshift table (four records in table).

Create materialized views on zero-ETL replicated tables

Common analytics use cases generally involve aggregating data across multiple source tables using complex queries to generate reports and dashboards for downstream applications. Customers usually create late binding views to meet such use cases, which aren’t always optimized to meet the stringent query SLAs due to the long underlying query runtimes. Another option is to create a table that stores the data across multiple source tables, which brings the challenge of incrementally updating and refreshing data based on the changes in the source table.

To serve such use cases and get around the challenges associated with traditional options, you can create materialized views on top of zero-ETL replicated tables in Amazon Redshift, which can get automatically refreshed incrementally as the underlying data changes. Materialized views are also convenient for storing frequently accessed data by unnesting and shredding data stored in the SUPER column value by the zero-ETL integration.

For example, we can use the following query to create a materialized view on the customerprofiles table to analyze customer data:

CREATE MATERIALIZED VIEW dev.public.customer_mv
AUTO REFRESH YES
AS
SELECT value."custname"."S"::varchar(30) as cust_name, value."username"."S"::varchar(100) as user_name, value."email"."S"::varchar(60) as cust_email, value."address"."S"::varchar(100) as cust_addres, value."phone"."S"::varchar(100) as cust_phone_nbr, value."status"."S"::varchar(10) as cust_status,
value."custcreatedt"."S"::varchar(10) as cust_create_dt, value."custupddt"."S"::varchar(10) as cust_update_dt FROM "ddb_rs_customerprofiles_zetl_db"."public"."customerprofiles"
group by 1,2,3,4,5,6,7,8;

This view is set to AUTO REFRESH, which means it will be automatically and incrementally refreshed when the new data arrives in the underlying source table customerprofiles.

Now let’s say you want to understand the distribution of customers across different status categories. You can query the materialized view customer_mv created from the zero-ETL DynamoDB table as follows:

-- Customer count by status
select cust_status,count(distinct user_name) cust_status_count
from dev.public.customer_mv
group by 1;

Next, let’s say you want to compare the number of active customer profiles over different time periods. You can run the following query on customer_mv to get that data:

-- Customer active count by date
select cust_create_dt,count(distinct user_name) cust_count
from dev.public.customer_mv
where cust_status ='active'
group by 1;

Let’s try to make a few incremental changes, which involves two new items and one delete on the source DynamoDB table using following AWS CLI commands.

aws dynamodb put-item --table-name customerprofiles --item  '{ "email": { "S": "[email protected]" }, "custname": { "S": "Robert Davis" }, "username": { "S": "rdavis789" }, "phone": { "S": "555-012-3456" }, "address": { "S": "789 Pine St, Seattle, WA 98101" }, "custcreatedt": { "S": "2022-07-01T14:00:00Z" }, "custupddt": { "S": "2023-04-01T11:30:00Z" }, "status": { "S": "inactive" } }'

aws dynamodb put-item --table-name customerprofiles --item '{ "email": { "S": "[email protected]" }, "custname": { "S": "William Jones" }, "username": { "S": "wjones456" }, "phone": { "S": "555-789-0123" }, "address": { "S": "456 Elm St, Atlanta, GA 30301" }, "custcreatedt": { "S": "2022-09-15T12:30:00Z" }, "custupddt": { "S": "2022-09-15T12:30:00Z" }, "status": { "S": "active" } }'

aws dynamodb delete-item --table-name customerprofiles --key '{"email": {"S": "[email protected]"}}'

Validate the incremental refresh of the materialized view

To monitor the history of materialized view refreshes, you can use the SYS_MV_REFRESH_HISTORY system view. As you can see in the following output, the materialized view customer_mv was incrementally refreshed.

Now let’s query the materialized view created from the zero-ETL table. You can see two new records. The changes were propagated into the materialized view with an incremental refresh.

Monitor the zero-ETL integration

There are several options to obtain metrics on the performance and status of the DynamoDB zero-ETL integration with Amazon Redshift.

On the Amazon Redshift console, choose Zero-ETL integrations in the navigation pane. You can choose the zero-ETL integration you want and display Amazon CloudWatch metrics related to the integration. These metrics are also directly available in CloudWatch.

For each integration, there are two tabs with information available:

  • Integration metrics – Shows metrics such as the lag (in minutes) and data transferred (in KBps)
  • Table statistics – Shows details about tables replicated from DynamoDB to Amazon Redshift such as status, last updated time, table row count, and table size

After inserting, deleting, and updating rows in the source DynamoDB table, the Table statistics section displays the details, as shown in the following screenshot.

In addition to the CloudWatch metrics, you can query the following system views, which provide information about the integrations:

Pricing

AWS does not charge an additional fee for the zero-ETL integration. You pay for existing DynamoDB and Amazon Redshift resources used to create and process the change data created as part of a zero-ETL integration. These include DynamoDB PITR, DynamoDB exports for the initial and ongoing data changes to your DynamoDB data, additional Amazon Redshift storage for storing replicated data, and Amazon Redshift compute on the target. For pricing on DynamoDB PITR and DynamoDB exports, see Amazon DynamoDB pricing. For pricing on Redshift clusters, see Amazon Redshift pricing.

Clean up

When you delete a zero-ETL integration, your data isn’t deleted from the DynamoDB table or Redshift, but data changes happening after that point of time aren’t sent to Amazon Redshift.

To delete a zero-ETL integration, complete the following steps:

  1. On the Amazon Redshift console, choose Zero-ETL integrations in the navigation pane.
  2. Select the zero-ETL integration that you want to delete and on the Actions menu, choose Delete.
  1. To confirm the deletion, enter confirm and choose Delete.

Conclusion

In this post, we explained how you can set up the zero-ETL integration from DynamoDB to Amazon Redshift to derive holistic insights across many applications, break data silos in your organization, and gain significant cost savings and operational efficiencies.

To learn more about zero-ETL integration, refer to documentation.


About the authors

Ekta Ahuja is an Amazon Redshift Specialist Solutions Architect at AWS. She is passionate about helping customers build scalable and robust data and analytics solutions. Before AWS, she worked in several different data engineering and analytics roles. Outside of work, she enjoys landscape photography, traveling, and board games.

Raghu Kuppala is an Analytics Specialist Solutions Architect experienced working in the databases, data warehousing, and analytics space. Outside of work, he enjoys trying different cuisines and spending time with his family and friends.

Veerendra Nayak is a Principal Database Solutions Architect based in the Bay Area, California. He works with customers to share best practices on database migrations, resiliency, and integrating operational data with analytics and AI services.

Amazon Aurora PostgreSQL and Amazon DynamoDB zero-ETL integrations with Amazon Redshift now generally available

Post Syndicated from Esra Kayabali original https://aws.amazon.com/blogs/aws/amazon-aurora-postgresql-and-amazon-dynamodb-zero-etl-integrations-with-amazon-redshift-now-generally-available/

Today, I am excited to announce the general availability of Amazon Aurora PostgreSQL-Compatible Edition and Amazon DynamoDB zero-ETL integrations with Amazon Redshift. Zero-ETL integration seamlessly makes transactional or operational data available in Amazon Redshift, removing the need to build and manage complex data pipelines that perform extract, transform, and load (ETL) operations. It automates the replication of source data to Amazon Redshift, simultaneously updating source data for you to use in Amazon Redshift for analytics and machine learning (ML) capabilities to derive timely insights and respond effectively to critical, time-sensitive events.

Using these new zero-ETL integrations, you can run unified analytics on your data from different applications without having to build and manage different data pipelines to write data from multiple relational and non-relational data sources into a single data warehouse. In this post, I provide two step-by-step walkthroughs on how to get started with both Amazon Aurora PostgreSQL and Amazon DynamoDB zero-ETL integrations with Amazon Redshift.

To create a zero-ETL integration, you specify a source and Amazon Redshift as the target. The integration replicates data from the source to the target data warehouse, making it available in Amazon Redshift seamlessly, and monitors the pipeline’s health.

Let’s explore how these new integrations work. In this post, you will learn how to create zero-ETL integrations to replicate data from different source databases (Aurora PostgreSQL and DynamoDB) to the same Amazon Redshift cluster. You will also learn how to select multiple tables or databases from Aurora PostgreSQL source databases to replicate data to the same Amazon Redshift cluster. You will observe how zero-ETL integrations provide flexibility without the operational burden of building and managing multiple ETL pipelines.

Getting started with Aurora PostgreSQL zero-ETL integration with Amazon Redshift
Before creating a database, I create a custom cluster parameter group because Aurora PostgreSQL zero-ETL integration with Amazon Redshift requires specific values for the Aurora DB cluster parameters. In the Amazon RDS console, I go to Parameter groups in the navigation pane. I choose Create parameter group.

I enter custom-pg-aurora-postgres-zero-etl for Parameter group name and Description. I choose Aurora PostgreSQL for Engine type and aurora-postgresql16 for Parameter group family (zero-ETL integration works with PostgreSQL 16.4 or above versions). Finally, I choose DB Cluster Parameter Group for Type and choose Create.

Next, I edit the newly created cluster parameter group by choosing it on the Parameter groups page. I choose Actions and then choose Edit. I set the following cluster parameter settings:

  • rds.logical_replication=1
  • aurora.enhanced_logical_replication=1
  • aurora.logical_replication_backup=0
  • aurora.logical_replication_globaldb=0

I choose Save Changes.

Next, I create an Aurora PostgreSQL database. When creating the database, you can set the configurations according to your needs. Remember to choose Aurora PostgreSQL (compatible with PostgreSQL 16.4 or above) from Available versions and the custom cluster parameter group (custom-pg-aurora-postgres-zero-etl in this case) for DB cluster parameter group in the Additional configuration section.

After the database becomes available, I connect to the Aurora PostgreSQL cluster, create a database named books, create a table named book_catalog in the default schema for this database and insert sample data to use with zero-ETL integration.

To get started with zero-ETL integration, I use an existing Amazon Redshift data warehouse. To create and manage Amazon Redshift resources, visit the Amazon Redshift Getting Started Guide.

In the Amazon RDS console, I go to the Zero-ETL integrations tab in the navigation pane and choose Create zero-ETL integration. I enter postgres-redshift-zero-etl for Integration identifier and Amazon Aurora zero-ETL integration with Amazon Redshift for Integration description. I choose Next.

On the next page, I choose Browse RDS databases to select the source database. For the Data filtering options, I use database.schema.table pattern. I include my table called book_catalog in Aurora PostgreSQL books database. The * in filters will replicate all book_catalog tables in all schemas within books database. I choose Include as filter type and enter books.*.book_catalog into the Filter expression field. I choose Next.

On the next page, I choose Browse Redshift data warehouses and select the existing Amazon Redshift data warehouse as the target. I must specify authorized principals and integration source on the target to enable Amazon Aurora to replicate into the data warehouse and enable case sensitivity. Amazon RDS can complete these steps for me during setup, or I can configure them manually in Amazon Redshift. For this demo, I choose Fix it for me and choose Next.

After the case sensitivity parameter and the resource policy for data warehouse are fixed, I choose Next on the next Add tags and encryption page. After I review the configuration, I choose Create zero-ETL integration.

After the integration succeeded, I choose the integration name to check the details.

Now, I need to create a database from integration to finish setting up. I go to the Amazon Redshift console, choose Zero-ETL integrations in the navigation pane and select the Aurora PostgreSQL integration I just created. I choose Create database from integration.

I choose books as Source named database and I enter zeroetl_aurorapg as the Destination database name. I choose Create database.

After the database is created, I return to the Aurora PostgreSQL integration page. On this page, I choose Query data to connect to the Amazon Redshift data warehouse to observe if the data is replicated. When I run a select query in the zeroetl_aurorapg database, I see that the data in book_catalog table is replicated to Amazon Redshift successfully.

As I said in the beginning, you can select multiple tables or databases from the Aurora PostgreSQL source database to replicate the data to the same Amazon Redshift cluster. To add another database to the same zero-ETL integration, all I have to do is to add another filter to the Data filtering options in the form of database.schema.table, replacing the database part with the database name I want to replicate. For this demo, I will select multiple tables to be replicated to the same data warehouse. I create another table named publisher in the Aurora PostgreSQL cluster and insert sample data to it.

I edit the Data filtering options to include publisher table for replication. To do this, I go to the postgres-redshift-zero-etl details page and choose Modify. I append books.*.publisher using comma in the Filter expression field. I choose Continue. I review the changes and choose Save changes. I observe that the Filtered data tables section on the integration details page has now 2 tables included for replication.

When I switch to the Amazon Redshift Query editor and refresh the tables, I can see that the new publisher table and its records are replicated to the data warehouse.

Now that I completed the Aurora PostgreSQL zero-ETL integration with Amazon Redshift, let’s create a DynamoDB zero-ETL integration with the same data warehouse.

Getting started with DynamoDB zero-ETL integration with Amazon Redshift
In this part, I proceed to create an Amazon DynamoDB zero-ETL integration using an existing Amazon DynamoDB table named Book_Catalog. The table has 2 items in it:

I go to the Amazon Redshift console and choose Zero-ETL integrations in the navigation pane. Then, I choose the arrow next to the Create zero-ETL integration and choose Create DynamoDB integration. I enter dynamodb-redshift-zero-etl for Integration name and Amazon DynamoDB zero-ETL integration with Amazon Redshift for Description. I choose Next.

On the next page, I choose Browse DynamoDB tables and select the Book_Catalog table. I must specify a resource policy with authorized principals and integration sources, and enable point-in-time recovery (PITR) on the source table before I create an integration. Amazon DynamoDB can do it for me, or I can change the configuration manually. I choose Fix it for me to automatically apply the required resource policies for the integration and enable PITR on the DynamoDB table. I choose Next.

Then, I choose my existing Amazon Redshift Serverless data warehouse as the target and choose Next.

I choose Next again in the Add tags and encryption page and choose Create DynamoDB integration in the Review and create page.

Now, I need to create a database from integration to finish setting up just like I did with Aurora PostgreSQL zero-ETL integration. In the Amazon Redshift console, I choose the DynamoDB integration and I choose Create database from integration. In the popup screen, I enter zeroetl_dynamodb as the Destination database name and choose Create database.

After the database is created, I go to the Amazon Redshift Zero-ETL integrations page and choose the DynamoDB integration I created. On this page, I choose Query data to connect to the Amazon Redshift data warehouse to observe if the data from DynamoDB Book_Catalog table is replicated. When I run a select query in the zeroetl_dynamodb database, I see that the data is replicated to Amazon Redshift successfully. Note that the data from DynamoDB is replicated in SUPER datatype column and can be accessed using PartiQL sql.

I insert another entry to the DynamoDB Book_Catalog table.

When I switch to the Amazon Redshift Query editor and refresh the select query, I can see that the new record is replicated to the data warehouse.

Zero-ETL integrations between Aurora PostgreSQL and DynamoDB with Amazon Redshift help you unify data from multiple database clusters and unlock insights in your data warehouse. Amazon Redshift allows cross-database queries and materialized views based off the multiple tables, giving you the opportunity to consolidate and simplify your analytics assets, improve operational efficiency, and optimize cost. You no longer have to worry about setting up and managing complex ETL pipelines.

Now available
Aurora PostgreSQL zero-ETL integration with Amazon Redshift is now available in US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Hong Kong), Asia Pacific (Mumbai), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland), and Europe (Stockholm) AWS Regions.

Amazon DynamoDB zero-ETL integration with Amazon Redshift is now available in all commercial, China and GovCloud AWS Regions.

For pricing information, visit the Amazon Aurora and Amazon DynamoDB pricing pages.

To get started with this feature, visit Working with Aurora zero-ETL integrations with Amazon Redshift and Amazon Redshift Zero-ETL integrations documentation.

— Esra

Differentiate generative AI applications with your data using AWS analytics and managed databases

Post Syndicated from Diego Colombatto original https://aws.amazon.com/blogs/big-data/differentiate-generative-ai-applications-with-your-data-using-aws-analytics-and-managed-databases/

While the potential of generative artificial intelligence (AI) is increasingly under evaluation, organizations are at different stages in defining their generative AI vision. In many organizations, the focus is on large language models (LLMs), and foundation models (FMs) more broadly. This is just the tip of the iceberg, because what enables you to obtain differential value from generative AI is your data.

Generative AI applications are still applications, so you need the following:

  • Operational databases to support the user experience for interaction steps outside of invoking generative AI models
  • Data lakes to store your domain-specific data, and analytics to explore them and understand how to use them in generative AI
  • Data integrations and pipelines to manage (sourcing, transforming, enriching, and validating, among others) and render data usable with generative AI
  • Governance to manage aspects such as data quality, privacy and compliance to applicable privacy laws, and security and access controls

LLMs and other FMs are trained on a generally available collective body of knowledge. If you use them as is, they’re going to provide generic answers with no differential value for your company. However, if you use generative AI with your domain-specific data, it can provide a valuable perspective for your business and enable you to build differentiated generative AI applications and products that will stand out from others. In essence, you have to enrich the generative AI models with your differentiated data.

On the importance of company data for generative AI, McKinsey stated that “If your data isn’t ready for generative AI, your business isn’t ready for generative AI.”

In this post, we present a framework to implement generative AI applications enriched and differentiated with your data. We also share a reusable, modular, and extendible asset to quickly get started with adopting the framework and implementing your generative AI application. This asset is designed to augment catalog search engine capabilities with generative AI, improving the end-user experience.

You can extend the solution in directions such as the business intelligence (BI) domain with customer 360 use cases, and the risk and compliance domain with transaction monitoring and fraud detection use cases.

Solution overview

There are three key data elements (or context elements) you can use to differentiate the generative AI responses:

  • Behavioral context – How do you want the LLM to behave? Which persona should the FM impersonate? We call this behavioral context. You can provide these instructions to the model through prompt templates.
  • Situational context – Is the user request part of an ongoing conversation? Do you have any conversation history and states? We call this situational context. Also, who is the user? What do you know about user and their request? This data is derived from your purpose-built data stores and previous interactions.
  • Semantic context – Is there any meaningfully relevant data that would help the FMs generate the response? We call this semantic context. This is typically obtained from vector stores and searches. For example, if you’re using a search engine to find products in a product catalog, you could store product details, encoded into vectors, into a vector store. This will enable you to run different kinds of searches.

Using these three context elements together is more likely to provide a coherent, accurate answer than relying purely on a generally available FM.

There are different approaches to design this type of solution; one method is to use generative AI with up-to-date, context-specific data by supplementing the in-context learning pattern using Retrieval Augmented Generation (RAG) derived data, as shown in the following figure. A second approach is to use your fine-tuned or custom-built generative AI model with up-to-date, context-specific data.

The framework used in this post enables you to build a solution with or without fine-tuned FMs and using all three context elements, or a subset of these context elements, using the first approach. The following figure illustrates the functional architecture.

Technical architecture

When implementing an architecture like that illustrated in the previous section, there are some key aspects to consider. The primary aspect is that, when the application receives the user input, it should process it and provide a response to the user as quickly as possible, with minimal response latency. This part of the application should also use data stores that can handle the throughput in terms of concurrent end-users and their activity. This means predominantly using transactional and operational databases.

Depending on the goals of your use case, you might store prompt templates separately in Amazon Simple Storage Service (Amazon S3) or in a database, if you want to apply different prompts for different usage conditions. Alternatively, you might treat them as code and use source code control to manage their evolution over time.

NoSQL databases like Amazon DynamoDB, Amazon DocumentDB (with MongoDB compatibility), and Amazon MemoryDB can provide low read latencies and are well suited to handle your conversation state and history (situational context). The document and key value data models allow you the flexibility to adjust the schema of the conversation state over time.

User profiles or other user information (situational context) can come from a variety of database sources. You can store that data in relational databases like Amazon Aurora, NoSQL databases, or graph databases like Amazon Neptune.

The semantic context originates from vector data stores or machine learning (ML) search services. Amazon Aurora PostgreSQL-Compatible Edition with pgvector and Amazon OpenSearch Service are great options if you want to interact with vectors directly. Amazon Kendra, our ML-based search engine, is a great fit if you want the benefits of semantic search without explicitly maintaining vectors yourself or tuning the similarity algorithms to be used.

Amazon Bedrock is a fully managed service that makes high-performing FMs from leading AI startups and Amazon available through a unified API. You can choose from a wide range of FMs to find the model that is best suited for your use case. Amazon Bedrock also offers a broad set of capabilities to build generative AI applications with security, privacy, and responsible AI. Amazon Bedrock provides integrations with both Aurora and OpenSearch Service, so you don’t have to explicitly query the vector data store yourself.

The following figure summarizes the AWS services available to support the solution framework described so far.

Catalog search use case

We present a use case showing how to augment the search capabilities of an existing search engine for product catalogs, such as ecommerce portals, using generative AI and customer data.

Each customer will have their own requirements, so we adopt the framework presented in the previous sections and show an implementation of the framework for the catalog search use case. You can use this framework for both catalog search use cases and as a foundation to be extended based on your requirements.

One additional benefit about this catalog search implementation is that it’s pluggable to existing ecommerce portals, search engines, and recommender systems, so you don’t have to redesign or rebuild your processes and tools; this solution will augment what you currently have with limited changes required.

The solution architecture and workflow is shown in the following figure.

The workflow consists of the following steps:

  1. The end-user browses the product catalog and submits a search, in natual language, using the web interface of the frontend catalog application (not shown). The catalog frontend application sends the user search to the generative AI application. Application logic is currently implemented as a container, but it can be deployed with AWS Lambda as required.
  2. The generative AI application connects to Amazon Bedrock to convert the user search into embeddings.
  3. The application connects with OpenSearch Service to search and retrieve relevant search results (using an OpenSearch index containing products). The application also connects to another OpenSearch index to get user reviews for products listed in the search results. In terms of searches, different options are possible, such as k-NN, hybrid search, or sparse neural search. For this post, we use k-NN search. At this stage, before creating the final prompt for the LLM, the application can perform an additional step to retrieve situational context from operational databases, such as customer profiles, user preferences, and other personalization information.
  4. The application gets prompt templates from an S3 data lake and creates the engineered prompt.
  5. The application sends the prompt to Amazon Bedrock and retrieves the LLM output.
  6. The user interaction is stored in a data lake for downstream usage and BI analysis.
  7. The Amazon Bedrock output retrieved in Step 5 is sent to the catalog application frontend, which shows results on the web UI to the end-user.
  8. DynamoDB stores the product list used to display products in the ecommerce product catalog. DynamoDB zero-ETL integration with OpenSearch Service is used to replicate product keys into OpenSearch.

Security considerations

Security and compliance are key concerns for any business. When adopting the solution described in this post, you should always factor in the Security Pillar best practices from the AWS Well-Architecture Framework.

There are different security categories to consider and different AWS Security services you can use in each security category. The following are some examples relevant for the architecture shown in this post:

  • Data protection – You can use AWS Key Management Service (AWS KMS) to manage keys and encrypt data based on the data classification policies defined. You can also use AWS Secrets Manager to manage, retrieve, and rotate database credentials, API keys, and other secrets throughout their lifecycles.
  • Identity and access management – You can use AWS Identity and Access Management (IAM) to specify who or what can access services and resources in AWS, centrally manage fine-grained permissions, and analyze access to refine permissions across AWS.
  • Detection and response – You can use AWS CloudTrail to track and provide detailed audit trails of user and system actions to support audits and demonstrate compliance. Additionally, you can use Amazon CloudWatch to observe and monitor resources and applications.
  • Network security – You can use AWS Firewall Manager to centrally configure and manage firewall rules across your accounts and AWS network security services, such as AWS WAF, AWS Network Firewall, and others.

Conclusion

In this post, we discussed the importance of using customer data to differentiate generative AI usage in applications. We presented a reference framework (including a functional architecture and a technical architecture) to implement a generative AI application using customer data and an in-context learning pattern with RAG-provided data. We then presented an example of how to apply this framework to design a generative AI application using customer data to augment search capabilities and personalize the search results of an ecommerce product catalog.

Contact AWS to get more information on how to implement this framework for your use case. We’re also happy to share the technical asset presented in this post to help you get started building generative AI applications with your data for your specific use case.


About the Authors

Diego Colombatto is a Senior Partner Solutions Architect at AWS. He brings more than 15 years of experience in designing and delivering Digital Transformation projects for enterprises. At AWS, Diego works with partners and customers advising how to leverage AWS technologies to translate business needs into solutions.

Angel Conde Manjon is a Sr. EMEA Data & AI PSA, based in Madrid. He has previously worked on research related to Data Analytics and Artificial Intelligence in diverse European research projects. In his current role, Angel helps partners develop businesses centered on Data and AI.

Tiziano Curci is a Manager, EMEA Data & AI PDS at AWS. He leads a team that works with AWS Partners (G/SI and ISV), to leverage the most comprehensive set of capabilities spanning databases, analytics and machine learning, to help customers unlock the through power of data through an end-to-end data strategy.

How Wesfarmers Health implemented upstream event buffering using Amazon SQS FIFO

Post Syndicated from Robbie Cooray original https://aws.amazon.com/blogs/architecture/how-wesfarmers-health-implemented-upstream-event-buffering-using-amazon-sqs-fifo/

Customers of all sizes and industries use Software-as-a-Service (SaaS) applications to host their workloads. Most SaaS solutions take care of maintenance and upgrades of the application for you, and get you up and running in a relatively short timeframe. Why spend time, money, and your precious resources to build and maintain applications when this could be offloaded?

However, working with SaaS solutions can introduce new requirements for integration. This blog post shows you how Wesfarmers Health was able to introduce an upstream architecture using serverless technologies in order to work with integration constraints.

At the end of the post, you will see the final architecture and a sample repository for you to download and adjust for your use case.

Let’s get started!

Consent capture problem

Wesfarmers Health used a SaaS solution to capture consent. When capturing consent for a user, order guarantee and delivery semantics become important. Failure to correctly capture consent choice can lead to downstream systems making non-compliant decisions. This can end up in penalties, financial or otherwise, and might even lead to brand reputation damage.

In Wesfarmers’ case, the integration options did not support a queue with order guarantee nor exactly-once processing. This meant that, with enough load and chance, a user’s preference might be captured incorrectly. Let’s look at two scenarios where this could happen.

In both of these scenarios, the user makes a choice, and quickly changes their mind. These are considered two discreet events:

  1. Event 1 – User confirms “yes.”
  2. Event 2 – User then quickly changes their mind to confirm “no.”

Scenario 1: Incorrect order

In this scenario, two events end up in a queue with no order guarantee. Event 2 might be processed before Event 1, so although the user provided a “no,” the system has now captured a “yes.” This is now considered a non-compliant consent capture.

Animation showing messages processed in the wrong order

Figure 1. Animation showing messages processed in the wrong order

Scenario 2 – events processed multiple times

In this scenario, perhaps due to the load, Event 1 was transmitted twice, once before and once after Event 2, due to at least once processing. In this scenario, the user’s record could be updated three times, first with Event 1 with “yes,” then Event 2 with “no,” then again with retransmitted Event 1 with “yes,” which ultimately ends up with a “yes,” also considered a non-compliant consent capture.

Animation showing messages processed multiple times

Figure 2. Animation showing messages processed multiple times

How did Amazon SQS and Amazon DynamoDB help with order?

With Amazon Amazon Simple Queue Service (Amazon SQS), queues come in two flavors: standard and first-in-first-out (FIFO). Standard queues provide best effort ordering and at-least once processing with high throughput, whereas FIFO delivers order and processes exactly once with relatively low throughput, as shown in Figure 3.

Animation showing FIFO queue processing in the correct order

Figure 3. Animation showing FIFO queue processing in the correct order

In Wesfarmers Health’s scenario with relatively few events per user, it made sense to deploy a FIFO queue to deliver messages in the order they arrived and also have them delivered once for each event (see more details on quotas at Amazon SQS FIFO queue quotas).

Wesfarmers Health also employed the use of message group IDs to parallelize all users using a unique userID. This means that they can guarantee order and exactly-once processing at the user level, while processing all users in parallel, as shown in Figure 4.

Animation showing a FIFO queue partitioned per user, in the correct order per user

Figure 4. Animation showing a FIFO queue partitioned per user, in the correct order per user

The buffer implementation

Wesfarmers Health also opted to buffer messages for the same user in order to minimize race conditions. This was achieved by employing an Amazon DynamoDB table to capture the timestamp of the last message that was processed. For this, Wesfarmers Health designed the DynamoDB table shown in Figure 5.

Example DynamoDB schema with messageGroupId based on user, and TTL

Figure 5. Example DynamoDB schema with messageGroupId based on user, and TTL

The messageGroupId value corresponds to a unique identifier for a user. The time-to-live (TTL) value serves dual functions. First, the TTL is the value of the Unix timestamp for the last time a message from a specific user was processed, plus the desired message buffer interval (for example, 60 seconds). It also serves a secondary function of allowing DynamoDB to remove obsolete entries to minimize table size, thus improving cost for certain DynamoDB operations.

In between the Amazon SQS FIFO queue and the Amazon DynamoDB table sits an AWS Lambda function that listens to all events and transmits to the downstream SaaS solution. The main responsibility of this Lambda function is to check the DynamoDB table for the last processed timestamp for the user before processing the event. If, by chance, a user event for the user was already processed within the buffer interval, then that event is sent back to the queue with a visibility timeout that matches the interval, so that the user events for that user is not processed until the buffer interval is passed.

Amazon DynamoDB table and AWS Lambda function introducing the buffer

Figure 6. Amazon DynamoDB table and AWS Lambda function introducing the buffer

Final architecture

Figure 7 shows the high-level architecture diagram that powers this integration. When users send their consent events, it is sent to the SQS FIFO queue first. The AWS Lambda function determines, based on the timestamp stored in the DynamoDB table, whether to process it or delay the message. Once the outcome is determined, the function passes through the event downstream.

Final architecture diagram

Figure 7. Final architecture diagram

Why serverless services were used

The Wesfarmers Health Digital Innovations team is strategically aligned towards a serverless first approach where appropriate. This team builds, maintains, and owns these solutions end-to-end. Using serverless technologies, the team gets to focus on delivering business outcomes while leaving the undifferentiated heavy lifting of managing infrastructure to AWS.

In this specific scenario, the number of requests for consent is sporadic. With serverless technologies, you pay as you go. This is a great use case for workloads that have requests fluctuate throughout the day, providing the customer a great option to be cost efficient.

The team at Wesfarmers Health has been on the serverless journey for a while, and are quite mature in developing and managing these workloads in a production setting using best practices mentioned above and employing the AWS Well Architected Framework to guide their solutions.

Conclusion

SaaS solutions are a great mechanism to move fast and reduce the undifferentiated heavy lifting of building and maintaining solutions. However, integrations play a crucial part as to how these solutions work with your existing ecosystem.

Using AWS services, you can build these integration patterns that is fit for purpose, for your unique requirements.

AWS Serverless Patterns is a great place to get started to see what other patterns exist for your use case.

Next steps

Check out the repository hosted on AWS Patterns that sets up this architecture. You can review, modify, and extend it for your own use case.

How AWS powered Prime Day 2024 for record-breaking sales

Post Syndicated from Channy Yun (윤석찬) original https://aws.amazon.com/blogs/aws/how-aws-powered-prime-day-2024-for-record-breaking-sales/

The last Amazon Prime Day 2024 (July 17-18) was Amazon’s biggest Prime Day shopping event ever, with record sales and more items sold during the two-day event than any previous Prime Day event. Prime members shopped for millions of deals and saved billions across more than 35 categories globally.

I live in South Korea, but luckily I was staying in Seattle to attend the AWS Heroes Summit during Prime Day 2024. I signed up for a Prime membership and used Rufus, my new AI-powered conversational shopping assistant, to search for items quickly and easily. Prime members in the U.S. like me chose to consolidate their deliveries on millions of orders during Prime Day, saving an estimated 10 million trips. This consolidation results in lower carbon emissions on average.

We know from Jeff’s annual blog post that AWS runs the Amazon website and mobile app that makes these short-term, large scale global events feasible. (check out his 2016, 2017, 2019, 2020, 2021, 2022, and 2023 posts for a look back). Today I want to share top numbers from AWS that made my amazing shopping experience possible.

Prime Day 2024 – all the numbers
Here are some of the most interesting and/or mind-blowing metrics:

Amazon EC2 – Since many of Amazon.com services such as Rufus and Search use AWS artificial intelligence (AI) chips under the hood, Amazon deployed a cluster of over 80,000 Inferentia and Trainium chips for Prime Day. During Prime Day 2024, Amazon used over 250K AWS Graviton chips to power more than 5,800 distinct Amazon.com services (double that of 2023).

Amazon EBS – In support of Prime Day, Amazon provisioned 264 PiB of Amazon EBS storage in 2024, a 62 percent increase compared to 2023. When compared to the day before Prime Day 2024, Amazon.com performance on Amazon EBS jumped by 5.6 trillion read/write I/O operations during the event, or an increase of 64 percent compared to Prime Day 2023. Also, when compared to the day before Prime Day 2024, Amazon.com transferred an incremental 444 petabytes of data during the event, or an increase of 81 percent compared to Prime Day 2023.

Amazon Aurora – On Prime Day, 6,311 database instances running the PostgreSQL-compatible and MySQL-compatible editions of Amazon Aurora processed more than 376 billion transactions, stored 2,978 terabytes of data, and transferred 913 terabytes of data.

Amazon DynamoDB – DynamoDB powers multiple high-traffic Amazon properties and systems including Alexa, the Amazon.com sites, and all Amazon fulfillment centers. Over the course of Prime Day, these sources made tens of trillions of calls to the DynamoDB API. DynamoDB maintained high availability while delivering single-digit millisecond responses and peaking at 146 million requests per second.

Amazon ElastiCache – ElastiCache served more than quadrillion requests on a single day with a peak of over 1 trillion requests per minute.

Amazon QuickSight – Over the course of Prime Day 2024, one Amazon QuickSight dashboard used by Prime Day teams saw 107K unique hits, 1300+ unique visitors, and delivered over 1.6M queries.

Amazon SageMaker – SageMaker processed more than 145B inference requests during Prime Day.

Amazon Simple Email Service (Amazon SES) – SES sent 30 percent more emails for Amazon.com during Prime Day 2024 vs 2023, delivering 99.23 percent of those emails to customers.

Amazon GuardDuty – During Prime Day 2024, Amazon GuardDuty monitored nearly 6 trillion log events per hour, a 31.9% increase from the previous year’s Prime Day.

AWS CloudTrail – CloudTrail processed over 976 billion events in support of Prime Day 2024.

Amazon CloudFront – CloudFront handled a peak load of over 500 million HTTP requests per minute, for a total of over 1.3 trillion HTTP requests during Prime Day 2024, a 30 percent increase in total requests compared to Prime Day 2023.

Prepare to Scale
As Jeff noted in every year, rigorous preparation is key to the success of Prime Day and our other large-scale events. For example, 733 AWS Fault Injection Service experiments were run to test resilience and ensure Amazon.com remains highly available on Prime Day.

If you are preparing for a similar business-critical events, product launches, and migrations, I strongly recommend that you take advantage of newly-branded AWS Countdown, a support program designed for your project lifecycle to assess operational readiness, identify and mitigate risks, and plan capacity, using proven playbooks developed by AWS experts. For example, with additional help from AWS Countdown, Legal Zoom successfully migrated 450 servers with minimal issues and continues to leverage AWS Countdown Premium to streamline and expedite the launch of SaaS applications.

We look forward to seeing what other records will be broken next year!

Channy & Jeff;

Tenant portability: Move tenants across tiers in a SaaS application

Post Syndicated from Aman Lal original https://aws.amazon.com/blogs/architecture/tenant-portability-move-tenants-across-tiers-in-a-saas-application/

In today’s fast-paced software as a service (SaaS) landscape, tenant portability is a critical capability for SaaS providers seeking to stay competitive. By enabling seamless movement between tiers, tenant portability allows businesses to adapt to changing needs. However, manual orchestration of portability requests can be a significant bottleneck, hindering scalability and requiring substantial resources. As tenant volumes and portability requests grow, this approach becomes increasingly unsustainable, making it essential to implement a more efficient solution.

This blog post delves into the significance of tenant portability and outlines the essential steps for its implementation, with a focus on seamless integration into the SaaS serverless reference architecture. The following diagram illustrates the tier change process, highlighting the roles of tenants and admins, as well as the impact on new and existing services in the architecture. The subsequent sections will provide a detailed walkthrough of the sequence of events shown in this diagram.

Incorporating tenant portability within a SaaS serverless reference architecture

Figure 1. Incorporating tenant portability within a SaaS serverless reference architecture

Why do we need tenant portability?

  • Flexibility: Tier upgrades or downgrades initiated by the tenant help align with evolving customer demand, preferences, budget, and business strategies. These tier changes generally alter the service contract between the tenant and the SaaS provider.
  • Quality of service: Generally initiated by the SaaS admin in response to a security breach or when the tenant is reaching service limits, these incidents might require tenant migration to maintain service level agreements (SLAs).

High-level portability flow

Tenant portability is generally achieved through a well-orchestrated process that ensures seamless tier transitions. This process comprises of the following steps:

High-level tenant portability flow

Figure 2. High-level tenant portability flow

  1. Port identity stores: Evaluate the need for migrating the tenant’s identity store to the target tier. In scenarios where the existing identity store is incompatible with the target tier, you’ll need to provision a new destination identity store and administrative users.
  2. Update tenant configuration: SaaS applications store tenant configuration details such as tenant identifier and tier that are required for operation.
  3. Resource management: Initiate deployment pipelines to provision resources in the target tier and update infrastructure-tenant mapping tables.
  4. Data migration: Migrate tenant data from the old tier to the newly provisioned target tier infrastructure.
  5. Cutover: Redirect tenant traffic to the new infrastructure, enabling zero-downtime utilization of updated resources.

Consideration walkthrough

We’ll now delve into each step of the portability workflow, highlighting key considerations for a successful implementation.

1. Port identity stores

The key consideration for porting identity is migrating user identities while maintaining a consistent end-user experience, without requiring password resets or changes to user IDs.

Create a new identity store and associated application client that the frontend can use; after that, we’ll need a mechanism to migrate users. In the reference architecture using Amazon Cognito, a silo refers to each tenant having its own user pool, while a pool refers to multiple tenants sharing a user pool through user groups.

To ensure a smooth migration process, it’s important to communicate with users and provide them with options to avoid password resets. One approach is to notify users to log in before a deadline to avoid password resets. Employ just-in-time migration, enabling password retention during login for uninterrupted user experience with existing passwords.

However, this requires waiting for all users to migrate, potentially leading to a prolonged migration window. As a complementary measure, after the deadline, the remaining users can be migrated by using bulk import, which enforces password resets. This ensures a consistent migration within a defined timeframe, albeit inconveniencing some users.

2. Update tenant configuration

SaaS providers rely on metadata stores to maintain all tenant-related configuration. Updates to tenant metadata should be completed carefully during the porting process. When you update the tenant configuration for the new tier, two key aspects must be considered:

  • Retain tenant IDs throughout the porting process to ensure smooth integration of tenant logging, metrics, and cost allocation post-migration, providing a continuous record of events.
  • Establish new API keys and a throttling mechanism tailored to the new tier to accommodate higher usage limits for the tenants.

To handle this, a new tenant portability service can be introduced in the SaaS reference architecture. This service assigns a different AWS API Gateway usage plan to the tenant based on the requested tier change, and orchestrates calls to other downstream services. Subsequently, the existing tenant management service will need an extension to handle tenant metadata updates (tier, user-pool-id, app-client-id) based on the incoming porting request.

3. Resource management

Successful portability hinges on two crucial aspects during infrastructure provisioning:

  • Ensure tenant isolation constructs are respected in the porting process through mechanisms to prevent cross-tenant access. Either role-based access control (RBAC) or attribute-based-access control (ABAC) can be used to ensure this. ABAC isolation is generally easier to manage during porting if the tenant identifier is preserved, as in the previous step.
  • Ensure instrumentation and metric collection are set up correctly in the new tier. Recreate identical metric filters to ensure monitoring visibility for SaaS operations.

To handle infrastructure provisioning and deprovisioning in the reference architecture, extend the tenant provisioning service:

  • Update the tenant-stack mapping table to record migrated tenant stack details.
  • Initiate infrastructure provisioning or destruction pipelines as needed (for example, to run destruction pipelines after the data migration and user cutover steps).

Finally, ensure new resources comply with required compliance standards by applying relevant security configurations and deploying a compliant version of the application.

By addressing these aspects, SaaS providers can ensure a seamless transition while maintaining tenant isolation and operational continuity.

4. Data migration

The data migration strategy is heavily influenced by architectural decisions such as the storage engine and isolation approach. Minimizing user downtime during migration requires a focus on accelerating the migration process, maintaining service availability, and setting up a replication channel for incremental updates. Additionally, it’s crucial to address schema changes made by tenants in a silo model to ensure data integrity and avoid data loss when transitioning to a pool model.

Extending the reference architecture, a new data porting service can be introduced to enable Amazon DynamoDB data migration between different tiers. DynamoDB partition migration can be accomplished through multiple approaches, including AWS Glue, custom scripts, or duplicating DynamoDB tables and bulk-deleting partitions. We recommend a hybrid approach to achieve zero-downtime migration. This solution applies only when the DynamoDB schema remains consistent across tiers. If the schema has changed, a custom solution is required for data migration.

5. Cutover

The cutover phase involves redirecting users to the new infrastructure, disabling continuous data replication, and ensuring that compliance requirements are met. This includes running tests or obtaining audits/certifications, especially when moving to high-sensitivity silos. After a successful cutover, cleanup activities are necessary, including removing temporary infrastructure and deleting historical tenant data from the previous tier. However, before deleting data, ensure that audit trails are preserved and compliant with regulatory requirements, and that data deletion aligns with organizational policies.

Conclusion

In conclusion, portability is a vital feature for multi-tenant SaaS. It allows tenants to move data and configurations between tiers effortlessly and can be incorporated in reference architecture as above. Key considerations include maintaining consistent identities, staying compliant, reducing downtime and automating the process.