All posts by Scott Wainner

Modernized Database Queuing using Amazon SQS and AWS Services

2021-12-17 Scott Wainner

Post Syndicated from Scott Wainner original https://aws.amazon.com/blogs/architecture/modernized-database-queuing-using-amazon-sqs-and-aws-services/

A queuing system is composed of producers and consumers. A producer enqueues messages (writes messages to a database) and a consumer dequeues messages (reads messages from the database). Business applications requiring asynchronous communications often use the relational database management system (RDBMS) as the default message storage mechanism. But the increased message volume, complexity, and size, competes with the inherent functionality of the database. The RDBMS becomes a bottleneck for message delivery, while also impacting other traditional enterprise uses of the database.

In this blog, we will show how you can mitigate the RDBMS performance constraints by using Amazon Simple Queue Service (Amazon SQS), while retaining the intrinsic value of the stored relational data.

Problems with legacy queuing methods

Commercial databases such as Oracle offer Advanced Queuing (AQ) mechanisms, while SQL Server supports Service Broker for queuing. The database acts as a message queue system when incoming messages are captured along with metadata. A message stored in a database is often processed multiple times using a sequence of message extraction, transformation, and loading (ETL). The message is then routed for distribution to a set of recipients based on logic that is often also stored in the database.

The repetitive manipulation of messages and iterative attempts at distributing pending messages may create a backlog that interferes with the primary function of the database. This backpressure can propagate to other systems that are trying to store and retrieve data from the database and cause a performance issue (see Figure 1).

Figure 1. A relational database serving as a message queue.

There are several scenarios where the database can become a bottleneck for message processing:

Message metadata. Messages consist of the payload (the content of the message) and metadata that describes the attributes of the message. The metadata often includes routing instructions, message disposition, message state, and payload attributes.

The message metadata may require iterative transformation during the message processing. This creates an inefficient sequence of read, transform, and write processes. This is especially inefficient if the message attributes undergo multiple transformations that must be reflected in the metadata. The iterative read/write process of metadata consumes the database IOPS, and forces the database to scale vertically (add more CPU and more memory).
A new paradigm emerges when message management processes exist outside of the database. Here, the metadata is manipulated without interacting with the database, except to write the final message disposition. Application logic can be applied through functions such as AWS Lambda to transform the message metadata.

Message large object (LOB). A message may contain a large binary object that must be stored in the payload.

Storing large binary objects in the RDBMS is expensive. Manipulating them consumes the throughput of the database with iterative read/write operations. If the LOB must be transformed, then it becomes wasteful to store the object in the database.
An alternative approach offers a more efficient message processing sequence. The large object is stored external to the database in universally addressable object storage, such as Amazon Simple Storage Service (Amazon S3). There is only a pointer to the object that is stored in the database. Smaller elements of the message can be read from or written to the database, while large objects can be manipulated more efficiently in object storage resources.

Message fan-out. A message can be loaded into the database and analyzed for routing, where the same message must be distributed to multiple recipients.

Messages that require multiple recipients may require a copy of the message replicated for each recipient. The replication creates multiple writes and reads from the database, which is inefficient.
A new method captures only the routing logic and target recipients in the database. The message replication then occurs outside of the database in distributed messaging systems, such as Amazon Simple Notification Service (Amazon SNS).

Message queuing. Messages are often kept in the database until they are successfully processed for delivery. If a message is read from the database and determined to be undeliverable, then the message is kept there until a later attempt is successful.

An inoperable message delivery process can create backpressure on the database where iterative message reads are processed for the same message with unsuccessful delivery. This creates a feedback loop causing even more unsuccessful work for the database.
Try a message queuing system such as Amazon MQ or Amazon SQS, which offloads the message queuing from the database. These services offer efficient message retry mechanisms, and reduce iterative reads from the database.

Sequenced message delivery. Messages may require ordered delivery where the delivery sequence is crucial for maintaining application integrity.

The application may capture the message order within database tables, but the sorting function still consumes processing capabilities. The order sequence must be sorted and maintained for each attempted message delivery.
Message order can be maintained outside of the database using a queue system, such as Amazon SQS, with first-in/first-out (FIFO) delivery.

Message scheduling. Messages may also be queued with a scheduled delivery attribute. These messages require an event driven architecture with initiated scheduled message delivery.

The database often uses trigger mechanisms to initiate message delivery. Message delivery may require a synchronized point in time for delivery (many messages at once), which can cause a spike in work at the scheduled interval. This impacts the database performance with artificially induced peak load intervals.
Event signals can be generated in systems such as Amazon EventBridge, which can coordinate the transmission of messages.

Message disposition. Each message maintains a message disposition state that describes the delivery state.

The database is often used as a logging system for message transmission status. The message metadata is updated with the disposition of the message, while the message remains in the database as an artifact.
An optimized technique is available using Amazon CloudWatch as a record of message disposition.

Modernized queuing architecture

Decoupling message queuing from the database improves database availability and enables greater message queue scalability. It also provides a more cost-effective use of the database, and mitigates backpressure created when database performance is constrained by message management.

The modernized architecture uses loosely coupled services, such as Amazon S3, AWS Lambda, Amazon Message Queue, Amazon SQS, Amazon SNS, Amazon EventBridge, and Amazon CloudWatch. This loosely coupled architecture lets each of the functional components scale vertically and horizontally independent of the other functions required for message queue management.

Figure 2 depicts a message queuing architecture that uses Amazon SQS for message queuing and AWS Lambda for message routing, transformation, and disposition management. An RDBMS is still leveraged to retain metadata profiles, routing logic, and message disposition. The ETL processes are handled by AWS Lambda, while large objects are stored in Amazon S3. Finally, message fan-out distribution is handled by Amazon SNS, and the queue state is monitored and managed by Amazon CloudWatch and Amazon EventBridge.

Figure 2. Modernized queuing architecture using Amazon SQS

Conclusion

In this blog, we show how queuing functionality can be migrated from the RDMBS while minimizing changes to the business application. The RDBMS continues to play a central role in sourcing the message metadata, running routing logic, and storing message disposition. However, AWS services such as Amazon SQS offload queue management tasks related to the messages. AWS Lambda performs message transformation, queues the message, and transmits the message with massive scale, fault-tolerance, and efficient message distribution.

Read more about the diverse capabilities of AWS messaging services:

By using AWS services, the RDBMS is no longer a performance bottleneck in your business applications. This improves scalability, and provides resilient, fault-tolerant, and efficient message delivery.

Read our blog on modernization of common database functions:

Migrating a Database Workflow to Modernized AWS Workflow Services

Migrating a Database Workflow to Modernized AWS Workflow Services

2021-12-11 Scott Wainner

Post Syndicated from Scott Wainner original https://aws.amazon.com/blogs/architecture/migrating-a-database-workflow-to-modernized-aws-workflow-services/

The relational database is a critical resource in application architecture. Enterprise organizations often use relational database management systems (RDBMS) to provide embedded workflow state management. But this can present problems, such as inefficient use of data storage and compute resources, performance issues, and decreased agility. Add to this the responsibility of managing workflow states through custom triggers and job-based algorithms, which further exacerbate the performance constraints of the database. The complexity of modern workflows, frequency of runtime, and external dependencies encourages us to seek alternatives to using these database mechanisms.

This blog describes how to use modernized workflow methods that will mitigate database scalability constraints. We’ll show how transitioning your workflow state management from a legacy database workflow to AWS services enables new capabilities with scale.

A workflow system is composed of an ordered set of tasks. Jobs are submitted to the workflow where tasks are initiated in the proper sequence to achieve consistent results. Each task is defined with a task input criterion, task action, task output, and task disposition, see Figure 1.

Figure 1. Task with input criteria, an action, task output, and task disposition

Embedded Workflow

Figure 2 depicts the database serving as the workflow state manager where an external entity submits a job for execution into the database workflow. This can be challenging, as the embedded workflow definition requires the use of well-defined database primitives. In addition, any external tasks require tight coupling with database primitives that constrains workflow agility.

Figure 2. Embedded database workflow mechanisms with internal and external task entities

Externalized workflow

A paradigm change is made with use of a modernized workflow management system, where the workflow state exists external to the relational database. A workflow management system is essentially a modernized database specifically designed to manage the workflow state (depicted in Figure 3.)

Figure 3. External task manager extracting workflow state, job data, performing the task, and re-inserting the job data back into the database

AWS offers two workflow state management services: Amazon Simple Workflow Service (Amazon SWF) and AWS Step Functions. The workflow definition and workflow state are no longer stored in a relational database; these workflow attributes are incorporated into the AWS service. The AWS services are highly scalable, enable flexible workflow definition, and integrate tasks from many other systems, including relational databases. These capabilities vastly expand the types of tasks available in a workflow. Migrating the workflow management to an AWS service reduces demand placed upon the database. In this way, the database’s primary value of representing structured and relational data is preserved. AWS Step Functions offers a well-defined set of task primitives for the workflow designer. The designer can still incorporate tasks that leverage the inherent relational database capabilities.

Pull and push workflow models

First, we must differentiate between Amazon SWF and AWS Step Functions to determine which service is optimal for your workflow. Amazon SWF uses an HTTPS API pull model where external Workers and Deciders execute Tasks and assert the Next-Step, respectively. The workflow state is captured in the Amazon SWF history table. This table tracks the state of jobs and tasks so a common reference exists for all the candidate Workers and Deciders.

Amazon SWF does require development of external entities that make the appropriate API calls into Amazon SWF. It inherently supports external tasks that require human intervention. This workflow can tolerate long lead times for task execution. The Amazon SWF pull model is represented in the Figure 4.

Figure 4. ‘Pull model’ for workflow definition when using Amazon SWF

In contrast, AWS Step Functions uses a push model, shown in Figure 5, that initiates workflow tasks and integrates seamlessly with other AWS services. AWS Step Functions may also incorporate mechanisms that enable long-running tasks that require human intervention. AWS Step Functions provides the workflow state management, requires minimal coding, and provides traceability of all transactions.

Figure 5. ‘Push model’ for workflow definition when using AWS Step Functions

Workflow optimizations

The introduction of an external workflow manager such as AWS Step Functions or Amazon SWF, can effectively handle long-running tasks, computationally complex processes, or large media files. AWS workflow managers support asynchronous call-back mechanisms to track task completion. The state of the workflow is intrinsically captured in the service, and the logging of state transitions is automatically captured. Computationally expensive tasks are addressed by invoking high-performance computational resources.

Finally, the AWS workflow manager also improves the handling of large data objects. Previously, jobs would transfer large data objects (images, videos, or audio) into a database’s embedded workflow manager. But this impacts the throughput capacity and consumes database storage.

In the new paradigm, large data objects are no longer transferred to the workflow as jobs, but as job pointers. These are transferred to the workflow whenever tasks must reference external object storage systems. The sequence of state transitions can be traced through CloudWatch Events. This verifies workflow completion, diagnostics of task execution (start, duration, and stop) and metrics on the number of jobs entering the various workflows.

Large data objects are best captured in more cost-effective object storage solutions such as Amazon Simple Storage Service (Amazon S3). Data records may be conveyed via a variety of NoSQL storage mechanisms including:

Amazon DynamoDB: Scalable and fast data retrieval of key-value task datasets
Amazon Simple Notification Service (SNS): Scalable distribution mechanism for tasks
Amazon Simple Queue Service (SQS): Asynchronous processing of data for task

The workflow manager stores pointer references so tasks can directly access these data objects and perform transformation on the data. It provides pointers to the results without transferring the data objects to the workflow. Transferring pointers in the workflow as opposed to transferring large data objects significantly improves the performance, reduces costs, and dramatically improves scalability. You may continue to use the RDBMS for the storage of structured data and use its SQL capabilities with structured tables, joins, and stored procedures. AWS Step Functions enable indirect integration with relational databases using tools such as the following:

AWS Lambda: Short-lived execution of custom code to handle tasks
AWS Glue: Data integration enabling combination and preparation of data including SQL

AWS Step Functions can be coupled with AWS Lambda, a serverless compute capability. Lambda code can manipulate the job data and incorporate many other AWS services. AWS Lambda can also interact with any relational database including Amazon Relational Database Service (RDS) or Amazon Aurora as the executor of a task.

The modernized architecture shown in Figure 6 offers more flexibility in creating new workflows that can evolve with your business requirements.

Figure 6. Using Step Functions as workflow state manager

Summary

Several key advantages are highlighted with this modernized architecture using either Amazon SWF or AWS Step Functions:

You can manage multiple versions of a workflow. Backwards compatibility is maintained as capability expands. Previous business requirements using metadata interpretation on job submission is preserved.
Tasks leverage loose coupling of external systems. This provides far more data processing and data manipulation capabilities in a workflow.
Upgrades can happen independently. A loosely coupled system enables independent upgrade capabilities of the workflow or the external system executing the task.
Automatic scaling. Serverless architecture scales automatically with the growth in job submissions.
Managed services. AWS provides highly resilient and fault tolerant managed services
Recovery. Instance recovery mechanisms can manage workflow state machines.

The modernized workflow using Amazon SWF or AWS Step Functions offers many key advantages. It enables application agility to adapt to changing business requirements. By using a managed service, the enterprise architect can focus on the workflow requirements and task actions, rather than building out a workflow management system. Finally, critical intellectual property developed in the RDBMS system can be preserved as tasks in the modernized workflow using AWS services.

Further reading:

Noise