Tag Archives: Amazon Simple Queue Service (SQS)

ICYMI: Serverless pre:Invent 2019

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/icymi-serverless-preinvent-2019/

With Contributions from Chris Munns – Sr Manager – Developer Advocacy – AWS Serverless

The last two weeks have been a frenzy of AWS service and feature launches, building up to AWS re:Invent 2019. As there has been a lot announced we thought we’d ship an ICYMI post summarizing the serverless service specific features that have been announced. We’ve also dropped in some service announcements from services that are commonly used in serverless application architectures or development.

AWS re:Invent

AWS re:Invent 2019

We also want you to know that we’ll be talking about many of these features (as well as those coming) in sessions at re:Invent.

Here’s what’s new!

AWS Lambda

On September 3, AWS Lambda started rolling out a major improvement to how AWS Lambda functions work with your Amazon VPC networks. This change brings both scale and performance improvements, and addresses several of the limitations of the previous networking model with VPCs.

On November 25, Lambda announced that the rollout of this new capability has completed in 6 additional regions including US East (Virginia) and US West (Oregon).

New VPC to VPC NAT for Lambda functions

New VPC to VPC NAT for Lambda functions

On November 18, Lambda announced three new runtime updates. Lambda now supports Node.js 12, Java 11, and Python 3.8. Each of these new runtimes has new language features and benefits so be sure to check out the linked release posts. These new runtimes are all based on an Amazon Linux 2 execution environment.

Lambda has released a number of controls for both stream and async based invocations:

  • For Lambda functions consuming events from Amazon Kinesis Data Streams or Amazon DynamoDB Streams, it’s now possible to limit the retry count, limit the age of records being retried, configure a failure destination, or split a batch to isolate a problem record. These capabilities will help you deal with potential “poison pill” records that would previously cause streams to pause in processing.
  • For asynchronous Lambda invocations, you can now set the maximum event age and retry attempts on the event. If either configured condition is met, the event can be routed to a dead letter queue (DLQ), Lambda destination, or it can be discarded.

In addition to the above controls, Lambda Destinations is a new feature that allows developers to designate an asynchronous target for Lambda function invocation results. You can set one destination for a success, and another for a failure. This unlocks really useful patterns for distributed event-based applications and can reduce code to send function results to a destination manually.

Lambda Destinations

Lambda Destinations

Lambda also now supports setting a Parallelization Factor, which allows you to set multiple Lambda invocations per shard for Amazon Kinesis Data Streams and Amazon DynamoDB Streams. This allows for faster processing without the need to increase your shard count, while still guaranteeing the order of records processed.

Lambda Parallelization Factor diagram

Lambda Parallelization Factor diagram

Lambda now supports Amazon SQS FIFO queues as an event source. FIFO queues guarantee the order of record processing compared to standard queues which are unordered. FIFO queues support messaging batching via a MessageGroupID attribute which allows for parallel Lambda consumers of a single FIFO queue. This allows for high throughput of record processing by Lambda.

Lambda now supports Environment Variables in AWS China (Beijing) Region, operated by Sinnet and the AWS China (Ningxia) Region, operated by NWCD.

Lastly, you can now view percentile statistics for the duration metric of your Lambda functions. Percentile statistics tell you the relative standing of a value in a dataset, and are useful when applied to metrics that exhibit large variances. They can help you understand the distribution of a metric, spot outliers, and find hard-to-spot situations that create a poor customer experience for a subset of your users.

AWS SAM CLI

AWS SAM CLI deploy command

AWS SAM CLI deploy command

The SAM CLI team simplified the bucket management and deployment process in the SAM CLI. You no longer need to manage a bucket for deployment artifacts – SAM CLI handles this for you. The deployment process has also been streamlined from multiple flagged commands to a single command, sam deploy.

AWS Step Functions

One of the powerful features of Step Functions is its ability to integrate directly with AWS services without you needing to write complicated application code. Step Functions has expanded its integration with Amazon SageMaker to simplify machine learning workflows, and added a new integration with Amazon EMR, making it faster to build and easier to monitor EMR big data processing workflows.

Step Functions step with EMR

Step Functions step with EMR

Step Functions now provides the ability to track state transition usage by integrating with AWS Budgets, allowing you to monitor and react to usage and spending trends on your AWS accounts.

You can now view CloudWatch Metrics for Step Functions at a one-minute frequency. This makes it easier to set up detailed monitoring for your workflows. You can use one-minute metrics to set up CloudWatch Alarms based on your Step Functions API usage, Lambda functions, service integrations, and execution details.

AWS Step Functions now supports higher throughput workflows, making it easier to coordinate applications with high event rates.

In US East (N. Virginia), US West (Oregon), and EU (Ireland), throughput has increased from 1,000 state transitions per second to 1,500 state transitions per second with bucket capacity of 5,000 state transitions. The default start rate for state machine executions has also increased from 200 per second to 300 per second, with bucket capacity of up to 1,300 starts in these regions.

In all other regions, throughput has increased from 400 state transitions per second to 500 state transitions per second with bucket capacity of 800 state transitions. The default start rate for AWS Step Functions state machine executions has also increased from 25 per second to 150 per second, with bucket capacity of up to 800 state machine executions.

Amazon SNS

Amazon SNS now supports the use of dead letter queues (DLQ) to help capture unhandled events. By enabling a DLQ, you can catch events that are not processed and re-submit them or analyze to locate processing issues.

Amazon CloudWatch

CloudWatch announced Amazon CloudWatch ServiceLens to provide a “single pane of glass” to observe health, performance, and availability of your application.

CloudWatch ServiceLens

CloudWatch ServiceLens

CloudWatch also announced a preview of a capability called Synthetics. CloudWatch Synthetics allows you to test your application endpoints and URLs using configurable scripts that mimic what a real customer would do. This enables the outside-in view of your customers’ experiences, and your service’s availability from their point of view.

On November 18, CloudWatch launched Embedded Metric Format to help you ingest complex high-cardinality application data in the form of logs and easily generate actionable metrics from them. You can publish these metrics from your Lambda function by using the PutLogEvents API or for Node.js or Python based applications using an open source library.

Lastly, CloudWatch announced a preview of Contributor Insights, a capability to identify who or what is impacting your system or application performance by identifying outliers or patterns in log data.

AWS X-Ray

X-Ray announced trace maps, which enable you to map the end to end path of a single request. Identifiers will show issues and how they affect other services in the request’s path. These can help you to identify and isolate service points that are causing degradation or failures.

X-Ray also announced support for Amazon CloudWatch Synthetics, currently in preview. X-Ray supports tracing canary scripts throughout the application providing metrics on performance or application issues.

X-Ray Service map with CloudWatch Synthetics

X-Ray Service map with CloudWatch Synthetics

Amazon DynamoDB

DynamoDB announced support for customer managed customer master keys (CMKs) to encrypt data in DynamoDB. This allows customers to bring your own key (BYOK) giving you full control over how you encrypt and manage the security of your DynamoDB data.

It is now possible to add global replicas to existing DynamoDB tables to provide enhanced availability across the globe.

Currently under preview, is another new DynamoDB capability to identify frequently accessed keys and database traffic trends. With this you can now more easily identify “hot keys” and understand usage of your DynamoDB tables.

CloudWatch Contributor Insights for DynamoDB

CloudWatch Contributor Insights for DynamoDB

Last but far from least for DynamoDB, is adaptive capacity, a feature which helps you handle imbalanced workloads by isolating frequently accessed items automatically and shifting data across partitions to rebalance them. This helps both reduce cost by enabling you to provision throughput for a more balanced out workload vs. over provisioning for uneven data access patterns.

AWS Serverless Application Repository

The AWS Serverless Application Repository (SAR) now offers Verified Author badges. These badges enable consumers to quickly and reliably know who you are. The badge will appear next to your name in the SAR and will deep-link to your GitHub profile.

SAR Verified developer badges

SAR Verified developer badges

AWS Code Services

AWS CodeCommit launched the ability for you to enforce rule workflows for pull requests, making it easier to ensure that code has pass through specific rule requirements. You can now create an approval rule specifically for a pull request, or create approval rule templates to be applied to all future pull requests in a repository.

AWS CodeBuild added beta support for test reporting. With test reporting, you can now view the detailed results, trends, and history for tests executed on CodeBuild for any framework that supports the JUnit XML or Cucumber JSON test format.

CodeBuild test trends

CodeBuild test trends

AWS Amplify and AWS AppSync

Instead of trying to summarize all the awesome things that our peers over in the Amplify and AppSync teams have done recently we’ll instead link you to their own recent summary: “A round up of the recent pre-re:Invent 2019 AWS Amplify Launches”.

AWS AppSync

AWS AppSync

Still looking for more?

We only covered a small bit of all the awesome new things that were recently announced. Keep your eyes peeled for more exciting announcements next week during re:Invent and for a future ICYMI Serverless Q4 roundup. We’ll also be kicking off a fresh series of Tech Talks in 2020 with new content helping to dive deeper on everything new coming out of AWS for serverless application developers.

Running Cost-effective queue workers with Amazon SQS and Amazon EC2 Spot Instances

Post Syndicated from peven original https://aws.amazon.com/blogs/compute/running-cost-effective-queue-workers-with-amazon-sqs-and-amazon-ec2-spot-instances/

This post is contributed by Ran Sheinberg | Sr. Solutions Architect, EC2 Spot & Chad Schmutzer | Principal Developer Advocate, EC2 Spot | Twitter: @schmutze

Introduction

Amazon Simple Queue Service (SQS) is used by customers to run decoupled workloads in the AWS Cloud as a best practice, in order to increase their applications’ resilience. You can use a worker tier to do background processing of images, audio, documents and so on, as well as offload long-running processes from the web tier. This blog post covers the benefits of pairing Amazon SQS and Spot Instances to maximize cost savings in the worker tier, and a customer success story.

Solution Overview

Amazon SQS is a fully managed message queuing service that enables customers to decouple and scale microservices, distributed systems, and serverless applications. It is a common best practice to use Amazon SQS with decoupled applications. Amazon SQS increases applications resilience by decoupling the direct communication between the frontend application and the worker tier that does data processing. If a worker node fails, the jobs that were running on that node return to the Amazon SQS queue for a different node to pick up.

Both the frontend and worker tier can run on Spot Instances, which offer spare compute capacity at steep discounts compared to On-Demand Instances. Spot Instances optimize your costs on the AWS Cloud and scale your application’s throughput up to 10 times for the same budget. Spot Instances can be interrupted with two minutes of notification when EC2 needs the capacity back. You can use Spot Instances for various fault-tolerant and flexible applications. These can include analytics, containerized workloads, high performance computing (HPC), stateless web servers, rendering, CI/CD, and queue worker nodes—which is the focus of this post.

Worker tiers of a decoupled application are typically fault-tolerant. So, it is a prime candidate for running on interruptible capacity. Amazon SQS running on Spot Instances allows for more robust, cost-optimized applications.

By using EC2 Auto Scaling groups with multiple instance types that you configured as suitable for your application (for example, m4.xlarge, m5.xlarge, c5.xlarge, and c4.xlarge, in multiple Availability Zones), you can spread the worker tier’s compute capacity across many Spot capacity pools (a combination of instance type and Availability Zone). This increases the chance of achieving the scale that’s required for the worker tier to ingest messages from the queue, and of keeping that scale when Spot Instance interruptions occur, while selecting the lowest-priced Spot Instances in each availability zone.

You can also choose the capacity-optimized allocation strategy for the Spot Instances in your Auto Scaling group. This strategy automatically selects instances that have a lower chance of interruption, which decreases the chances of restarting jobs due to Spot interruptions. When Spot Instances are interrupted, your Auto Scaling group automatically replenishes the capacity from a different Spot capacity pool in order to achieve your desired capacity. Read the blog post “Introducing the capacity-optimized allocation strategy for Amazon EC2 Spot Instances” for more details on how to choose the suitable allocation strategy.

We focus on three main points in this blog:

  1. Best practices for using Spot Instances with Amazon SQS
  2. A customer example that uses these components
  3. Example solution that can help you get you started quickly

Application of Amazon SQS with Spot Instances

Amazon SQS eliminates the complexity of managing and operating message-oriented middleware. Using Amazon SQS, you can send, store, and receive messages between software components at any volume, without losing messages or requiring other services to be available. Amazon SQS is a fully managed service which allows you to set up a queue in seconds. It also allows you to use your preferred SDK to start writing and reading to and from the queue within minutes.

In the following example, we describe an AWS architecture that brings together the Amazon SQS queue and an EC2 Auto Scaling group running Spot Instances. The architecture is used for decoupling the worker tier from the web tier by using Amazon SQS. The example uses the Protect feature (which we will explain later in this post) to ensure that an instance currently processing a job does not get terminated by the Auto Scaling group when it detects that a scale-in activity is required due to a Dynamic Scaling Policy.Architecture diagram for using Amazon SQS with Spot Instances and Auto Scaling groups

AWS reference architecture used for decoupling the worker tier from the web tier by using Amazon SQS

Customer Example: How Trax Retail uses Auto Scaling groups with Spot Instances in their Amazon SQS application

Trax decided to run its queue worker tier exclusively on Spot Instances due to the fault-tolerant nature of its architecture and for cost-optimization purposes. The company digitizes the physical world of retail using Computer Vision. Their ‘Trax Factory’ transforms individual shelf into data and insights about retail store conditions.

Built using asynchronous event-driven architecture, Trax Factory is a cluster of microservices in which the completion of one service triggers the activation of another service. The worker tier uses Auto Scaling groups with dynamic scaling policies to increase and decrease the number of worker nodes in the worker tier.

You can create a Dynamic Scaling Policy by doing the following:

  1. Observe a Amazon CloudWatch metric. Watch the metric for the current number of messages in the Amazon SQS queue (ApproximateNumberOfMessagesVisible).
  2. Create a CloudWatch alarm. This alarm should be based on that metric you created in the prior step.
  3. Use your CloudWatch alarm in a Dynamic Scaling Policy. Use this policy increase and decrease the number of EC2 Instances in the Auto Scaling group.

In Trax’s case, due to the high variability of the number of messages in the queue, they opted to enhance this approach in order to minimize the time it takes to scale, by building a service that would call the SQS API and find the current number of messages in the queue more frequently, instead of waiting for the 5 minute metric refresh interval in CloudWatch.

Trax ensures that its applications are always scaled to meet the demand by leveraging the inherent elasticity of Amazon EC2 instances. This elasticity ensures that end users are never affected and/or service-level agreements (SLA) are never violated.

With a Dynamic Scaling Policy, the Auto Scaling group can detect when the number of messages in the queue has decreased, so that it can initiate a scale-in activity. The Auto Scaling group uses its configured termination policy for selecting the instances to be terminated. However, this policy poses the risk that the Auto Scaling group might select an instance for termination while that instance is currently processing an image. That instance’s work would be lost (although the image would eventually be processed by reappearing in the queue and getting picked up by another worker node).

To decrease this risk, you can use Auto Scaling groups instance protection. This means that every time an instance fetches a job from the queue, it also sends an API call to EC2 to protect itself from scale-in. The Auto Scaling group does not select the protected, working instance for termination until the instance finishes processing the job and calls the API to remove the protection.

Handling Spot Instance interruptions

This instance-protection solution ensures that no work is lost during scale-in activities. However, protecting from scale-in does not work when an instance is marked for termination due to Spot Instance interruptions. These interruptions occur when there’s increased demand for On-Demand Instances in the same capacity pool (a combination of an instance type in an Availability Zone).

Applications can minimize the impact of a Spot Instance interruption. To do so, an application catches the two-minute interruption notification (available in the instance’s metadata), and instructs itself to stop fetching jobs from the queue. If there’s an image still being processed when the two minutes expire and the instance is terminated, the application does not delete the message from the queue after finishing the process. Instead, the message simply becomes visible again for another instance to pick up and process after the Amazon SQS visibility timeout expires.

Alternatively, you can release any ongoing job back to the queue upon receiving a Spot Instance interruption notification by setting the visibility timeout of the specific message to 0. This timeout potentially decreases the total time it takes to process the message.

Testing the solution

If you’re not currently using Spot Instances in your queue worker tier, we suggest testing the approach described in this post.

For that purpose, we built a simple solution to demonstrate the capabilities mentioned in this post, using an AWS CloudFormation template. The stack includes an Amazon Simple Storage Service (S3) bucket with a CloudWatch trigger to push notifications to an SQS queue after an image is uploaded to the Amazon S3 bucket. Once the message is in the queue, it is picked up by the application running on the EC2 instances in the Auto Scaling group. Then, the image is converted to PDF, and the instance is protected from scale-in for as long as it has an active processing job.

To see the solution in action, deploy the CloudFormation template. Then upload an image to the Amazon S3 bucket. In the Auto Scaling Groups console, check the instance protection status on the Instances tab. The protection status is shown in the following screenshot.

instance protection status in console

You can also see the application logs using CloudWatch Logs:

/usr/local/bin/convert-worker.sh: Found 1 messages in https://sqs.us-east-1.amazonaws.com/123456789012/qtest-sqsQueue-1CL0NYLMX64OB

/usr/local/bin/convert-worker.sh: Found work to convert. Details: INPUT=Capture1.PNG, FNAME=capture1, FEXT=png

/usr/local/bin/convert-worker.sh: Running: aws autoscaling set-instance-protection --instance-ids i-0a184c5ae289b2990 --auto-scaling-group-name qtest-autoScalingGroup-QTGZX5N70POL --protected-from-scale-in

/usr/local/bin/convert-worker.sh: Convert done. Copying to S3 and cleaning up

/usr/local/bin/convert-worker.sh: Running: aws s3 cp /tmp/capture1.pdf s3://qtest-s3bucket-18fdpm2j17wxx

/usr/local/bin/convert-worker.sh: Running: aws sqs --output=json delete-message --queue-url https://sqs.us-east-1.amazonaws.com/123456789012/qtest-sqsQueue-1CL0NYLMX64OB --receipt-handle

/usr/local/bin/convert-worker.sh: Running: aws autoscaling set-instance-protection --instance-ids i-0a184c5ae289b2990 --auto-scaling-group-name qtest-autoScalingGroup-QTGZX5N70POL --no-protected-from-scale-in

Conclusion

This post helps you architect fault tolerant worker tiers in a cost optimized way. If your queue worker tiers are fault tolerant and use the built-in Amazon SQS features, you can increase your application’s resilience and take advantage of Spot Instances to save up to 90% on compute costs.

In this post, we emphasized several best practices to help get you started saving money using Amazon SQS and Spot Instances. The main best practices are:

  • Diversifying your Spot Instances using Auto Scaling groups, and selecting the right Spot allocation strategy
  • Protecting instances from scale-in activities while they process jobs
  • Using the Spot interruption notification so that the application stop polling the queue for new jobs before the instance is terminated

We hope you found this post useful. If you’re not using Spot Instances in your queue worker tier, we suggest testing the approach described here. Finally, we would like to thank the Trax team for sharing its architecture and best practices. If you want to learn more, watch the “This is my architecture” video featuring Trax and their solution.

We’d love your feedback—please comment and let me know what you think.


About the authors

 

Ran Sheinberg is a specialist solutions architect for EC2 Spot Instances with Amazon Web Services. He works with AWS customers on cost optimizing their compute spend by utilizing Spot Instances across different types of workloads: stateless web applications, queue workers, containerized workloads, analytics, HPC and others.

 

 

 

 

As a Principal Developer Advocate for EC2 Spot at AWS, Chad’s job is to make sure our customers are saving at scale by using EC2 Spot Instances to take advantage of the most cost-effective way to purchase compute capacity. Follow him on Twitter here! @schmutze

 

New AWS Lambda controls for stream processing and asynchronous invocations

Post Syndicated from Benjamin Smith original https://aws.amazon.com/blogs/compute/new-aws-lambda-controls-for-stream-processing-and-asynchronous-invocations/

Today AWS Lambda is introducing new controls for asynchronous and stream processing invocations. These new features allow you to customize responses to Lambda function errors and build more resilient event-driven and stream-processing applications.

Stream processing function invocations

When processing data from event sources such as Amazon Kinesis Data Streams, and Amazon DynamoDB Streams, Lambda reads records in batches via shards. A shard is a uniquely identified sequence of data records. Your function is then invoked to process records from the batch “in order.” If an error is returned, Lambda retries the batch until processing succeeds or the data expires. This retry behavior is desirable in many cases, but not all:

  1. Until the issue is resolved, no data in the shard is processed. A single malformed record can prevent processing on an entire shard since “in order” guarantee ensures that failed batches are retried until the data record expires. This is often referred to as a “poison pill” event in that it prevents the rest of the system from processing data.
  2. In some cases, it may not be helpful to retry if subsequent invocations will also fail.
  3. If the function is not idempotent, retrying might produce unexpected side effects, and may result in duplicate outputs (for example, multiple database entries or business transactions).
Stream processing function invocationsFigure 1: Default stream invocation retry logs.

The new Lambda controls for stream processing invocations

With the new customizable controls, users are able to control how function errors and retries impact stream processing function invocations. A new event source-mapping configuration subresource (shown below), allows developers to clearly specify which controls will apply specifically to invocations.

DestinationConfig: {
    OnFailure: {
       Destination: SNS/SQS arn (String)
    }
}
Figure 2: Stream processing destination configuration.
{
    "MaximumRetryAttempts": integer,
    "BisectBatchOnFunctionError" : boolean,
    "MaximumRecordAgeInSeconds" : integer
}
Figure 3: Stream processing failure configuration.

MaximumRetryAttempts

Set a maximum number of retry attempts for batches before they can be skipped to unblock processing and prevent duplicate outputs.

Minimum: 0 | maximum: 10,000 | default: 10,000

MaximumRecordAgeInSeconds

Define a record maximum age in seconds, with expired records skipped to allow processing to continue. Data records that do not get successfully processed within the defined age of submission are skipped

Minimum: 60 | default/maximum: 604,800

BisectBatchOnFunctionError

This gives the option to recursively split the failed batch and retry on a smaller subset of records, eventually isolating the metadata causing the error.

Default: false

On-failure Destination

When either MaximumRetryAttempts or MaximumRecordAgeInSeconds reaches the specified value, a record will be skipped. If ‘Destination’ is set, metadata about the skipped records can be sent to a target ARN (i.e. Amazon SQS or Amazon SNS). If no target is configured, then the record will be dropped.

Getting started with error handling for stream processing invocations

Here you will see how to use the new controls to create a customized error handling configuration for stream processing invocations. You will set the maximum retry attempts to 1, the maximum record age to 60s, and send the metadata of skipped record to an Amazon Simple Queue Service (SQS) queue. First create a Kinesis Stream to put records into, and create an SQS queue to hold the metadata of retry exhausted or expired data records.

  1. Go to the Lambda console and choose Create function.
  2. Ensure that Author from scratch is selected, and enter a function a name such as “customStreamErrorExample.”
  3. Select Runtime as Node.js.12.x.
  4. Choose Create function.
    To enable Kinesis as an event trigger and to send metadata of a failed invocation to the SQS standard queue, you must give the Lambda execution role the required permissions.
  5. Scroll down to the Execution Role section and select the view link below the Existing role drop-down.
    Existing role selection
  6. Choose Add inline policy > JSON, then paste the following into the text box replacing {yourSQSarn} with the ARN of the SQS queue you created earlier and replacing {yourKinesisarn} with the ARN of the stream you created earlier. Choose Review policy.
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "VisualEditor0",
                "Effect": "Allow",
                "Action": [
                    "sqs:SendMessage",
                    "kinesis:GetShardIterator",
                    "kinesis:GetRecords",
                    "kinesis:DescribeStream"
                ],
                "Resource": [
                    "{yourSQSarn}",
                    "{yourKinesisarn}"
                ]
            },
            {
                "Sid": "VisualEditor1",
                "Effect": "Allow",
                "Action": "kinesis:ListStreams",
                "Resource": "*"
            }
        ]
    }
    
  7. Give the policy a name such as CustomSQSKinesis and choose Create policy.
  8. Navigate back to your Lambda function and choose Add trigger.
  9. Select Kinesis from the drop-down. Select your stream in the Kinesis stream dropdown.
  10. To see the new control options, choose the drop-down arrow on the Additional settings section.
  11. Paste the ARN of the previously created SQS queue (refer to this link to find the ARN).Additional settings
  12. Set the Maximum retry attempts to 1. Set the Maximum record age to 60. Choose Add.
    The Kinesis trigger has now been configured and the Lambda function has the required permissions to write to SQS and CloudWatch Logs.Success messageKinesis trigger added
  13. Choose the Lambda icon in the Designer section. Scroll down to the Function code section and paste the following code:
    exports.handler = async (event) => {
        // TODO implement
        console.log(event)
        const response = {
            statusCode: 200,
            body: event,dfh
        };
        return response;
    };

    This code has a syntax error in order to force a failure response.

    Next, you will trigger the Lambda function by adding a record to the Kinesis Stream via the AWS CLI. See how to install the AWS CLI if you have not previously done so.

  14. In your preferred terminal run the following command, replacing {YourStreamName} with the name of the stream you have created.
    aws kinesis put-record --stream-name {YourStreamName} --partition-key 123 --data testdata

    You should see a response similar to this (your sequence number will be different):
    Terminal response

  15. In the AWS Lambda console, choose Monitoring > View logs in CloudWatch and select the most recent log group.CloudWatch Logs
  16. As you can see the Lambda function failed as expected, but this time with only a single retry attempt. This is because the max retry attempt value was set to 1 and the record is younger than the max record age that was set to 60.
  17. In the AWS Management Console navigate to your SQS queue, Services > SQS.
  18. Select your queue and choose Queue Actions > View/Delete Messages > Start Polling for Messages.View/delete messages

Here you will see the metadata of your failed invocation. It has successfully been sent to the SQS destination for further investigation or processing.

Asynchronous function invocations

When a function is asynchronously invoked, Lambda sends the event to a queue before it is processed by your function. Invocations that result in an exception in the function code are retried twice with a delay of one minute before the first retry and two minutes before the second. Some invocations may never run due to a throttle or service fault: these are retried with exponential backoff until the invocation is six hours old.

Default asynchronous invocation retry logs

This retry behaviour shown above, works well in situations where every invocation must run at least once. However, in situations with a large volume of errors subsequent invocations are increasingly delayed as the backlog increases, as each new error places another retry event back into the event queue. If it were possible to skip the event without retrying, it would eliminate this delay and continue processing new events.

In order to avoid this default auto-retry policy, there are several widely used approaches:

  • Function error handling with AWS Step Functions
  • Using the event handler for error routing logic.
  • Third party monitoring services for exception alerts

These workarounds require extra resources, code, or custom logic and add latency to your applications. Retries caused by system failures due to timeouts, lack of memory, early exits, etc. could still occur.

New Lambda controls for asynchronous event invocations

New asynchronous event invoke controls mean that functions can now skip the processing of certain events and discard unwanted or backlogged requests from the asynchronous event queue without the need for “workarounds.” The controls will be accessible from the console and from a new Event Invoke Config subresource, listed in detail below:

{
    "MaximumEventAgeInSeconds": integer,
    "MaximumRetryAttempts": integer
}
Figure 4: Asynchronous event invoke configuration.

MaximumRetryAttempts

Set a maximum number of retry attempts for events before they can be skipped to unblock processing and prevent duplicate outputs.

Minimum: 0 | maximum: 2 | default: 2

MaximumEventAgeInSeconds

Define an event maximum age in seconds, with expired events skipped to allow processing to continue. Events that do not get successfully processed within defined age of submission are written to the function’s configured Dead Letter Queue and/or On-failure Destinations for asynchronous invocations. If none is configured, the event is discarded.

Minimum: 60 | maximum: 21600 (6 hours)

These controls can also be configured from the AWS Lambda console, in the “Failure handling configuration” section shown below:

Edit failure handling configurations

When the same Lambda function is run with maximum retry attempts set to 0, the Amazon CloudWatch Logs show that the function is only invoked a single time, and not retried after the initial error.

CloudWatch Logs

Conclusion

The addition of these new Lambda controls allows you to handle function failures and retries appropriately for both asynchronous and stream processing. This eliminates the need for custom-built logic, alerts, and added overhead.

Developers understand the needs of their own applications. These new features allow them to decide how to react to errors on a per-function basis. Whether that means real-time stream processing, non-blocking event queues, or more retries, it depends on the application’s requirements. With the new controls in place, the power is in your hands.

You can get started with the new controls for stream processing and asynchronous invocations via the AWS Management Console, AWS CLI, AWS SAM, AWS CloudFormation, or AWS SDK for Lambda.

Introducing AWS Lambda Destinations

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/introducing-aws-lambda-destinations/

Today we’re announcing AWS Lambda Destinations for asynchronous invocations. This is a feature that provides visibility into Lambda function invocations and routes the execution results to AWS services, simplifying event-driven applications and reducing code complexity.

Asynchronous invocations

When a function is invoked asynchronously, Lambda sends the event to an internal queue. A separate process reads events from the queue and executes your Lambda function. When the event is added to the queue, Lambda previously only returned a 2xx status code to confirm that the queue has received this event. There was no additional information to confirm whether the event had been processed successfully.

A common event-driven microservices architectural pattern is to use a queue or message bus for communication. This helps with resilience and scalability. Lambda asynchronous invocations can put an event or message on Amazon Simple Notification Service (SNS), Amazon Simple Queue Service (SQS), or Amazon EventBridge for further processing. Previously, you needed to write the SQS/SNS/EventBridge handling code within your Lambda function and manage retries and failures yourself.

With Destinations, you can route asynchronous function results as an execution record to a destination resource without writing additional code. An execution record contains details about the request and response in JSON format including version, timestamp, request context, request payload, response context, and response payload. For each execution status such as Success or Failure you can choose one of four destinations: another Lambda function, SNS, SQS, or EventBridge. Lambda can also be configured to route different execution results to different destinations.

Asynchronous Function Execution Result

Success

When a function is invoked successfully, Lambda routes the record to the destination resource for every successful invocation. You can use this to monitor the health of your serverless applications via execution status or build workflows based on the invocation result.

You no longer need to chain long-running Lambda functions together synchronously. Previously you needed to complete the entire workflow within the Lambda 15-minute function timeout, pay for idle time, and wait for a response. Destinations allows you to return a Success response to the calling function and then handle the remaining chaining functions asynchronously.

Failure

Alongside today’s announcement of Maximum Event Age and Maximum Retry Attempt for asynchronous invocations, Destinations gives you the ability to handle the Failure of function invocations along with their Success. When a function invocation fails, such as when retries are exhausted or the event age has been exceeded (hitting its TTL), Destinations routes the record to the destination resource for every failed invocation for further investigation or processing.

Dead Letter Queues (DLQ) have been available since 2016 and are a great way to handle asynchronous failure situations. Destinations provide more useful capabilities by passing additional function execution information, including code exception stack traces, to more destination services.

Destinations and DLQs can be used together and at the same time although Destinations should be considered a more preferred solution. If you already have DLQs set up, existing functionality does not change and Destinations does not replace existing DLQ configurations. If both Destinations and DLQ are used for Failure notifications, function invoke errors are sent to both DLQ and Destinations targets.

How to configure Destinations

Adding Destinations is a straightforward process. This walkthrough uses the AWS Management Console but you can also use the AWS CLI, AWS SAM, AWS CloudFormation, or language-specific SDKs for Lambda.

  1. Open the Lambda console Functions page. Choose an existing Lambda function, or create a new one. In this example, I create a new Lambda function. Choose Create Function.
  2. Enter a Function name, select Node.js 12.x for Runtime, and Choose or create an execution role. Ensure that your Lambda function execution role includes access to the destination resource.
    Basic information
  3. Choose Create function.
  4. Within the Function code pane, paste the following Lambda function code. The code generates a function execution result of either Success or Failure depending on a JSON input ("Success": true or "Success": false).
    // Lambda Destinations tester, Success returns a json blob, Failure throws an error
    
    exports.handler = function(event, context, callback) {
        var event_received_at = new Date().toISOString();
        console.log('Event received at: ' + event_received_at);
        console.log('Received event:', JSON.stringify(event, null, 2));
    
        if (event.Success) {
            console.log("Success");
            context.callbackWaitsForEmptyEventLoop = false;
            callback(null);
        } else {
            console.log("Failure");
            context.callbackWaitsForEmptyEventLoop = false;
            callback(new Error("Failure from event, Success = false, I am failing!"), 'Destination Function Error Thrown');
        }
    };
    
  5. Choose Save.
  6. To configure Destinations, within the Designer pane, choose Add destination.
    Designer pane
  7. Select the Source as Asynchronous invocation. Select the Condition as On failure or On success, depending on your use case. In this example, I select On Success.
  8. Enter the Amazon Resource Name (ARN) for the Destination SQS queue, SNS topic, Lambda function, or EventBridge event bus. In this example, I use the ARN of an SNS topic I have already configured.
    Add destination
  9. Choose Save. The Destination is added to SNS for On Success.
    Designer
  10. Add another Destination for Failure to Lambda. Within the Designer pane, choose Add destination.
    Add destination
  11. Select the Source as Asynchronous invocation, the Condition as On failure and Enter a Destination Lambda function ARN, then choose Save.
    Enter a Destination Lambda function ARN, and choose Save
  12. The Destination is added to Lambda for On Failure.
    7. The Destination has been added to Lambda for On Failure.

Success testing

To test invoking the asynchronous Lambda function to generate a Success result, use the AWS CLI:

aws lambda invoke --function-name event-destinations --invocation-type Event --payload '{ "Success": true }' response.json

The Lambda function is invoked successfully with a response "StatusCode": 202.

And an SNS notification email is received, showing the invocation details with "condition":"Success" and the requestPayload.

{
	"version": "1.0",
	"timestamp": "2019-11-24T23:08:25.651Z",
	"requestContext": {
		"requestId": "c2a6f2ae-7dbb-4d22-8782-d0485c9877e2",
		"functionArn": "arn:aws:lambda:sa-east-1:123456789123:function:event-destinations:$LATEST",
		"condition": "Success",
		"approximateInvokeCount": 1
	},
	"requestPayload": {
		"Success": true
	},
	"responseContext": {
		"statusCode": 200,
		"executedVersion": "$LATEST"
	},
	"responsePayload": null
}

Failure testing

The Lambda function can be set to Failure by throwing an exception within the code. To test invoking the asynchronous Lambda function to generate a Failure result, use the AWS CLI:

aws lambda invoke --function-name event-destinations --invocation-type Event --payload '{ "Success": false }' response.json

The Lambda function is executed and reports a successful invoke on the Lambda processing queue. If Lambda is not able to add the event to the queue, the error message appears in the command output.

However, due to the exception error within the code, the function invocation will fail. Destinations then routes the invoke failure to the configured destination Lambda function. You can see the failed function invocation information in the Amazon CloudWatch Logs for the Destination function including "condition": "RetriesExhausted", along with the requestPayload, errorMessage, and stackTrace.

2019-11-24T21:52:47.855Z	d123456-c0dd-4871-a123-a356cb1b3ba6	EVENT
{
    "version": "1.0",
    "timestamp": "2019-11-24T21:52:47.333Z",
    "requestContext": {
        "requestId": "8ea123e4-1db7-4aca-ad10-d9ca1234c1fd",
        "functionArn": "arn:aws:lambda:sa-east-1:123456678912:function:event-destinations:$LATEST",
        "condition": "RetriesExhausted",
        "approximateInvokeCount": 3
    },
    "requestPayload": {
        "Success": false
    },
    "responseContext": {
        "statusCode": 200,
        "executedVersion": "$LATEST",
        "functionError": "Handled"
    },
    "responsePayload": {
        "errorMessage": "Failure from event, Success = false, I am failing!",
        "errorType": "Error",
        "stackTrace": [ "exports.handler (/var/task/index.js:18:18)" ]
    }
}

Destination-specific JSON format

  • For SNS/SQS, the JSON object is passed as the Message to the destination.
  • For Lambda, the JSON is passed as the payload to the function. The destination function cannot be the same as the source function. For example, if LambdaA has a Destination configuration attached for Success, LambdaA is not a valid destination ARN. This prevents recursive functions.
  • For EventBridge, the JSON is passed as the Detail in the PutEvents call. The source is lambda, and detail type is either Lambda Function Invocation Result - Success or Lambda Function Invocation Result – Failure. The resource fields contain the function and destination ARNs.

AWS CloudFormation configuration

Destinations CloudFormation configuration is created via the following YAML.

Resources: 
  EventInvokeConfig:
    Type: AWS::Lambda::EventInvokeConfig
    Properties:
        FunctionName: “YourLambdaFunctionWithEventInvokeConfig”
        Qualifier: "$LATEST"
        MaximumEventAgeInSeconds: 600
        MaximumRetryAttempts: 0
        DestinationConfig:
            OnSuccess:
                Destination: “arn:aws:sns:us-east-1:123456789012:YourSNSTopicOnSuccess”
            OnFailure:
                Destination: “arn:aws:lambda:us-east-1:123456789012:function:YourLambdaFunctionOnFailure”

Conclusion

AWS Lambda Destinations gives you more visibility and control of function execution results. This helps you build better event-driven applications, reducing code, and using Lambda’s native failure handling controls.

There are no additional costs for enabling Lambda Destinations. However, calls made to destination target services may be charged.

To learn more, see Lambda Destinations in the AWS Lambda Developer Guide.

Application integration patterns for microservices: Fan-out strategies

Post Syndicated from Rachel Richardson original https://aws.amazon.com/blogs/compute/application-integration-patterns-for-microservices-fan-out-strategies/

This post is courtesy of Dirk Fröhner

The first blog in this series introduced asynchronous messaging for building loosely coupled systems that can scale, operate, and evolve individually. It considered messaging as a communications model for microservices architectures. This post covers concrete architectural considerations, focusing on the messaging architecture.

Wild Rydes

Wild Rydes is a fictional technology start-up. You may have heard of it – it disrupts individual transportation by replacing traditional taxis with unicorns. We use the Wild Rydes storyline in several hands-on AWS workshops. It illustrates concepts such as serverless development, event-driven design, API management, and messaging in microservices.

This blog post explores the decision-making process in building the Wild Rydes workshop, with a goal of helping you apply these concepts to your applications.

In the workshop, a customer requests a ‘unicorn’ ride using the Wild Rydes customer application. Registered unicorn drivers can use the application to manage their rides. Unicorn drivers submit a ride completion message after they have successfully delivered a customer to their destination.

Wild Rydes app image

Submit a ride completion

API exposed by the unicorn management service

At Wild Rydes, end-user clients are implemented as mobile applications and communicate via REST APIs (also known as hypermedia APIs) with the backend services.

For this use case, the application interacts with the API exposed by the unicorn management service. It uses the submit-ride-completion resource that it discovered from the API’s home document to send the relevant details of a ride to the backend. In response, the backend persists these details, creates a new completed-ride resource. This returns the respective status code, the location, and a representation of the new resource to the client. The API details are shown below.

Request from client to submit the details of a completed ride:

POST /<submit-ride-completion-resource-path> HTTP/1.1
Content-Type: application/json;charset=UTF-8
...

{
    "from": "...",
    "to": "...",
    "duration": "...",
    "distance": "...",
    "customer": "...",
    "fare": "..."
}

Response from the unicorn management service:

HTTP/1.1 201 Created
Date: Sat, 31 Aug 2019 12:00:00 GMT
Location: <url-of-newly-created-completed-ride-resource>
Content-Location: <url-of-newly-created-completed-ride-resource>
Content-Type: application/json;charset=UTF-8
...

{
    "links": {
        "self": {
            "href": "https://..."
        }
    },
    <completed-ride-resource-representation-properties>
}

Schematic architecture for the use case

The schematic architecture for the use case is shown in diagram 1 below:

Diagram 1: Mutliple microservices need information about ride completion

Diagram 1: Multiple microservices need information about ride completion

There are other microservices in Wild Rydes that are also interested in a new completed ride. The examples from the diagram are:

  • Customer notification service: customers should receive a notification in the app about their latest completed ride.
  • Customer accounting service: After all, Wild Rydes is a business, so this service is responsible for collecting the fare from the customer.
  • Customer loyalty service: Everybody wants to collect miles and would like to receive benefits for being a loyal customer.
  • Data lake ingestion service: Wild Rydes is a data-driven company and they want to ingest all data generated from any process into their data lake for arbitrary analytics.
  • Extraordinary rides service: This special service is interested in rides with fares or distances above certain thresholds for preparing insights for business managers.

Based on this scenario, let’s review the integration options.

Integration options

Integration via database

The unicorn management service stores the details of a completed ride in a database. It could share the database with the other services directly, but that creates tight coupling. Sharing the database also restricts your flexibility to scale and evolve your services.

Integration via REST APIs

What about using REST APIs for the integration? The HTTP-based implementation of the REST architectural style uses the distributed architecture concepts of the web. However, what does this mean for the implementation?

Diagram 2: Using REST APIs to communicate to microservices

Diagram 2: Using REST APIs to communicate to microservices

As shown in diagram 2 above:

  • Effectively, all interested services on the right-hand side would have to expose an API resource. These would be called by the unicorn management service for each newly completed ride.
  • To enable elasticity behind a single resource URL, you may need a load balancer in front of each interested service.
  • The unicorn management service would have to know about all these interested services and their respective APIs. Hopefully, each service uses a streamlined API resource.
  • Lastly, the unicorn management service must store, retry, and track all request attempts in case an interested service is not available. This ensures durability so we don’t lose any of these notifications.

One approach is to manage a recipient list in the unicorn management service. This adds additional complexity to the unicorn management service and coupling on both sides. Although there are self-registration and discovery approaches, managing a recipient list is not the core use case of the unicorn management service.

Diagram 3: Using a separate service to manage the fan-out to other services

Diagram 3: Using a separate service to manage the fan-out to other services

A better approach would be to externalize the recipient list into a separate Request Distribution Service, as diagram 3 shows. This decouples both sides, but binds each side to the new service. Still, the unicorn management service is still responsible for the delivery of the ride data to all the recipients. Again, this heavy lifting is not the main task of this service.

Diagram 4: Filtering information for extraordinary rides

Diagram 4: Filtering information for extraordinary rides

In diagram 4, the information filtering for the Extraordinary Rides Service is self-managed. This means that there is code on one side to either not send or to discard irrelevant ride data.

For this use case, integration via REST APIs potentially adds coupling to the services. And it adds heavy lifting to the services that is beyond their actual domain.

Integration via messaging

A third option could use messaging for the integration.

Publish-subscribe pattern

Both Amazon SNS and Amazon EventBridge can be used to implement the publish-subscribe pattern.  In this use case, we recommend Amazon SNS, which scales to support high throughput and fan-out applications. Amazon EventBridge includes direct integrations with software as a service (SaaS) applications and other AWS services. It’s ideal for publish-subscribe use cases involving these types of integrations.

Diagram 5: Using Amazon SNS to implement a publish-subscribe pattern

Diagram 5: Using Amazon SNS to implement a publish-subscribe pattern

Diagram 5 shows an SNS topic called Ride Completion Topic. The unicorn management service can now send the details about a completed ride into that topic. All interested services on the right-hand side can subscribe to this topic.

Using a message topic to publish the details of a completed ride frees us from managing the recipient list, as well as making ensuring reliable delivery of the messages. It also decouples both sides as much as possible. Services on the right-hand side can autonomously subscribe to the topic. The Unicorn Management Service does not know anything about the topic’s subscribers.

Message filter pattern

Looking at the Extraordinary Rides Service, the message filter functionality of Amazon SNS can autonomously and individually discard irrelevant messages. The Extraordinary Rides Service can specify the threshold values for the fare and distance.

Diagram 6: Filtering extraordinary rides using Amazon SNS

Diagram 6: Filtering extraordinary rides using Amazon SNS

Topic-queue-chaining pattern

Consider the publish-subscribe channel between the Unicorn Management Service, and the subscribing services on the right-hand side.

One of the consuming services may go offline for maintenance. Or the code that processes messages from the ride completion topic could run into an exception. These are two examples where a subscriber service could potentially miss topic messages.

A good pattern to apply here is topic-queue-chaining. That means that you add a queue, in our case an SQS queue, between the ride completion topic and each of the subscriber services. As messages are buffered persistently in an SQS queue, it prevents lost messages if a subscriber process run into problems for many hours or days.

Diagram 7: Chaining topics and queues to buffer messages persistently

Diagram 7: Chaining topics and queues to buffer messages persistently

Queues as buffering load balancers

An SQS queue in front of each subscriber service also acts as a buffering load balancer.

Since every message is delivered to one of potentially many consumer processes, you can scale out the subscriber services, and the message load is distributed over the available consumer processes.

As messages are buffered in the queue, they are preserved during a scaling event, such as when you must wait until an additional consumer process becomes operational.

Lastly, these queue characteristics help flatten peak loads for your consumer processes, buffering messages until consumers are available. This allows you to process messages at a pace decoupled from the message source.

Conclusion

The Wild Rydes example shows how messaging can provide decoupling and greater flexibility for your microservices landscape.

In contrast to REST APIs, a messaging system takes care of message delivery outside of your service code. Using a publish-subscribe channel provides simple fan-out capability. And message filters allow for selective message reception without the effort of implementing that logic into your code.

With topic-queue-chaining pattern, you can add queue characteristics to a fan-out scenario so that you can easily scale out on the consumer side, and flatten peak loads.

For a deeper dive into queues and topics and how to use them in your microservices architecture, please use the following resources:

  1. AWS whitepaper: Implementing Microservices on AWS
  2. AWS blog: Implementing enterprise integration patterns with AWS messaging services: point-to-point channels
  3. AWS blog: Implementing enterprise integration patterns with AWS messaging services: publish-subscribe channels
  4. AWS blog: Building Scalable Applications and Microservices: Adding Messaging to Your Toolbox

Understanding asynchronous messaging for microservices

Post Syndicated from Rachel Richardson original https://aws.amazon.com/blogs/compute/understanding-asynchronous-messaging-for-microservices/

This post is courtesy of Dirk Fröhner

One of the implications of applying the microservices architectural style is that much communication between components happens over the network. After all, your microservices landscape is a distributed system. To achieve the promises of microservices, such as being able to individually scale, operate, and evolve each service, this communication must happen in a loosely coupled and reliable manner.

A common way to loosely couple services is to expose an API following the REST architectural style. REST APIs are based on the architecture of the web and provide loose coupling between communicating parties. REST APIs offer a great way to decouple interfaces from concrete implementations, and to advise clients about what they can do next, by the use of links and link relations.

While REST APIs are common and useful in microservices design, REST APIs tend to be designed with synchronous communications, where a response is required. A request coming from an end-user client can trigger a complex communications path within your services landscape, which can effectively add coupling between the services at runtime. After all, this is why there are mitigation patterns like circuit-breaker in the first place. REST APIs can also add some heavy lifting to your infrastructure that we will discuss further below.

Asynchronous messaging

If loose-coupling is important, especially in a system that requires high resilience and has unpredictable scale, another option is asynchronous messaging.

Asynchronous messaging is a fundamental approach for integrating independent systems, or building up a set of loosely coupled systems that can operate, scale, and evolve independently and flexibly. As our colleague Tim Bray said, “If your application is cloud-native, or large-scale, or distributed, and doesn’t include a messaging component, that’s probably a bug.” In this blog post, we will outline some fundamental benefits of asynchronous messaging for the communications between microservices.

For a refresher on the fundamental messaging patterns and their implementations with Amazon SQS, Amazon SNS, and Amazon MQ, please read our previous blog posts

For a summary of the semantics of queues and topics:

  • A queue is like a buffer. You can put messages into a queue, and you can retrieve messages from a queue. Message queues operate so that any given message is only consumed by one receiver, although multiple receivers can be connected to the queue.
  • A topic is like a broadcasting station. You can publish messages to a topic, and anyone interested in these messages can subscribe to the topic. In this model, any message published to a topic is immediately received by all of the subscribers of the topic (unless you have applied the message filter pattern).

Use-case

Consider a typical scenario illustrated in the diagram below. An end-user client (EUC) addresses an API resource of one of our services, through Amazon API Gateway in this example. From there, the request can potentially follow a path across the microservices landscape to get completely processed.

To provide the final result, there will be potentially cascading subsequent requests sent between other microservices. This example illustrates the complexity involved in processing a single end user request.

End User Client accessing a service using an API

Diagram 1: End-User Client accessing a service using an API

End-user clients (EUCs) often communicate with services via REST APIs in a synchronous manner. However, the communication can also be designed using an asynchronous approach. For instance, if an EUC submits a request that takes some time to process, the respective API resource can respond with HTTP status 202 Accepted, and a link to a resource that provides the current processing status. Downstream, the communication between the service that receives that request, and other services that are involved in processing the request, can happen asynchronously using messaging services.

There are situations where a communications model using asynchronous messaging can make your life easier than using REST APIs.

Infrastructure complexity

Start with looking at the infrastructure complexity for the backends of your services. Depending on your implementation paradigm, you have to include different components in your infrastructure that you don’t have to deal with when using messaging.

Imagine your services each expose a REST API. Typically, this means you add a load balancer in front of your compute layer, and your backend implementation includes an HTTP server. It is usually a good idea to decouple your services APIs from their concrete implementations, so you could also consider adding Amazon API Gateway in front of your load balancer.

For a serverless approach, you don’t need to worry about load balancing and scaling out infrastructure. Amazon API Gateway with AWS Lambda integration provides a fully managed solution for removing complexities around infrastructure management.

Using Amazon SQS as a cloud-native messaging service for queues, you don’t employ any of the above mentioned components. As described in a prior post, an SQS queue can act as a load balancer in itself. The consumers, or target services, don’t need an HTTP server, but simply ask a queue for available messages. If you use AWS Lambda for your consumers, this process is even simpler, as the Lambda functions are automatically invoked when messages appear in an SQS queue. See Using AWS Lambda with Amazon SQS to learn more.

The same applies to Serverless architectures implementing a publish/subscribe pattern. Lambda function executions can be directly triggered by SNS messages. Without AWS Lambda, you need load balancers and web servers in your backend implementations to receive SNS notifications, as those are injected via web hooks into your services. SNS also provides the fan-out functionality that you would otherwise have to build using an intermediary component to implement a recipient list of subscribers.

Reliability, resilience

For synchronous systems, if a service crashes while it processes the payload of an API request, the information is lost. A good way to prevent this on a microservice is to explicitly persist an incoming request immediately after receiving it. Then process and reprocess, until the request is finally marked as resolved.

This approach requires additional work, and it requires the microservice to not crash while persisting an incoming API request. The microservice sending a request must also resend if the target service doesn’t acknowledge receipt. For example, it doesn’t respond with a successful HTTP status code, or the connection drops.

When sending messages to a queue, this additional work is addressed by the messaging infrastructure. A message will remain in a queue unless a consumer explicitly states that processing is finished by acknowledging the message reception. As long as message reception is not acknowledged by a consumer, it will stay in the queue. Messages can be retained in an SQS queue for a maximum of 14 days.

Scale out latency

Under increased load, your services must scale out to process the requests. You must then consider scale-out latency, which may be managed for you with serverless implementations. It takes a few moments from when an Auto Scaling group triggers the launch of additional instances until these are ready to operate. Also launching new container tasks takes time. When your scaling threshold is not optimal and the scaling event occurs late, your available resources may be unable to serve all incoming requests. These requests may be lost or answered with HTTP status code 5xx.

Using message queues that buffer messages during a scaling event help prevent this. Even in use cases where the EUC is waiting for an immediate response, this is the more reliable architecture. If your infrastructure needs time to scale out and you are not able to process all requests in time, the requests are persisted.

When messaging is your only choice

What happens when your services must respond to peak loads at scale?

For many applications, the scale-out latency, including load balancer pre-warming, will eventually become too large to handle steeply ascending loads fast enough. With a serverless architecture, exposing your Lambda functions with API Gateway can handle steeply ascending loads. But you must still consider downstream systems, which may be easily overwhelmed.

In these scenarios, where rapid scaling without overwhelming downstream systems is important, messaging may be your best choice. Message queues help protect your downstream services by buffering incoming payloads for consumption at the pace of the consuming service. This helps not only for the communications between microservices, but also when peak loads flood your client-facing API. Often, the most important goal is to accept an incoming request, while the actual processing of that request can happen later. You decouple these steps from each other by using queues.

Serverless messaging systems like Amazon SQS and Amazon SNS can respond quickly to support high scale. These are often the best solution when scale is unpredictable.  While the instance-based messaging system, Amazon MQ, provides compatibility with open standards, it requires manual scaling for large workloads, unlike serverless messaging services.

Conclusion

We hope you got some inspiration to also employ asynchronous messaging for your microservices communications architecture. In blog XYZ we provide concrete examples of these patterns. For more information, feel free to consume the following resources:

  1. AWS whitepaper: Implementing Microservices on AWS
  2. AWS blog: Implementing enterprise integration patterns with AWS messaging services: point-to-point channels
  3. AWS blog: Implementing enterprise integration patterns with AWS messaging services: publish-subscribe channels
  4. AWS blog: Building Scalable Applications and Microservices: Adding Messaging to Your Toolbox

Read the next blog in the series,  Application Integration Patterns for Microservices: Fan-out Strategies.

New for AWS Lambda – SQS FIFO as an event source

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/new-for-aws-lambda-sqs-fifo-as-an-event-source/

AWS Lambda first announced support for Amazon SQS standard queues as an event source in April 2018. This allows builders to develop serverless applications using queues to directly invoke Lambda functions. Today, we have expanded this feature to include SQS FIFO queues. This makes it easier to create serverless applications using queues where the order of messages is important.

Ordered events and operations are critical for some types of event-driven application. For example, in stock market trading applications, the order of events for a trade is essential to determine the time, price, and parties in a transaction. Similarly, in applications managing pricing or inventory, the order of events is key for maintaining the data integrity of the system.

Using SQS FIFO with Lambda is straight-forward, with only minor differences in configuration compared with standard SQS queues. This blog post explains how serverless developers can get started using SQS FIFO queues in their Lambda functions.

How it works

SQS FIFO queues are similar to standard queues but use two additional attributes in the SendMessage API. These attributes are also returned in the event payload to the consuming Lambda function:

  • MessageGroupID: this required field enables multiple message groups within a single queue. If you do not need this functionality, provide the same MessageGroupId value for all messages. Messages within a single group are processed in a FIFO fashion.
  • MessageDeduplicationID: this token is used to deduplicate messages within a 5-minute interval. If two messages with the same MessageDuplicationID are sent, both messages are accepted successfully but only one is delivered. Learn more about the best practices for using this attribute.

You can submit a message into an SQS FIFO queue using the AWS Management Console, AWS CLI, AWS SAM, or AWS SDK. This SDK example uses Node.js to submit a single message:

const AWS = require('aws-sdk')
AWS.config.update({region: 'us-west-2'})
const sqs = new AWS.SQS({apiVersion: '2012-11-05'})

const send = async (groupId, messageId) => {
  return await sqs.sendMessage({
    MessageGroupId: `group-${groupId}`,
    MessageDeduplicationId: `m-${groupId}-${messageId}`,
    MessageBody: `${messageId}`,
    QueueUrl: "https://sqs.us-west-2.amazonaws.com/123456789012/myTestQueue.fifo"
  }).promise()
}

const main = async () => {
  await send ('A', '1')
}
main()

When the Lambda function is triggered by the SQS FIFO queue, these two attributes are delivered as part of the event payload:

SQS FIFO payload

When configured this way, the SQS FIFO queue sends as many messages as possible to the consuming Lambda function, subject to the Batch size configured in the trigger. The Batch size determines the number of items to read from the queue, up to a maximum of 10 items, though a single batch is smaller if the queue has fewer items.

For example, if you submit six messages to a queue with a Batch size of 10 using two separate MessageGroupId values, the consuming Lambda function would receive all six items:

 

SQS FIFO example #1

As all the messages are processed in a single batch, only one Lambda function is invoked. A more complex example shows how the Lambda service uses concurrency to process a busy SQS FIFO queue more efficiently.

SQS FIFO example #2

In this example, there are three message groups, and a Lambda function with an SQS FIFO trigger. The Batch size for the Lambda trigger is 5, and there are 53 messages in total:

  • Message Group A consists of 18 messages, A1-A18
  • Message Group B consists of 12 messages, B1-B12
  • Message Group C consists of 23 messages, C1-C23

There are three rules governing the order of messages leaving a FIFO queue, to help understand the processing behavior:

  1. Return the oldest message where no other message with the same MessageGroupId is in flight.
  2. Return as many messages with the same MessageGroupId as possible.
  3. If a message batch is still not full, go back to the first rule. As a result, it’s possible for a single batch to contain messages from multiple MessageGroupIds.

In this case, the SQS FIFO queue receives these messages, and orders the items within each message group. In processing, the following events occur:

  1. Consumer #1 receives 5 messages in batch #1 (C1-C5).
  2. Consumer #1 receives 5 messages in batch #2 (B1-B5).
  3. As there are messages in flight for group B, and there are items in the queue from other groups ready for processing, Lambda scales up. Consumer #2 now receives 5 messages in batch #2 (C6-C10).
  4. Consumer #3 receives batch #3 (A1-A5).
  5. Each Lambda instance will continue receiving batches of 5 messages until the queue is empty.

Lambda automatically scales out horizontally to consume the messages in the queue. It will try to consume the queue as quickly and efficiently as possible by maximizing concurrency within the bounds of each service. As queue traffic fluctuates, the Lambda service scales the polling operations based on the number of inflight messages.

In SQS FIFO queues, using more than one MessageGroupId enables Lambda to scale up and process more items in the queue using a greater concurrency limit. Total concurrency is equal to or less than the number of unique MessageGroupIds in the SQS FIFO queue. Learn more about AWS Lambda Function Scaling and how it applies to your event source.

Amazon SQS FIFO queues ensure that the order of processing follows the message order within a message group. However, it does not guarantee only once delivery when used as a Lambda trigger. If only once delivery is important in your serverless application, it’s recommended to make your function idempotent. You could achieve this by tracking a unique attribute of the message using a scalable, low-latency control database like Amazon DynamoDB.

Triggering Lambda from SQS

1. Navigate to the SQS console and choose Create New Queue.

2. Names for FIFO queues must end in .fifo. For Queue Name, enter myTestQueue.fifo and select FIFO Queue. Choose Quick-Create Queue.

Create New Queue

3. From the Lambda console, choose Create function.

4. Enter mySQStest for Function name, and leave the Node.js 12.x runtime selected. Open the Choose or create an execution role panel and select Create a new role from AWS policy templates.

5. Enter mySQStest for Role name and select Amazon SQS poller permissions from the Policy templates drop-down. Choose Create function.

Create Lambda function

6. Once the function is created, you can configure the trigger. Choose Add trigger.

Add trigger

7. In the Select a trigger drop-down, choose SQS. In the SQS queue drop-down, select the FIFO queue created earlier. Leave Batch size as the default value, then choose Add.

Add trigger dialog

8. After a few seconds, the SQS trigger is ready.

SQS FIFO trigger ready

The Lambda function is now invoked when new messages are available in the SQS FIFO queue.

Conclusion

Now, you can process messages from Amazon SQS FIFO queues with Lambda, helping bring scalable serverless processing to applications needing first-in, first-out message processing.

You can get started with SQS FIFO queue as a Lambda event source via AWS Management Console, AWS CLI, AWS SAM, AWS CloudFormation, or AWS SDK for Lambda. It is available in all AWS Regions where AWS Lambda is available. You pay only for the SQS API operations performed by the Lambda service on your behalf, and the Lambda requests and duration used to process your messages.

For more information on where AWS Lambda is available, see the AWS Region Table. To learn more, see Using AWS Lambda with Amazon SQS FIFO in the AWS Lambda Developer Guide.

 

 

Designing durable serverless apps with DLQs for Amazon SNS, Amazon SQS, AWS Lambda

Post Syndicated from Rachel Richardson original https://aws.amazon.com/blogs/compute/designing-durable-serverless-apps-with-dlqs-for-amazon-sns-amazon-sqs-aws-lambda/

This post is courtesy of Otavio Ferreira, Sr Manager, SNS.

In a postal system, a dead-letter office is a facility for processing undeliverable mail. In pub/sub messaging, a dead-letter queue (DLQ) is a queue to which messages published to a topic can be sent, in case those messages cannot be delivered to a subscribed endpoint.

Amazon SNS supports DLQs, making your applications more resilient and durable upon delivery failure modes.

Understanding message delivery failures and retries

The delivery of a message fails when it’s not possible for Amazon SNS to access the subscribed endpoint. There are two reasons why this might happen:

  • Client errors, where the client is SNS (the message sender).
  • Server errors, where the server is the system that hosts the subscription endpoint (the message receiver), such as Amazon SQS or AWS Lambda.

Client errors

Client errors happen when SNS has stale subscription metadata. One common cause of client errors is when you (the endpoint owner) delete the endpoint. For example, you might delete the SQS queue that is subscribed to your SNS topic, without also deleting the SNS subscription corresponding to the queue. Another common cause is when you change the resource policy attached to your endpoint in a way that prevents SNS from delivering messages to that endpoint.

These errors are considered client errors because the client has attempted the delivery of a message to a destination that, from the client’s perspective, is no longer accessible. SNS does not retry the delivery of messages that failed as the result of client errors.

Server errors

Server errors happen when the system that powers the subscribed endpoint is unavailable, or when it returns an exception response indicating that it failed to process a valid request from SNS.

When server errors occur, SNS retries the failed deliveries according to a backoff function, which can be either linear or exponential. When a server error occurs for an AWS managed endpoint, backed by either SQS or Lambda, then SNS retries the delivery for up to 100,015 times, over 23 days.

Server errors can also happen with customer managed endpoints, namely HTTP, SMS, email, and mobile push endpoints. SNS also retries the delivery for these types of endpoints. HTTP endpoints support customer-defined retry policies, while SNS sets an internal delivery retry policy for SMS, email, and mobile push endpoints to 50 times, over 6 hours.

Delivery retries

SNS may receive a client error, or continue to receive a server error for a message beyond the number of retries defined by the corresponding retry policy. In that event, SNS discards the message. Setting a DLQ to your SNS subscription enables you to keep this message, regardless of the type of error, either client or server. DLQs give you more control over messages that cannot be delivered.

For more information on the delivery retry policy for each delivery protocol supported by SNS, see Amazon SNS Message Delivery Retry.

Using DLQs for AWS services

SNS, SQS, and Lambda support DLQs, addressing different failure modes. All DLQs are regular queues powered by SQS.

In SNS, DLQs store the messages that failed to be delivered to subscribed endpoints. For more information, see Amazon SNS Dead-Letter Queues.

In SQS, DLQs store the messages that failed to be processed by your consumer application. This failure mode can happen when producers and consumers fail to interpret aspects of the protocol that they use to communicate. In that case, the consumer receives the message from the queue, but fails to process it, as the message doesn’t have the structure or content that the consumer expects. The consumer can’t delete the message from the queue either. After exhausting the receive count in the redrive policy, SQS can sideline the message to the DLQ. For more information, see Amazon SQS Dead-Letter Queues.

In Lambda, DLQs store the messages that resulted in failed asynchronous executions of your Lambda function. An execution can result in an error for several reasons. Your code might raise an exception, time out, or run out of memory. The runtime executing your code might encounter an error and stop. Your function might hit its concurrency limit and be throttled. Regardless of the error type, when the error occurs, your code might have run completely, partially, or not at all. By default, Lambda retries an asynchronous execution twice. After exhausting the retries, Lambda can sideline the message to the DQL. For more information, see AWS Lambda Dead-Letter Queues.

When you have a fan-out architecture, with SQS queues and Lambda functions subscribed to an SNS topic, we recommend that you set DLQs to your SNS subscriptions, and to your destination queues and functions as well. This approach gives your application resilience against message delivery failures, message processing failures, and function execution failures too.

Applying DLQs in a use case

Here’s how everything comes together. The following diagram shows a serverless backend architecture that supports a car rental application. This is a durable serverless architecture based on DLQs for SNS, SQS, and Lambda.

Dead Letter Queue - DLQ SNS use case with architecture diagram

When a customer places an order to rent a car, the application sends that request to an API, which is powered by Amazon API Gateway. The REST API is backed by an SNS topic named Rental-Orders, and deployed onto an Amazon VPC subnet. The topic then fans out that order to the following two subscribed endpoints, for parallel processing:

  • An SQS queue, named Rental-Fulfilment, which feeds the integration with an internal fulfilment system hosted on Amazon EC2.
  • A Lambda function, named Rental-Billing, which processes and loads the customer order into a third-party billing system, also hosted on Amazon EC2.

To increase the durability of this serverless backend API, the following DLQs have been set up:

  • Two SNS DLQs, namely Rental-Fulfilment-Fanout-DLQ and Rental-Billing-Fanout-DLQ, which store the order in case either the subscribed SQS queue or Lambda function ever becomes unreachable.
  • An SQS DLQ, named Rental-Fulfilment-DLQ, which stores the order when the fulfilment system fails to process the order.
  • A Lambda DLQ, named Rental-Billing-DLQ, which stores the order when the function fails to process and load the order into the billing system.

When the DLQ captures the message, you can inspect the message for troubleshooting purposes. After you address the error at hand, you can poll the DLQ to retry the processing of the message.

Setting up DLQs for subscriptions, queues, and functions can be done using the AWS Management Console, SDK, CLI, API, or AWS CloudFormation. You can use the SDK, CLI, and API for polling the DLQs as well.

Configuring DLQs for subscriptions

You can attach a DLQ to an SNS subscription by setting the subscription’s RedrivePolicy parameter. The policy is a JSON object that refers to the DLQ ARN. The ARN must point to an SQS queue in the same AWS account as that of the SNS subscription. Also, both the DLQ and the subscription must be in the same AWS Region.

Here’s how you can configure one of the SNS DLQs applied in the car rental application example, presented earlier.

The following JSON object is a CloudFormation template that subscribes the SQS queue Rental-Fulfilment to the SNS topic Rental-Orders. The template also sets a RedrivePolicy that targets Rental-Fulfilment-Fanout-DLQ as a DLQ.

Lastly, the template sets a FilterPolicy value. It makes SNS deliver a message to the subscribed queue only if the published message carries an attribute named order-status with value set to either confirmed or canceled. As Amazon SNS Message Filtering happens before message delivery, messages that are filtered out aren’t sent to that subscription’s DLQ.

Internally, the CloudFormation template uses the SNS Subscribe API action for deploying the subscription and setting both policies, all part of the same API request.

{  
   "Resources": {
      "mySubscription": {
         "Type" : "AWS::SNS::Subscription",
         "Properties" : {
            "Protocol": "sqs",
            "Endpoint": "arn:aws:sqs:us-east-1:123456789012:Rental-Fulfilment",
            "TopicArn": "arn:aws:sns:us-east-1:123456789012:Rental-Orders",
            "RedrivePolicy": {
               "deadLetterTargetArn": 
                  "arn:aws:sqs:us-east-1:123456789012:Rental-Fulfilment-Fanout-DLQ"
            },
            "FilterPolicy": { 
               "order-status": [ "confirmed", "canceled" ]
            }
         }
      }
   }
}

Maybe the SNS topic and subscription are already deployed. In that case, you can use the SNS SetSubscriptionAttributes API action to set the RedrivePolicy, as shown by the following code examples, based on the AWS CLI and the AWS SDK for Java.

$ aws sns set-subscription-attributes 
   --region us-east-1
   --subscription-arn arn:aws:sns:us-east-1:123456789012:Rental-Orders:44019880-ffa0-4067-9cb4-b974443bcck2
   --attribute-name RedrivePolicy 
   --attribute-value '{"deadLetterTargetArn":"arn:aws:sqs:us-east-1:123456789012:Rental-Fulfilment-Fanout-DLQ"}'
AmazonSNS sns = AmazonSNSClientBuilder.defaultClient();

String subscriptionArn = "arn:aws:sns:us-east-1:123456789012:Rental-Orders:44019880-ffa0-4067-9cb4-b974443bcck2";

String redrivePolicy = "{\"deadLetterTargetArn\":\"arn:aws:sqs:us-east-1:123456789012:Rental-Fulfilment-Fanout-DLQ\"}";

SetSubscriptionAttributesRequest request = new SetSubscriptionAttributesRequest(
  subscriptionArn, 
  "RedrivePolicy", 
  redrivePolicy
);

sns.setSubscriptionAttributes(request);

Monitoring DLQs

You can use Amazon CloudWatch metrics and alarms to monitor the DLQs associated with your SNS subscriptions. In the car rental example, you can monitor the DLQs to be notified when the API failed to distribute any car rental order to the fulfillment or billing systems.

As regular SQS queues, the DLQs in SNS emit a number of metrics to CloudWatch, in 5-minute data points, such as NumberOfMessagesSent, NumberOfMessagesReceived and NumberOfMessagesDeleted. You can use these SQS metrics to be notified upon activity in your DLQs in SNS, so you may trigger a message recovery protocol.

You might have a case where you expect the DLQ to be always empty. In that case, create an CloudWatch alarm on NumberOfMessagesSent, set the alarm threshold to zero, and provide a separate SNS topic to be notified when the alarm goes off. The SNS topic, in its turn, can delivery your alarm notification to any endpoint type that you choose, such as email address, phone number, or mobile pager app.

Additionally, SNS itself provides its own set of metrics that are relevant to DLQs. Specifically, SNS metrics include the following:

  • NumberOfNotificationsRedrivenToDlq – Used when sending the message to the DLQ succeeds.
  • NumberOfNotificationsFailedToRedriveToDlq – Used when sending the message to the DLQ fails. This can happen because the DLQ either doesn’t exist anymore or doesn’t have the required access permissions to allow SNS to send messages to it. For more information about setting up the required access policy, see Giving Permissions for Amazon SNS to Send Messages to Amazon SQS.

Debugging with DLQs

Use CloudWatch Logs to see the exceptions that caused your SNS deliveries to fail and your messages to be sidelined to DLQs. In the car rental example, you can inspect the rental orders in the DLQs, as well as the logs associated with these queues. Then you can understand why those orders failed to be fanned out to the fulfilment or billing systems.

SNS can log both successful and failed deliveries in CloudWatch. You can enable Amazon SNS Delivery Status Logging by setting three SNS topic attributes, which are delivery protocol-specific. As an example, for SNS deliveries to SQS queues, you must set the following topic attributes: SQSSuccessFeedbackRoleArn,  SQSFailureFeedbackRoleArn, and SQSSuccessFeedbackSampleRate.

The following JSON object represents a successful SNS delivery in an CloudWatch Logs entry. The status code logged is 200 (SUCCESS). The attribute RedrivePolicy shows that the SNS subscription in question had its DLQ set.

{
  "notification": {
    "messageMD5Sum": "7bb3327ac55e49485bad42e159ca4d4b",
    "messageId": "e8c2bb09-235c-5f5d-b583-efd8df0f7d74",
    "topicArn": "arn:aws:sns:us-east-1:123456789012:Rental-Orders",
    "timestamp": "2019-10-04 05:13:55.876"
  },
  "delivery": {
    "deliveryId": "6adf232e-fb12-5062-a564-27ff3741051f",
    "redrivePolicy": "{\"deadLetterTargetArn\": \"arn:aws:sqs:us-east-1:123456789012:Rental-Fulfilment-Fanout-DLQ\"}",
    "destination": "arn:aws:sqs:us-east-1:123456789012:Rental-Fulfilment",
    "providerResponse": "{\"sqsRequestId\":\"b2608a46-ccc4-51cc-003d-de972097debc\",\"sqsMessageId\":\"05fecd22-60a1-4d7d-bb79-026d49700b5a\"}",
    "dwellTimeMs": 58,
    "attempts": 1,
    "statusCode": 200
  },
  "status": "SUCCESS"
}

The following JSON object represents a failed SNS delivery in CloudWatch Logs. In the following code example, the subscribed queue doesn’t exist. As a client error, the status code logged is 400 (FAILURE). Again, the RedrivePolicy attribute refers to a DLQ.

{
  "notification": {
    "messageMD5Sum": "81c395cbd350da6bedfe3b24db9517b0",
    "messageId": "9959db9d-25c8-57a6-9439-8e5be8f71a1f",
    "topicArn": "arn:aws:sns:us-east-1:123456789012:Rental-Orders",
    "timestamp": "2019-10-04 05:16:51.116"
  },
  "delivery": {
    "deliveryId": "be743821-4c2c-5acc-a586-6cf0807f6fb1",
    "redrivePolicy": "{\"deadLetterTargetArn\": \"arn:aws:sqs:us-east-1:123456789012:Rental-Fulfilment-Fanout-DLQ\"}",
    "destination": "arn:aws:sqs:us-east-1:123456789012:Rental-Fulfilment",
    "providerResponse": "{\"ErrorCode\":\"AWS.SimpleQueueService.NonExistentQueue\", \"ErrorMessage\":\"The specified queue does not exist or you do not have access to it.\",\"sqsRequestId\":\"Unrecoverable\"}",
    "dwellTimeMs": 53,
    "attempts": 1,
    "statusCode": 400
  },
  "status": "FAILURE"
}

When the message delivery fails and there is a DLQ attached to the subscription, the message is sent to the DLQ and an additional entry is logged in CloudWatch. This new entry is specific to the delivery to the DLQ and refers to the DLQ ARN as the destination, as shown in the following JSON object.

{
  "notification": {
    "messageMD5Sum": "81c395cbd350da6bedfe3b24db9517b0",
    "messageId": "8959db9d-25c8-57a6-9439-8e5be8f71a1f",
    "topicArn": "arn:aws:sns:us-east-1:123456789012:Rental-Orders",
    "timestamp": "2019-10-04 05:16:52.876"
  },
  "delivery": {
    "deliveryId": "a877c79f-a3ee-5105-9bbd-92596eae0232",
    "destination":"arn:aws:sqs:us-east-1:123456789012:Rental-Fulfilment-Fanout-DLQ",
    "providerResponse": "{\"sqsRequestId\":\"8cef1af5-e86a-519e-ad36-4f33252aa5ec\",\"sqsMessageId\":\"2b742c5c-0750-4ec5-a717-b95897adda8e\"}",
    "dwellTimeMs": 51,
    "attempts": 1,
    "statusCode": 200
  },
  "status": "SUCCESS"
}

By analyzing Amazon CloudWatch Logs entries, you can understand why an SNS message was moved to a DLQ, and then take the required set of steps to recover the message. When you enable delivery status logging in SNS, you can configure the sample rate in which deliveries are logged, from 0% to 100%.

Encrypting DLQs

When your SNS subscription targets an SQS encrypted queue, then you probably want your DLQ to be an SQS encrypted queue as well. This configuration provides consistency in the form that your messages are encrypted at rest.

To follow this security recommendation, give the CMK you used to encrypt your DLQ a key policy that grants the SNS service principal access to AWS KMS API actions. For example, see the following sample key policy:

{
    "Sid": "GrantSnsAccessToKms",
    "Effect": "Allow",
    "Principal": { "Service": "sns.amazonaws.com" },
    "Action": [ "kms:Decrypt", "kms:GenerateDataKey*" ],
    "Resource": "*"
}

If you have an SNS encrypted topic, but a subscription in this topic points to a DLQ that isn’t an SQS encrypted queue, then messages sidelined to the DLQ aren’t encrypted at rest.

For more information, see Enabling Server-Side Encryption (SSE) for an Amazon SNS Topic with an Amazon SQS Encrypted Queue Subscribed.

Summary

DLQs for SNS, SQS, and Lambda increase the resiliency and durability of your applications. These DLQs address different failure modes, and can be used together.

  • SNS DLQs store messages that failed to be delivered to subscribed endpoints.
  • SQS DLQs store messages that the consumer system failed to process.
  • Lambda DLQs store the messages that resulted in failed asynchronous executions of your functions.

Setting up DLQs for subscriptions, queues, and functions can be done using the AWS Management Console, SDK, CLI, API, or CloudFormation. DLQs are available in all AWS Regions. Start today by running the tutorials:

Things to Consider When You Build REST APIs with Amazon API Gateway

Post Syndicated from George Mao original https://aws.amazon.com/blogs/architecture/things-to-consider-when-you-build-rest-apis-with-amazon-api-gateway/

A few weeks ago, we kicked off this series with a discussion on REST vs GraphQL APIs. This post will dive deeper into the things an API architect or developer should consider when building REST APIs with Amazon API Gateway.

Request Rate (a.k.a. “TPS”)

Request rate is the first thing you should consider when designing REST APIs. By default, API Gateway allows for up to 10,000 requests per second. You should use the built in Amazon CloudWatch metrics to review how your API is being used. The Count metric in particular can help you review the total number of API requests in a given period.

It’s important to understand the actual request rate that your architecture is capable of supporting. For example, consider this architecture:

REST API 1

This API accepts GET requests to retrieve a user’s cart by using a Lambda function to perform SQL queries against a relational database managed in RDS.  If you receive a large burst of traffic, both API Gateway and Lambda will scale in response to the traffic. However, relational databases typically have limited memory/cpu capacity and will quickly exhaust the total number of connections.

As an API architect, you should design your APIs to protect your down stream applications.  You can start by defining API Keys and requiring your clients to deliver a key with incoming requests. This lets you track each application or client who is consuming your API.  This also lets you create Usage Plans and throttle your clients according to the plan you define.  For example, you if you know your architecture is capable of of sustaining 200 requests per second, you should define a Usage plan that sets a rate of 200 RPS and optionally configure a quota to allow a certain number of requests by day, week, or month.

Additionally, API Gateway lets you define throttling settings for the whole stage or per method. If you know that a GET operation is less resource intensive than a POST operation you can override the stage settings and set different throttling settings for each resource.

Integrations and Design patterns

The example above describes a synchronous, tightly coupled architecture where the request must wait for a response from the backend integration (RDS in this case). This results in system scaling characteristics that are the lowest common denominator of all components. Instead, you should look for opportunities to design an asynchronous, loosely coupled architecture. A decoupled architecture separates the data ingestion from the data processing and allows you to scale each system separately. Consider this new architecture:

REST API 2

This architecture enables ingestion of orders directly into a highly scalable and durable data store such as Amazon Simple Queue Service (SQS).  Your backend can process these orders at any speed that is suitable for your business requirements and system ability.  Most importantly,  the health of the backend processing system does not impact your ability to continue accepting orders.

Security

Security with API Gateway falls into three major buckets, and I’ll outline them below. Remember, you should enable all three options to combine multiple layers of security.

Option 1 (Application Firewall)

You can enable AWS Web Application Firewall (WAF) for your entire API. WAF will inspect all incoming requests and block requests that fail your inspection rules. For example, WAF can inspect requests for SQL Injection, Cross Site Scripting, or whitelisted IP addresses.

Option 2 (Resource Policy)

You can apply a Resource Policy that protects your entire API. This is an IAM policy that is applied to your API and you can use this to white/black list client IP ranges or allow AWS accounts and AWS principals to access your API.

Option 3 (AuthZ)

  1. IAM:This AuthZ option requires clients to sign requests with the AWS v4 signing process. The associated IAM role or user must have permissions to perform the execute-api:Invoke action against the API.
  2. Cognito: This AuthZ option requires clients to login into Cognito and then pass the returned ID or Access JWT token in the Authentication header.
  3. Lambda Auth: This AuthZ option is the most flexible and lets you execute a Lambda function to perform any custom auth strategy needed. A common use case for this is OpenID Connect.

A Couple of Tips

Tip #1: Use Stage variables to avoid hard coding your backend Lambda and HTTP integrations. For example, you probably have multiple stages such as “QA” and “PROD” or “V1” and “V2.” You can define the same variable in each stage and specify different values. For example, you might an API that executes a Lambda function. In each stage, define the same variable called functionArn. You can reference this variable as your Lambda ARN during your integration configuration using this notation: ${stageVariables.functionArn}. API Gateway will inject the corresponding value for the stage dynamically at runtime, allowing you to execute different Lambda functions by stage.

Tip #2: Use Path and Query variables to inject dynamic values into your HTTP integrations. For example, your cart API may define a userId Path variable that is used to lookup a user’s cart: /cart/profile/{userId}. You can inject this variable directly into your backend HTTP integration URL settings like this: http://myapi.someds.com/cart/profile/{userId}

Summary

This post covered strategies you should use to ensure your REST API architectures are scalable and easy to maintain.  I hope you’ve enjoyed this post and our next post will cover GraphQL API architectures with AWS AppSync.

About the Author

George MaoGeorge Mao is a Specialist Solutions Architect at Amazon Web Services, focused on the Serverless platform. George is responsible for helping customers design and operate Serverless applications using services like Lambda, API Gateway, Cognito, and DynamoDB. He is a regular speaker at AWS Summits, re:Invent, and various tech events. George is a software engineer and enjoys contributing to open source projects, delivering technical presentations at technology events, and working with customers to design their applications in the Cloud. George holds a Bachelor of Computer Science and Masters of IT from Virginia Tech.

Simple Two-way Messaging using the Amazon SQS Temporary Queue Client

Post Syndicated from Rachel Richardson original https://aws.amazon.com/blogs/compute/simple-two-way-messaging-using-the-amazon-sqs-temporary-queue-client/

This post is contributed by Robin Salkeld, Sr. Software Development Engineer

Amazon SQS is a fully managed message queuing service that makes it easy to decouple and scale microservices, distributed systems, and serverless applications. Asynchronous workflows have always been the primary use case for SQS. Using queues ensures one component can keep running smoothly without losing data when another component is unavailable or slow.

We were surprised, then, to discover that many customers use SQS in synchronous workflows. For example, many applications use queues to communicate between frontends and backends when processing a login request from a user.

Why would anyone use SQS for this? The service stores messages for up to 14 days with high durability, but messages in a synchronous workflow often must be processed within a few minutes, or even seconds. Why not just set up an HTTPS endpoint?

The more we talked to customers, the more we understood. Here’s what we learned:

  • Creating a queue is often easier and faster than creating an HTTPS endpoint and the infrastructure necessary to ensure the endpoint’s scalability.
  • Queues are safe by default because they are locked down to the AWS account that created them. In addition, any DDoS attempt on your service is absorbed by SQS instead of loading down your own servers.
  • There is generally no need to create firewall rules for the communication between microservices if they use queues.
  • Although SQS provides durable storage (which isn’t necessary for short-lived messages), it is still a cost-effective solution for this use case. This is especially true when you consider all the messaging broker maintenance that is done for you.

However, setting up efficient two-way communication through one-way queues requires some non-trivial client-side code. In our previous two-part post series on implementing enterprise integration patterns with AWS messaging services, Point-to-point channels and Publish-subscribe channels, we discussed the Request-Response Messaging Pattern. In this pattern, each requester creates a temporary destination to receive each response message.

The simplest approach is to create a new queue for each response, but this is like building a road just so a single car can drive on it before tearing it down. Technically, this can work (and SQS can create and delete queues quickly), but we can definitely make it faster and cheaper.

To better support short-lived, lightweight messaging destinations, we are pleased to present the Amazon SQS Temporary Queue Client. This client makes it easy to create and delete many temporary messaging destinations without inflating your AWS bill.

Virtual queues

The key concept behind the client is the virtual queue. Virtual queues let you multiplex many low-traffic queues onto a single SQS queue. Creating a virtual queue only instantiates a local buffer to hold messages for consumers as they arrive; there is no API call to SQS and no costs associated with creating a virtual queue.

The Temporary Queue Client includes the AmazonSQSVirtualQueuesClient class for creating and managing virtual queues. This class implements the AmazonSQS interface and adds support for attributes related to virtual queues. You can create a virtual queue using this client by calling the CreateQueue API action and including the HostQueueURL queue attribute. This attribute specifies the existing SQS queue on which to host the virtual queue. The queue URL for a virtual queue is in the form <host queue URL>#<virtual queue name>. For example:

https://sqs.us-east-2.amazonaws.com/123456789012/MyQueue#MyVirtualQueueName

When you call the SendMessage or SendMessageBatch API actions on AmazonSQSVirtualQueuesClient with a virtual queue URL, the client first extracts the virtual queue name. It then attaches this name as an additional message attribute to each message, and sends the messages to the host queue. When you call the ReceiveMessage API action on a virtual queue, the calling thread waits for messages to appear in the in-memory buffer for the virtual queue. Meanwhile, a background thread polls the host queue and dispatches messages to these buffers according to the additional message attribute.

This mechanism is similar to how the AmazonSQSBufferedAsyncClient prefetches messages, and the benefits are similar. A single call to SQS can provide messages for up to 10 virtual queues, reducing the API calls that you pay for by up to a factor of ten. Deleting a virtual queue simply removes the client-side resources used to implement them, again without making API calls to SQS.

The diagram below illustrates the end-to-end process for sending messages through virtual queues:

Sending messages through virtual queues

Virtual queues are similar to virtual machines. Just as a virtual machine provides the same experience as a physical machine, a virtual queue divides the resources of a single SQS queue into smaller logical queues. This is ideal for temporary queues, since they frequently only receive a handful of messages in their lifetime. Virtual queues are currently implemented entirely within the Temporary Queue Client, but additional support and optimizations might be added to SQS itself in the future.

In most cases, you don’t have to manage virtual queues yourself. The library also includes the AmazonSQSTemporaryQueuesClient class. This class automatically creates virtual queues when the CreateQueue API action is called and creates host queues on demand for all queues with the same queue attributes. To optimize existing application code that creates and deletes queues, you can use this class as a drop-in replacement implementation of the AmazonSQS interface.

The client also includes the AmazonSQSRequester and AmazonSQSResponder interfaces, which enable two-way communication through SQS queues. The following is an example of an RPC implementation for a web application’s login process.

/**
 * This class handles a user's login request on the client side.
 */
public class LoginClient {

    // The SQS queue to send the requests to.
    private final String requestQueueUrl;

    // The AmazonSQSRequester creates a temporary queue for each response.
    private final AmazonSQSRequester sqsRequester = AmazonSQSRequesterClientBuilder.defaultClient();

    private final LoginClient(String requestQueueUrl) {
        this.requestQueueUrl = requestQueueUrl;
    }

    /**
     * Send a login request to the server.
     */
    public String login(String body) throws TimeoutException {
        SendMessageRequest request = new SendMessageRequest()
                .withMessageBody(body)
                .withQueueUrl(requestQueueUrl);

        // This:
        //  - creates a temporary queue
        //  - attaches its URL as an attribute on the message
        //  - sends the message
        //  - receives the response from the temporary queue
        //  - deletes the temporary queue
        //  - returns the response
        //
        // If something goes wrong and the server's response never shows up, this method throws a
        // TimeoutException.
        Message response = sqsRequester.sendMessageAndGetResponse(request, 20, TimeUnit.SECONDS);
        
        return response.getBody();
    }
}

/**
 * This class processes users' login requests on the server side.
 */
public class LoginServer {

    // The SQS queue to poll for login requests.
    // Assume that on construction a thread is created to poll this queue and call
    // handleLoginRequest() below for each message.
    private final String requestQueueUrl;

    // The AmazonSQSResponder sends responses to the correct response destination.
    private final AmazonSQSResponder sqsResponder = AmazonSQSResponderClientBuilder.defaultClient();

    private final AmazonSQS(String requestQueueUrl) {
        this.requestQueueUrl = requestQueueUrl;
    }

    /**
     * Handle a login request sent from the client above.
     */
    public void handleLoginRequest(Message message) {
        // Assume doLogin does the actual work, and returns a serialized result
        String response = doLogin(message.getBody());

        // This extracts the URL of the temporary queue from the message attribute and sends the
        // response to that queue.
        sqsResponder.sendResponseMessage(MessageContent.fromMessage(message), new MessageContent(response));  
    }
}

Cleaning up

As with any other AWS SDK client, your code should call the shutdown() method when the temporary queue client is no longer needed. The AmazonSQSRequester interface also provides a shutdown() method, which shuts down its internal temporary queue client. This ensures that the in-memory resources needed for any virtual queues are reclaimed, and that the host queue that the client automatically created is also deleted automatically.

However, in the world of distributed systems things are a little more complex. Processes can run out of memory and crash, and hosts can reboot suddenly and unexpectedly. There are even cases where you don’t have the opportunity to run custom code on shutdown.

The Temporary Queue Client client addresses this issue as well. For each host queue with recent API calls, the client periodically uses the TagQueue API action to attach a fresh tag value that indicates the queue is still being used. The tagging process serves as a heartbeat to keep the queue alive. According to a configurable time period (by default, 5 minutes), a background thread uses the ListQueues API action to obtain the URLs of all queues with the configured prefix. Then, it deletes each queue that has not been tagged recently. The mechanism is similar to how the Amazon DynamoDB Lock Client expires stale lock leases.

If you use the AmazonSQSTemporaryQueuesClient directly, you can customize how long queues must be idle before they is deleted by configuring the IdleQueueRetentionPeriodSeconds queue attribute. The client supports setting this attribute on both host queues and virtual queues. For virtual queues, setting this attribute ensures that the in-memory resources do not become a memory leak in the presence of application bugs.

Any API call to a queue marks it as non-idle, including ReceiveMessage calls that don’t return any messages. The only reason to increase the retention period attribute is to give the client more time when it can’t send heartbeats—for example, due to garbage collection pauses or networking issues.

But what if you want to use this client in a fleet of a thousand EC2 instances? Won’t every client spend a lot of time checking every queue for idleness? Doesn’t that imply duplicate work that increases as the fleet is scaled up?

We thought of this too. The Temporary Queue Client creates a shared queue for all clients using the same queue prefix, and uses this queue as a work queue for the distributed task. Instead of every client calling the ListQueues API action every five minutes, a new seed message (which triggers the sweeping process) is sent to this queue every five minutes.

When one of the clients receives this message, it calls the ListQueues API action and sends each queue URL in the result as another kind of message to the same shared work queue. The work of actually checking each queue for idleness is distributed roughly evenly to the active clients, ensuring scalability. There is even a mechanism that works around the fact that the ListQueues API action currently only returns no more than 1,000 queue URLs at time.

Summary

We are excited about how the new Amazon SQS Temporary Queue Client makes more messaging patterns easier and cheaper for you. Download the code from GitHub, have a look at Temporary Queues in the Amazon SQS Developer Guide, try out the client, and let us know what you think!

Building Simpler Genomics Workflows on AWS Step Functions

Post Syndicated from Christie Gifrin original https://aws.amazon.com/blogs/compute/building-simpler-genomics-workflows-on-aws-step-functions/

This post is courtesy of Ryan Ulaszek, AWS Genomics Partner Solutions Architect and Aaron Friedman, AWS Healthcare and Life Sciences Partner Solutions Architect

In 2017, we published a four part blog series on how to build a genomics workflow on AWS. In part 1, we introduced a general architecture highlighting three common layers: job, batch and workflow.  In part 2, we described building the job layer with Docker and Amazon Elastic Container Registry (Amazon ECR).  In part 3, we tackled the batch layer and built a batch engine using AWS Batch.  In part 4, we built out the workflow layer using AWS Step Functions and AWS Lambda.

Since then, we’ve worked with many AWS customers and APN partners to implement this solution in genomics as well as in other workloads-of-interest. Today, we wanted to highlight a new feature in Step Functions that simplifies how customers and partners can build high-throughput genomics workflows on AWS.

Step Functions now supports native integration with AWS Batch, which simplifies how you can create an AWS Batch state that submits an asynchronous job and waits for that job to finish.

Before, you needed to build a state machine building block that submitted a job to AWS Batch, and then polled and checked its execution. Now, you can just submit the job to AWS Batch using the new AWS Batch task type.  Step Functions waits to proceed until the job is completed. This reduces the complexity of your state machine and makes it easier to build a genomics workflow with asynchronous AWS Batch steps.

The new integrations include support for the following API actions:

  • AWS Batch SubmitJob
  • Amazon SNS Publish
  • Amazon SQS SendMessage
  • Amazon ECS RunTask
  • AWS Fargate RunTask
  • Amazon DynamoDB
    • PutItem
    • GetItem
    • UpdateItem
    • DeleteItem
  • Amazon SageMaker
    • CreateTrainingJob
    • CreateTransformJob
  • AWS Glue
    • StartJobRun

You can also pass parameters to the service API.  To use the new integrations, the role that you assume when running a state machine needs to have the appropriate permissions.  For more information, see the AWS Step Functions Developer Guide.

Using a job status poller

In our 2017 post series, we created a job poller “pattern” with two separate Lambda functions. When the job finishes, the state machine proceeds to the next step and operates according to the necessary business logic.  This is a useful pattern to manage asynchronous jobs when a direct integration is unavailable.

The steps in this building block state machine are as follows:

  1. A job is submitted through a Lambda function.
  2. The state machine queries the AWS Batch API for the job status in another Lambda function.
  3. The job status is checked to see if the job has completed.  If the job status equals SUCCESS, the final job status is logged. If the job status equals FAILED, the execution of the state machine ends. In all other cases, wait 30 seconds and go back to Step 2.

Both of the Submit Job and Get Job Lambda functions are available as example Lambda functions in the console.  The job status poller is available in the Step Functions console as a sample project.

Here is the JSON representing this state machine.

{
  "Comment": "A simple example that submits a job to AWS Batch",
  "StartAt": "SubmitJob",
  "States": {
    "SubmitJob": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:<account-id>::function:batchSubmitJob",
      "Next": "GetJobStatus"
    },
    "GetJobStatus": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:<account-id>:function:batchGetJobStatus",
      "Next": "CheckJobStatus",
      "InputPath": "$",
      "ResultPath": "$.status"
    },
    "CheckJobStatus": {
      "Type": "Choice",
      "Choices": [
        {
          "Variable": "$.status",
          "StringEquals": "FAILED",
          "End": true
        },
        {
          "Variable": "$.status",
          "StringEquals": "SUCCEEDED",
          "Next": "GetFinalJobStatus"
        }
      ],
      "Default": "Wait30Seconds"
    },
    "Wait30Seconds": {
      "Type": "Wait",
      "Seconds": 30,
      "Next": "GetJobStatus"
    },
    "GetFinalJobStatus": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:<account-id>:function:batchGetJobStatus",
      "End": true
    }
  }
}

With Step Functions Service Integrations

With Step Functions service integrations, it is now simpler to submit and wait for an AWS Batch job, or any other supported service.

The following code block is the JSON representing the new state machine for an asynchronous batch job. If you are familiar with the AWS Batch SubmitJob API action, you may notice that the parameters are consistent with what you would see in that API call. You can also use the optional AWS Batch parameters in addition to JobDefinition, JobName, and JobQueue.

{
 "StartAt": "RunBatchJob",
 "States": {
     "RunIsaacJob":{
     "Type":"Task",
     "Resource":"arn:aws:states:::batch:submitJob.sync",
     "Parameters":{
        "JobDefinition":"Isaac",
        "JobName.$":"$.isaac.JobName",
        "JobQueue":"HighPriority",
        "Parameters.$": "$.isaac"
     },
     "TimeoutSeconds": 900,
     "HeartbeatSeconds": 60,
     "Next":"Parallel",
     "InputPath":"$",
     "ResultPath":"$.status",
     "Retry" : [
        {
          "ErrorEquals": [ "States.Timeout" ],
          "IntervalSeconds": 3,
          "MaxAttempts": 2,
          "BackoffRate": 1.5
        }
     ]
  }
}

Here is an example of the workflow input JSON.  Pass all of the container parameters that were being constructed in the submit job Lambda function.

{
  "isaac": {
    "WorkingDir": "/scratch",
    "JobName": "isaac-1",
    "FastQ1S3Path": "s3://aws-batch-genomics-resources/fastq/SRR1919605_1.fastq.gz",
    "BAMS3FolderPath": "s3://aws-batch-genomics-resources/fastq/SRR1919605_2.fastq.gz",
    "FastQ2S3Path": "s3://bccn-genome-data/fastq/NIST7035_R2_trimmed.fastq.gz",
    "ReferenceS3Path": "s3://aws-batch-genomics-resources/reference/isaac/"
  }
}

When you deploy the job definition, add the command attribute that was previously being constructed in the Lambda function launching the AWS Batch job.

IsaacJobDefinition:
    Type: AWS::Batch::JobDefinition
    Properties:
      JobDefinitionName: "Isaac"
      Type: container
      RetryStrategy:
        Attempts: 1
      Parameters:
        BAMS3FolderPath: !Sub "s3://${JobResultsBucket}/NA12878_states_1/bam"
        FastQ1S3Path: "s3://aws-batch-genomics-resources/fastq/SRR1919605_1.fastq.gz"
        FastQ2S3Path: "s3://aws-batch-genomics-resources/fastq/SRR1919605_2.fastq.gz"
        ReferenceS3Path: "s3://aws-batch-genomics-resources/reference/isaac/"
        WorkingDir: "/scratch"
      ContainerProperties:
        Image: "rulaszek/isaac"
        Vcpus: 32
        Memory: 80000
        JobRoleArn:
          Fn::ImportValue: !Sub "${RoleStackName}:ECSTaskRole"
        Command:
          - "--bam_s3_folder_path"
          - "Ref::BAMS3FolderPath"
          - "--fastq1_s3_path"
          - "Ref::FastQ1S3Path"
          - "--fastq2_s3_path"
          - "Ref::FastQ2S3Path"
          - "--reference_s3_path"
          - "Ref::ReferenceS3Path"
          - "--working_dir"
          - "Ref::WorkingDir"
        MountPoints:
          - ContainerPath: "/scratch"
            ReadOnly: false
            SourceVolume: docker_scratch
        Volumes:
          - Name: docker_scratch
            Host:
              SourcePath: "/docker_scratch"

The key-value parameters passed into the workflow are mapped using Parameters.$ to the values in the job definition using the keys.  Value substitutions do take place. The Docker run looks like the following:

docker run <isaac_container_uri> --bam_s3_folder_path s3://batch-genomics-pipeline-jobresultsbucket-1kzdu216m2b0k/NA12878_states_3/bam
                                 --fastq1_s3_path s3://aws-batch-genomics-resources/fastq/SRR1919605_1.fastq.gz
                                 --fastq2_s3_path s3://aws-batch-genomics-resources/fastq/SRR1919605_2.fastq.gz 
                                 --reference_s3_path s3://aws-batch-genomics-resources/reference/isaac/ 
                                 --working_dir /scratch

Genomics workflow: Before and after

Overall, connectors dramatically simplify your genomics workflow.  The following workflow is a simple genomics secondary analysis pipeline, which we highlighted in our original post series.

The first step aligns the sample against a reference genome.  When alignment is complete, variant calling and QA metrics are calculated in two parallel steps.  When variant calling is complete, variant annotation is performed.  Before, our genomics workflow looked like this:

Now it looks like this:

Here is the new workflow JSON:

{
   "Comment":"A simple genomics secondary-analysis workflow",
   "StartAt":"RunIsaacJob",
   "States":{
      "RunIsaacJob":{
         "Type":"Task",
         "Resource":"arn:aws:states:::batch:submitJob.sync",
         "Parameters":{
            "JobDefinition":"Isaac",
            "JobName.$":"$.isaac.JobName",
            "JobQueue":"HighPriority",
            "Parameters.$": "$.isaac"
         },
         "TimeoutSeconds": 900,
         "HeartbeatSeconds": 60,
         "Next":"Parallel",
         "InputPath":"$",
         "ResultPath":"$.status",
         "Retry" : [
            {
              "ErrorEquals": [ "States.Timeout" ],
              "IntervalSeconds": 3,
              "MaxAttempts": 2,
              "BackoffRate": 1.5
            }
         ]
      },
      "Parallel":{
         "Type":"Parallel",
         "Next":"FinalState",
         "Branches":[
            {
               "StartAt":"RunStrelkaJob",
               "States":{
                  "RunStrelkaJob":{
                     "Type":"Task",
                     "Resource":"arn:aws:states:::batch:submitJob.sync",
                     "Parameters":{
                        "JobDefinition":"Strelka",
                        "JobName.$":"$.strelka.JobName",
                        "JobQueue":"HighPriority",
                        "Parameters.$": "$.strelka"
                     },
                     "TimeoutSeconds": 900,
                     "HeartbeatSeconds": 60,
                     "Next":"RunSnpEffJob",
                     "InputPath":"$",
                     "ResultPath":"$.status",
                     "Retry" : [
                        {
                          "ErrorEquals": [ "States.Timeout" ],
                          "IntervalSeconds": 3,
                          "MaxAttempts": 2,
                          "BackoffRate": 1.5
                        }
                     ]
                  },
                  "RunSnpEffJob":{
                     "Type":"Task",
                     "Resource":"arn:aws:states:::batch:submitJob.sync",
                     "Parameters":{
                        "JobDefinition":"SNPEff",
                        "JobName.$":"$.snpeff.JobName",
                        "JobQueue":"HighPriority",
                        "Parameters.$": "$.snpeff"
                     },
                     "TimeoutSeconds": 900,
                     "HeartbeatSeconds": 60,
                     "Retry" : [
                        {
                          "ErrorEquals": [ "States.Timeout" ],
                          "IntervalSeconds": 3,
                          "MaxAttempts": 2,
                          "BackoffRate": 1.5
                        }
                     ],
                     "End":true
                  }
               }
            },
            {
               "StartAt":"RunSamtoolsStatsJob",
               "States":{
                  "RunSamtoolsStatsJob":{
                     "Type":"Task",
                     "Resource":"arn:aws:states:::batch:submitJob.sync",
                     "Parameters":{
                        "JobDefinition":"SamtoolsStats",
                        "JobName.$":"$.samtools.JobName",
                        "JobQueue":"HighPriority",
                        "Parameters.$": "$.samtools"
                     },
                     "TimeoutSeconds": 900,
                     "HeartbeatSeconds": 60,
                     "End":true,
                     "Retry" : [
                        {
                          "ErrorEquals": [ "States.Timeout" ],
                          "IntervalSeconds": 3,
                          "MaxAttempts": 2,
                          "BackoffRate": 1.5
                        }
                     ]
                  }
               }
            }
         ]
      },
      "FinalState":{
         "Type":"Pass",
         "End":true
      }
   }
}

Here is the new Amazon CloudFormation template for deploying the AWS Batch job definitions for each tool:

AWSTemplateFormatVersion: 2010-09-09

Description: Batch job definitions for batch genomics

Parameters:
  RoleStackName:
    Description: "Stack that deploys roles for genomic workflow"
    Type: String
  VPCStackName:
    Description: "Stack that deploys vps for genomic workflow"
    Type: String
  JobResultsBucket:
    Description: "Bucket that holds workflow job results"
    Type: String

Resources:
  IsaacJobDefinition:
    Type: AWS::Batch::JobDefinition
    Properties:
      JobDefinitionName: "Isaac"
      Type: container
      RetryStrategy:
        Attempts: 1
      Parameters:
        BAMS3FolderPath: !Sub "s3://${JobResultsBucket}/NA12878_states_1/bam"
        FastQ1S3Path: "s3://aws-batch-genomics-resources/fastq/SRR1919605_1.fastq.gz"
        FastQ2S3Path: "s3://aws-batch-genomics-resources/fastq/SRR1919605_2.fastq.gz"
        ReferenceS3Path: "s3://aws-batch-genomics-resources/reference/isaac/"
        WorkingDir: "/scratch"
      ContainerProperties:
        Image: "rulaszek/isaac"
        Vcpus: 32
        Memory: 80000
        JobRoleArn:
          Fn::ImportValue: !Sub "${RoleStackName}:ECSTaskRole"
        Command:
          - "--bam_s3_folder_path"
          - "Ref::BAMS3FolderPath"
          - "--fastq1_s3_path"
          - "Ref::FastQ1S3Path"
          - "--fastq2_s3_path"
          - "Ref::FastQ2S3Path"
          - "--reference_s3_path"
          - "Ref::ReferenceS3Path"
          - "--working_dir"
          - "Ref::WorkingDir"
        MountPoints:
          - ContainerPath: "/scratch"
            ReadOnly: false
            SourceVolume: docker_scratch
        Volumes:
          - Name: docker_scratch
            Host:
              SourcePath: "/docker_scratch"

  StrelkaJobDefinition:
    Type: AWS::Batch::JobDefinition
    Properties:
      JobDefinitionName: "Strelka"
      Type: container
      RetryStrategy:
        Attempts: 1
      Parameters:
        BAMS3Path: !Sub "s3://${JobResultsBucket}/NA12878_states_1/bam/sorted.bam"
        BAIS3Path: !Sub "s3://${JobResultsBucket}/NA12878_states_1/bam/sorted.bam.bai"
        ReferenceS3Path: "s3://aws-batch-genomics-resources/reference/hg38.fa"
        ReferenceIndexS3Path: "s3://aws-batch-genomics-resources/reference/hg38.fa.fai"
        VCFS3Path: !Sub "s3://${JobResultsBucket}/NA12878_states_1/vcf"
        WorkingDir: "/scratch"
      ContainerProperties:
        Image: "rulaszek/strelka"
        Vcpus: 32
        Memory: 32000
        JobRoleArn:
          Fn::ImportValue: !Sub "${RoleStackName}:ECSTaskRole"
        Command:
          - "--bam_s3_path"
          - "Ref::BAMS3Path"
          - "--bai_s3_path"
          - "Ref::BAIS3Path"
          - "--reference_s3_path"
          - "Ref::ReferenceS3Path"
          - "--reference_index_s3_path"
          - "Ref::ReferenceIndexS3Path"
          - "--vcf_s3_path"
          - "Ref::VCFS3Path"
          - "--working_dir"
          - "Ref::WorkingDir"
        MountPoints:
          - ContainerPath: "/scratch"
            ReadOnly: false
            SourceVolume: docker_scratch
        Volumes:
          - Name: docker_scratch
            Host:
              SourcePath: "/docker_scratch"

  SnpEffJobDefinition:
    Type: AWS::Batch::JobDefinition
    Properties:
      JobDefinitionName: "SNPEff"
      Type: container
      RetryStrategy:
        Attempts: 1
      Parameters:
        VCFS3Path: !Sub "s3://${JobResultsBucket}/NA12878_states_1/vcf/variants/genome.vcf.gz"
        AnnotatedVCFS3Path: !Sub "s3://${JobResultsBucket}/NA12878_states_1/vcf/genome.anno.vcf"
        CommandArgs: " -t hg38 "
        WorkingDir: "/scratch"
      ContainerProperties:
        Image: "rulaszek/snpeff"
        Vcpus: 4
        Memory: 10000
        JobRoleArn:
          Fn::ImportValue: !Sub "${RoleStackName}:ECSTaskRole"
        Command:
          - "--annotated_vcf_s3_path"
          - "Ref::AnnotatedVCFS3Path"
          - "--vcf_s3_path"
          - "Ref::VCFS3Path"
          - "--cmd_args"
          - "Ref::CommandArgs"
          - "--working_dir"
          - "Ref::WorkingDir"
        MountPoints:
          - ContainerPath: "/scratch"
            ReadOnly: false
            SourceVolume: docker_scratch
        Volumes:
          - Name: docker_scratch
            Host:
              SourcePath: "/docker_scratch"

  SamtoolsStatsJobDefinition:
    Type: AWS::Batch::JobDefinition
    Properties:
      JobDefinitionName: "SamtoolsStats"
      Type: container
      RetryStrategy:
        Attempts: 1
      Parameters:
        ReferenceS3Path: "s3://aws-batch-genomics-resources/reference/hg38.fa"
        BAMS3Path: !Sub "s3://${JobResultsBucket}/NA12878_states_1/bam/sorted.bam"
        BAMStatsS3Path: !Sub "s3://${JobResultsBucket}/NA12878_states_1/bam/sorted.bam.stats"
        WorkingDir: "/scratch"
      ContainerProperties:
        Image: "rulaszek/samtools-stats"
        Vcpus: 4
        Memory: 10000
        JobRoleArn:
          Fn::ImportValue: !Sub "${RoleStackName}:ECSTaskRole"
        Command:
          - "--bam_s3_path"
          - "Ref::BAMS3Path"
          - "--bam_stats_s3_path"
          - "Ref::BAMStatsS3Path"
          - "--reference_s3_path"
          - "Ref::ReferenceS3Path"
          - "--working_dir"
          - "Ref::WorkingDir"
        MountPoints:
          - ContainerPath: "/scratch"
            ReadOnly: false
            SourceVolume: docker_scratch
        Volumes:
          - Name: docker_scratch
            Host:
              SourcePath: "/docker_scratch"

Here is the new CloudFormation script that deploys the new workflow:

AWSTemplateFormatVersion: 2010-09-09

Description: State Machine for batch benomics

Parameters:
  RoleStackName:
    Description: "Stack that deploys roles for genomic workflow"
    Type: String
  VPCStackName:
    Description: "Stack that deploys vps for genomic workflow"
    Type: String

Resources:
  # S3
  GenomicWorkflow:
    Type: AWS::StepFunctions::StateMachine
    Properties:
      RoleArn:
        Fn::ImportValue: !Sub "${RoleStackName}:StatesExecutionRole"
      DefinitionString: !Sub |-
        {
           "Comment":"A simple example that submits a job to AWS Batch",
           "StartAt":"RunIsaacJob",
           "States":{
              "RunIsaacJob":{
                 "Type":"Task",
                 "Resource":"arn:aws:states:::batch:submitJob.sync",
                 "Parameters":{
                    "JobDefinition":"Isaac",
                    "JobName.$":"$.isaac.JobName",
                    "JobQueue":"HighPriority",
                    "Parameters.$": "$.isaac"
                 },
                 "TimeoutSeconds": 900,
                 "HeartbeatSeconds": 60,
                 "Next":"Parallel",
                 "InputPath":"$",
                 "ResultPath":"$.status",
                 "Retry" : [
                    {
                      "ErrorEquals": [ "States.Timeout" ],
                      "IntervalSeconds": 3,
                      "MaxAttempts": 2,
                      "BackoffRate": 1.5
                    }
                 ]
              },
              "Parallel":{
                 "Type":"Parallel",
                 "Next":"FinalState",
                 "Branches":[
                    {
                       "StartAt":"RunStrelkaJob",
                       "States":{
                          "RunStrelkaJob":{
                             "Type":"Task",
                             "Resource":"arn:aws:states:::batch:submitJob.sync",
                             "Parameters":{
                                "JobDefinition":"Strelka",
                                "JobName.$":"$.strelka.JobName",
                                "JobQueue":"HighPriority",
                                "Parameters.$": "$.strelka"
                             },
                             "TimeoutSeconds": 900,
                             "HeartbeatSeconds": 60,
                             "Next":"RunSnpEffJob",
                             "InputPath":"$",
                             "ResultPath":"$.status",
                             "Retry" : [
                                {
                                  "ErrorEquals": [ "States.Timeout" ],
                                  "IntervalSeconds": 3,
                                  "MaxAttempts": 2,
                                  "BackoffRate": 1.5
                                }
                             ]
                          },
                          "RunSnpEffJob":{
                             "Type":"Task",
                             "Resource":"arn:aws:states:::batch:submitJob.sync",
                             "Parameters":{
                                "JobDefinition":"SNPEff",
                                "JobName.$":"$.snpeff.JobName",
                                "JobQueue":"HighPriority",
                                "Parameters.$": "$.snpeff"
                             },
                             "TimeoutSeconds": 900,
                             "HeartbeatSeconds": 60,
                             "Retry" : [
                                {
                                  "ErrorEquals": [ "States.Timeout" ],
                                  "IntervalSeconds": 3,
                                  "MaxAttempts": 2,
                                  "BackoffRate": 1.5
                                }
                             ],
                             "End":true
                          }
                       }
                    },
                    {
                       "StartAt":"RunSamtoolsStatsJob",
                       "States":{
                          "RunSamtoolsStatsJob":{
                             "Type":"Task",
                             "Resource":"arn:aws:states:::batch:submitJob.sync",
                             "Parameters":{
                                "JobDefinition":"SamtoolsStats",
                                "JobName.$":"$.samtools.JobName",
                                "JobQueue":"HighPriority",
                                "Parameters.$": "$.samtools"
                             },
                             "TimeoutSeconds": 900,
                             "HeartbeatSeconds": 60,
                             "End":true,
                             "Retry" : [
                                {
                                  "ErrorEquals": [ "States.Timeout" ],
                                  "IntervalSeconds": 3,
                                  "MaxAttempts": 2,
                                  "BackoffRate": 1.5
                                }
                             ]
                          }
                       }
                    }
                 ]
              },
              "FinalState":{
                 "Type":"Pass",
                 "End":true
              }
           }
        }

Outputs:
  GenomicsWorkflowArn:
    Description: GenomicWorkflow ARN
    Value: !Ref GenomicWorkflow
  StackName:
    Description: StackName
    Value: !Sub ${AWS::StackName}

Conclusion

AWS Step Functions service integrations are a great way to simplify creating complex workflows with asynchronous steps. While we highlighted the use case with AWS Batch today, there are many other ways that healthcare and life sciences customers can use this new feature, such as with message processing.

For more information about how AWS can enable your genomics workloads, be sure to check out the AWS Genomics page.

We’ve updated the open-source project to take advantage of the new AWS Batch integration in Step Functions.  You can find the changes aws-batch-genomics/tree/v2.0.0 folder.

Original posts in this four-part series:

Happy coding!

Implementing enterprise integration patterns with AWS messaging services: point-to-point channels

Post Syndicated from Rachel Richardson original https://aws.amazon.com/blogs/compute/implementing-enterprise-integration-patterns-with-aws-messaging-services-point-to-point-channels/

This post is courtesy of Christian Mueller, Sr. Solutions Architect, AWS and Dirk Fröhner, Sr. Solutions Architect, AWS

At AWS, we see our customers increasingly moving toward managed services to reduce the time and money that they spend managing infrastructure. This also applies to the messaging domain, where AWS provides a collection of managed services.

Asynchronous messaging is a fundamental approach for integrating independent systems or building up a set of loosely coupled systems that can scale and evolve independently and flexibly. The well-known collection of enterprise integration patterns (EIPs) provides a “technology-independent vocabulary” to “design and document integration solutions.” This blog is the first of two that describes how you can implement the core EIPs using AWS messaging services. Let’s first look at the relevant AWS messaging services.

When organizations migrate their traditional messaging and existing applications to the cloud gradually, they usually want to do it without rewriting their code. Amazon MQ is a managed message broker service for Apache ActiveMQ that makes it easy to set up and operate message brokers in the cloud. It supports industry-standard APIs and protocols such as JMS, AMQP, and MQTT, so you can switch from any standards-based message broker to Amazon MQ without rewriting the messaging code in your applications. Amazon MQ is recommended if you’re using messaging with existing applications and want to move your messaging to the cloud without rewriting existing code.

However, if you build new applications for the cloud, we recommend that you consider using cloud-native messaging services such as Amazon SQS and Amazon SNS. These serverless, fully managed message queue and topic services scale to meet your demands and provide simple, easy-to-use APIs. You can use Amazon SQS and Amazon SNS to decouple and scale microservices, distributed systems, and serverless applications and improve overall reliability.

This blog looks at the first part of some fundamental integration patterns. We describe the patterns and apply them to these AWS messaging services. This will help you apply the right pattern to your use case and architect for scale in a secure and cost-efficient manner. For all variants, we employ both traditional and cloud-native messaging services: Amazon MQ for the former and Amazon SQS and Amazon SNS for the latter.

Integration Patterns

Let’s start with some fundamental integration patterns.

Message exchange patterns

First, we inspect the two major message exchange patterns: one-way and request-response.

One-way messaging

Applying one-way messaging, a message producer (sender) sends out a message to a messaging channel and doesn’t expect or want a response from whatever process (receiver) consumed the message. Examples of one-way messaging include a data transfer and a notification about an event that happened.

Request-response messaging

With request-response messaging, a message producer (requester) sends out a message: for example, a command to instruct the responder to execute something. The requester expects a response from each message consumer (responder) who received that message, likely to know what the result of all executions was. To know where to send the response message to, the request message contains a return address that the responder uses. To make sure that the requester can assign an incoming response to a request, the requester adds a correlation identifier to the request, which the responders echo in their responses.

Messaging channels: point-to-point

Next, we look at the point-to-point messaging channel, one of the most important patterns for messaging channels. We will continue our consideration with publish-subscribe in our second post.

A point-to-point channel is usually implemented by message queues. Message queues operate so that any given message is only consumed by one receiver, although multiple receivers can be connected to the queue. The queue ensures once-only consumption. Messages are usually buffered in queues so that they’re available for consumption for a certain amount of time, even if no receiver is currently connected.

Point-to-point channels are often used for loosely coupled message transmission, though there are two other common uses. First, it can support horizontal scaling of message processing on the receiver side. Depending on the message load in the channel, the number of receiver processes can be elastically adjusted to cope with the load as needed. The queue acts as a buffering load balancer. Second, it can flatten peak loads of messages and prevent your receivers from being flooded when you can’t scale out fast enough or you don’t want additional scaling.

Integration scenarios

In this section, we apply these fundamental patterns to AWS messaging services. The code examples are written in Java, but only by author preference. You can implement the same integration scenarios in C++, .NET, Node.js, Python, Ruby, Go, and other programming languages that AWS provides an SDK and an Apache Active MQ client library is available for.

Point-to-point channels: one-way messaging

The diagrams in the following subsections show the principle of one-way messaging for point-to-point channels, using Amazon MQ queues and Amazon SQS queues. The sender produces a message and sends it into a queue, and the receiver consumes the message from the queue for processing. For traditional messaging (that is, Amazon MQ), the senders and consumers can use protocols such as JMS or AMQP. For cloud-native messaging, they can use the Amazon SQS API.

Traditional messaging

To follow this example, open the Amazon MQ console and create a broker. In the following diagram we see the above explained components for the traditional messaging scenario: A sender sends messages into an Amazon MQ queue, a receiver consumes messages from that queue.

Point to point traditional messaging

In the following code example, sender and receiver are using the Apache Active MQ client library and the standard Java messaging service (JMS) API to send and receive messages to and from an Amazon MQ queue. You can run the code on every Amazon compute service, your on-premises data center, or your personal computer. For simplicity, the code launches sender and receiver in the same Java virtual machine (JVM).

public class PointToPointOneWayTraditional {

    public static void main(String... args) throws Exception {
        ActiveMQSslConnectionFactory connFact = new ActiveMQSslConnectionFactory("failover:(ssl://<broker-1>.amazonaws.com:61617,ssl://<broker-2>.amazonaws.com:61617)");
        connFact.setConnectResponseTimeout(10000);
        Connection conn = connFact.createConnection("user", "password");
        conn.setClientID("PointToPointOneWayTraditional");
        conn.start();

        new Thread(new Receiver(conn.createSession(false, Session.CLIENT_ACKNOWLEDGE), "Queue.PointToPoint.OneWay.Traditional")).start();
        new Thread(new Sender(conn.createSession(false, Session.CLIENT_ACKNOWLEDGE), "Queue.PointToPoint.OneWay.Traditional")).start();
    }

    public static class Sender implements Runnable {

        private Session session;
        private String destination;

        public Sender(Session session, String destination) {
            this.session = session;
            this.destination = destination;
        }

        public void run() {
            try {
                MessageProducer messageProducer = session.createProducer(session.createQueue(destination));
                long counter = 0;

                while (true) {
                    TextMessage message = session.createTextMessage("Message " + ++counter);
                    message.setJMSMessageID(UUID.randomUUID().toString());
                    messageProducer.send(message);
                }
            } catch (JMSException e) {
                throw new RuntimeException(e);
            }
        }
    }

    public static class Receiver implements Runnable, MessageListener {

        private Session session;
        private String destination;

        public Receiver(Session session, String destination) {
            this.session = session;
            this.destination = destination;
        }

        public void run() {
            try {
                MessageConsumer consumer = session.createConsumer(session.createQueue(destination));
                consumer.setMessageListener(this);
            } catch (JMSException e) {
                throw new RuntimeException(e);
            }
        }

        public void onMessage(Message message) {
            try {
                System.out.println(String.format("received message '%s' with message id '%s'", ((TextMessage) message).getText(), message.getJMSMessageID()));
                message.acknowledge();
            } catch (JMSException e) {
                throw new RuntimeException(e);
            }
        }
    }
}

Cloud-native messaging

To follow this example, open the Amazon SQS console and create a standard SQS queue, using the queue name P2POneWayCloudNative.  In the following diagram we see the above explained components for the cloud-native messaging scenario: A sender sends messages into an Amazon SQS queue, a receiver consumes messages from that queue.

Point to point cloud-native messaging

 

In the sample code below, the example sender is using the AWS SDK for Java to send messages to an Amazon SQS queue, running in an endless loop. You can run the code on every Amazon compute service, your on-premises data center, or your personal computer.

public class PointToPointOneWayCloudNative {

    public static void main(String... args) throws Exception {
        final AmazonSQS sqs = AmazonSQSClientBuilder.standard().build();

        new Thread(new Sender(sqs, "https://sqs.<region>.amazonaws.com/<account-number>/P2POneWayCloudNative")).start();
    }

    public static class Sender implements Runnable {

        private AmazonSQS sqs;
        private String destination;

        public Sender(AmazonSQS sqs, String destination) {
            this.sqs = sqs;
            this.destination = destination;
        }

        public void run() {
            long counter = 0;

            while (true) {
                sqs.sendMessage(
                    new SendMessageRequest()
                        .withQueueUrl(destination)
                        .withMessageBody("Message " + ++counter)
                        .addMessageAttributesEntry("MessageID", new MessageAttributeValue().withDataType("String").withStringValue(UUID.randomUUID().toString())));
            }
        }
    }
}

We implement the receiver below in a serverless manner as an AWS Lambda function, using Amazon SQS as the event source. The name of the SQS queue is configured outside the function’s code, which is why it doesn’t appear in this code example.

public class Receiver implements RequestHandler<SQSEvent, Void> {

    @Override
    public Void handleRequest(SQSEvent request, Context context) {
        for (SQSEvent.SQSMessage message: request.getRecords()) {
            System.out.println(String.format("received message '%s' with message id '%s'", message.getBody(), message.getMessageAttributes().get("MessageID").getStringValue()));
        }

        return null;
    }
}

If this approach is new to you, you can find more details in AWS Lambda Adds Amazon Simple Queue Service to Supported Event Sources. Using Lambda comes with a number of benefits. For example, you don’t have to manage the compute environment for the receiver, and you can use an event (or push) model instead of having to poll for new messages.

Point-to-point channels: request-response messaging

In addition to the one-way scenario, we have a return channel option. We would now call the involved processes rather than the requester and responder. The requester sends a message into the request queue, and the responder sends the response into the response queue. Remember that the requester enriches the message with a return address (the name of the response queue) so that the responder knows where to send the response to. The requester also sends a correlation ID that the responder copies into the response message so that the requester can match the incoming response with a request.

Traditional messaging

In this example, we reuse the Amazon MQ broker that we set up earlier. In the following diagram we see the above explained components for the traditional messaging scenario, using an Amazon MQ queue each for the request messages and for the response messages.

Point to point request response traditional messaging

Using Amazon MQ, we don’t have to create queues explicitly because they’re implicitly created as needed when we start sending messages to them. This example is similar to the point-to-point one-way traditional example.

public class PointToPointRequestResponseTraditional {

    public static void main(String... args) throws Exception {
        ActiveMQSslConnectionFactory connFact = new ActiveMQSslConnectionFactory("failover:(ssl://<broker-1>.amazonaws.com:61617,ssl://<broker-2>.amazonaws.com:61617)");
        connFact.setConnectResponseTimeout(10000);
        Connection conn = connFact.createConnection("user", "password");
        conn.setClientID("PointToPointRequestResponseTraditional");
        conn.start();

        new Thread(new Responder(conn.createSession(false, Session.CLIENT_ACKNOWLEDGE), "Queue.PointToPoint.RequestResponse.Traditional")).start();
        new Thread(new Requester(conn.createSession(false, Session.CLIENT_ACKNOWLEDGE), "Queue.PointToPoint.RequestResponse.Traditional")).start();
    }

    public static class Requester implements Runnable {

        private Session session;
        private String destination;

        public Requester(Session session, String destination) {
            this.session = session;
            this.destination = destination;
        }

        public void run() {
            MessageProducer messageProducer = null;
            try {
                messageProducer = session.createProducer(session.createQueue(destination));
                long counter = 0;

                while (true) {
                    TemporaryQueue replyTo = session.createTemporaryQueue();
                    String correlationId = UUID.randomUUID().toString();
                    TextMessage message = session.createTextMessage("Message " + ++counter);
                    message.setJMSMessageID(UUID.randomUUID().toString());
                    message.setJMSCorrelationID(correlationId);
                    message.setJMSReplyTo(replyTo);
                    messageProducer.send(message);

                    MessageConsumer consumer = session.createConsumer(replyTo, "JMSCorrelationID='" + correlationId + "'");
                    try {
                        Message receivedMessage = consumer.receive(5000);
                        System.out.println(String.format("received message '%s' with message id '%s'", ((TextMessage) receivedMessage).getText(), receivedMessage.getJMSMessageID()));
                        receivedMessage.acknowledge();
                    } finally {
                        if (consumer != null) {
                            consumer.close();
                        }
                    }
                }
            } catch (JMSException e) {
                throw new RuntimeException(e);
            }
        }
    }

    public static class Responder implements Runnable, MessageListener {

        private Session session;
        private String destination;

        public Responder(Session session, String destination) {
            this.session = session;
            this.destination = destination;
        }

        public void run() {
            try {
                MessageConsumer consumer = session.createConsumer(session.createQueue(destination));
                consumer.setMessageListener(this);
            } catch (JMSException e) {
                throw new RuntimeException(e);
            }
        }

        public void onMessage(Message message) {
            try {
                String correlationId = message.getJMSCorrelationID();
                Destination replyTo = message.getJMSReplyTo();

                TextMessage responseMessage = session.createTextMessage(((TextMessage) message).getText() + " with CorrelationID " + correlationId);
                responseMessage.setJMSMessageID(UUID.randomUUID().toString());
                responseMessage.setJMSCorrelationID(correlationId);

                MessageProducer messageProducer = session.createProducer(replyTo);
                try {
                    messageProducer.send(responseMessage);

                    message.acknowledge();
                } finally {
                    if (messageProducer != null) {
                        messageProducer.close();
                    }
                }
            } catch (JMSException e) {
                throw new RuntimeException(e);
            }
        }
    }
}

Cloud-native messaging

Open the Amazon SQS console and create two standard SQS queues using the queue names P2PReqRespCloudNative and P2PReqRespCloudNative-Resp. In the following diagram we see the above explained components for the cloud-native scenario, using an Amazon SQS queue each for the request messages and for the response messages.

Point to point request response cloud native messaging

The following example requester is almost identical to the point-to-point one-way cloud-native example sender. It also provides a reply-to address and a correlation ID.

public class PointToPointRequestResponseCloudNative {

    public static void main(String... args) throws Exception {
        final AmazonSQS sqs = AmazonSQSClientBuilder.standard().build();

        new Thread(new Requester(sqs, "https://sqs.<region>.amazonaws.com/<account-number>/P2PReqRespCloudNative", "https://sqs.<region>.amazonaws.com/<account-number>/P2PReqRespCloudNative-Resp")).start();
    }

    public static class Requester implements Runnable {

        private AmazonSQS sqs;
        private String destination;
        private String replyDestination;
        private Map<String, SendMessageRequest> inflightMessages = new ConcurrentHashMap<>();

        public Requester(AmazonSQS sqs, String destination, String replyDestination) {
            this.sqs = sqs;
            this.destination = destination;
            this.replyDestination = replyDestination;
        }

        public void run() {
            long counter = 0;

            while (true) {
                String correlationId = UUID.randomUUID().toString();
                SendMessageRequest request = new SendMessageRequest()
                    .withQueueUrl(destination)
                    .withMessageBody("Message " + ++counter)
                    .addMessageAttributesEntry("CorrelationID", new MessageAttributeValue().withDataType("String").withStringValue(correlationId))
                    .addMessageAttributesEntry("ReplyTo", new MessageAttributeValue().withDataType("String").withStringValue(replyDestination));
                sqs.sendMessage(request);

                inflightMessages.put(correlationId, request);

                ReceiveMessageResult receiveMessageResult = sqs.receiveMessage(
                    new ReceiveMessageRequest()
                        .withQueueUrl(replyDestination)
                        .withMessageAttributeNames("CorrelationID")
                        .withMaxNumberOfMessages(5)
                        .withWaitTimeSeconds(2));

                for (Message receivedMessage : receiveMessageResult.getMessages()) {
                    System.out.println(String.format("received message '%s' with message id '%s'", receivedMessage.getBody(), receivedMessage.getMessageId()));

                    String receivedCorrelationId = receivedMessage.getMessageAttributes().get("CorrelationID").getStringValue();
                    SendMessageRequest originalRequest = inflightMessages.remove(receivedCorrelationId);
                    System.out.println(String.format("Corresponding request message '%s'", originalRequest.getMessageBody()));

                    sqs.deleteMessage(
                        new DeleteMessageRequest()
                            .withQueueUrl(replyDestination)
                            .withReceiptHandle(receivedMessage.getReceiptHandle()));
                }
            }
        }
    }
}

The following example responder is almost identical to the point-to-point one-way cloud-native example receiver. It also creates a message and sends it back to the reply-to address provided in the received message.

public class Responder implements RequestHandler<SQSEvent, Void> {

    private final AmazonSQS sqs = AmazonSQSClientBuilder.standard().build();

    @Override
    public Void handleRequest(SQSEvent request, Context context) {
        for (SQSEvent.SQSMessage message: request.getRecords()) {
            System.out.println(String.format("received message '%s' with message id '%s'", message.getBody(), message.getMessageId()));
            String correlationId = message.getMessageAttributes().get("CorrelationID").getStringValue();
            String replyTo = message.getMessageAttributes().get("ReplyTo").getStringValue();

            System.out.println(String.format("sending message with correlation id '%s' to '%s'", correlationId, replyTo));
            sqs.sendMessage(
                new SendMessageRequest()
                    .withQueueUrl(replyTo)
                    .withMessageBody(message.getBody() + " with CorrelationID " + correlationId)
                    .addMessageAttributesEntry("CorrelationID", new MessageAttributeValue().withDataType("String").withStringValue(correlationId)));
        }

        return null;
    }
}

Go build!

We look forward to hearing about what you build and will continue innovating our services on your behalf.

Additional resources

What’s next?

We have introduced the first fundamental EIPs and shown how you can apply them to the AWS messaging services. If you are keen to dive deeper, continue reading with the second part of this series, where we will cover publish-subscribe messaging.

Read Part 2: Publish-Subscribe Messaging

Implementing enterprise integration patterns with AWS messaging services: publish-subscribe channels

Post Syndicated from Rachel Richardson original https://aws.amazon.com/blogs/compute/implementing-enterprise-integration-patterns-with-aws-messaging-services-publish-subscribe-channels/

This post is courtesy of Christian Mueller, Sr. Solutions Architect, AWS and Dirk Fröhner, Sr. Solutions Architect, AWS

In this blog, we look at the second part of some fundamental enterprise integration patterns and how you can implement them with AWS messaging services. If you missed the first part, we encourage you to start there.

Read Part 1: Point-to-Point Messaging

Integration patterns

Messaging channels: publish-subscribe

As mentioned in the first blog, we continue with the second major messaging channel pattern: publish-subscribe.

A publish-subscribe channel is usually implemented using message topics. In this model, any message published to a topic is immediately received by all of the subscribers of the topic (unless you have applied the message filter pattern). However, if there is no subscriber, messages are usually discarded. The durable subscriber pattern describes an exception where messages are kept for a while in case the subscriber is offline. Publish-subscribe is used when multiple parties are interested in certain messages. Sometimes, this pattern is also referred to as fan-out.

Let’s apply this pattern to the different AWS messaging services and get our hands dirty. To follow our examples, sign in to your AWS account (or create an account as described in How do I create and activate a new Amazon Web Services account?).

Integration scenarios

Publish-subscribe channels: one-way messaging

Publish-subscribe one-way patterns are often involved in notification style use cases, where the publisher sends out an event and doesn’t care who is interested in this event. For example, Amazon CloudWatch Events publishes state changes in the environment, and you can subscribe and act accordingly.

The diagrams in the following subsections show the principles of one-way messaging for publish-subscribe channels, using both Amazon MQ and Amazon SNS topics. A publisher produces a message and sends it into a topic, and subscribers consume the message from the topic for processing.

For traditional messaging, senders and consumers can use API protocols such JMS or AMQP. For cloud-native messaging, they can use the Amazon SNS API.

Traditional messaging

In this example, we reuse the Amazon MQ broker we set up in part one of this blog. As we can see in the following diagram, messages as published into an Amazon MQ topic and multiple subscribers can consume messages from it.

Publish Subscribe One Way Traditional Messaging

This example is similar to the point-to-point one-way traditional example using the Apache Active MQ client library, but we use topics instead of queues, as shown in the following code.

public class PublishSubscribeOneWayTraditional {

    public static void main(String... args) throws Exception {
        ActiveMQSslConnectionFactory connFact = new ActiveMQSslConnectionFactory("failover:(ssl://<broker-1>.amazonaws.com:61617,ssl://<broker-2>.amazonaws.com:61617)");
        connFact.setConnectResponseTimeout(10000);
        Connection conn = connFact.createConnection("user", "password");
        conn.setClientID("PubSubOneWayTraditional");
        conn.start();

        new Thread(new Subscriber(conn.createSession(false, Session.CLIENT_ACKNOWLEDGE), "Topic.PubSub.OneWay.Traditional")).start();
        new Thread(new Publisher(conn.createSession(false, Session.CLIENT_ACKNOWLEDGE), "Topic.PubSub.OneWay.Traditional")).start();
    }

    public static class Publisher implements Runnable {

        private Session session;
        private String destination;

        public Sender(Session session, String destination) {
            this.session = session;
            this.destination = destination;
        }

        public void run() {
            try {
                MessageProducer messageProducer = session.createProducer(session.createTopic(destination));
                long counter = 0;

                while (true) {
                    TextMessage message = session.createTextMessage("Message " + ++counter);
                    message.setJMSMessageID(UUID.randomUUID().toString());
                    messageProducer.send(message);
                }
            } catch (JMSException e) {
                throw new RuntimeException(e);
            }
        }
    }

    public static class Subscriber implements Runnable, MessageListener {

        private Session session;
        private String destination;

        public Receiver(Session session, String destination) {
            this.session = session;
            this.destination = destination;
        }

        public void run() {
            try {
                MessageConsumer consumer = session.createDurableSubscriber(session.createTopic(destination), "subscriber-1");
                consumer.setMessageListener(this);
            } catch (JMSException e) {
                throw new RuntimeException(e);
            }
        }

        public void onMessage(Message message) {
            try {
                System.out.println(String.format("received message '%s' with message id '%s'", ((TextMessage) message).getText(), message.getJMSMessageID()));
                message.acknowledge();
            } catch (JMSException e) {
                throw new RuntimeException(e);
            }
        }
    }
}

Cloud-native messaging

To follow a similar example using Amazon SNS, open the Amazon SNS console and create an Amazon SNS topic named PubSubOneWayCloudNative. The below diagram illustrates that a publisher sends messages into an Amazon SNS topic which are consumed by subscribers of this topic.

Publish Subscribe One Way Cloud Native Messaging

We use the AWS SDK for Java to send messages to our Amazon SNS topic, running in an endless loop. You can run the following code on every Amazon compute service, your on-premises data center, or your personal computer.

public class PublishSubscribeOneWayCloudNative {

    public static void main(String... args) throws Exception {
        final AmazonSNS sns = AmazonSNSClientBuilder.standard().build();

        new Thread(new Publisher(sns, "arn:aws:sns:<region>:<account-number>:PubSubOneWayCloudNative")).start();
    }

    public static class Publisher implements Runnable {

        private AmazonSNS sns;
        private String destination;

        public Sender(AmazonSNS sns, String destination) {
            this.sns = sns;
            this.destination = destination;
        }

        public void run() {
            long counter = 0;

            while (true) {
                sns.publish(
                    new PublishRequest()
                        .withTargetArn(destination)
                        .withSubject("PubSubOneWayCloudNative sample")
                        .withMessage("Message " + ++counter)
                        .addMessageAttributesEntry("MessageID", new MessageAttributeValue().withDataType("String").withStringValue(UUID.randomUUID().toString())));
            }
        }
    }
}

The subscriber is implemented as an AWS Lambda function, using Amazon SNS as the event source. For more information on how to set this up, see Using Amazon SNS for System-to-System Messaging with a Lambda Function as a Subscriber.

public class Subscriber implements RequestHandler<SNSEvent, Void> {

    @Override
    public Void handleRequest(SNSEvent request, Context context) {
        for (SNSEvent.SNSRecord record: request.getRecords()) {
            SNS sns = record.getSNS();

            System.out.println(String.format("received message '%s' with message id '%s'", sns.getMessage(), sns.getMessageAttributes().get("MessageID").getValue()));
        }

        return null;
    }
}

Publish-subscribe channels: request-response messaging

Publish-subscribe request-response patterns are beneficial in use cases where it’s important to communicate with multiple services that do their work in parallel, but all their responses need to be aggregated afterward. One example is an order service, which needs to enrich the order message with data from multiple backend services.

The diagrams in the following subsections show the principles of request-response messaging for publish-subscribe channels, using both Amazon MQ and Amazon SNS topics. A publisher produces a message and sends it into a topic, and subscribers consume the message from the topic for processing.

Although we use a publish-subscribe channel for the request messages, we would usually use a point-to-point channel for the response messages. This assumes that the requester application or at least a dedicated application is the one entity that works on processing all the responses.

Traditional messaging

As we can see in the following diagram, a Amazon MQ topic is used to send out all the request messages, while all the response messages are sent into an Amazon MQ queue.

Publish Subscribe Request Response Traditional Messaging

In our code sample below, we use two responders.

public class PublishSubscribeRequestResponseTraditional {

    public static void main(String... args) throws Exception {
        ActiveMQSslConnectionFactory connFact = new ActiveMQSslConnectionFactory("failover:(ssl://<broker-1>.amazonaws.com:61617,ssl://<broker-2>.amazonaws.com:61617)");
        connFact.setConnectResponseTimeout(10000);
        Connection conn = connFact.createConnection("user", "password");
        conn.setClientID("PubSubReqRespTraditional");
        conn.start();

        new Thread(new Responder(conn.createSession(false, Session.CLIENT_ACKNOWLEDGE), "Topic.PubSub.ReqResp.Traditional", "subscriber-1")).start();
        new Thread(new Responder(conn.createSession(false, Session.CLIENT_ACKNOWLEDGE), "Topic.PubSub.ReqResp.Traditional", "subscriber-2")).start();
        new Thread(new Requester(conn.createSession(false, Session.CLIENT_ACKNOWLEDGE), "Topic.PubSub.ReqResp.Traditional")).start();
    }

    public static class Requester implements Runnable {

        private Session session;
        private String destination;

        public Requester(Session session, String destination) {
            this.session = session;
            this.destination = destination;
        }

        public void run() {
            MessageProducer messageProducer = null;
            try {
                messageProducer = session.createProducer(session.createTopic(destination));
                long counter = 0;

                while (true) {
                    TemporaryQueue replyTo = session.createTemporaryQueue();
                    String correlationId = UUID.randomUUID().toString();
                    TextMessage message = session.createTextMessage("Message " + ++counter);
                    message.setJMSMessageID(UUID.randomUUID().toString());
                    message.setJMSCorrelationID(correlationId);
                    message.setJMSReplyTo(replyTo);
                    messageProducer.send(message);

                    MessageConsumer consumer = session.createConsumer(replyTo, "JMSCorrelationID='" + correlationId + "'");
                    try {
                        Message receivedMessage1 = consumer.receive(5000);
                        Message receivedMessage2 = consumer.receive(5000);
                        System.out.println(String.format("received 2 messages '%s' and '%s'", ((TextMessage) receivedMessage1).getText(), ((TextMessage) receivedMessage2).getText()));
                        receivedMessage2.acknowledge();
                    } finally {
                        if (consumer != null) {
                            consumer.close();
                        }
                    }
                }
            } catch (JMSException e) {
                throw new RuntimeException(e);
            }
        }
    }

    public static class Responder implements Runnable, MessageListener {

        private Session session;
        private String destination;
        private String name;

        public Responder(Session session, String destination, String name) {
            this.session = session;
            this.destination = destination;
            this.name = name;
        }

        public void run() {
            try {
                MessageConsumer consumer = session.createDurableSubscriber(session.createTopic(destination), name);
                consumer.setMessageListener(this);
            } catch (JMSException e) {
                throw new RuntimeException(e);
            }
        }

        public void onMessage(Message message) {
            try {
                String correlationId = message.getJMSCorrelationID();
                Destination replyTo = message.getJMSReplyTo();

                TextMessage responseMessage = session.createTextMessage(((TextMessage) message).getText() + " from responder " + name);
                responseMessage.setJMSMessageID(UUID.randomUUID().toString());
                responseMessage.setJMSCorrelationID(correlationId);

                MessageProducer messageProducer = session.createProducer(replyTo);
                try {
                    messageProducer.send(responseMessage);

                    message.acknowledge();
                } finally {
                    if (messageProducer != null) {
                        messageProducer.close();
                    }
                }
            } catch (JMSException e) {
                throw new RuntimeException(e);
            }
        }
    }
}

Cloud-native messaging

To implement a similar pattern with Amazon SNS, open the Amazon SNS console and create a new SNS topic named PubSubReqRespCloudNative. Then open the Amazon SQS console and create a standard SQS queue named PubSubReqRespCloudNative-Resp. The following diagram illustrates that we now use an Amazon SNS topic for request messages and an Amazon SQS queue for response messages.

Publish Subscribe Request Response Cloud Native Messaging

This example requester is almost identical to the publish-subscribe one-way cloud-native example sender. The requester also specifies a reply-to address and a correlation ID as message attributes. This way, responders know where to send the responses to, and the receiver of the responses can assign them accordingly.

public class PublishSubscribeReqRespCloudNative {

    public static void main(String... args) throws Exception {
        final AmazonSNS sns = AmazonSNSClientBuilder.standard().build();
        final AmazonSQS sqs = AmazonSQSClientBuilder.standard().build();

        new Thread(new Requester(sns, sqs, "arn:aws:sns:<region>:<account-number>:PubSubReqRespCloudNative", "https://sqs.<region>.amazonaws.com/<account-number>/PubSubReqRespCloudNative-Resp")).start();
    }

    public static class Requester implements Runnable {

        private AmazonSNS sns;
        private AmazonSQS sqs;
        private String destination;
        private String replyDestination;
        private Map<String, PublishRequest> inflightMessages = new ConcurrentHashMap<>();

        public Requester(AmazonSNS sns, AmazonSQS sqs, String destination, String replyDestination) {
            this.sns = sns;
            this.sqs = sqs;
            this.destination = destination;
            this.replyDestination = replyDestination;
        }

        public void run() {
            long counter = 0;

            while (true) {
                String correlationId = UUID.randomUUID().toString();
                PublishRequest request = new PublishRequest()
                    .withTopicArn(destination)
                    .withMessage("Message " + ++counter)
                    .addMessageAttributesEntry("CorrelationID", new MessageAttributeValue().withDataType("String").withStringValue(correlationId))
                    .addMessageAttributesEntry("ReplyTo", new MessageAttributeValue().withDataType("String").withStringValue(replyDestination));
                sns.publish(request);

                inflightMessages.put(correlationId, request);

                ReceiveMessageResult receiveMessageResult = sqs.receiveMessage(
                    new ReceiveMessageRequest()
                        .withQueueUrl(replyDestination)
                        .withMessageAttributeNames("CorrelationID")
                        .withMaxNumberOfMessages(5)
                        .withWaitTimeSeconds(2));

                for (Message receivedMessage : receiveMessageResult.getMessages()) {
                    System.out.println(String.format("received message '%s' with message id '%s'", receivedMessage.getBody(), receivedMessage.getMessageId()));

                    String receivedCorrelationId = receivedMessage.getMessageAttributes().get("CorrelationID").getStringValue();
                    PublishRequest originalRequest = inflightMessages.remove(receivedCorrelationId);
                    System.out.println(String.format("Corresponding request message '%s'", originalRequest.getMessage()));

                    sqs.deleteMessage(
                        new DeleteMessageRequest()
                            .withQueueUrl(replyDestination)
                            .withReceiptHandle(receivedMessage.getReceiptHandle()));
                }
            }
        }
    }
}

This example responder is almost identical to the publish-subscribe one-way cloud-native example receiver. It also creates a message, enriches it with the correlation ID, and sends it back to the reply-to address provided in the received message.

public class Responder implements RequestHandler<SNSEvent, Void> {

    private final AmazonSQS sqs = AmazonSQSClientBuilder.standard().build();

    @Override
    public Void handleRequest(SNSEvent request, Context context) {
        for (SNSEvent.SNSRecord record: request.getRecords()) {
            System.out.println(String.format("received record '%s' with message id '%s'", record.getSNS().getMessage(), record.getSNS().getMessageId()));
            String correlationId = record.getSNS().getMessageAttributes().get("CorrelationID").getValue();
            String replyTo = record.getSNS().getMessageAttributes().get("ReplyTo").getValue();

            System.out.println(String.format("sending message with correlation id '%s' to '%s'", correlationId, replyTo));
            sqs.sendMessage(
                new SendMessageRequest()
                    .withQueueUrl(replyTo)
                    .withMessageBody(record.getSNS().getMessage() + " with CorrelationID " + correlationId)
                    .addMessageAttributesEntry("CorrelationID", new MessageAttributeValue().withDataType("String").withStringValue(correlationId)));
        }

        return null;
    }
}

Go Build!

We look forward to hearing about what you build and will continue innovating our services on your behalf.

Additional Resources

ICYMI: Serverless Q2 2018

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/icymi-serverless-q2-2018/

The better-late-than-never edition!

Welcome to the second edition of the AWS Serverless ICYMI (in case you missed it) quarterly recap. Every quarter, we share all of the most recent product launches, feature enhancements, blog posts, webinars, Twitch live streams, and other interesting things that you might have missed!

The second quarter of 2018 flew by so fast that we didn’t get a chance to get out this post! We’re playing catch up, and making sure that the Q3 post launches a bit sooner.

Missed our Q1 ICYMI? Catch up on everything you missed.

So, what might you have missed this past quarter? Here’s the recap….

AWS AppSync

In April, AWS AppSync went generally available (GA)!

AWS AppSync provides capabilities to build real-time, collaborative mobile and web applications. It uses GraphQL, an open standard query language that makes it easy to request data from the cloud. When AWS AppSync went GA, several features also launched. These included better in-console testing with mock data, Amazon CloudWatch support, AWS CloudFormation support, and console log access.

AWS Amplify then also launched support for AWS AppSync to make it even easier for developers to build JavaScript-based applications that can integrate with several AWS services via its simplified GraphQL interface. Click here for the documentation.

AppSync expanded to more Regions and added OIDC support in May.

New features

AWS Lambda made Node.js v8.10 available. 8.10 brings some significant improvements in supporting async/await calls that simplify the traditional callback style common in Node.js applications. Developers can also see performance improvements and lower memory consumption.

In June, the long-awaited support for Amazon SQS as a trigger for Lambda launched! With this launch, customers can easily create Lambda functions that directly consume from SQS queues without needing to manage scheduling for the invocations to poll a queue. Today, SQS is one of the most popular AWS services. It’s used by hundreds of thousands of customers at massive scales as one of the fundamental building blocks of many applications.

AWS Lambda gained support for AWS Config. With AWS Config, you can track changes to the Lambda function, runtime environments, tags, handler name, code size, memory allocation, timeout settings, and concurrency settings. You can also record changes to Lambda IAM execution roles, subnets, and security group associations. Even more fun, you can use AWS Lambda functions in AWS Config Rules to check if your Lambda functions conform to certain standards as decided by you. Inception!

Amazon API Gateway announced the availability of private API endpoints! With private API endpoints, you can now create APIs that are completely inside your own virtual private clouds (VPCs). You can use awesome API Gateway features such as Lambda custom authorizers and Amazon Cognito integration. Back your APIs with Lambda, containers running in Amazon ECS, ECS supporting AWS Fargate, and Amazon EKS, as well as on Amazon EC2.

Amazon API Gateway also launched two really useful features; support for Resource Polices for APIs and Cross-Account AWS Lambda Authorizers and Integrations. Both features offer capabilities to help developers secure their APIs whether they are public or private.

AWS SAM went open source and the AWS SAM Local tool has now been relaunched as AWS SAM CLI! As part of the relaunch, AWS SAM CLI has gained numerous capabilities, such as helping you start a brand new serverless project and better template validation. With version 0.4.0, released in June, we added Python 3.6 support. You can now perform new project creation, local development and testing, and then packaging and deployment of serverless applications for all actively supported Lambda languages.

AWS Step Functions expanded into more Regions, increased default limits, became HIPPA eligible, and is also now available in AWS GovCloud (US).

AWS [email protected] added support for Node.js v8.10.

Serverless posts

April:

May:

June:

Webinars

Here are the three webinars we delivered in Q2. We hold several Serverless webinars throughout the year, so look out for them in the Serverless section of the AWS Online Tech Talks page:

Twitch

We’ve been so busy livestreaming on Twitch that you are most certainly missing out if you aren’t following along!

Here are links to all of the Serverless Twitch sessions that we’ve done.

Keep an eye on AWS on Twitch for more Serverless videos and on the Join us on Twitch AWS page for information about upcoming broadcasts and recent live streams.

Worthwhile reading

Serverless: Changing the Face of Business Economics, A Venture Capital and Startup Perspective
In partnership with three prominent venture capitalists—Greylock Partners, Madrona Venture Group, and Accel—AWS released a whitepaper on the business benefits to serverless. Check it out to hear about opportunities for companies in the space and how several have seen significant benefits from a serverless approach.

Serverless Streaming Architectures and Best Practices
Streaming workloads are some of the biggest workloads for AWS Lambda. Customers of all shapes and sizes are using streaming workloads for near real-time processing of data from services such as Amazon Kinesis Streams. In this whitepaper, we explore three stream-processing patterns using a serverless approach. For each pattern, we describe how it applies to a real-world use case, the best practices and considerations for implementation, and cost estimates. Each pattern also includes a template that enables you to quickly and easily deploy these patterns in your AWS accounts.

In other news

AWS re:Invent 2018 is coming! From November 26—30 in Las Vegas, Nevada, join tens of thousands of AWS customers to learn, share ideas, and see exciting keynote announcements. The agenda for Serverless talks is just starting to show up now and there are always lots of opportunities to hear about serverless applications and technologies from fellow AWS customers, AWS product teams, solutions architects, evangelists, and more.

Register for AWS re:Invent now!

What did we do at AWS re:Invent 2017? Check out our recap here: Serverless @ re:Invent 2017

Attend a Serverless event!

“ServerlessDays are a family of events around the world focused on fostering a community around serverless technologies.” —https://serverlessdays.io/

The events are run by local volunteers as vendor-agnostic events with a focus on community, accessibility, and local representation. Dozens of cities around the world have folks interested in these events, with more popping up regularly.

Find a ServerlessDays event happening near you. Come ready to learn and connect with other developers, architects, hobbyists, and practitioners. AWS has members from our team at every event to connect with and share ideas and content. Maybe, just maybe, we’ll even hand out cool swag!

AWS Serverless Apps for Social Good Hackathon

Our AWS Serverless Apps for Social Good hackathon invites you to publish serverless applications for popular use cases. Your app can use Alexa skills, machine learning, media processing, monitoring, data transformation, notification services, location services, IoT, and more.

We’re looking for apps that can be used as standalone assets or as inputs that can be combined with other applications to add to the open-source serverless ecosystem. This supports the work being done by developers and nonprofit organizations around the world.

Winners will be awarded cash prizes and the opportunity to direct donations to the nonprofit partner of their choice.

Still looking for more?

The AWS Serverless landing page has lots of information. The resources page contains case studies, webinars, whitepapers, customer stories, reference architectures, and even more Getting Started tutorials. Check it out!

AWS Lambda Adds Amazon Simple Queue Service to Supported Event Sources

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/aws-lambda-adds-amazon-simple-queue-service-to-supported-event-sources/

We can now use Amazon Simple Queue Service (SQS) to trigger AWS Lambda functions! This is a stellar update with some key functionality that I’ve personally been looking forward to for more than 4 years. I know our customers are excited to take it for a spin so feel free to skip to the walk through section below if you don’t want a trip down memory lane.

SQS was the first service we ever launched with AWS back in 2004, 14 years ago. For some perspective, the largest commercial hard drives in 2004 were around 60GB, PHP 5 came out, Facebook had just launched, the TV show Friends ended, GMail was brand new, and I was still in high school. Looking back, I can see some of the tenets that make AWS what it is today were present even very early on in the development of SQS: fully managed, network accessible, pay-as-you-go, and no minimum commitments. Today, SQS is one of our most popular services used by hundreds of thousands of customers at absolutely massive scales as one of the fundamental building blocks of many applications.

AWS Lambda, by comparison, is a relative new kid on the block having been released at AWS re:Invent in 2014 (I was in the crowd that day!). Lambda is a compute service that lets you run code without provisioning or managing servers and it launched the serverless revolution back in 2014. It has seen immediate adoption across a wide array of use-cases from web and mobile backends to IT policy engines to data processing pipelines. Today, Lambda supports Node.js, Java, Go, C#, and Python runtimes letting customers minimize changes to existing codebases and giving them flexibility to build new ones. Over the past 4 years we’ve added a large number of features and event sources for Lambda making it easier for customers to just get things done. By adding support for SQS to Lambda we’re removing a lot of the undifferentiated heavy lifting of running a polling service or creating an SQS to SNS mapping.

Let’s take a look at how this all works.

Triggering Lambda from SQS

First, I’ll need an existing SQS standard queue or I’ll need to create one. I’ll go over to the AWS Management Console and open up SQS to create a new queue. Let’s give it a fun name. At the moment the Lambda triggers only work with standard queues and not FIFO queues.

Now that I have a queue I want to create a Lambda function to process it. I can navigate to the Lambda console and create a simple new function that just prints out the message body with some Python code like this:

def lambda_handler(event, context):
    for record in event['Records']:
        print(record['body'])

Next I need to add the trigger to the Lambda function, which I can do from the console by selecting the SQS trigger on the left side of the console. I can select which queue I want to use to invoke the Lambda and the maximum number of records a single Lambda will process (up to 10, based on the SQS ReceiveMessage API).

Lambda will automatically scale out horizontally to invoke one Lambda function for each new message in my queue. For example, if I had 100 messages in my queue and enabled the trigger then Lambda would execute 100 invocations of my function to process the messages concurrently. As the queue traffic fluctuates the Lambda service will scale the polling operations up and down based on the number of inflight messages. I’ve covered this behavior in more detail in the additional info section at the bottom of this post. If I have my batch size set to 10 then I would invoke 10 new Lambda functions. In order to control this behavior I can increase or decrease the concurrent execution limit for my function.

For each batch of messages processed if the function returns successfully then those messages will be removed from the queue. If the function errors out or times out then the messages will return to the queue after the visibility timeout set on the queue. Just as a quick note here, our Lambda function timeout has to be lower than the queue’s visibility timeout in order to create the event mapping from SQS to Lambda.

After adding the trigger, I can make any other changes I want to the function and save it. If we hop back over to the SQS console we can see the trigger is registered. I can configure and edit the trigger from the SQS console as well.

I’ll use the AWS CLI to enqueue a simple message and test the functionality:

aws sqs send-message --queue-url https://sqs.us-east-2.amazonaws.com/054060359478/SQS_TO_LAMBDA_IS_FINALLY_HERE_YAY --message-body "hello, world"

My Lambda receives the message and executes the code printing the message payload into my Amazon CloudWatch logs.

Of course all of this works with AWS SAM out of the box.

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: Example of processing messages on an SQS queue with Lambda
Resources:
  MySQSQueueFunction:
    Type: AWS::Serverless::Function
    Properties:
      Runtime: python3.6
      CodeUri: src/
      Events:
        MySQSEvent:
          Type: SQS
          Properties:
            Queue: !GetAtt MySqsQueue.Arn
            BatchSize: 10

  MySqsQueue:
    Type: AWS::SQS::Queue

Additional Information

First of all, there are no additional charges for this feature, but because the Lambda service is continuously long-polling the SQS queue the account will be charged for those API calls at the standard SQS pricing rates.

So, a quick deep dive on concurrency and automatic scaling here – just keep in mind that this behavior could change. The automatic scaling behavior of Lambda is designed to keep polling costs low when a queue is empty while simultaneously letting us scale up to high throughput when the queue is being used heavily. When an SQS event source mapping is initially created and enabled, or when messages first appear after a period with no traffic, then the Lambda service will begin polling the SQS queue using five parallel long-polling connections. The Lambda service monitors the number of inflight messages, and when it detects that this number is trending up, it will increase the polling frequency by 20 ReceiveMessage requests per minute and the function concurrency by 60 calls per minute. As long as the queue remains busy it will continue to scale until it hits the function concurrency limits. As the number of inflight messages trends down Lambda will reduce the polling frequency by 10 ReceiveMessage requests per minute and decrease the concurrency used to invoke our function by 30 calls per-minute.

The documentation is up to date with more info than what’s contained in this post. You can find an example SQS event payload there as well. You can find more details from the SQS side in their documentation as well.

As always we’re excited to hear feedback about this feature either on Twitter or in the comments below.

Randall

Solving Complex Ordering Challenges with Amazon SQS FIFO Queues

Post Syndicated from Christie Gifrin original https://aws.amazon.com/blogs/compute/solving-complex-ordering-challenges-with-amazon-sqs-fifo-queues/

Contributed by Shea Lutton, AWS Cloud Infrastructure Architect

Amazon Simple Queue Service (Amazon SQS) is a fully managed queuing service that helps decouple applications, distributed systems, and microservices to increase fault tolerance. SQS queues come in two distinct types:

  • Standard SQS queues are able to scale to enormous throughput with at-least-once delivery.
  • FIFO queues are designed to guarantee that messages are processed exactly once in the exact order that they are received and have a default rate of 300 transactions per second.

As customers explore SQS FIFO queues, they often have questions about how the behavior works when messages arrive and are consumed. This post walks through some common situations to identify the exact behavior that you can expect. It also covers the behavior of message groups in depth and explains why message groups are key to understanding how FIFO queues work.

The simple case

Suppose that you run a major auction platform where people buy and sell a wide range of products. Your platform requires that transactions from buyers and sellers get processed in exactly the order received. Here’s how a FIFO queue helps you keep all your transactions in one straight flow.

A seller currently is holding an auction for a laptop, and three different bids are received for the same price. Ties are awarded to the first bidder at that price so it is important to track which arrived first. Your auction platform receives the three bids and sends them to a FIFO queue before they are processed.

Now observe how messages leave the queue. When your consumer asks for a batch of up to 10 messages, SQS starts filling the batch with the oldest message (bid A1). It keeps filling until either the batch is full or the queue is empty. In this case, the batch contains the three messages and the queue is now empty. After a batch has left the queue, SQS considers that batch of messages to be “in-flight” until the consumer either deletes them or the batch’s visibility timer expires.

 

When you have a single consumer, this is easy to envision. The consumer gets a batch of messages (now in-flight), does its processing, and deletes the messages. That consumer is then ready to ask for the next batch of messages.

The critical thing to keep in mind is that SQS won’t release the next batch of messages until the first batch has been deleted. By adding more messages to the queue, you can see more interesting behaviors. Imagine that a burst of 11 bids is sent to your FIFO queue, with two bids for Auction A arriving last.

The FIFO queue now has at least two batches of messages in it. When your single consumer requests the first batch of 10 messages, it receives a batch starting with B1 and ending with A1. Later, after the first batch has been deleted, the consumer can get the second batch of messages containing the final A2 message from the queue.

Adding complexity with multiple message groups

A new challenge arises. Your auction platform is getting busier and your dev team added a number of new features. The combination of increased messages and extra processing time for the new features means that a single consumer is too slow. The solution is to scale to have more consumers and process messages in parallel.

To work in parallel, your team realized that only the messages related to a single auction must be kept in order. All transactions for Auction A need to be kept in order and so do all transactions for Auction B. But the two auctions are independent and it does not matter which auctions transactions are processed first.

FIFO can handle that case with a feature called message groups. Each transaction related to Auction A is placed by your producer into message group A, and so on. In the diagram below, Auction A and Auction B each received three bid transactions, with bid B1 arriving first. The FIFO queue always keeps transactions within a message group in the order in which they arrived.

How is this any different than earlier examples? The consumer now gets the messages ordered by message groups, all the B group messages followed by all the A group messages. Multiple message groups create the possibility of using multiple consumers, which I explain in a moment. If FIFO can’t fill up a batch of messages with a single message group, FIFO can place more than one message group in a batch of messages. But whenever possible, the queue gives you a full batch of messages from the same group.

The order of messages leaving a FIFO queue is governed by three rules:

  1. Return the oldest message where no other message in the same message group is currently in-flight.
  2. Return as many messages from the same message group as possible.
  3. If a message batch is still not full, go back to rule 1.

To see this behavior, add a second consumer and insert many more messages into the queue. For simplicity, the delete message action has been omitted in these diagrams but it is assumed that all messages in a batch are processed successfully by the consumer and the batch is properly deleted immediately after.

In this example, there are 11 Group A and 11 Group B transactions arriving in interleaved order and a second consumer has been added. Consumer 1 asks for a group of 10 messages and receives 10 Group A messages. Consumer 2 then asks for 10 messages but SQS knows that Group A is in flight, so it releases 10 Group B messages. The two consumers are now processing two batches of messages in parallel, speeding up throughput and then deleting their batches. When Consumer 1 requests the next batch of messages, it receives the remaining two messages, one from Group A and one from Group B.

Consider this nuanced detail from the example above. What would happen if Consumer 1 was on a faster server and processed its first batch of messages before Consumer 2 could mark its messages for deletion? See if you can predict the behavior before looking at the answer.

If Consumer 2 has not deleted its Group B messages yet when Consumer 1 asks for the next batch, then the FIFO queue considers Group B to still be in flight. It does not release any more Group B messages. Consumer 1 gets only the remaining Group A message. Later, after Consumer 2 has deleted its first batch, the remaining Group B message is released.

Conclusion

I hope this post answered your questions about how Amazon SQS FIFO queues work and why message groups are helpful. If you’re interested in exploring SQS FIFO queues further, here are a few ideas to get you started:

Message Filtering Operators for Numeric Matching, Prefix Matching, and Blacklisting in Amazon SNS

Post Syndicated from Christie Gifrin original https://aws.amazon.com/blogs/compute/message-filtering-operators-for-numeric-matching-prefix-matching-and-blacklisting-in-amazon-sns/

This blog was contributed by Otavio Ferreira, Software Development Manager for Amazon SNS

Message filtering simplifies the overall pub/sub messaging architecture by offloading message filtering logic from subscribers, as well as message routing logic from publishers. The initial launch of message filtering provided a basic operator that was based on exact string comparison. For more information, see Simplify Your Pub/Sub Messaging with Amazon SNS Message Filtering.

Today, AWS is announcing an additional set of filtering operators that bring even more power and flexibility to your pub/sub messaging use cases.

Message filtering operators

Amazon SNS now supports both numeric and string matching. Specifically, string matching operators allow for exact, prefix, and “anything-but” comparisons, while numeric matching operators allow for exact and range comparisons, as outlined below. Numeric matching operators work for values between -10e9 and +10e9 inclusive, with five digits of accuracy right of the decimal point.

  • Exact matching on string values (Whitelisting): Subscription filter policy   {"sport": ["rugby"]} matches message attribute {"sport": "rugby"} only.
  • Anything-but matching on string values (Blacklisting): Subscription filter policy {"sport": [{"anything-but": "rugby"}]} matches message attributes such as {"sport": "baseball"} and {"sport": "basketball"} and {"sport": "football"} but not {"sport": "rugby"}
  • Prefix matching on string values: Subscription filter policy {"sport": [{"prefix": "bas"}]} matches message attributes such as {"sport": "baseball"} and {"sport": "basketball"}
  • Exact matching on numeric values: Subscription filter policy {"balance": [{"numeric": ["=", 301.5]}]} matches message attributes {"balance": 301.500} and {"balance": 3.015e2}
  • Range matching on numeric values: Subscription filter policy {"balance": [{"numeric": ["<", 0]}]} matches negative numbers only, and {"balance": [{"numeric": [">", 0, "<=", 150]}]} matches any positive number up to 150.

As usual, you may apply the “AND” logic by appending multiple keys in the subscription filter policy, and the “OR” logic by appending multiple values for the same key, as follows:

  • AND logic: Subscription filter policy {"sport": ["rugby"], "language": ["English"]} matches only messages that carry both attributes {"sport": "rugby"} and {"language": "English"}
  • OR logic: Subscription filter policy {"sport": ["rugby", "football"]} matches messages that carry either the attribute {"sport": "rugby"} or {"sport": "football"}

Message filtering operators in action

Here’s how this new set of filtering operators works. The following example is based on a pharmaceutical company that develops, produces, and markets a variety of prescription drugs, with research labs located in Asia Pacific and Europe. The company built an internal procurement system to manage the purchasing of lab supplies (for example, chemicals and utensils), office supplies (for example, paper, folders, and markers) and tech supplies (for example, laptops, monitors, and printers) from global suppliers.

This distributed system is composed of the four following subsystems:

  • A requisition system that presents the catalog of products from suppliers, and takes orders from buyers
  • An approval system for orders targeted to Asia Pacific labs
  • Another approval system for orders targeted to European labs
  • A fulfillment system that integrates with shipping partners

As shown in the following diagram, the company leverages AWS messaging services to integrate these distributed systems.

  • Firstly, an SNS topic named “Orders” was created to take all orders placed by buyers on the requisition system.
  • Secondly, two Amazon SQS queues, named “Lab-Orders-AP” and “Lab-Orders-EU” (for Asia Pacific and Europe respectively), were created to backlog orders that are up for review on the approval systems.
  • Lastly, an SQS queue named “Common-Orders” was created to backlog orders that aren’t related to lab supplies, which can already be picked up by shipping partners on the fulfillment system.

The company also uses AWS Lambda functions to automatically process lab supply orders that don’t require approval or which are invalid.

In this example, because different types of orders have been published to the SNS topic, the subscribing endpoints have had to set advanced filter policies on their SNS subscriptions, to have SNS automatically filter out orders they can’t deal with.

As depicted in the above diagram, the following five filter policies have been created:

  • The SNS subscription that points to the SQS queue “Lab-Orders-AP” sets a filter policy that matches lab supply orders, with a total value greater than $1,000, and that target Asia Pacific labs only. These more expensive transactions require an approver to review orders placed by buyers.
  • The SNS subscription that points to the SQS queue “Lab-Orders-EU” sets a filter policy that matches lab supply orders, also with a total value greater than $1,000, but that target European labs instead.
  • The SNS subscription that points to the Lambda function “Lab-Preapproved” sets a filter policy that only matches lab supply orders that aren’t as expensive, up to $1,000, regardless of their target lab location. These orders simply don’t require approval and can be automatically processed.
  • The SNS subscription that points to the Lambda function “Lab-Cancelled” sets a filter policy that only matches lab supply orders with total value of $0 (zero), regardless of their target lab location. These orders carry no actual items, obviously need neither approval nor fulfillment, and as such can be automatically canceled.
  • The SNS subscription that points to the SQS queue “Common-Orders” sets a filter policy that blacklists lab supply orders. Hence, this policy matches only office and tech supply orders, which have a more streamlined fulfillment process, and require no approval, regardless of price or target location.

After the company finished building this advanced pub/sub architecture, they were then able to launch their internal procurement system and allow buyers to begin placing orders. The diagram above shows six example orders published to the SNS topic. Each order contains message attributes that describe the order, and cause them to be filtered in a different manner, as follows:

  • Message #1 is a lab supply order, with a total value of $15,700 and targeting a research lab in Singapore. Because the value is greater than $1,000, and the location “Asia-Pacific-Southeast” matches the prefix “Asia-Pacific-“, this message matches the first SNS subscription and is delivered to SQS queue “Lab-Orders-AP”.
  • Message #2 is a lab supply order, with a total value of $1,833 and targeting a research lab in Ireland. Because the value is greater than $1,000, and the location “Europe-West” matches the prefix “Europe-“, this message matches the second SNS subscription and is delivered to SQS queue “Lab-Orders-EU”.
  • Message #3 is a lab supply order, with a total value of $415. Because the value is greater than $0 and less than $1,000, this message matches the third SNS subscription and is delivered to Lambda function “Lab-Preapproved”.
  • Message #4 is a lab supply order, but with a total value of $0. Therefore, it only matches the fourth SNS subscription, and is delivered to Lambda function “Lab-Cancelled”.
  • Messages #5 and #6 aren’t lab supply orders actually; one is an office supply order, and the other is a tech supply order. Therefore, they only match the fifth SNS subscription, and are both delivered to SQS queue “Common-Orders”.

Although each message only matched a single subscription, each was tested against the filter policy of every subscription in the topic. Hence, depending on which attributes are set on the incoming message, the message might actually match multiple subscriptions, and multiple deliveries will take place. Also, it is important to bear in mind that subscriptions with no filter policies catch every single message published to the topic, as a blank filter policy equates to a catch-all behavior.

Summary

Amazon SNS allows for both string and numeric filtering operators. As explained in this post, string operators allow for exact, prefix, and “anything-but” comparisons, while numeric operators allow for exact and range comparisons. These advanced filtering operators bring even more power and flexibility to your pub/sub messaging functionality and also allow you to simplify your architecture further by removing even more logic from your subscribers.

Message filtering can be implemented easily with existing AWS SDKs by applying message and subscription attributes across all SNS supported protocols (Amazon SQS, AWS Lambda, HTTP, SMS, email, and mobile push). SNS filtering operators for numeric matching, prefix matching, and blacklisting are available now in all AWS Regions, for no extra charge.

To experiment with these new filtering operators yourself, and continue learning, try the 10-minute Tutorial Filter Messages Published to Topics. For more information, see Filtering Messages with Amazon SNS in the SNS documentation.

Now Open AWS EU (Paris) Region

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/now-open-aws-eu-paris-region/

Today we are launching our 18th AWS Region, our fourth in Europe. Located in the Paris area, AWS customers can use this Region to better serve customers in and around France.

The Details
The new EU (Paris) Region provides a broad suite of AWS services including Amazon API Gateway, Amazon Aurora, Amazon CloudFront, Amazon CloudWatch, CloudWatch Events, Amazon CloudWatch Logs, Amazon DynamoDB, Amazon Elastic Compute Cloud (EC2), EC2 Container Registry, Amazon ECS, Amazon Elastic Block Store (EBS), Amazon EMR, Amazon ElastiCache, Amazon Elasticsearch Service, Amazon Glacier, Amazon Kinesis Streams, Polly, Amazon Redshift, Amazon Relational Database Service (RDS), Amazon Route 53, Amazon Simple Notification Service (SNS), Amazon Simple Queue Service (SQS), Amazon Simple Storage Service (S3), Amazon Simple Workflow Service (SWF), Amazon Virtual Private Cloud, Auto Scaling, AWS Certificate Manager (ACM), AWS CloudFormation, AWS CloudTrail, AWS CodeDeploy, AWS Config, AWS Database Migration Service, AWS Direct Connect, AWS Elastic Beanstalk, AWS Identity and Access Management (IAM), AWS Key Management Service (KMS), AWS Lambda, AWS Marketplace, AWS OpsWorks Stacks, AWS Personal Health Dashboard, AWS Server Migration Service, AWS Service Catalog, AWS Shield Standard, AWS Snowball, AWS Snowball Edge, AWS Snowmobile, AWS Storage Gateway, AWS Support (including AWS Trusted Advisor), Elastic Load Balancing, and VM Import.

The Paris Region supports all sizes of C5, M5, R4, T2, D2, I3, and X1 instances.

There are also four edge locations for Amazon Route 53 and Amazon CloudFront: three in Paris and one in Marseille, all with AWS WAF and AWS Shield. Check out the AWS Global Infrastructure page to learn more about current and future AWS Regions.

The Paris Region will benefit from three AWS Direct Connect locations. Telehouse Voltaire is available today. AWS Direct Connect will also become available at Equinix Paris in early 2018, followed by Interxion Paris.

All AWS infrastructure regions around the world are designed, built, and regularly audited to meet the most rigorous compliance standards and to provide high levels of security for all AWS customers. These include ISO 27001, ISO 27017, ISO 27018, SOC 1 (Formerly SAS 70), SOC 2 and SOC 3 Security & Availability, PCI DSS Level 1, and many more. This means customers benefit from all the best practices of AWS policies, architecture, and operational processes built to satisfy the needs of even the most security sensitive customers.

AWS is certified under the EU-US Privacy Shield, and the AWS Data Processing Addendum (DPA) is GDPR-ready and available now to all AWS customers to help them prepare for May 25, 2018 when the GDPR becomes enforceable. The current AWS DPA, as well as the AWS GDPR DPA, allows customers to transfer personal data to countries outside the European Economic Area (EEA) in compliance with European Union (EU) data protection laws. AWS also adheres to the Cloud Infrastructure Service Providers in Europe (CISPE) Code of Conduct. The CISPE Code of Conduct helps customers ensure that AWS is using appropriate data protection standards to protect their data, consistent with the GDPR. In addition, AWS offers a wide range of services and features to help customers meet the requirements of the GDPR, including services for access controls, monitoring, logging, and encryption.

From Our Customers
Many AWS customers are preparing to use this new Region. Here’s a small sample:

Societe Generale, one of the largest banks in France and the world, has accelerated their digital transformation while working with AWS. They developed SG Research, an application that makes reports from Societe Generale’s analysts available to corporate customers in order to improve the decision-making process for investments. The new AWS Region will reduce latency between applications running in the cloud and in their French data centers.

SNCF is the national railway company of France. Their mobile app, powered by AWS, delivers real-time traffic information to 14 million riders. Extreme weather, traffic events, holidays, and engineering works can cause usage to peak at hundreds of thousands of users per second. They are planning to use machine learning and big data to add predictive features to the app.

Radio France, the French public radio broadcaster, offers seven national networks, and uses AWS to accelerate its innovation and stay competitive.

Les Restos du Coeur, a French charity that provides assistance to the needy, delivering food packages and participating in their social and economic integration back into French society. Les Restos du Coeur is using AWS for its CRM system to track the assistance given to each of their beneficiaries and the impact this is having on their lives.

AlloResto by JustEat (a leader in the French FoodTech industry), is using AWS to to scale during traffic peaks and to accelerate their innovation process.

AWS Consulting and Technology Partners
We are already working with a wide variety of consulting, technology, managed service, and Direct Connect partners in France. Here’s a partial list:

AWS Premier Consulting PartnersAccenture, Capgemini, Claranet, CloudReach, DXC, and Edifixio.

AWS Consulting PartnersABC Systemes, Atos International SAS, CoreExpert, Cycloid, Devoteam, LINKBYNET, Oxalide, Ozones, Scaleo Information Systems, and Sopra Steria.

AWS Technology PartnersAxway, Commerce Guys, MicroStrategy, Sage, Software AG, Splunk, Tibco, and Zerolight.

AWS in France
We have been investing in Europe, with a focus on France, for the last 11 years. We have also been developing documentation and training programs to help our customers to improve their skills and to accelerate their journey to the AWS Cloud.

As part of our commitment to AWS customers in France, we plan to train more than 25,000 people in the coming years, helping them develop highly sought after cloud skills. They will have access to AWS training resources in France via AWS Academy, AWSome days, AWS Educate, and webinars, all delivered in French by AWS Technical Trainers and AWS Certified Trainers.

Use it Today
The EU (Paris) Region is open for business now and you can start using it today!

Jeff;

 

Now Open – AWS China (Ningxia) Region

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/now-open-aws-china-ningxia-region/

Today we launched our 17th Region globally, and the second in China. The AWS China (Ningxia) Region, operated by Ningxia Western Cloud Data Technology Co. Ltd. (NWCD), is generally available now and provides customers another option to run applications and store data on AWS in China.

The Details
At launch, the new China (Ningxia) Region, operated by NWCD, supports Auto Scaling, AWS Config, AWS CloudFormation, AWS CloudTrail, Amazon CloudWatch, CloudWatch Events, Amazon CloudWatch Logs, AWS CodeDeploy, AWS Direct Connect, Amazon DynamoDB, Amazon Elastic Compute Cloud (EC2), Amazon Elastic Block Store (EBS), Amazon EC2 Systems Manager, AWS Elastic Beanstalk, Amazon ElastiCache, Amazon Elasticsearch Service, Elastic Load Balancing, Amazon EMR, Amazon Glacier, AWS Identity and Access Management (IAM), Amazon Kinesis Streams, Amazon Redshift, Amazon Relational Database Service (RDS), Amazon Simple Storage Service (S3), Amazon Simple Notification Service (SNS), Amazon Simple Queue Service (SQS), AWS Support API, AWS Trusted Advisor, Amazon Simple Workflow Service (SWF), Amazon Virtual Private Cloud, and VM Import. Visit the AWS China Products page for additional information on these services.

The Region supports all sizes of C4, D2, M4, T2, R4, I3, and X1 instances.

Check out the AWS Global Infrastructure page to learn more about current and future AWS Regions.

Operating Partner
To comply with China’s legal and regulatory requirements, AWS has formed a strategic technology collaboration with NWCD to operate and provide services from the AWS China (Ningxia) Region. Founded in 2015, NWCD is a licensed datacenter and cloud services provider, based in Ningxia, China. NWCD joins Sinnet, the operator of the AWS China China (Beijing) Region, as an AWS operating partner in China. Through these relationships, AWS provides its industry-leading technology, guidance, and expertise to NWCD and Sinnet, while NWCD and Sinnet operate and provide AWS cloud services to local customers. While the cloud services offered in both AWS China Regions are the same as those available in other AWS Regions, the AWS China Regions are different in that they are isolated from all other AWS Regions and operated by AWS’s Chinese partners separately from all other AWS Regions. Customers using the AWS China Regions enter into customer agreements with Sinnet and NWCD, rather than with AWS.

Use it Today
The AWS China (Ningxia) Region, operated by NWCD, is open for business, and you can start using it now! Starting today, Chinese developers, startups, and enterprises, as well as government, education, and non-profit organizations, can leverage AWS to run their applications and store their data in the new AWS China (Ningxia) Region, operated by NWCD. Customers already using the AWS China (Beijing) Region, operated by Sinnet, can select the AWS China (Ningxia) Region directly from the AWS Management Console, while new customers can request an account at www.amazonaws.cn to begin using both AWS China Regions.

Jeff;

 

 

Amazon MQ – Managed Message Broker Service for ActiveMQ

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-mq-managed-message-broker-service-for-activemq/

Messaging holds the parts of a distributed application together, while also adding resiliency and enabling the implementation of highly scalable architectures. For example, earlier this year, Amazon Simple Queue Service (SQS) and Amazon Simple Notification Service (SNS) supported the processing of customer orders on Prime Day, collectively processing 40 billion messages at a rate of 10 million per second, with no customer-visible issues.

SQS and SNS have been used extensively for applications that were born in the cloud. However, many of our larger customers are already making use of open-sourced or commercially-licensed message brokers. Their applications are mission-critical, and so is the messaging that powers them. Our customers describe the setup and on-going maintenance of their messaging infrastructure as “painful” and report that they spend at least 10 staff-hours per week on this chore.

New Amazon MQ
Today we are launching Amazon MQ – a managed message broker service for Apache ActiveMQ that lets you get started in minutes with just three clicks! As you may know, ActiveMQ is a popular open-source message broker that is fast & feature-rich. It offers queues and topics, durable and non-durable subscriptions, push-based and poll-based messaging, and filtering.

As a managed service, Amazon MQ takes care of the administration and maintenance of ActiveMQ. This includes responsibility for broker provisioning, patching, failure detection & recovery for high availability, and message durability. With Amazon MQ, you get direct access to the ActiveMQ console and industry standard APIs and protocols for messaging, including JMS, NMS, AMQP, STOMP, MQTT, and WebSocket. This allows you to move from any message broker that uses these standards to Amazon MQ–along with the supported applications–without rewriting code.

You can create a single-instance Amazon MQ broker for development and testing, or an active/standby pair that spans AZs, with quick, automatic failover. Either way, you get data replication across AZs and a pay-as-you-go model for the broker instance and message storage.

Amazon MQ is a full-fledged part of the AWS family, including the use of AWS Identity and Access Management (IAM) for authentication and authorization to use the service API. You can use Amazon CloudWatch metrics to keep a watchful eye metrics such as queue depth and initiate Auto Scaling of your consumer fleet as needed.

Launching an Amazon MQ Broker
To get started, I open up the Amazon MQ Console, select the desired AWS Region, enter a name for my broker, and click on Next step:

Then I choose the instance type, indicate that I want to create a standby , and click on Create broker (I can select a VPC and fine-tune other settings in the Advanced settings section):

My broker will be created and ready to use in 5-10 minutes:

The URLs and endpoints that I use to access my broker are all available at a click:

I can access the ActiveMQ Web Console at the link provided:

The broker publishes instance, topic, and queue metrics to CloudWatch. Here are the instance metrics:

Available Now
Amazon MQ is available now and you can start using it today in the US East (Northern Virginia), US East (Ohio), US West (Oregon), EU (Ireland), EU (Frankfurt), and Asia Pacific (Sydney) Regions.

The AWS Free Tier lets you use a single-AZ micro instance for up to 750 hours and to store up to 1 gigabyte each month, for one year. After that, billing is based on instance-hours and message storage, plus charges Internet data transfer if the broker is accessed from outside of AWS.

Jeff;