Tag Archives: Amazon Simple Notification Service (SNS)

Building Multi-partner integration on AWS using Event-Driven Architecture

Post Syndicated from Vivek Kant original https://aws.amazon.com/blogs/architecture/building-multi-partner-integration-on-aws-using-event-driven-architecture/

Summary

Finserv MARKETS enables customers to buy financial services products such as credit cards, loans, insurance, and investments from various partners. Finserv integrates with a large number of partners in real time to provide services to customers.

Each partner has their own semantic APIs which can pose a challenge. There are also issues of latency and failures. We’ll show how our solution based on Event Driven Architecture (EDA) on AWS with Reactive design offers a solution to address these challenges.

Challenges with multi-partner integration

  • Latency – Making API calls to multiple partners increases latency of a service even in the scenario in which the calls are made in parallel.
  • Timeouts – If a partner’s APIs time out, this could impact overall service performance and availability.
  • Failures – A partner API failure could lead to the failure or major performance degradation of the overall service.
  • Customer Experience – Preserving the customer experience when depending on partner integration is a challenge, considering these technical issues.

Conceptual solution

In order to build this multi-partner platform, we discussed using traditional command-driven synchronous architecture. We also considered adopting an Event-Driven Architecture (EDA). Building our solution on EDA enabled us to deliver consistent customer experience while addressing failures and performance issues from partner APIs.

The diagram following shows a conceptual view of the solution:

Event Driven Conceptual view

Figure 1 – Conceptual View of Our Solution

The key components of the solution are:

1. User Interface: It is the single page application running on the user’s browser. For example, the UI makes an API call to business services to calculate insurance premiums with required parameters and receives a unique identifier. This enables the UI to be reactive and display the responses from the partner as they arrive.

2. Business Service: A microservice which provides APIs for:

• The user interface to submit and generate events for request for partner offer

• A callback to submit the response from the partner integration service in a reactive manner

3. Event Bus: The event bus infrastructure enables transportation, routing, and delivery of events to the right services. The business service raises a set of events to the event bus for a quote request. There is one event for each partner that is listened to by the reactive services.

4. Reactive Services: Services that consume events and call partner integration services for the calling partner API. On receiving the partner response, it calls the callback API on the business service. These services are organized by product domain (for example Motor Insurance).

5. Partner Integration Service: This service is responsible for integration with partners. The service translates the canonical request to partner-specific API calls. The service implements partner-specific security and error handling. There is one partner integration service per partner.

Realizing this solution on AWS

Amazon Web Services (AWS) offers cloud-native services enabling us to realize this solution.

Event Driven Architecture

Figure 2 – Realizing our solution on AWS

We used the following services to implement this architecture:

User Interface
We built the load balancing interface in Angular and host it on Apache Web Servers running Amazon Elastic Container Service (Amazon ECS). Elastic Load Balancing (ELB) distributes traffic evenly across the containers. Running a container gives us the required flexibility and scalability needed.

Business Service
The business service is a microservice built using Spring Boot and running on Amazon ECS containers. It uses an ELB used for service load balancing. The choice of Spring Boot and Amazon ECS gives flexibility and scalability through cluster auto scaling.

Data Store
We use polyglot architecture for data storage. For different product journeys and depending on data lifecycle, we choose either Amazon Aurora Postgres or Amazon ElastiCache for Redis. This gives us the right mix of performance and required durability for each business use case.

Event Bus
We evaluated Amazon Kinesis and Amazon Simple Notification Service (Amazon SNS) and came to the conclusion that for our volume and use cases, Amazon SNS offers the right capability. We implemented this by defining topics, for example, a four-wheeler insurance quote. This topic is subscribed by reactive services for each partner integration, where the partner service is called and a quote is generated. For downstream functionality where the API is to be called of the selected partner (for example, insurance policy issuance or credit card application submission), we chose Amazon Simple Queue Service (SQS). Amazon SQS provides simple queuing for asynchronous processing.

Reactive Services
These services are built using two different technologies. Spring Boot microservice running in an Amazon ECS container and AWS Lambda functions. Spring Boot is chosen due to our team’s familiarity of this technology; however, our plan is to move completely towards usage of Lambda functions for all asynchronous reactive services.

Partner Integration Service
Partner integration services provides the abstraction layer between partner API and canonical API, and is called by reactive services synchronously. In some cases, the error is passed back to reactive service to decide on retry. In other cases, for example, a policy issuance API retry is built into this service using exponential backoff strategy.

Partner API tracking
Partner API tracking gives us the right way of tracking the partner request and proactively address failures. We use Amazon Elasticsearch Service and Kibana for tracking. We can implement a circuit breaker pattern to shut down any partner for a period of time should failures reach a given threshold.

How this solution addresses multi-partner integration

This solution is built on the foundation of Event Driven Architecture with reactive design and is able to address the following challenges:

Latency
Since the user interface asynchronously polls for the response, it receives the partner response as soon as it arrives. This makes the response latency on the fastest partner API rather than slowest.

Timeout
Timeouts are set in the UI for polling, so if a partner API doesn’t respond, we time it out without any degradation. These timeouts are set based on the established user experience benchmarks. For example, how long do we let a user wait to see all insurance quotes before degrading their experience?

Failures
In case of an API failure, the UI will time out. Our API tracking can also enable a circuit breaker pattern to take a partner offline in the event of persistent failures.

Customer Experience
The solution gives a consistent experience to the user. Users can make their choice from the partner quotes/offers in a reactive way as received, rather than waiting for all offers to be shown. This design meets our customer requirements, derived from our Voice of Customer (VoC) study sessions.

Conclusion

In the digital space, APIs are the most common mechanism for system integration. Building a solution that is scalable, resilient, and provides the best customer experience is challenging. Event Driven Architecture with Reactive design offers a solution to address these issues. We process over 5000+ requests every day in Insurance and Credit domains. We’ve been able to achieve required availability of over 99.9%, while maintaining a positive customer experience on this platform.

Introducing message archiving and analytics for Amazon SNS

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/introducing-message-archiving-and-analytics-for-amazon-sns/

This blog post is courtesy of Sebastian Caceres (AWS Consultant, DevOps), Otavio Ferreira (Sr. Manager, Amazon SNS), Prachi Sharma and Mary Gao (Software Engineers, Amazon SNS).

Today, we are announcing the release of a message delivery protocol for Amazon SNS based on Amazon Kinesis Data Firehose. This is a new way to integrate SNS with storage and analytics services, without writing custom code.

SNS provides topics for push-based, many-to-many pub/sub messaging to help you decouple distributed systems, microservices, and event-driven serverless applications. As applications grow, so does the need to archive messages to meet compliance goals. These archives can also provide important operational and business insights.

Previously, custom code was required to create data pipelines, using general-purpose SNS subscription endpoints, such as Amazon SQS queues or AWS Lambda functions. You had to manage data transformation, data buffering, data compression, and the upload to data stores.

Overview

With the new native integration between SNS and Kinesis Data Firehose, you can send messages to storage and analytics services, using a purpose-built SNS subscription type.

Once you configure a subscription, messages published to the SNS topic are sent to the subscribed Kinesis Data Firehose delivery stream. The messages are then delivered to the destination endpoint configured in the delivery stream, which can be an Amazon S3 bucket, an Amazon Redshift table, or an Amazon Elasticsearch Service index.

You can also use a third-party service provider as the destination of a delivery stream, including Datadog, New Relic, MongoDB, and Splunk. No custom code is required to bridge the services. For more information, see Fanout to Kinesis Data Firehose streams, in the SNS Developer Guide.

Amazon SNS subscriber types with Amazon Kinesis Data Firehose.

The new Kinesis Data Firehose subscription type and its destinations are part of the application-to-application (A2A) messaging offering of SNS. The addition of this subscription type expands the SNS A2A offering to include the following use cases:

  • Run analytics on SNS messages, using Amazon Kinesis Data Analytics, Amazon Elasticsearch Service, or Amazon Redshift as a delivery stream destination. You can use this option to gain insights and detect anomalies in workloads.
  • Index and search SNS messages, using Amazon Elasticsearch Service as a delivery stream destination. From there, you can create dashboards using Kibana, a data visualization and exploration tool.
  • Store SNS messages for backup and auditing purposes, using S3 as a destination of choice. You can then use Amazon Athena to query the S3 bucket for analytics purposes.
  • Apply transformation to SNS messages. For example, you may obfuscate personally identifiable information (PII) or protected health information (PHI) using a Lambda function invoked by the delivery stream.
  • Feed SNS messages into cloud-based application monitoring and observability tools, using Datadog, New Relic, or Splunk as a destination. You can choose this option to enrich DevOps or marketing workflows.

As with all supported message delivery protocols, you can filter, monitor, and encrypt messages.

To simplify architecture and further avoid custom code, you can use an SNS subscription filter policy. This enables you to route only the relevant subset of SNS messages to the Kinesis Data Firehose delivery stream. For more information, see SNS message filtering.

To monitor the throughput, you can check the NumberOfMessagesPublished and the NumberOfNotificationsDelivered metrics for SNS, and the IncomingBytes, IncomingRecords, DeliveryToS3.Records and DeliveryToS3.Success metrics for Kinesis Data Firehose. For additional information, see Monitoring SNS topics using CloudWatch and Monitoring Kinesis Data Firehose using CloudWatch.

For security purposes, you can choose to have data encrypted at rest, using server-side encryption (SSE), in addition to encrypted in transit using HTTPS. For more information, see SNS SSE, Kinesis Data Firehose SSE, and S3 SSE.

Applying SNS message archiving and analytics in a use case

For example, consider an airline ticketing platform that operates in a regulated environment. The compliance framework requires that the company archives all ticket sales for at least 5 years.

Example architecture of a flight ticket selling platform.

The platform is based on an event-driven serverless architecture. It has a ticket seller Lambda function that publishes an event to an SNS topic for every ticket sold. The SNS topic fans out the event to subscribed systems that are interested in processing this type of event. In the preceding diagram, two systems are interested: one focused on payment processing, and another on fraud control. Each subscribed system is invoked by an SQS queue and an event processing Lambda function.

To meet the compliance goal on data retention, the airline company subscribes a Kinesis Data Firehose delivery stream to their existing SNS topic. They use an S3 bucket as the stream destination. After this, all events published to the SNS topic are archived in the S3 bucket.

The company can then use Athena to query the S3 bucket with standard SQL to run analytics and gain insights on ticket sales. For example, they can query for the most popular flight destinations or the most frequent flyers.

Subscribing a Kinesis Data Firehose stream to an SNS topic

You can set up a Kinesis Data Firehose subscription to an SNS topic using the AWS Management Console, the AWS CLI, or the AWS SDKs. You can also use AWS CloudFormation to automate the provisioning of these resources.

We use CloudFormation for this example. The provided CloudFormation template creates the following resources:

  • An SNS topic
  • An S3 bucket
  • A Kinesis Data Firehose delivery stream
  • A Kinesis Data Firehose subscription in SNS
  • Two SQS subscriptions in SNS
  • Two IAM roles with access to deliver messages:
    • From SNS to Kinesis Data Firehose
    • From Kinesis Data Firehose to S3

To provision the infrastructure, use the following template:

---
AWSTemplateFormatVersion: '2010-09-09'
Description: Template for creating an SNS archiving use case
Resources:
  ticketUploadStream:
    DependsOn:
    - ticketUploadStreamRolePolicy
    Type: AWS::KinesisFirehose::DeliveryStream
    Properties:
      S3DestinationConfiguration:
        BucketARN: !Sub 'arn:${AWS::Partition}:s3:::${ticketArchiveBucket}'
        BufferingHints:
          IntervalInSeconds: 60
          SizeInMBs: 1
        CompressionFormat: UNCOMPRESSED
        RoleARN: !GetAtt ticketUploadStreamRole.Arn
  ticketArchiveBucket:
    Type: AWS::S3::Bucket
  ticketTopic:
    Type: AWS::SNS::Topic
  ticketPaymentQueue:
    Type: AWS::SQS::Queue
  ticketFraudQueue:
    Type: AWS::SQS::Queue
  ticketQueuePolicy:
    Type: AWS::SQS::QueuePolicy
    Properties:
      PolicyDocument:
        Statement:
          Effect: Allow
          Principal:
            Service: sns.amazonaws.com
          Action:
            - sqs:SendMessage
          Resource: '*'
          Condition:
            ArnEquals:
              aws:SourceArn: !Ref ticketTopic
      Queues:
        - !Ref ticketPaymentQueue
        - !Ref ticketFraudQueue
  ticketUploadStreamSubscription:
    Type: AWS::SNS::Subscription
    Properties:
      TopicArn: !Ref ticketTopic
      Endpoint: !GetAtt ticketUploadStream.Arn
      Protocol: firehose
      SubscriptionRoleArn: !GetAtt ticketUploadStreamSubscriptionRole.Arn
  ticketPaymentQueueSubscription:
    Type: AWS::SNS::Subscription
    Properties:
      TopicArn: !Ref ticketTopic
      Endpoint: !GetAtt ticketPaymentQueue.Arn
      Protocol: sqs
  ticketFraudQueueSubscription:
    Type: AWS::SNS::Subscription
    Properties:
      TopicArn: !Ref ticketTopic
      Endpoint: !GetAtt ticketFraudQueue.Arn
      Protocol: sqs
  ticketUploadStreamRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Sid: ''
          Effect: Allow
          Principal:
            Service: firehose.amazonaws.com
          Action: sts:AssumeRole
  ticketUploadStreamRolePolicy:
    Type: AWS::IAM::Policy
    Properties:
      PolicyName: FirehoseticketUploadStreamRolePolicy
      PolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Action:
          - s3:AbortMultipartUpload
          - s3:GetBucketLocation
          - s3:GetObject
          - s3:ListBucket
          - s3:ListBucketMultipartUploads
          - s3:PutObject
          Resource:
          - !Sub 'arn:aws:s3:::${ticketArchiveBucket}'
          - !Sub 'arn:aws:s3:::${ticketArchiveBucket}/*'
      Roles:
      - !Ref ticketUploadStreamRole
  ticketUploadStreamSubscriptionRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Principal:
            Service:
            - sns.amazonaws.com
          Action:
          - sts:AssumeRole
      Policies:
      - PolicyName: SNSKinesisFirehoseAccessPolicy
        PolicyDocument:
          Version: '2012-10-17'
          Statement:
          - Action:
            - firehose:DescribeDeliveryStream
            - firehose:ListDeliveryStreams
            - firehose:ListTagsForDeliveryStream
            - firehose:PutRecord
            - firehose:PutRecordBatch
            Effect: Allow
            Resource:
            - !GetAtt ticketUploadStream.Arn

To test, publish a message to the SNS topic. After the delivery stream buffer interval of 60 seconds, the message appears in the destination S3 bucket. For information on message formats, see Amazon SNS message formats in Amazon Kinesis Data Firehose destinations.

Cleaning up

After testing, avoid incurring usage charges by deleting the resources you created during the walkthrough. If you used the CloudFormation template, delete all the objects from the S3 bucket before deleting the stack.

Conclusion

In this post, we show how SNS delivery to Kinesis Data Firehose enables you to integrate SNS with storage and analytics services. The example shows how to create an SNS subscription to use a Kinesis Data Firehose delivery stream to store SNS messages in an S3 bucket.

You can adapt this configuration for your needs for storage, encryption, data transformation, and data pipeline architecture. For more information, see Fanout to Kinesis Data Firehose streams in the SNS Developer Guide.

For details on pricing, see SNS pricing and Kinesis Data Firehose pricing. For more serverless learning resources, visit Serverless Land.

Discovering sensitive data in AWS CodeCommit with AWS Lambda

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/discovering-sensitive-data-in-aws-codecommit-with-aws-lambda-2/

This post is courtesy of Markus Ziller, Solutions Architect.

Today, git is a de facto standard for version control in modern software engineering. The workflows enabled by git’s branching capabilities are a major reason for this. However, with git’s distributed nature, it can be difficult to reliably remove changes that have been committed from all copies of the repository. This is problematic when secrets such as API keys have been accidentally committed into version control. The longer it takes to identify and remove secrets from git, the more likely that the secret has been checked out by another user.

This post shows a solution that automatically identifies credentials pushed to AWS CodeCommit in near-real-time. I also show three remediation measures that you can use to reduce the impact of secrets pushed into CodeCommit:

  • Notify users about the leaked credentials.
  • Lock the repository for non-admins.
  • Hard reset the CodeCommit repository to a healthy state.

I use the AWS Cloud Development Kit (CDK). This is an open source software development framework to model and provision cloud application resources. Using the CDK can reduce the complexity and amount of code needed to automate the deployment of resources.

Overview of solution

The services in this solution are AWS Lambda, AWS CodeCommit, Amazon EventBridge, and Amazon SNS. These services are part of the AWS serverless platform. They help reduce undifferentiated work around managing servers, infrastructure, and the parts of the application that add less value to your customers. With serverless, the solution scales automatically, has built-in high availability, and you only pay for the resources you use.

Solution architecture

This diagram outlines the workflow implemented in this blog:

  1. After a developer pushes changes to CodeCommit, it emits an event to an event bus.
  2. A rule defined on the event bus routes this event to a Lambda function.
  3. The Lambda function uses the AWS SDK for JavaScript to get the changes introduced by commits pushed to the repository.
  4. It analyzes the changes for secrets. If secrets are found, it publishes another event to the event bus.
  5. Rules associated with this event type then trigger invocations of three Lambda functions A, B, and C with information about the problematic changes.
  6. Each of the Lambda functions runs a remediation measure:
    • Function A sends out a notification to an SNS topic that informs users about the situation (A1).
    • Function B locks the repository by setting a tag with the AWS SDK (B2). It sends out a notification about this action (B2).
    • Function C runs git commands that remove the problematic commit from the CodeCommit repository (C2). It also sends out a notification (C1).

Walkthrough

The following walkthrough explains the required components, their interactions and how the provisioning can be automated via CDK.

For this walkthrough, you need:

Checkout and deploy the sample stack:

  1. After completing the prerequisites, clone the associated GitHub repository by running the following command in a local directory:
    git clone [email protected]:aws-samples/discover-sensitive-data-in-aws-codecommit-with-aws-lambda.git
  2. Open the repository in a local editor and review the contents of cdk/lib/resources.ts, src/handlers/commits.ts, and src/handlers/remediations.ts.
  3. Follow the instructions in the README.md to deploy the stack.

The CDK will deploy resources for the following services in your account.

Using CodeCommit to manage your git repositories

The CDK creates a new empty repository called TestRepository and adds a tag RepoState with an initial value of ok. You later use this tag in the LockRepo remediation strategy to restrict access.

It also creates two IAM groups with one user in each. Members of the CodeCommitSuperUsers group are always able to access the repository, while members of the CodeCommitUsers group can only access the repository when the value of the tag RepoState is not locked.

I also import the CodeCommitSystemUser into the CDK. Since the user requires git credentials in a downloaded CSV file, it cannot be created by the CDK. Instead it must be created as described in the README file.

The following CDK code sets up all the described resources:

const TAG_NAME = "RepoState";

const superUsers = new Group(this, "CodeCommitSuperUsers", { groupName: "CodeCommitSuperUsers" });
superUsers.addUser(new User(this, "CodeCommitSuperUserA", {
    password: new Secret(this, "CodeCommitSuperUserPassword").secretValue,
    userName: "CodeCommitSuperUserA"
}));

const users = new Group(this, "CodeCommitUsers", { groupName: "CodeCommitUsers" });
users.addUser(new User(this, "User", {
    password: new Secret(this, "CodeCommitUserPassword").secretValue,
    userName: "CodeCommitUserA"
}));

const systemUser = User.fromUserName(this, "CodeCommitSystemUser", props.codeCommitSystemUserName);

const repo = new Repository(this, "Repository", {
    repositoryName: "TestRepository",
    description: "The repository to test this project out",
});
Tags.of(repo).add(TAG_NAME, "ok");

users.addToPolicy(new PolicyStatement({
    effect: Effect.ALLOW,
    actions: ["*"],
    resources: [repo.repositoryArn],
    conditions: {
        StringNotEquals: {
            [`aws:ResourceTag/${TAG_NAME}`]: "locked"
        }
    }
}));

superUsers.addToPolicy(new PolicyStatement({
    effect: Effect.ALLOW,
    actions: ["*"],
    resources: [repo.repositoryArn]
}));

Using EventBridge to pass events between components

I use EventBridge, a serverless event bus, to connect the Lambda functions together. Many AWS services like CodeCommit are natively integrated into EventBridge and publish events about changes in their environment.

repo.onCommit is a higher-level CDK construct. It creates the required resources to invoke a Lambda function for every commit to a given repository. The created events rule looks like this:

EventBridge rule definition

Note that this event rule only matches commit events in TestRepository. To send commits of all repositories in that account to the inspecting Lambda function, remove the resources filter in the event pattern.

CodeCommit Repository State Change is a default event that is published by CodeCommit if changes are made to a repository. In addition, I define CodeCommit Security Event, a custom event, which Lambda publishes to the same event bus if secrets are discovered in the inspected code.

The sample below shows how you can set up Lambda functions as targets for both type of events.

const DETAIL_TYPE = "CodeCommit Security Event";
const eventBus = new EventBus(this, "CodeCommitEventBus", {
    eventBusName: "CodeCommitSecurityEvents"
});

repo.onCommit("AnyCommitEvent", {
    ruleName: "CallLambdaOnAnyCodeCommitEvent",
    target: new targets.LambdaFunction(commitInspectLambda)
});


new Rule(this, "CodeCommitSecurityEvent", {
    eventBus,
    enabled: true,
    ruleName: "CodeCommitSecurityEventRule",
    eventPattern: {
        detailType: [DETAIL_TYPE]
    },
    targets: [
        new targets.LambdaFunction(lockRepositoryLambda),
        new targets.LambdaFunction(raiseAlertLambda),
        new targets.LambdaFunction(forcefulRevertLambda)
    ]
});

Using Lambda functions to run remediation measures

AWS Lambda functions allow you to run code in response to events. The example defines four Lambda functions.

By comparing the delta to its predecessor, the commitInspectLambda function analyzes if secrets are introduced by a commit. With the CDK, you can create a Lambda function with:

const myLambdaInCDK = new Function(this, "UniqueIdentifierRequiredByCDK", {
    runtime: Runtime.NODEJS_12_X,
    handler: "<handlerfile>.<function name>",
    code: Code.fromAsset(path.join(__dirname, "..", "..", "src", "handlers")),
    // See git repository for complete code
});

The code for this Lambda function uses the AWS SDK for JavaScript to fetch the details of the commit, the differences introduced, and the new content.

The code checks each modified file line by line with a regular expression that matches typical secret formats. In src/handlers/regex.json, I provide a few regular expressions that match common secrets. You can extend this with your own patterns.

If a secret is discovered, a CodeCommit Security Event is published to the event bus. EventBridge then invokes all Lambda functions that are registered as targets with this event. This demo triggers three remediation measures.

The raiseAlertLambda function uses the AWS SDK for JavaScript to send out a notification to all subscribers (that is, CodeCommit administrators) on an SNS topic. It takes no further action.

SNS.publish({
    TopicArn: <TOPIC_ARN>,
    Subject: `[ACTION REQUIRED] Secrets discovered in <repo>`
    Message: `<Your message>
}

Notification about secrets discovered in a commit in TestRepository

The lockRepositoryLambda function uses the AWS SDK for JavaScript to change the RepoState tag from ok to locked. This restricts access to members of the CodeCommitSuperUsers IAM group.

CodeCommit.tagResource({
    resourceArn: event.detail.repositoryArn,
    tags: {
        RepoState: "locked"
    }
})

In addition, the Lambda function uses SNS to send out a notification. The forcefulRevertLambda function runs the following git commands:

git clone <repository>
git checkout <branch>
git reset –hard <previousCommitId>
git push origin <branch> --force

These commands reset the repository to the last accepted commit, by forcefully removing the respective commit from the git history of your CodeCommit repo. I advise you to handle this with care and only activate it on a real project if you fully understand the consequences of rewriting git history.

The Node.js v12 runtime for Lambda does not have a git runtime installed by default. You can add one by using the git-lambda2 Lambda layer. This allows you to run git commands from within the Lambda function.

Logs for the remediation measure Hard Reset

Finally, this Lambda function also sends out a notification. The complete code is available in the GitHub repo.

Using SNS to notify users

To notify users about secrets discovered and actions taken, you create an SNS topic and subscribe to it via email.

const topic = new Topic(this, "CodeCommitSecurityEventNotification", {
    displayName: "CodeCommitSecurityEventNotification",
});

topic.addSubscription(new subs.EmailSubscription(/* your email address */));

Testing the solution

You can test the deployed solution by running these two sets of commands. First, add a file with no credentials:

echo "Clean file - no credentials here" > clean_file.txt
git add clean_file.txt
git commit clean_file.txt -m "Adds clean_file.txt"
git push

Then add a file containing credentials:

SECRET_LIKE_STRING=$(cat /dev/urandom | env LC_CTYPE=C tr -dc 'a-zA-Z0-9' | fold -w 32 | head -n 1)
echo "secret=$SECRET_LIKE_STRING" > problematic_file.txt
git add problematic_file.txt
git commit problematic_file.txt -m "Adds secret-like string to problematic_file.txt"
git push

This first command creates, commits and pushes an unproblematic file clean_file.txt that will pass the checks of commitInspectLambda. The second command creates, commits, and pushes problematic_file.txt, which matches the regular expressions and triggers the remediation measures.

If you check your email, you soon receive notifications about actions taken by the Lambda functions.

Cleaning up

To avoid incurring charges, delete the resources by running cdk destroy and confirming the deletion.

Conclusion

This post demonstrates how you can implement a solution to discover secrets in commits to AWS CodeCommit repositories. It also defines different strategies to remediate this.

The CDK code to set up all components is minimal and can be extended for remediation measures. The template is portable between Regions and uses serverless technologies to minimize cost and complexity.

For more serverless learning resources, visit Serverless Land.

ICYMI: Serverless pre:Invent 2020

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/icymi-serverless-preinvent-2020/

During the last few weeks, the AWS serverless team has been releasing a wave of new features in the build-up to AWS re:Invent 2020. This post recaps some of the most important releases for serverless developers.

re:Invent is virtual and free to all attendees in 2020 – register here. See the complete list of serverless sessions planned and join the serverless DA team live on Twitch. Also, follow your DAs on Twitter for live recaps and Q&A during the event.

AWS re:Invent 2020

AWS Lambda

We launched Lambda Extensions in preview, enabling you to more easily integrate monitoring, security, and governance tools into Lambda functions. You can also build your own extensions that run code during Lambda lifecycle events, and there is an example extensions repo for starting development.

You can now send logs from Lambda functions to custom destinations by using Lambda Extensions and the new Lambda Logs API. Previously, you could only forward logs after they were written to Amazon CloudWatch Logs. Now, logging tools can receive log streams directly from the Lambda execution environment. This makes it easier to use your preferred tools for log management and analysis, including Datadog, Lumigo, New Relic, Coralogix, Honeycomb, or Sumo Logic.

Lambda Extensions API

Lambda launched support for Amazon MQ as an event source. Amazon MQ is a managed broker service for Apache ActiveMQ that simplifies deploying and scaling queues. This integration increases the range of messaging services that customers can use to build serverless applications. The event source operates in a similar way to using Amazon SQS or Amazon Kinesis. In all cases, the Lambda service manages an internal poller to invoke the target Lambda function.

We also released a new layer to make it simpler to integrate Amazon CodeGuru Profiler. This service helps identify the most expensive lines of code in a function and provides recommendations to help reduce cost. With this update, you can enable the profiler by adding the new layer and setting environment variables. There are no changes needed to the custom code in the Lambda function.

Lambda announced support for AWS PrivateLink. This allows you to invoke Lambda functions from a VPC without traversing the public internet. It provides private connectivity between your VPCs and AWS services. By using VPC endpoints to access the Lambda API from your VPC, this can replace the need for an Internet Gateway or NAT Gateway.

For developers building machine learning inferencing, media processing, high performance computing (HPC), scientific simulations, and financial modeling in Lambda, you can now use AVX2 support to help reduce duration and lower cost. By using packages compiled for AVX2 or compiling libraries with the appropriate flags, your code can then benefit from using AVX2 instructions to accelerate computation. In the blog post’s example, enabling AVX2 for an image-processing function increased performance by 32-43%.

Lambda now supports batch windows of up to 5 minutes when using SQS as an event source. This is useful for workloads that are not time-sensitive, allowing developers to reduce the number of Lambda invocations from queues. Additionally, the batch size has been increased from 10 to 10,000. This is now the same as the batch size for Kinesis as an event source, helping Lambda-based applications process more data per invocation.

Code signing is now available for Lambda, using AWS Signer. This allows account administrators to ensure that Lambda functions only accept signed code for deployment. Using signing profiles for functions, this provides granular control over code execution within the Lambda service. You can learn more about using this new feature in the developer documentation.

Amazon EventBridge

You can now use event replay to archive and replay events with Amazon EventBridge. After configuring an archive, EventBridge automatically stores all events or filtered events, based upon event pattern matching logic. You can configure a retention policy for archives to delete events automatically after a specified number of days. Event replay can help with testing new features or changes in your code, or hydrating development or test environments.

EventBridge archived events

EventBridge also launched resource policies that simplify managing access to events across multiple AWS accounts. This expands the use of a policy associated with event buses to authorize API calls. Resource policies provide a powerful mechanism for modeling event buses across multiple account and providing fine-grained access control to EventBridge API actions.

EventBridge resource policies

EventBridge announced support for Server-Side Encryption (SSE). Events are encrypted using AES-256 at no additional cost for customers. EventBridge also increased PutEvent quotas to 10,000 transactions per second in US East (N. Virginia), US West (Oregon), and Europe (Ireland). This helps support workloads with high throughput.

AWS Step Functions

Synchronous Express Workflows have been launched for AWS Step Functions, providing a new way to run high-throughput Express Workflows. This feature allows developers to receive workflow responses without needing to poll services or build custom solutions. This is useful for high-volume microservice orchestration and fast compute tasks communicating via HTTPS.

The Step Functions service recently added support for other AWS services in workflows. You can now integrate API Gateway REST and HTTP APIs. This enables you to call API Gateway directly from a state machine as an asynchronous service integration.

Step Functions now also supports Amazon EKS service integration. This allows you to build workflows with steps that synchronously launch tasks in EKS and wait for a response. In October, the service also announced support for Amazon Athena, so workflows can now query data in your S3 data lakes.

These new integrations help minimize custom code and provide built-in error handling, parameter passing, and applying recommended security settings.

AWS SAM CLI

The AWS Serverless Application Model (AWS SAM) is an AWS CloudFormation extension that makes it easier to build, manage, and maintains serverless applications. On November 10, the AWS SAM CLI tool released version 1.9.0 with support for cached and parallel builds.

By using sam build --cached, AWS SAM no longer rebuilds functions and layers that have not changed since the last build. Additionally, you can use sam build --parallel to build functions in parallel, instead of sequentially. Both of these new features can substantially reduce the build time of larger applications defined with AWS SAM.

Amazon SNS

Amazon SNS announced support for First-In-First-Out (FIFO) topics. These are used with SQS FIFO queues for applications that require strict message ordering with exactly once processing and message deduplication. This is designed for workloads that perform tasks like bank transaction logging or inventory management. You can also use message filtering in FIFO topics to publish updates selectively.

SNS FIFO

AWS X-Ray

X-Ray now integrates with Amazon S3 to trace upstream requests. If a Lambda function uses the X-Ray SDK, S3 sends tracing headers to downstream event subscribers. With this, you can use the X-Ray service map to view connections between S3 and other services used to process an application request.

AWS CloudFormation

AWS CloudFormation announced support for nested stacks in change sets. This allows you to preview changes in your application and infrastructure across the entire nested stack hierarchy. You can then review those changes before confirming a deployment. This is available in all Regions supporting CloudFormation at no extra charge.

The new CloudFormation modules feature was released on November 24. This helps you develop building blocks with embedded best practices and common patterns that you can reuse in CloudFormation templates. Modules are available in the CloudFormation registry and can be used in the same way as any native resource.

Amazon DynamoDB

For customers using DynamoDB global tables, you can now use your own encryption keys. While all data in DynamoDB is encrypted by default, this feature enables you to use customer managed keys (CMKs). DynamoDB also announced support for global tables in the Europe (Milan) and Europe (Stockholm) Regions. This feature enables you to scale global applications for local access in workloads running in different Regions and replicate tables for higher availability and disaster recovery (DR).

The DynamoDB service announced the ability to export table data to data lakes in Amazon S3. This enables you to use services like Amazon Athena and AWS Lake Formation to analyze DynamoDB data with no custom code required. This feature does not consume table capacity and does not impact performance and availability. To learn how to use this feature, see this documentation.

AWS Amplify and AWS AppSync

You can now use existing Amazon Cognito user pools and identity pools for Amplify projects, making it easier to build new applications for an existing user base. AWS Amplify Console, which provides a fully managed static web hosting service, is now available in the Europe (Milan), Middle East (Bahrain), and Asia Pacific (Hong Kong) Regions. This service makes it simpler to bring automation to deploying and hosting single-page applications and static sites.

AWS AppSync enabled AWS WAF integration, making it easier to protect GraphQL APIs against common web exploits. You can also implement rate-based rules to help slow down brute force attacks. Using AWS Managed Rules for AWS WAF provides a faster way to configure application protection without creating the rules directly. AWS AppSync also recently expanded service availability to the Asia Pacific (Hong Kong), Middle East (Bahrain), and China (Ningxia) Regions, making the service now available in 21 Regions globally.

Still looking for more?

Join the AWS Serverless Developer Advocates on Twitch throughout re:Invent for live Q&A, session recaps, and more! See this page for the full schedule.

For more serverless learning resources, visit Serverless Land.

Tracking the latest server images in Amazon EC2 Image Builder pipelines

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/tracking-the-latest-server-images-in-amazon-ec2-image-builder-pipelines/

This post courtesy of Anoop Rachamadugu, Cloud Architect at AWS

The Amazon EC2 Image Builder service helps users to build and maintain server images. The images created by EC2 Image Builder can be used with Amazon Elastic Compute Cloud (EC2) and on-premises. Image Builder reduces the effort of keeping images up-to-date and secure by providing a graphical interface, built-in automation, and AWS-provided security settings. Customers have told us that they manage multiple server images and are looking for ways to track the latest server images created by the pipelines.

In this blog post, I walk through a solution that uses AWS Lambda and AWS Systems Manager (SSM) Parameter Store. It tracks and updates the latest Amazon Machine Image (AMI) IDs every time an Image Builder pipeline is run. With Lambda, you pay only for what you use. You are charged based on the number of requests for your functions and the time it takes for your code to run. In this case, the Lambda function is invoked upon the completion of the image builder pipeline. Standard SSM parameters are available at no additional charge.

Users can reference the SSM parameters in automation scripts and AWS CloudFormation templates providing access to the latest AMI ID for your EC2 infrastructure. Consider the use case of updating Amazon Machine Image (AMI) IDs for the EC2 instances in your CloudFormation templates. Normally, you might map AMI IDs to specific instance types and Regions. Then to update these, you would manually change them in each of your templates. With the SSM parameter integration, your code remains untouched and a CloudFormation stack update operation automatically fetches the latest Parameter Store value.

Overview

This solution uses a Lambda function written in Python that subscribes to an Amazon Simple Notification Service (SNS) topic. The Lambda function and the SNS topic are deployed using AWS SAM CLI. Once deployed, the SNS topic must be configured in an existing Image Builder pipeline. This results in the Lambda function being invoked at the completion of the Image Builder pipeline.

When a Lambda function subscribes to an SNS topic, it is invoked with the payload of the published messages. The Lambda function receives the message payload as an input parameter. The Lambda function first checks the message payload to see if the image status is available. If the image state is available, it retrieves the AMI ID from the message payload and updates the SSM parameter.

EC2 Image builder architecture diagram

EC2 Image builder architecture diagram

Prerequisites

To get started with this solution, the following is required:

Deploying the solution

The solution consists of two files, which can be downloaded from the amazon-ec2-image-builder GitHub repository.

  1. The Python file image-builder-lambda-update-ssm.py contains the code for the Lambda function. It first checks the SNS message payload to determine if the image is available. If it’s available, it extracts the AMI ID from the SNS message payload and updates the SSM parameter specified.The ‘ssm_parameter_name’ variable specifies the SSM parameter path where the AMI ID should be stored and updated. The Lambda function finishes by adding tags to the SSM parameter.
  2. The template.yaml file is an AWS SAM template. It deploys the Lambda function, SNS topic, and IAM role required for the Lambda function. I use Python 3.7 as the runtime and assign a memory of 256 MB for the Lambda function. The IAM policy gives the Lambda function permissions to retrieve and update SSM parameters. Deploy this application using the AWS SAM CLI guided deploy:
    sam deploy --guided

After deploying the application, note the ARN of the created SNS topic. Next, update the infrastructure settings of an existing Image Builder pipeline with this newly created SNS topic. This results in the Lambda function being invoked upon the completion of the image builder pipeline.

Configuration details

Configuration details

Verifying the solution

After the completion of the image builder pipeline, use the AWS CLI or check the AWS Management Console to verify the updated SSM parameter. To verify via AWS CLI, run the following commands to retrieve and list the tags attached to the SSM parameter:

aws ssm get-parameter --name ‘/ec2-imagebuilder/latest’
aws ssm list-tags-for-resource --resource-type "Parameter" --resource-id ‘/ec2-imagebuilder/latest’

To verify via the AWS Management Console, navigate to the Parameter Store under AWS Systems Manager. Search for the parameter /ec2-imagebuilder/latest:

AWS Systems Manager: Parameter Store

AWS Systems Manager: Parameter Store

Select the Tags tab to view the tags attached to the SSM parameter:

Image builder tags list

Image builder tags list

Referencing the SSM Parameter in CloudFormation templates

Users can reference the SSM parameters in automation scripts and AWS CloudFormation templates providing access to the latest AMI ID for your EC2 infrastructure. This sample code shows how to reference the SSM parameter in a CloudFormation template.

Parameters :
  LatestAmiId :
    Type : 'AWS::SSM::Parameter::Value<AWS::EC2::Image::Id>'
    Default: ‘/ec2-imagebuilder/latest’

Resources :
  Instance :
    Type : 'AWS::EC2::Instance'
    Properties :
      ImageId : !Ref LatestAmiId

Conclusion

In this blog post, I demonstrate a solution that allows users to track and update the latest AMI ID created by the Image Builder pipelines. The Lambda function retrieves the AMI ID of the image created by a pipeline and update an AWS Systems Manager parameter. This Lambda function is triggered via an SNS topic configured in an Image Builder pipeline.

The solution is deployed using AWS SAM CLI. I also note how users can reference Systems Manager parameters in AWS CloudFormation templates providing access to the latest AMI ID for your EC2 infrastructure.

The amazon-ec2-image-builder-samples GitHub repository provides a number of examples for getting started with EC2 Image Builder. Image Builder can make it easier for you to build virtual machine (VM) images.

Getting started with RPA using AWS Step Functions and Amazon Textract

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/getting-started-with-rpa-using-aws-step-functions-and-amazon-textract/

This post is courtesy of Joe Tringali, Solutions Architect.

Many organizations are using robotic process automation (RPA) to automate workflow, back-office processes that are labor-intensive. RPA, as software bots, can often handle many of these activities. Often RPA workflows contain repetitive manual tasks that must be done by humans, such as viewing invoices to find payment details.

AWS Step Functions is a serverless function orchestrator and workflow automation tool. Amazon Textract is a fully managed machine learning service that automatically extracts text and data from scanned documents. Combining these services, you can create an RPA bot to automate the workflow and enable employees to handle more complex tasks.

In this post, I show how you can use Step Functions and Amazon Textract to build a workflow that enables the processing of invoices. Download the code for this solution from https://github.com/aws-samples/aws-step-functions-rpa.

Overview

The following serverless architecture can process scanned invoices in PDF or image formats for submitting payment information to a database.

Example architecture
To implement this architecture, I use single-purpose Lambda functions and Step Functions to build the workflow:

  1. Invoices are scanned and loaded into an Amazon Simple Storage Service (S3) bucket.
  2. The loading of an invoice into Amazon S3 triggers an AWS Lambda function to be invoked.
  3. The Lambda function starts an asynchronous Amazon Textract job to analyze the text and data of the scanned invoice.
  4. The Amazon Textract job publishes a completion notification message with a status of “SUCCEEDED” or “FAILED” to an Amazon Simple Notification Service (SNS) topic.
  5. SNS sends the message to an Amazon Simple Queue Service (SQS) queue that is subscribed to the SNS topic.
  6. The message in the SQS queue triggers another Lambda function.
  7. The Lambda function initiates a Step Functions state machine to process the results of the Amazon Textract job.
  8. For an Amazon Textract job that completes successfully, a Lambda function saves the document analysis into an Amazon S3 bucket.
  9. The loading of the document analysis to Amazon S3 triggers another Lambda function.
  10. The Lambda function retrieves the text and data of the scanned invoice to find the payment information. It writes an item to an Amazon DynamoDB table with a status indicating if the invoice can be processed.
  11. If the DynamoDB item contains the payment information, another Lambda function is invoked.
  12. The Lambda function archives the processed invoice into another S3 bucket.
  13. If the DynamoDB item does not contain the payment information, a message is published to an Amazon SNS topic requesting that the invoice be reviewed.

Amazon Textract can extract information from the various invoice images and associate labels with the data. You must then handle the various labels that different invoices may associate with the payee name, due date, and payment amount.

Determining payee name, due date and payment amount

After the document analysis has been saved to S3, a Lambda function retrieves the text and data of the scanned invoice to find the information needed for payment. However, invoices can use a variety of labels for the same piece of data, such a payment’s due date.

In the example invoices included with this blog, the payment’s due date is associated with the labels “Pay On or Before”, “Payment Due Date” and “Payment Due”. Payment amounts can also have different labels, such as “Total Due”, “New Balance Total”, “Total Current Charges”, and “Please Pay”. To address this, I use a series of helper functions in the app.py file in the process_document_analysis folder of the GitHub repo.

In app.py, there is the following get_ky_map helper function:

def get_kv_map(blocks):
    key_map = {}
    value_map = {}
    block_map = {}
    for block in blocks:
        block_id = block['Id']
        block_map[block_id] = block
        if block['BlockType'] == "KEY_VALUE_SET":
            if 'KEY' in block['EntityTypes']:
                key_map[block_id] = block
            else:
                value_map[block_id] = block
    return key_map, value_map, block_map

The get_kv_map function is invoked by the Lambda function handler. It iterates over the “Blocks” element of the document analysis produced by Amazon Textract to create dictionaries of keys (labels) and values (data) associated with each block identified by Amazon Textract. It then invokes the following get_kv_relationship helper function:

def get_kv_relationship(key_map, value_map, block_map):
    kvs = {}
    for block_id, key_block in key_map.items():
        value_block = find_value_block(key_block, value_map)
        key = get_text(key_block, block_map)
        val = get_text(value_block, block_map)
        kvs[key] = val
    return kvs

The get_kv_relationship function merges the key and value dictionaries produced by the get_kv_map function to create a single Python key value dictionary where labels are the keys to the dictionary and the invoice’s data are the values. The handler then invokes the following get_line_list helper function:

def get_line_list(blocks):
    line_list = []
    for block in blocks:
        if block['BlockType'] == "LINE":
            if 'Text' in block: 
                line_list.append(block["Text"])
    return line_list

Extracting payee names is more complex because the data may not be labeled. The payee may often differ from the entity sending the invoice. With the Amazon Textract analysis in a format more easily consumable by Python, I use the following get_payee_name helper function to parse and extract the payee:

def get_payee_name(lines):
    payee_name = ""
    payable_to = "payable to"
    payee_lines = [line for line in lines if payable_to in line.lower()]
    if len(payee_lines) > 0:
        payee_line = payee_lines[0]
        payee_line = payee_line.strip()
        pos = payee_line.lower().find(payable_to)
        if pos > -1:
            payee_line = payee_line[pos + len(payable_to):]
            if payee_line[0:1] == ':':
                payee_line = payee_line[1:]
            payee_name = payee_line.strip()
    return payee_name

The get_amount helper function searches the key value dictionary produced by the get_kv_relationship function to retrieve the payment amount:

def get_amount(kvs, lines):
    amount = None
    amounts = [search_value(kvs, amount_tag) for amount_tag in amount_tags if search_value(kvs, amount_tag) is not None]
    if len(amounts) > 0:
        amount = amounts[0]
    else:
        for idx, line in enumerate(lines):
            if line.lower() in amount_tags:
                amount = lines[idx + 1]
                break
    if amount is not None:
        amount = amount.strip()
        if amount[0:1] == '$':
            amount = amount[1:]
    return amount

The amount_tags variable contains a list of possible labels associated with the payment amount:

amount_tags = ["total due", "new balance total", "total current charges", "please pay"]

Similarly, the get_due_date helper function searches the key value dictionary produced by the get_kv_relationship function to retrieve the payment due date:

def get_due_date(kvs):
    due_date = None
    due_dates = [search_value(kvs, due_date_tag) for due_date_tag in due_date_tags if search_value(kvs, due_date_tag) is not None]
    if len(due_dates) > 0:
        due_date = due_dates[0]
    if due_date is not None:
        date_parts = due_date.split('/')
        if len(date_parts) == 3:
            due_date = datetime(int(date_parts[2]), int(date_parts[0]), int(date_parts[1])).isoformat()
        else:
            date_parts = [date_part for date_part in re.split("\s+|,", due_date) if len(date_part) > 0]
            if len(date_parts) == 3:
                datetime_object = datetime.strptime(date_parts[0], "%b")
                month_number = datetime_object.month
                due_date = datetime(int(date_parts[2]), int(month_number), int(date_parts[1])).isoformat()
    else:
        due_date = datetime.now().isoformat()
    return due_date

The due_date_tag contains a list of possible labels associated with the payment due:

due_date_tags = ["pay on or before", "payment due date", "payment due"]

If all required elements needed to issue a payment are found, it adds an item to the DynamoDB table with a status attribute of “Approved for Payment”. If the Lambda function cannot determine the value of one or more required elements, it adds an item to the DynamoDB table with a status attribute of “Pending Review”.

Payment Processing

If the item in the DynamoDB table is marked “Approved for Payment”, the processed invoice is archived. If the item’s status attribute is marked “Pending Review”, an SNS message is published to an SNS Pending Review topic. You can subscribe to this topic so that you can add additional labels to the Python code for determining payment due dates and payment amounts.

Note that the Lambda functions are single-purpose functions, and all workflow logic is contained in the Step Functions state machine. This diagram shows the various tasks (states) of a successful workflow.

State machine workflow

For more information about this solution, download the code from the GitHub repo (https://github.com/aws-samples/aws-step-functions-rpa).

Prerequisites

Before deploying the solution, you must install the following prerequisites:

  1. Python.
  2. AWS Command Line Interface (AWS CLI) – for instructions, see Installing the AWS CLI.
  3. AWS Serverless Application Model Command Line Interface (AWS SAM CLI) – for instructions, see Installing the AWS SAM CLI.

Deploying the solution

The solution creates the following S3 buckets with names suffixed by your AWS account ID to prevent a global namespace collision of your S3 bucket names:

  • scanned-invoices-<YOUR AWS ACCOUNT ID>
  • invoice-analyses-<YOUR AWS ACCOUNT ID>
  • processed-invoices-<YOUR AWS ACCOUNT ID>

The following steps deploy the example solution in your AWS account. The solution deploys several components including a Step Functions state machine, Lambda functions, S3 buckets, a DynamoDB table for payment information, and SNS topics.

AWS CloudFormation requires an S3 bucket and stack name for deploying the solution. To deploy:

  1. Download code from GitHub repo (https://github.com/aws-samples/aws-step-functions-rpa).
  2. Run the following command to build the artifacts locally on your workstation:sam build
  3. Run the following command to create a CloudFormation stack and deploy your resources:sam deploy --guided --capabilities CAPABILITY_NAMED_IAM

Monitor the progress and wait for the completion of the stack creation process from the AWS CloudFormation console before proceeding.

Testing the solution

To test the solution, upload the PDF test invoices to the S3 bucket named scanned-invoices-<YOUR AWS ACCOUNT ID>.

A Step Functions state machine with the name <YOUR STACK NAME>-ProcessedScannedInvoiceWorkflow runs the workflow. Amazon Textract document analyses are stored in the S3 bucket named invoice-analyses-<YOUR AWS ACCOUNT ID>, and processed invoices are stored in the S3 bucket named processed-invoices-<YOUR AWS ACCOUNT ID>. Processed payments are found in the DynamoDB table named <YOUR STACK NAME>-invoices.

You can monitor the status of the workflows from the Step Functions console. Upon completion of the workflow executions, review the items added to DynamoDB from the Amazon DynamoDB console.

Cleanup

To avoid ongoing charges for any resources you created in this blog post, delete the stack:

  1. Empty the three S3 buckets created during deployment using the S3 console:
    – scanned-invoices-<YOUR AWS ACCOUNT ID>
    – invoice-analyses-<YOUR AWS ACCOUNT ID>
    – processed-invoices-<YOUR AWS ACCOUNT ID>
  2. Delete the CloudFormation stack created during deployment using the CloudFormation console.

Conclusion

In this post, I showed you how to use a Step Functions state machine and Amazon Textract to automatically extract data from a scanned invoice. This eliminates the need for a person to perform the manual step of reviewing an invoice to find payment information to be fed into a backend system. By replacing the manual steps of a workflow with automation, an organization can free up their human workforce to handle more value-added tasks.

To learn more, visit AWS Step Functions and Amazon Textract for more information. For more serverless learning resources, visit https://serverlessland.com.

 

Application integration patterns for microservices: Running distributed RFQs

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/application-integration-patterns-running-distributed-rfqs/

This post is courtesy of Dirk Fröhner, Principal Solutions Architect.

The first blog in this series introduces asynchronous messaging for building loosely coupled systems that can scale, operate, and evolve individually. It considers messaging as a communications model for microservices architectures. Part 2 dives into fan-out strategies and applies the respective patterns to a concrete use case.

In this post, I look at how to apply messaging patterns to help coordinate distributed requests and responses. Specifically, I focus on a composite pattern called scatter-gather, as presented in the book “Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions” (Hohpe and Woolf, 2004).

I also show how a client can communicate with a backend via synchronous REST API operations while asynchronous messaging is applied internally for processing.

Overview

The use case is for Wild Rydes, a fictional application that replaces traditional taxis with unicorns. It’s used in several hands-on AWS workshops that illustrate serverless development concepts.

Wild Rydes wants to allow customers to initiate requests for quotation (RFQs) for their rides. This allows unicorns to make special offers to potential customers within a defined schedule. A customer can send their ride details and ask for quotations from all unicorns that are within a certain vicinity. The customer can then choose the best offer.

Wild Rydes

The scatter-gather pattern

The scatter-gather pattern can be used to implement this use-case on the server side. This pattern is ideal for requesting responses from multiple parties, then aggregating and processing that data.

As presented by Hohpe and Woolf, the scatter-gather pattern is a composite pattern that illustrates how to “broadcast a message to multiple recipients and re-aggregate the responses back into a single message”. The pattern is illustrated in the following diagram.

Scatter-gather architecture

The flow starts with the Requester to initiate the broadcast to all potential Responders. This can be architected in a loosely coupled manner using pub-sub messaging with Amazon SNS or Amazon MQ, as shown in this blog post.

All responders must send their answers somewhere for aggregation and processing. This can also be architected in a loosely coupled manner using a message queue with Amazon SQS or Amazon MQ, as described in this blog post.

The Aggregator component consumes the individual responses from the response queue. It forwards the aggregate to the Processor component for final processing. Both Aggregator and Processor can be part of the same application or process. If separated, they can be decoupled through messaging. The Requester can also be part of the same application or process as Aggregator and Processor.

Explaining the architecture and API

In this section, I walk through the use-case and explain how it can be architected and implemented. I show how the scatter-gather pattern works in the backend, and the client-to-backend communication.

Submit instant ride RFQ

To initiate such an RFQ, the customer app communicates with the ride booking service on the backend. The ride booking service exposes a REST API. By default, an RFQ runs for five minutes, but Wild Rydes is working on a feature to let a customer individually set that value.

A request to submit an instant-ride RFQ contains start and destination locations for the ride and the customer ID:

POST /<submit-instant-ride-rfq-resource-path> HTTP/1.1
...

{
    "from": "...",
    "to": "...",
    "customer": "..."
}

The RFQ is a lengthy process so the client app should not expect an immediate response. Instead, the API accepts the RFQ, creates an RFQ task resource, and returns to the client. The response contains a URL to request an update for the status. It also provides an estimated time for the end of the RFQ:

HTTP/1.1 202 Accepted
...

{
    "links": {
        "self": "http://.../<rfq-task-resource-path>",
        "...": "..."
    },
    "status": "running",
    "eta": "..."
}

The following architecture shows this interaction, excluding the process after a new RFQ is submitted.

Client app interaction

Processing the RFQ

The backend uses the scatter-gather pattern to publish the RFQ to unicorns and collect responses for aggregation and processing.

Backend architecture

1. The ride booking service acts as the requester in the scatter-gather pattern. Following a new RFQ from the client app, it publishes the details into an SNS topic. This topic is related to the location of the ride’s starting point since customers need quotes from unicorns within the vicinity. These messages are the green request messages.

2. The unicorn management service maintains instances of unicorn management resources and subscribes them to RFQ topics related to their current location. These resources receive the RFQ request messages and handle the interaction with the Wild Rydes unicorn app.

3. The unicorns in the vicinity are notified through the Wild Rydes unicorn app about the new RFQ and can react if they are available. Notification options between the unicorn management service and the Wild Rydes unicorn app include push notifications and web sockets.

4. Every addressed unicorn can now submit their quote. All quotes go back through the unicorn management resources and the unicorn management service into the RFQ response queue. They act as the responders in the sense of the scatter-gather pattern.

5. The ride booking service also acts as aggregator and processor in the sense of the scatter-gather pattern. It uses SQS to consume messages from an RFQ response queue that eventually contains the RFQ responses from the involved unicorns. It starts doing so immediately after it publishes the details of a new RFQ into the RFQ topic. The messages from the RFQ response queue relate to the blue response messages.

The ride booking service consumes all incoming responses from that queue. This continues until the deadline or all participating unicorns have answered, whatever occurs first. The aggregator responsibility can be as simple as persisting the details of each incoming RFQ response into an Amazon DynamoDB table.

To match incoming responses to the right RFQ, it uses a fundamental integration pattern, correlation ID. In this pattern, a requester adds a unique ID to an outgoing message and each responder is asked to forward this ID in their response.

Also, responders must know where to send their responses to. To keep this dynamic, there is another fundamental integration pattern: return address. It suggests that a requester adds meta information into outgoing messages that indicate the address for their responses. In this architecture, this is the ARN of the SQS queue that acts as the RFQ response queue. This supports an option to simplify the response management: the RFQ response queue is a dedicated queue per customer.

Lastly, the processor responsibility in the ride booking service reads the RFQ responses from the DynamoDB table. It converts the data to JSON for the Wild Rydes customer app.

Check RFQ status

During the RFQ processing, a customer may want to know how many responses have already arrived, or if the results are already available. After submitting an instant ride RFQ, the client receives a representation of the running task. It can use the self-link to request an update:

GET /<rfq-task-resource-path> HTTP/1.1

While the task is running, a response from the ride booking service comes back with the respective status value and the count of responses that have already arrived:

HTTP/1.1 200 OK
...

{
    "links": {
        "self": "http://.../<rfq-task-resource-path>",
        "...": "..."
    },
    "status": "running",
    "responses-received": 2,
    "eta": "..."
}

After the RFQ is completed

An RFQ is completed if either the time is up or all unicorns have answered. The result of the RFQ is then available to the customer. If the client requests an update to the task representation, the response indicates this by redirecting to the RFQ result:

HTTP/1.1 303 See Other
Location: <url-of-rfq-result-resource>

Requesting a representation of the results resource, the client receives the quotes of all the participating unicorns. The frontend customer app can visualize these accordingly:

HTTP/1.1 200 OK
...

{
    "links": { ... },
    "from": "...",
    "to": "...",
    "customer": "...",
    "quotes": [ ... ]
}

The ride booking service can also use means of active notifications to make the customer app aware once the RFQ result is ready, including the link to the RFQ result. Examples for this include push notifications and web sockets.

Conclusion

In this blog, I present the scatter-gather pattern, which is a composite pattern based on pub-sub and point-to-point messaging channels. It also employs correlation ID and return address. I show how this is implemented in the Wild Rydes example application. You can use this integration pattern for communication in your microservices.

I cover how synchronous API communication between end user client and backend can work along with asynchronous messaging for request processing internally.

To learn more:

For more serverless learning resources, visit https://serverlessland.com.

Introducing Amazon SNS FIFO – First-In-First-Out Pub/Sub Messaging

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/introducing-amazon-sns-fifo-first-in-first-out-pub-sub-messaging/

When designing a distributed software architecture, it is important to define how services exchange information. For example, the use of asynchronous communication decouples components and simplifies scaling, reducing the impact of changes and making it easier to release new features.

The two most common forms of asynchronous service-to-service communication are message queues and publish/subscribe messaging:

  • With message queues, messages are stored on the queue until they are processed and deleted by a consumer. On AWS, Amazon Simple Queue Service (SQS) provides a fully managed message queuing service with no administrative overhead.
  • With pub/sub messaging, a message published to a topic is delivered to all subscribers to the topic. On AWS, Amazon Simple Notification Service (SNS) is a fully managed pub/sub messaging service that enables message delivery to a large number of subscribers. Each subscriber can also set a filter policy to receive only the messages that it cares about.

You can use topics when you want to fan out messages to multiple applications, and queues when you want to send messages to one application. Using topics and queues together, you can decouple microservices, distributed systems, and serverless applications.

With SQS, you can use FIFO (First-In-First-Out) queues to preserve the order in which messages are sent and received, and to avoid that a message is processed more than once.

Introducing SNS FIFO Topics
Today, we are adding similar capabilities for pub/sub messaging with the introduction of SNS FIFO topics, providing strict message ordering and deduplicated message delivery to one or more subscribers.

FIFO topics manage ordering and deduplication similar to FIFO queues:

Ordering – You configure a message group by including a message group ID when publishing a message to a FIFO topic. For each message group ID, all messages are sent and delivered in order of their arrival. For example, to ensure the delivery of messages related to the same customer in order, you can publish these messages to the topic using the customer’s account number as the message group ID. There is no limit in the number of message groups with FIFO topics and queues. You don’t need to declare in advance the message group ID, any value will work. If you don’t have a logical distinction between messages, you can simply use the same message group ID for all and have a single group of ordered messages. The message group ID is passed to any subscribed FIFO queue.

Deduplication – Distributed systems (like SNS) and client applications sometimes generate duplicate messages. You can avoid duplicated message deliveries from the topic in two ways: either by enabling content-based deduplication on the topic, or by adding a deduplication ID to the messages that you publish. With message content-based deduplication, SNS uses a SHA-256 hash to generate the message deduplication ID using the body of the message. After a message with a specific deduplication ID is published successfully, there is a 5-minute interval during which any message with the same deduplication ID is accepted but not delivered. If you subscribe a FIFO queue to a FIFO topic, the deduplication ID is passed to the queue and it is used by SQS to avoid duplicate messages being received.

You can use FIFO topics and queues together to simplify the implementation of applications where the order of operations and events is critical, or when you cannot tolerate duplicates. For example, to process financial operations and inventory updates, or to asynchronously apply commands that you receive from a client device. FIFO queues can use message filtering in FIFO topics to selectively receive only a subset of messages rather than every message published to the topic.

How to Use SNS FIFO Topics
A common scenario where FIFO topics can help is when you receive updates that need to be processed in order. For example, I can use a FIFO topic to receive updates from an application where my customers edit their account profiles. Then, I subscribe an SQS FIFO queue to the FIFO topic, and use the queue as trigger for a Lambda function that applies the account updates to an Amazon DynamoDB table used by my Customer management system that needs to be kept in sync.

The decoupling introduced by the FIFO topic makes it easier to add new functionality with minimal impact to existing applications. For example, to reward my loyal customers with additional promotions, I add a new Loyalty application that is storing information in a relational database managed by Amazon Aurora. To keep the customer’s information stored in the Loyalty database in sync with my other applications, I can subscribe a new FIFO queue to the same FIFO topic, and add a new Lambda function that receives customer updates in the same order as they are generated, and applies them to the Loyalty database. In this way, I don’t need to change code and configuration of other applications to integrate the new Loyalty app.

First, I create two FIFO queues in the SQS console, leaving all options to their defaults:

  • The customer.fifo queue to process updates in my Customer management system.
  • The loyalty.fifo queue to help me collect and store customer updates for the Loyalty application.

In the SNS console, I create the updates.fifo topic. I select FIFO as type, and enable Content-based message deduplication.

Then,  I subscribe the customer.fifo and loyalty.fifo queues to the topic.

To be able to receive messages, I add a statement to the access policy of both queues granting the updates.fifo topic permissions to send messages to the queues. For example, for the customer.fifo queue the statement is:

{
  "Effect": "Allow",
  "Principal": {
    "Service": "sns.amazonaws.com"
  },
  "Action": "SQS:SendMessage",
  "Resource": "arn:aws:sqs:us-east-2:123412341234:customer.fifo",
  "Condition": {
    "ArnLike": {
      "aws:SourceArn": "arn:aws:sns:us-east-2:123412341234:updates.fifo"
    }
  }
}

Now, I use the SNS console to publish 4 messages in sequence. For all messages, I use the same message group ID. In this way, they are all in the same message group. The only part that is different is the message body, where I use in order:

  • Update One
  • Update Two
  • Update Three
  • Update One

In the SQS console, I see that only 3 messages have been delivered to the FIFO queues:

Why is that? When I created the FIFO topics, I enabled content-based deduplication. The 4 messages were sent within the 5-minute deduplication window. The last message has been recognized as a duplicate of the first one and has not been delivered to the subscribed queues.

Let’s see the actual messages in the queues. I use the AWS Command Line Interface (CLI) to receive the messages from SQS, and the jq command-line JSON processor to format the output and get only the Message in the Body.

Here are the messages in the customer.fifo queue:

$ aws sqs receive-message --queue-url https://sqs.us-east-2.amazonaws.com/123412341234/customer.fifo --max-number-of-messages 10 | jq '.Messages[].Body | fromjson | .Message'

"Update One"
"Update Two"
"Update Three"

And these are the messages in the loyalty.fifo queue:

$ aws sqs receive-message --queue-url https://sqs.us-east-2.amazonaws.com/123412341234/loyalty.fifo --max-number-of-messages 10 | jq '.Messages[].Body | fromjson | .Message'

"Update One"
"Update Two"
"Update Three"

As expected, the 3 messages with unique content have been delivered to both queues in the same order as they were sent.

Available Now
You can use SNS FIFO topics in all commercial regions. You can process up to 300 transactions per second (TPS) per FIFO topic or FIFO queue. With SNS, you pay only for what you use, you can find more information in the pricing page.

To learn more, please see the documentation.

Danilo

Building event-driven architectures with Amazon SNS FIFO

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-event-driven-architectures-with-amazon-sns-fifo/

This post is courtesy of Christian Mueller, Principal Solutions Architect.

Developers increasingly adopt event-driven architectures to decouple their distributed applications. Often, these events must be propagated in a strictly ordered manner to all subscribed applications. Using Amazon SNS FIFO topics and Amazon SQS FIFO queues, you can address use cases that require end-to-end message ordering, deduplication, filtering, and encryption.

In this blog post, I introduce a sample event-driven architecture. I walk through an implementation based on Amazon SNS FIFO topics and Amazon SQS FIFO queues.

Common requirements in event-driven-architectures

In event-driven architectures, data consistency is a common business requirement. This is often translated into technical requirements such as zero message loss and strict message ordering. For example, if you update your domain object rapidly, you want to be sure that all events are received by each subscriber in exactly the order they occurred. This way, the current domain object state is what each subscriber received as the latest update event. Similarly, all update events should be received after the initial create event.

Before Amazon SNS FIFO, architects had to design applications to check if messages are received out of order before processing.

Comparing SNS and SNS FIFO

Another common challenge is preventing message duplicates when sending events to the messaging service. If an event publisher receives an error, such as a network timeout, the publisher does not know if the messaging service could receive and successfully process the message or not.

The client may retry, as this is the default behavior for some HTTP response codes in AWS SDKs. This can cause duplicate messages.

Before Amazon SNS FIFO, developers had to design receivers to be idempotent. In some cases, where the event cannot be idempotent, this requires the receiver to be implemented in an idempotent way. Often, this is done by adding a key-value store like Amazon DynamoDB or Amazon ElastiCache for Redis to the service. Using this approach, the receiver can track if the event has been seen before.

Exactly once processing and message deduplication

Exploring the recruiting agency example

This sample application models a recruitment agency with a job listings website. The application is composed of multiple services. I explain 3 of them in more detail.

Sample application architecture

A custom service, the anti-corruption service, receives a change data capture (CDC) event stream of changes from a relational database. This service translates the low-level technical database events into meaningful business events for the domain services for easy consumption. These business events are sent to the SNS FIFO “JobEvents.fifo“ topic. Here, interested services subscribe to these events and process them asynchronously.

In this domain, the analytics service is interested in all events. It has an SQS FIFO “AnalyticsJobEvents.fifo” queue subscribed to the SNS FIFO “JobEvents.fifo“ topic. It uses SQS FIFO as event source for AWS Lambda, which processes and stores these events in Amazon S3. S3 is object storage service with high scalability, data availability, durability, security, and performance. This allows you to use services like Amazon EMR, AWS Glue or Amazon Athena to get insights into your data to extract value.

The inventory service owns an SQS FIFO “InventoryJobEvents.fifo” queue, which is subscribed to the SNS FIFO “JobEvents.fifo“ topic. It is only interested in “JobCreated” and “JobDeleted” events, as it only tracks which jobs are currently available and stores this information in a DynamoDB table. Therefore, it uses an SNS filter policy to only receive these events, instead of receiving all events.

This sample application focuses on the SNS FIFO capabilities, so I do not explore other services subscribed to the SNS FIFO topic. This sample follows the SQS best practices and SNS redrive policy recommendations and configures dead-letter queues (DLQ). This is useful in case SNS cannot deliver an event to the subscribed SQS queue. It also helps if the function fails to process an event from the corresponding SQS FIFO queue multiple times. As a requirement in both cases, the attached SQS DLQ must be an SQS FIFO queue.

Deploying the application

To deploy the application using infrastructure as code, it uses the AWS Serverless Application Model (SAM). SAM provides shorthand syntax to express functions, APIs, databases, and event source mappings. It is expanded into AWS CloudFormation syntax during deployment.

To get started, clone the “event-driven-architecture-with-sns-fifo” repository, from here. Alternatively, download the repository as a ZIP file from here and extract it to a directory of your choice.

As a prerequisite, you must have SAM CLI, Python 3, and PIP installed. You must also have the AWS CLI configured properly.

Navigate to the root directory of this project and build the application with SAM. SAM downloads required dependencies and stores them locally. Execute the following commands in your terminal:

git clone https://github.com/aws-samples/event-driven-architecture-with-amazon-sns-fifo.git
cd event-driven-architecture-with-amazon-sns-fifo
sam build

You see the following output:

Deployment output

Now, deploy the application:

sam deploy --guided

Provide arguments for the deployments, such as the stack name and preferred AWS Region:

SAM guided deployment

After a successful deployment, you see the following output:

Successful deployment message

Learning more about the implementation

I explore the three services forming this sample application, and how they use the features of SNS FIFO.

Anti-corruption service

The anti-corruption service owns the SNS FIFO “JobEvents.fifo” topic, where it publishes business events related to job postings. It uses an SNS FIFO topic, as end-to-end ordering per job ID is required. SNS FIFO is configured not to perform content-based deduplication, as I require a unique message deduplication ID for each event for deduplication. The corresponding definition in the SAM template looks like this:

  JobEventsTopic:
    Type: AWS::SNS::Topic
    Properties:
      TopicName: JobEvents.fifo
      FifoTopic: true
      ContentBasedDeduplication: false

For simplicity, the anti-corruption function in the sample application doesn’t consume an external database CDC stream. It uses Amazon CloudWatch Events as an event source to trigger the function every minute.

I provide the SNS FIFO topic Amazon Resource Name (ARN) as an environment variable in the function. This makes this function more portable to deploy in different environments and stages. The function’s AWS Identity and Access Management (IAM) policy grants permissions to publish messages to only this SNS topic:

  AntiCorruptionFunction:
    Type: AWS::Serverless::
    Properties:
      CodeUri: anti-corruption-service/
      Handler: app.lambda_handler
      Runtime: python3.7
      MemorySize: 256
      Environment:
        Variables:
          TOPIC_ARN: !Ref JobEventsTopic
      Policies:
        - SNSPublishMessagePolicy
            TopicName: !GetAtt JobEventsTopic.TopicName
      Events:
        Trigger:
          Type: 
          Properties:
            Schedule: 'rate(1 minute)'

The anti-corruption function uses features in the SNS publish API, which allows you to define a “MessageDeduplicationId” and a “MessageGroupId”. The “MessageDeduplicationId” is used to filter out duplicate messages, which are sent to SNS FIFO within in 5-minute deduplication interval. The “MessageGroupId” is required, as SNS FIFO processes all job events for the same message group in a strictly ordered manner, isolated from other message groups processed through the same topic.

Another important aspect in this implementation is the use of “MessageAttributes”. We define a message attribute with the name “eventType” and values like “JobCreated”, “JobSalaryUpdated”, and “JobDeleted”. This allows subscribers to define SNS filter policies to only receive certain events they are interested in:

import boto3
from datetime import datetime
import json
import os
import random
import uuid

TOPIC_ARN = os.environ['TOPIC_ARN']

sns = boto3.client('sns')

def lambda_handler(event, context):
    jobId = str(random.randrange(0, 1000))

    send_job_created_event(jobId)
    send_job_updated_event(jobId)
    send_job_deleted_event(jobId)
    return

def send_job_created_event(jobId):
    messageId = str(uuid.uuid4())

    response = sns.publish(
        TopicArn=TOPIC_ARN,
        Subject=f'Job {jobId} created',
        MessageDeduplicationId=messageId,
        MessageGroupId=f'JOB-{jobId}',
        Message={...},
        MessageAttributes = {
            'eventType': {
                'DataType': 'String',
                'StringValue': 'JobCreated'
            }
        }
    )
    print('sent message and received response: {}'.format(response))
    return

def send_job_updated_event(jobId):
    messageId = str(uuid.uuid4())

    response = sns.publish(...)
    print('sent message and received response: {}'.format(response))
    return

def send_job_deleted_event(jobId):
    messageId = str(uuid.uuid4())

    response = sns.publish(...)
    print('sent message and received response: {}'.format(response))
    return

Analytics service

The analytics service owns an SQS FIFO “AnalyticsJobEvents.fifo” queue which is subscribed to the SNS FIFO “JobEvents.fifo” topic. Following best practices, I define redrive policies for the SQS FIFO queue and the SNS FIFO subscription in the template:

  AnalyticsJobEventsQueue:
    Type: AWS::SQS::Queue
    Properties:
      QueueName: AnalyticsJobEvents.fifo
      FifoQueue: true
      RedrivePolicy:
        deadLetterTargetArn: !GetAtt AnalyticsJobEventsQueueDLQ.Arn
        maxReceiveCount: 3

  AnalyticsJobEventsQueueToJobEventsTopicSubscription:
    Type: AWS::SNS::Subscription
    Properties:
      Endpoint: !GetAtt AnalyticsJobEventsQueue.Arn
      Protocol: sqs
      RawMessageDelivery: true
      TopicArn: !Ref JobEventsTopic
      RedrivePolicy: !Sub '{"deadLetterTargetArn": "${AnalyticsJobEventsSubscriptionDLQ.Arn}"}'

The analytics function uses SQS FIFO as an event source for Lambda. The S3 bucket name is an environment variable for the function, which increases the code portability across environments and stages. The IAM policy for this function only grants permissions write objects to this S3 bucket:

  AnalyticsFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: analytics-service/
      Handler: app.lambda_handler
      Runtime: python3.7
      MemorySize: 256
      Environment:
        Variables:
          BUCKET_NAME: !Ref AnalyticsBucket
      Policies:
        - S3WritePolicy:
            BucketName: !Ref AnalyticsBucket
      Events:
        Trigger:
          Type: SQS
          Properties:
            Queue: !GetAtt AnalyticsJobEventsQueue.Arn
            BatchSize: 10

View the function implementation at the GitHub repo.

Inventory service

The inventory service also owns an SQS FIFO “InventoryJobEvents.fifo” queue which is subscribed to the SNS FIFO “JobEvents.fifo” topic. It uses redrive policies for the SQS FIFO queue and the SNS FIFO subscription as well. This service is only interested in certain events, so uses an SNS filter policy to specify these events:

  InventoryJobEventsQueue:
    Type: AWS::SQS::Queue
    Properties:
      QueueName: InventoryJobEvents.fifo
      FifoQueue: true
      RedrivePolicy:
        deadLetterTargetArn: !GetAtt InventoryJobEventsQueueDLQ.Arn
        maxReceiveCount: 3

  InventoryJobEventsQueueToJobEventsTopicSubscription:
    Type: AWS::SNS::Subscription
    Properties:
      Endpoint: !GetAtt InventoryJobEventsQueue.Arn
      Protocol: sqs
      RawMessageDelivery: true
      TopicArn: !Ref JobEventsTopic
      FilterPolicy: '{"eventType":["JobCreated", "JobDeleted"]}'
      RedrivePolicy: !Sub '{"deadLetterTargetArn": "${InventoryJobEventsQueueSubscriptionDLQ.Arn}"}'

The inventory function also uses SQS FIFO as event source for Lambda. The DynamoDB table name is set as an environment variable, so the function can look up the name during initialization. The IAM policy grants read/write permissions for only this table:

  InventoryFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: inventory-service/
      Handler: app.lambda_handler
      Runtime: python3.7
      MemorySize: 256
      Environment:
        Variables:
          TABLE_NAME: !Ref InventoryTable
      Policies:
        - DynamoDBCrudPolicy:
            TableName: !Ref InventoryTable
      Events:
        Trigger:
          Type: SQS
          Properties:
            Queue: !GetAtt InventoryJobEventsQueue.Arn
            BatchSize: 10

View the function implementation at the GitHub repo.

Conclusion

Amazon SNS FIFO topics can simplify the design of event-driven architectures and reduce custom code in building such applications.

By using the native integration with Amazon SQS FIFO queues, you can also build architectures that fan out to thousands of subscribers. This pattern helps achieve data consistency, deduplication, filtering, and encryption in near real time, using managed services.

For information on regional availability and service quotas, see SNS endpoints and quotas and SQS endpoints and quotas. For more information on the FIFO functionality, see SNS FIFO and SQS FIFO in their Developer Guides.

Optimizing the cost of serverless web applications

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/optimizing-the-cost-of-serverless-web-applications/

Web application backends are one of the most frequent types of serverless use-case for customers. The pay-for-value model can make it cost-efficient to build web applications using serverless tools.

While serverless cost is generally correlated with level of usage, there are architectural decisions that impact cost efficiency. The impact of these choices is more significant as your traffic grows, so it’s important to consider the cost-effectiveness of different designs and patterns.

This blog post reviews some common areas in web applications where you may be able to optimize cost. It uses the Happy Path web application as a reference example, which you can read about in the introductory blog post.

Serverless web applications generally use a combination of the services in the following diagram. I cover each of these areas to highlight common areas for cost optimization.

Serverless architecture by AWS service

The API management layer: Selecting the right API type

Most serverless web applications use an API between the frontend client and the backend architecture. Amazon API Gateway is a common choice since it is a fully managed service that scales automatically. There are three types of API offered by the service – REST APIs, WebSocket APIs, and the more recent HTTP APIs.

HTTP APIs offer many of the features in the REST APIs service, but the cost is often around 70% less. It supports Lambda service integration, JWT authorization, CORS, and custom domain names. It also has a simpler deployment model than REST APIs. This feature set tends to work well for web applications, many of which mainly use these capabilities. Additionally, HTTP APIs will gain feature parity with REST APIs over time.

The Happy Path application is designed for 100,000 monthly active users. It uses HTTP APIs, and you can inspect the backend/template.yaml to see how to define these in the AWS Serverless Application Model (AWS SAM). If you have existing AWS SAM templates that are using REST APIs, in many cases you can change these easily:

REST to HTTP API

Content distribution layer: Optimizing assets

Amazon CloudFront is a content delivery network (CDN). It enables you to distribute content globally across 216 Points of Presence without deploying or managing any infrastructure. It reduces latency for users who are geographically dispersed and can also reduce load on other parts of your service.

A typical web application uses CDNs in a couple of different ways. First, there is the distribution of the application itself. For single-page application frameworks like React or Vue.js, the build processes create static assets that are ideal for serving over a CDN.

However, these builds may not be optimized and can be larger than necessary. Many frameworks offer optimization plugins, and the JavaScript community frequently uses Webpack to bundle modules and shrink deployment packages. Similarly, any media assets used in the application build should be optimized. You can use tools like Lighthouse to analyze your web apps to find images that can be resized or compressed.

Optimizing images

The second common CDN use-case for web apps is for user-generated content (UGC). Many apps allow users to upload images, which are then shared with other users. A typical photo from a 12-megapixel smartphone is 3–9 MB in size. This high resolution is not necessary when photos are rendered within web apps. Displaying the high-resolution asset results in slower download performance and higher data transfer costs.

The Happy Path application uses a Resizer Lambda function to optimize these uploaded assets. This process creates two different optimized images depending upon which component loads the asset.

Image sizes in front-end applications

The upload S3 bucket shows the original size of the upload from the smartphone:

The distribution S3 bucket contains the two optimized images at different sizes:

Optimized images in the distribution S3 bucket

The distribution file sizes are 98–99% smaller. For a busy web application, using optimized image assets can make a significant difference to data transfer and CloudFront costs.

Additionally, you can convert to highly optimized file formats such as WebP to reduce file size even further. Not all browsers support this format, but you can use CSS on the frontend to fall back to other types if needed:

<img src="myImage.webp" onerror="this.onerror=null; this.src='myImage.jpg'">

The data layer

AWS offers many different database and storage options that can be useful for web applications. Billing models vary by service and Region. By understanding the data access and storage requirements of your app, you can make informed decisions about the right service to use.

Generally, it’s more cost-effective to store binary data in S3 than a database. First, when the data is uploaded, you can upload directly to S3 with presigned URLs instead of proxying data via API Gateway or another service.

If you are using Amazon DynamoDB, it’s best practice to store larger items in S3 and include a reference token in a table item. Part of DynamoDB pricing is based on read capacity units (RCUs). For binary items such as images, it is usually more cost-efficient to use S3 for storage.

Many web developers who are new to serverless are familiar with using a relational database, so choose Amazon RDS for their database needs. Depending upon your use-case and data access patterns, it may be more cost effective to use DynamoDB instead. RDS is not a serverless service so there are monthly charges for the underlying compute instance. DynamoDB pricing is based upon usage and storage, so for many web apps may be a lower-cost choice.

Integration layer

This layer includes services like Amazon SQS, Amazon SNS, and Amazon EventBridge, which are essential for decoupling serverless applications. Each of these have a request-based pricing component, where 64 KB of a payload is billed as one request. For example, a single SQS message with a 256 KB payload is billed as four requests. There are two optimization methods common for web applications.

1. Combine messages

Many messages sent to these services are much smaller than 64 KB. In some applications, the publishing service can combine multiple messages to reduce the total number of publish actions to SNS. Additionally, by either eliminating unused attributes in the message or compressing the message, you can store more data in a single request.

For example, a publishing service may be able to combine multiple messages together in a single publish action to an SNS topic:

  • Before optimization, a publishing service sends 100,000,000 1KB-messages to an SNS topic. This is charged as 100 million messages for a total cost of $50.00.
  • After optimization, the publishing service combines messages to send 1,562,500 64KB-messages to an SNS topic. This is charged as 1,562,500 messages for a total cost of $0.78.

2. Filter messages

In many applications, not every message is useful for a consuming service. For example, an SNS topic may publish to a Lambda function, which checks the content and discards the message based on some criteria. In this case, it’s more cost effective to use the native filtering capabilities of SNS. The service can filter messages and only invoke the Lambda function if the criteria is met. This lowers the compute cost by only invoking Lambda when necessary.

For example, an SNS topic receives messages about customer orders and forwards these to a Lambda function subscriber. The function is only interested in canceled orders and discards all other messages:

  • Before optimization, the SNS topic sends all messages to a Lambda function. It evaluates the message for the presence of an order canceled attribute. On average, only 25% of the messages are processed further. While SNS does not charge for delivery to Lambda functions, you are charged each time the Lambda service is invoked, for 100% of the messages.
  • After optimization, using an SNS subscription filter policy, the SNS subscription filters for canceled orders and only forwards matching messages. Since the Lambda function is only invoked for 25% of the messages, this may reduce the total compute cost by up to 75%.

3. Choose a different messaging service

For complex filtering options based upon matching patterns, you can use EventBridge. The service can filter messages based upon prefix matching, numeric matching, and other patterns, combining several rules into a single filter. You can create branching logic within the EventBridge rule to invoke downstream targets.

EventBridge offers a broader range of targets than SNS destinations. In cases where you publish from an SNS topic to a Lambda function to invoke an EventBridge target, you could use EventBridge instead and eliminate the Lambda invocation. For example, instead of routing from SNS to Lambda to AWS Step Functions, instead create an EventBridge rule that routes events directly to a state machine.

Business logic layer

Step Functions allows you to orchestrate complex workflows in serverless applications while eliminating common boilerplate code. The Standard Workflow service charges per state transition. Express Workflows were introduced in December 2019, with pricing based on requests and duration, instead of transitions.

For workloads that are processing large numbers of events in shorter durations, Express Workflows can be more cost-effective. This is designed for high-volume event workloads, such as streaming data processing or IoT data ingestion. For these cases, compare the cost of the two workflow types to see if you can reduce cost by switching across.

Lambda is the on-demand compute layer in serverless applications, which is billed by requests and GB-seconds. GB-seconds is calculated by multiplying duration in seconds by memory allocated to the function. For a function with a 1-second duration, invoked 1 million times, here is how memory allocation affects the total cost in the US East (N. Virginia) Region:

Memory (MB) GB/S Compute cost Total cost
128 125,000 $ 2.08 $ 2.28
512 500,000 $ 8.34 $ 8.54
1024 1,000,000 $ 16.67 $ 16.87
1536 1,500,000 $ 25.01 $ 25.21
2048 2,000,000 $ 33.34 $ 33.54
3008 2,937,500 $ 48.97 $ 49.17

There are many ways to optimize Lambda functions, but one of the most important choices is memory allocation. You can choose between 128 MB and 3008 MB, but this also impacts the amount of virtual CPU as memory increases. Since total cost is a combination of memory and duration, choosing more memory can often reduce duration and lower overall cost.

Instead of manually setting the memory for a Lambda function and running executions to compare duration, you can use the AWS Lambda Power Tuning tool. This uses Step Functions to run your function against varying memory configurations. It can produce a visualization to find the optimal memory setting, based upon cost or execution time.

Optimizing costs with the AWS Lambda Power Tuning tool

Conclusion

Web application backends are one of the most popular workload types for serverless applications. The pay-per-value model works well for this type of workload. As traffic grows, it’s important to consider the design choices and service configurations used to optimize your cost.

Serverless web applications generally use a common range of services, which you can logically split into different layers. This post examines each layer and suggests common cost optimizations helpful for web app developers.

To learn more about building web apps with serverless, see the Happy Path series. For more serverless learning resources, visit https://serverlessland.com.

Building resilient serverless patterns by combining messaging services

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-resilient-no-code-serverless-patterns-by-combining-messaging-services/

In “Choosing between messaging services for serverless applications”, I explain the features and differences between the core AWS messaging services. Amazon SQS, Amazon SNS, and Amazon EventBridge provide queues, publish/subscribe, and event bus functionality for your applications. Individually, these are robust, scalable services that are fundamental building blocks of serverless architectures.

However, you can also combine these services to solve specific challenges in distributed architectures. By doing this, you can use specific features of each service to build sophisticated patterns with little code. These combinations can make your applications more resilient and scalable, and reduce the amount of custom logic and architecture in your workload.

In this blog post, I highlight several important patterns for serverless developers. I also show how you use and deploy these integrations with the AWS Serverless Application Model (AWS SAM).

Examples in this post refer to code that can be downloaded from this GitHub repo. The README.md file explains how to deploy and run each example.

SNS to SQS: Adding resilience and throttling to message throughput

SNS has a robust retry policy that results in up to 100,010 delivery attempts over 23 days. If a downstream service is unavailable, it may be overwhelmed by retries when it comes back online. You can solve this issue by adding an SQS queue.

Adding an SQS queue between the SNS topic and its subscriber has two benefits. First, it adds resilience to message delivery, since the messages are durably stored in a queue. Second, it throttles the rate of messages to the consumer, helping smooth out traffic bursts caused by the service catching up with missed messages.

To build this in an AWS SAM template, you first define the two resources, and the SNS subscription:

  MySqsQueue:
    Type: AWS::SQS::Queue

  MySnsTopic:
    Type: AWS::SNS::Topic
    Properties:
      Subscription:
        - Protocol: sqs
          Endpoint: !GetAtt MySqsQueue.Arn

Finally, you provide permission to the SNS topic to publish to the queue, using the AWS::SQS::QueuePolicy resource:

  SnsToSqsPolicy:
    Type: AWS::SQS::QueuePolicy
    Properties:
      PolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Sid: "Allow SNS publish to SQS"
            Effect: Allow
            Principal: "*"
            Resource: !GetAtt MySqsQueue.Arn
            Action: SQS:SendMessage
            Condition:
              ArnEquals:
                aws:SourceArn: !Ref MySnsTopic
      Queues:
        - Ref: MySqsQueue

To test this, you can publish a message to the SNS topic and then inspect the SQS queue length using the AWS CLI:

aws sns publish --topic-arn "arn:aws:sns:us-east-1:123456789012:sns-sqs-MySnsTopic-ABC123ABC" --message "Test message"
aws sqs get-queue-attributes --queue-url "https://sqs.us-east-1.amazonaws.com/123456789012/sns-sqs-MySqsQueue- ABC123ABC " --attribute-names ApproximateNumberOfMessages

This results in the following output:

CLI output

Another usage of this pattern is when you want to filter messages in architectures using an SQS queue. By placing the SNS topic in front of the queue, you can use the message filtering capabilities of SNS. This ensures that only the messages you need are published to the queue. To use message filtering in AWS SAM, use the AWS:SNS:Subcription resource:

  QueueSubcription:
    Type: 'AWS::SNS::Subscription'
    Properties:
      TopicArn: !Ref MySnsTopic
      Endpoint: !GetAtt MySqsQueue.Arn
      Protocol: sqs
      FilterPolicy:
        type:
        - orders
        - payments 
      RawMessageDelivery: 'true'

EventBridge to SNS: combining features of both services

Both SNS and EventBridge have different characteristics in terms of targets, and integration with broader features. This table compares the major differences between the two services:

Amazon SNS Amazon EventBridge
Number of targets 10 million (soft) 5
Limits 100,000 topics. 12,500,000 subscriptions per topic. 100 event buses. 300 rules per event bus.
Input transformation No Yes – see details.
Message filtering Yes – see details. Yes, including IP address matching – see details.
Format Raw or JSON JSON
Receive events from AWS CloudTrail No Yes
Targets HTTP(S), SMS, SNS Mobile Push, Email/Email-JSON, SQS, Lambda functions 15 targets including AWS LambdaAmazon SQSAmazon SNSAWS Step FunctionsAmazon Kinesis Data StreamsAmazon Kinesis Data Firehose.
SaaS integration No Yes – see integration partners.
Schema Registry integration No Yes – see details.
Dead-letter queues supported Yes No
Public visibility Can create public topics Cannot create public buses
Cross-Region You can subscribe your AWS Lambda functions to an Amazon SNS topic in any Region. Targets must be same Region. You can publish across Region to another event bus.

In this pattern, you configure an SNS topic as a target of an EventBridge rule:

SNS topic as a target for an EventBridge rule

In the AWS SAM template, you declare the resources in the preceding diagram as follows:

Resources:
  MySnsTopic:
    Type: AWS::SNS::Topic

  EventRule: 
    Type: AWS::Events::Rule
    Properties: 
      Description: "EventRule"
      EventPattern: 
        account: 
          - !Sub '${AWS::AccountId}'
        source:
          - "demo.cli"
      Targets: 
        - Arn: !Ref MySnsTopic
          Id: "SNStopic"

The default bus already exists in every AWS account, so there is no need to declare it. For the event bus to publish matching events to the SNS topic, you define permissions using the AWS::SNS::TopicPolicy resource:

  EventBridgeToToSnsPolicy:
    Type: AWS::SNS::TopicPolicy
    Properties: 
      PolicyDocument:
        Statement:
        - Effect: Allow
          Principal:
            Service: events.amazonaws.com
          Action: sns:Publish
          Resource: !Ref MySnsTopic
      Topics: 
        - !Ref MySnsTopic       

EventBridge has a limit of five targets per rule. In cases where you must send events to hundreds or thousands of targets, publishing to SNS first and then subscribing those targets to the topic works around this limit. Both services have different targets, and this pattern allows you to deliver EventBridge events to SMS, HTTP(s), email and SNS mobile push.

You can transform and filter the message using these services, often without needing an AWS Lambda function. SNS does not support input transformation but you can do this in an EventBridge rule. Message filtering is possible in both services but EventBridge provides richer content filtering capabilities.

AWS CloudTrail can log and monitor activity across services in your AWS account. It can be a useful source for events, allowing you to respond dynamically to objects in Amazon S3 or react to changes in your environment, for example. This natively integrates with EventBridge, allowing you to ingest events at scale from dozens of services.

Using EventBridge enables you to source events from outside your AWS account, offering integrations with a list of software as a service (SaaS) providers. This capability allows you to receive events from your accounts with SaaS providers like Zendesk, PagerDuty, and Auth0. These events are delivered to a partner event bus in your account, and can then be filtered and routed to an SNS topic.

Additionally, this pattern allows you to deliver events to Lambda functions in other AWS accounts and in other AWS Regions. You can invoke Lambda from SNS topics in other Regions and accounts. It’s also possible to make SNS topics publicly read-only, making them extensible endpoints that other third parties can consume from. SNS has comprehensive access control, which you can incorporate into this pattern.

Cross-account publishing

EventBridge to SQS: Building fault-tolerant microservices

EventBridge can route events to targets such as microservices. In the case of downstream failures, the service retries events for up to 24 hours. For workloads where you need a longer period of time to store and retry messages, you can deliver the events to an SQS queue in each microservice. This durably stores those events until the downstream service recovers. Additionally, this pattern protects the microservice from large bursts of traffic by throttling the delivery of messages.

Fault-tolerant microservices architecture

The resources declared in the AWS SAM template are similar to the previous examples, but it uses the AWS::SQS::QueuePolicy resource to grant the appropriate permission to EventBridge:

  EventBridgeToToSqsPolicy:
    Type: AWS::SQS::QueuePolicy
    Properties:
      PolicyDocument:
        Statement:
        - Effect: Allow
          Principal:
            Service: events.amazonaws.com
          Action: SQS:SendMessage
          Resource:  !GetAtt MySqsQueue.Arn
      Queues:
        - Ref: MySqsQueue

Conclusion

You can combine these services in your architectures to implement patterns that solve complex challenges, often with little code required. This blog post shows three examples that implement message throttling and queueing, integrating SNS and EventBridge, and building fault tolerant microservices.

To learn more building decoupled architectures, see this Learning Path series on EventBridge. For more serverless learning resources, visit https://serverlessland.com.

Choosing between messaging services for serverless applications

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/choosing-between-messaging-services-for-serverless-applications/

Most serverless application architectures use a combination of different AWS services, microservices, and AWS Lambda functions. Messaging services are important in allowing distributed applications to communicate with each other, and are fundamental to most production serverless workloads.

Messaging services can improve the resilience, availability, and scalability of applications, when used appropriately. They can also enable your applications to communicate beyond your workload or even the AWS Cloud, and provide extensibility for future service features and versions.

In this blog post, I compare the primary messaging services offered by AWS and how you can use these in your serverless application architectures. I also show how you use and deploy these integrations with the AWS Serverless Application Model (AWS SAM).

Examples in this post refer to code that can be downloaded from this GitHub repository. The README.md file explains how to deploy and run each example.

Overview

Three of the most useful messaging patterns for serverless developers are queues, publish/subscribe, and event buses. In AWS, these are provided by Amazon SQS, Amazon SNS, and Amazon EventBridge respectively. All of these services are fully managed and highly available, so there is no infrastructure to manage. All three integrate with Lambda, allowing you to publish messages via the AWS SDK and invoke functions as targets. Each of these services has an important role to play in serverless architectures.

SNS enables you to send messages reliably between parts of your infrastructure. It uses a robust retry mechanism for when downstream targets are unavailable. When the delivery policy is exhausted, it can optionally send those messages to a dead-letter queue for further processing. SNS uses topics to logically separate messages into channels, and your Lambda functions interact with these topics.

SQS provides queues for your serverless applications. You can use a queue to send, store, and receive messages between different services in your workload. Queues are an important mechanism for providing fault tolerance in distributed systems, and help decouple different parts of your application. SQS scales elastically, and there is no limit to the number of messages per queue. The service durably persists messages until they are processed by a downstream consumer.

EventBridge is a serverless event bus service, simplifying routing events between AWS services, software as a service (SaaS) providers, and your own applications. It logically separates routing using event buses, and you implement the routing logic using rules. You can filter and transform incoming messages at the service level, and route events to multiple targets, including Lambda functions.

Integrating an SQS queue with AWS SAM

The first example shows an AWS SAM template defining a serverless application with two Lambda functions and an SQS queue:

Producer-consumer example

You can declare an SQS queue in an AWS SAM template with the AWS::SQS::Queue resource:

  MySqsQueue:
    Type: AWS::SQS::Queue

To publish to the queue, the publisher function must have permission to send messages. Using an AWS SAM policy template, you can apply policy that enables send messaging to one specific queue:

      Policies:
        - SQSSendMessagePolicy:
            QueueName: !GetAtt MySqsQueue.QueueName

The AWS SAM template passes the queue name into the Lambda function as an environment variable. The function uses the sendMessage method of the AWS.SQS class to publish the message:

const AWS = require('aws-sdk')
AWS.config.region = process.env.AWS_REGION 
const sqs = new AWS.SQS({apiVersion: '2012-11-05'})

// The Lambda handler
exports.handler = async (event) => {
  // Params object for SQS
  const params = {
    MessageBody: `Message at ${Date()}`,
    QueueUrl: process.env.SQSqueueName
  }
  
  // Send to SQS
  const result = await sqs.sendMessage(params).promise()
  console.log(result)
}

When the SQS queue receives the message, it publishes to the consuming Lambda function. To configure this integration in AWS SAM, the consumer function is granted the SQSPollerPolicy policy. The function’s event source is set to receive messages from the queue in batches of 10:

  QueueConsumerFunction:
    Type: AWS::Serverless::Function 
    Properties:
      CodeUri: code/
      Handler: consumer.handler
      Runtime: nodejs12.x
      Timeout: 3
      MemorySize: 128
      Policies:  
        - SQSPollerPolicy:
            QueueName: !GetAtt MySqsQueue.QueueName
      Events:
        MySQSEvent:
          Type: SQS
          Properties:
            Queue: !GetAtt MySqsQueue.Arn
            BatchSize: 10

The payload for the consumer function is the message from SQS. This is an array of messages up to the batch size, containing a body attribute with the publishing function’s MessageBody. You can see this in the CloudWatch log for the function:

CloudWatch log result

Integrating an SNS topic with AWS SAM

The second example shows an AWS SAM template defining a serverless application with three Lambda functions and an SNS topic:

SNS fanout to Lambda functions

You declare an SNS topic and the subscribing Lambda functions with the AWS::SNS:Topic resource:

  MySnsTopic:
    Type: AWS::SNS::Topic
    Properties:
      Subscription:
        - Protocol: lambda
          Endpoint: !GetAtt TopicConsumerFunction1.Arn    
        - Protocol: lambda
          Endpoint: !GetAtt TopicConsumerFunction2.Arn

You provide the SNS service with permission to invoke the Lambda functions but defining an AWS::Lambda::Permission for each:

  TopicConsumerFunction1Permission:
    Type: 'AWS::Lambda::Permission'
    Properties:
      Action: 'lambda:InvokeFunction'
      FunctionName: !Ref TopicConsumerFunction1
      Principal: sns.amazonaws.com

The SNSPublishMessagePolicy policy template grants permission to the publishing function to send messages to the topic. In the function, the publish method of the AWS.SNS class handles publishing:

const AWS = require('aws-sdk')
AWS.config.region = process.env.AWS_REGION 
const sns = new AWS.SNS({apiVersion: '2012-11-05'})

// The Lambda handler
exports.handler = async (event) => {
  // Params object for SNS
  const params = {
    Message: `Message at ${Date()}`,
    Subject: 'New message from publisher',
    TopicArn: process.env.SNStopic
  }
  
  // Send to SQS
  const result = await sns.publish(params).promise()
  console.log(result)
}

The payload for the consumer functions is the message from SNS. This is an array of messages, containing subject and message attributes from the publishing function. You can see this in the CloudWatch log for the function:

CloudWatch log result

Differences between SQS and SNS configurations

SQS queues and SNS topics offer different functionality, though both can publish to downstream Lambda functions.

An SQS message is stored on the queue for up to 14 days until it is successfully processed by a subscriber. SNS does not retain messages so if there are no subscribers for a topic, the message is discarded.

SNS topics may broadcast to multiple targets. This behavior is called fan-out. It can be used to parallelize work across Lambda functions or send messages to multiple environments (such as test or development). An SNS topic can have up to 12,500,000 subscribers, providing highly scalable fan-out capabilities. The targets may include HTTP/S endpoints, SMS text messaging, SNS mobile push, email, SQS, and Lambda functions.

In AWS SAM templates, you can retrieve properties such as ARNs and names of queues and topics, using the following intrinsic functions:

Amazon SQS Amazon SNS
Channel type Queue Topic
Get ARN !GetAtt MySqsQueue.Arn !Ref MySnsTopic
Get name !GetAtt MySqsQueue.QueueName !GetAtt MySnsTopic.TopicName

Integrating with EventBridge in AWS SAM

The third example shows the AWS SAM template defining a serverless application with two Lambda functions and an EventBridge rule:

EventBridge integration with AWS SAM

The default event bus already exists in every AWS account. You declare a rule that filters events in the event bus using the AWS::Events::Rule resource:

  EventRule: 
    Type: AWS::Events::Rule
    Properties: 
      Description: "EventRule"
      EventPattern: 
        source: 
          - "demo.event"
        detail: 
          state: 
            - "new"
      State: "ENABLED"
      Targets: 
        - Arn: !GetAtt EventConsumerFunction.Arn
          Id: "ConsumerTarget"

The rule describes an event pattern specifying matching JSON attributes. Events that match this pattern are routed to the list of targets. You provide the EventBridge service with permission to invoke the Lambda functions in the target list:

  PermissionForEventsToInvokeLambda: 
    Type: AWS::Lambda::Permission
    Properties: 
      FunctionName: 
        Ref: "EventConsumerFunction"
      Action: "lambda:InvokeFunction"
      Principal: "events.amazonaws.com"
      SourceArn: !GetAtt EventRule.Arn

The AWS SAM template uses an IAM policy statement to grant permission to the publishing function to put events on the event bus:

  EventPublisherFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: code/
      Handler: publisher.handler
      Timeout: 3
      Runtime: nodejs12.x
      Policies:
        - Statement:
          - Effect: Allow
            Resource: '*'
            Action:
              - events:PutEvents      

The publishing function then uses the putEvents method of the AWS.EventBridge class, which returns after the events have been durably stored in EventBridge:

const AWS = require('aws-sdk')
AWS.config.update({region: 'us-east-1'})
const eventbridge = new AWS.EventBridge()

exports.handler = async (event) => {
  const params = {
    Entries: [ 
      {
        Detail: JSON.stringify({
          "message": "Hello from publisher",
          "state": "new"
        }),
        DetailType: 'Message',
        EventBusName: 'default',
        Source: 'demo.event',
        Time: new Date 
      }
    ]
  }
  const result = await eventbridge.putEvents(params).promise()
  console.log(result)
}

The payload for the consumer function is the message from EventBridge. This is an array of messages, containing subject and message attributes from the publishing function. You can see this in the CloudWatch log for the function:

CloudWatch log result

Comparing SNS with EventBridge

SNS and EventBridge have many similarities. Both can be used to decouple publishers and subscribers, filter messages or events, and provide fan-in or fan-out capabilities. However, there are differences in the list of targets and features for each service, and your choice of service depends on the needs of your use-case.

EventBridge offers two newer capabilities that are not available in SNS. The first is software as a service (SaaS) integration. This enables you to authorize supported SaaS providers to send events directly from their EventBridge event bus to partner event buses in your account. This replaces the need for polling or webhook configuration, and creates a highly scalable way to ingest SaaS events directly into your AWS account.

The second feature is the Schema Registry, which makes it easier to discover and manage OpenAPI schemas for events. EventBridge can infer schemas based on events routed through an event bus by using schema discovery. This can be used to generate code bindings directly to your IDE for type-safe languages like Python, Java, and TypeScript. This can help accelerate development by automating the generation of classes and code directly from events.

This table compares the major features of both services:

Amazon SNS Amazon EventBridge
Number of targets 10 million (soft) 5
Availability SLA 99.9% 99.99%
Limits 100,000 topics. 12,500,000 subscriptions per topic. 100 event buses. 300 rules per event bus.
Publish throughput Varies by Region. Soft limits. Varies by Region. Soft limits.
Input transformation No Yes – see details.
Message filtering Yes – see details. Yes, including IP address matching – see details.
Message size maximum 256 KB 256 KB
Billing Per 64 KB
Format Raw or JSON JSON
Receive events from AWS CloudTrail No Yes
Targets HTTP(S), SMS, SNS Mobile Push, Email/Email-JSON, SQS, Lambda functions. 15 targets including AWS LambdaAmazon SQSAmazon SNSAWS Step FunctionsAmazon Kinesis Data StreamsAmazon Kinesis Data Firehose.
SaaS integration No Yes – see integrations.
Schema Registry integration No Yes – see details.
Dead-letter queues supported Yes No
FIFO ordering available No No
Public visibility Can create public topics Cannot create public buses
Pricing $0.50/million requests + variable delivery cost + data transfer out cost. SMS varies. $1.00/million events. Free for AWS events. No charge for delivery.
Billable request size 1 request = 64 KB 1 event = 64 KB
AWS Free Tier eligible Yes No
Cross-Region You can subscribe your AWS Lambda functions to an Amazon SNS topic in any Region. Targets must be in the same Region. You can publish across Regions to another event bus.
Retry policy
  • For SQS/Lambda, exponential backoff over 23 days.
  • For SMTP, SMS and Mobile push, exponential backoff over 6 hours.
At-least-once event delivery to targets, including retry with exponential backoff for up to 24 hours.

Conclusion

Messaging is an important part of serverless applications and AWS services provide queues, publish/subscribe, and event routing capabilities. This post reviews the main features of SNS, SQS, and EventBridge and how they provide different capabilities for your workloads.

I show three example applications that publish and consume events from the three services. I walk through AWS SAM syntax for deploying these resources in your applications. Finally, I compare differences between the services.

To learn more building decoupled architectures, see this Learning Path series on EventBridge. For more serverless learning resources, visit https://serverlessland.com.

BBVA: Architecture for Large-Scale Macie Implementation

Post Syndicated from Neel Sendas original https://aws.amazon.com/blogs/architecture/bbva-architecture-for-large-scale-macie-implementation/

This post was co-written by Andrew Alaniz , Technical Information Security Officer, and Brady Pratt, Cloud Security Enginner, both at BBVA USA.

Introduction

Data Loss Prevention (DLP) is a common topic among companies that work with any type of sensitive data. One of the challenges is that many people either don’t fully understand what DLP is, or rather, have their own definition of what it is. Regardless of one’s interpretation of DLP, one thing is certain: before you can control data loss, you need to locate find the data sources.

If an organization can’t identify its data, it can’t protect it. BBVA USA, a bank holding company, turned to AWS for advice, and decided to use Amazon Macie to accomplish this in Amazon Simple Storage Service (Amazon S3). Macie is a fully managed data security and data privacy service that uses machine learning and pattern matching to discover and protect your sensitive data in AWS. This blog post will share some of the design and architecture we used to deploy Macie using services, such as AWS Lambda and Amazon CloudWatch.

Data challenges in Amazon S3

Although all S3 buckets are private by default, everyone is aware of the challenges of unsecured S3 buckets exposing data publicly. Amazon has provided a way to prevent that by removing the ability to make buckets public. As with other data storage mechanisms, this doesn’t stop anyone from storing sensitive data within AWS and exposing it another way. With Macie, one can classify the data stored in S3, centrally, and through AWS Organizations.

Recommended architecture

We can break the Macie architecture into two main parts: S3 discovery and evaluation, and S3 sensitive data discovery:

Macie architecture

The setup of discovery and evaluation is simple and straightforward, and should be enabled through Amazon Organizations and across all accounts. The cost of this piece is minimal, and it provides valuable insights into the compliance state of S3 buckets.

Once setup of discovery and evaluation is completed, we are a ready to move to the next step and configure discovery jobs for our S3 buckets. The architecture includes the use of S3, Amazon CloudWatch Events, Amazon EventBridge, and Lambda. All of the execution should happen in a centralized account, but the event triggers should come from each individual account.

Architectural considerations

When determining the architectural design of the solution, consider a few main components:

Centralization

Utilize AWS Organizations: Macie allows native integration with AWS Organizations. This is a significant advantage for Macie. Additionally, within AWS Organizations, it allows the delegation of the Macie master account to a subordinate account. The benefit of this is that it allows centralized management while allowing for the compartmentalization of roles.

Ease of management

One of the most challenging things to manage is non-conforming configurations. It’s much easier to manage a standard way to create, name, and configure settings. Once the classification jobs were ready to be created, we had to take into consideration the following when deploying Macie for our use case:

  1. Macie classifies content in a single job across one account.
  2. If you submit multiple jobs that contain the same bucket, Macie will scan the objects multiple times.
  3. Macie jobs are immutable.

Due to these considerations, we decided to create one job per S3 bucket. This allows administrators to search more easily for jobs related to findings.

Cost considerations

Macie plays an essential role, not only in identifying data and improving data collection, but also in compliance. We needed to make a decision about how to determine if an S3 bucket would be included in a classification job. Initially, we considered including all buckets no matter what. The logic here was that even if we make an assumption that a bucket would never have sensitive data in it, an entity with the right role could always add something at a later date.

Finally, we implemented a solution to tag specific buckets that were known to have immutable properties and which would never allow sensitive data to be added. We could do this because we knew exactly what data was in the bucket, who or what created the bucket, and exactly who or what had access to the bucket.

An example of this type of bucket is the S3 bucket used to store VPC Flow Logs. We know that this bucket is only created by provisioning scripts and is only going to store VPC flow logs that contain no sensitive data based on data classification standards. Also, only VPC services and specific security services can access this bucket for anything other than READ. This is controlled organizationally and can be tagged with a simple ignore key/value pair upon creation.

Deploying Macie at BBVA USA

BBVA USA developed an approach to working within AWS that allows guardrails to be applied as accounts are created. One of those guardrails identifies if developers have stored sensitive data in an account. BBVA needed to be able to do this, and do it at scale. If there is a roadblock or a challenge with AWS services, the first place BBVA looks is to support, but the second place is the Technical Account Manager.

After initiating conversations with its account team, BBVA determined that AWS Macie was the tool to help them with this challenge.

With the help of its technical account manager (TAM), BBVA was able to meet with the Macie Product team and discuss the best options for deploying at scale. Through these conversations, they were even able to influence the Macie product roadmap.

Getting Macie ready to deploy at scale was actually quite simple once the architectural pattern was designed.

Initial job creation

In order to set up jobs for each existing bucket in the organization, it’s a matter of scripting the job creation and adding each bucket from each account into its own job, which is pretty straightforward.

Job creation for new buckets

The recommended architecture and implementation for existing buckets:

  1. Whenever a new S3 bucket is added to Organization accounts, trigger a CloudWatch Event in the target account.
  2. Set up a cross account EventBridge to consume the Event. Using the EventBridge allows for a simpler configuration and centralized management of both Events and Lambda.
  3. Trigger a Lambda function in a delegated Macie admin account, which creates classification jobs to apply Macie to all the newly created S3 buckets.
  4. Repeat the same process when a bucket is deleted by triggering a cancel job.

Evaluate the state of S3 buckets

To evaluate the S3 accounts, turn on Macie at the organization master account and delegate administration to a subordinate account used for Macie. This enables management consolidation of security features into a centralized security account. This helps further restrict access from those that may need access to the master billing account. Finally, enable Macie by default on all organization accounts.

Evaluate the state of S3 buckets

Conclusion

BBVA USA worked directly with the Macie product team by leveraging its relationship with the AWS account team and Enterprise Support. This allowed the company to eventually deploy Macie quickly and at scale. Through Macie, the company is able to track any changes to configurations on buckets that allow a bucket to be public, shared, or replicated with external accounts and if the encryption policies are disabled. Using Macie, BBVA was able to identify buckets that contained sensitive information and put in another control to bolster its AWS governance profile.

Building well-architected serverless applications: Approaching application lifecycle management – part 3

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/building-well-architected-serverless-applications-approaching-application-lifecycle-management-part-3/

This series of blog posts uses the AWS Well-Architected Tool with the Serverless Lens to help customers build and operate applications using best practices. In each post, I address the nine serverless-specific questions identified by the Serverless Lens along with the recommended best practices. See the Introduction post for a table of contents and explanation of the example application.

Question OPS2: How do you approach application lifecycle management?

This post continues part 2 of this Operational Excellence question where I look at deploying to multiple stages using temporary environments, and rollout deployments. In part 1, I cover using infrastructure as code with version control to deploy applications in a repeatable manner.

Good practice: Use configuration management

Use environment variables and configuration management systems to make and track configuration changes. These systems reduce errors caused by manual processes, reduce the level of effort to deploy changes, and help isolate configuration from business logic.

Environment variables are suited for infrequently changing configuration options such as logging levels, and database connection strings. Configuration management systems are for dynamic configuration that might change frequently or contain sensitive data such as secrets.

Environment variables

The serverless airline example used in this series uses AWS Amplify Console environment variables to store application-wide settings.

For example, the Stripe payment keys for all branches, and names for individual branches, are visible within the Amplify Console in the Environment variables section.

AWS Amplify environment variables

AWS Amplify environment variables

AWS Lambda environment variables are set up as part of the function configuration stored using the AWS Serverless Application Model (AWS SAM).

For example, the airline booking ReserveBooking AWS SAM template sets global environment variables including the LOG_LEVEL with the following code.

Globals:
    Function:
        Environment:
            Variables:
                LOG_LEVEL: INFO

This is visible in the AWS Lambda console within the function configuration.

AWS Lambda environment variables in console

AWS Lambda environment variables in console

See the AWS Documentation for more information on using AWS Lambda environment variables and also how to store sensitive data. Amazon API Gateway can also pass stage-specific metadata to Lambda functions.

Dynamic configuration

Dynamic configuration is also stored in configuration management systems to specify external values and is unique to each environment. This configuration may include values such as an Amazon Simple Notification Service (Amazon SNS) topic, Lambda function name, or external API credentials. AWS System Manager Parameter Store, AWS Secrets Manager, and AWS AppConfig have native integrations with AWS CloudFormation to store dynamic configuration. For more information, see the examples for referencing dynamic configuration from within AWS CloudFormation.

For the serverless airline application, dynamic configuration is stored in AWS Systems Manager Parameter Store. During CloudFormation stack deployment, a number of parameters are stored in Systems Manager. For example, in the booking service AWS SAM template, the booking SNS topic ARN is stored.

BookingTopicParameter:
    Type: "AWS::SSM::Parameter"
    Properties:
        Name: !Sub /${Stage}/service/booking/messaging/bookingTopic
        Description: Booking SNS Topic ARN
        Type: String
        Value: !Ref BookingTopic

View the stored SNS topic value by navigating to the Parameter Store console, and search for BookingTopic.

Finding Systems Manager Parameter Store values

Finding Systems Manager Parameter Store values

Select the Parameter name and see the Amazon SNS ARN.

Viewing SNS topic value

Viewing SNS topic value

The loyalty service then references this value within another stack.

When the Amplify Console Makefile deploys the loyalty service, it retrieves this value for the booking service from Parameter Store, and references it as a parameter-override. The deployment is also parametrized with the $${AWS_BRANCH} environment variable if there are multiple environments within the same AWS account and Region.

sam deploy \
	--parameter-overrides \
	BookingSNSTopic=/$${AWS_BRANCH}/service/booking/messaging/bookingTopic

Environment variables and configuration management systems help with managing application configuration.

Improvement plan summary

  1. Use environment variables for configuration options that change infrequently such as logging levels, and database connection strings.
  2. Use a configuration management system for dynamic configuration that might change frequently or contain sensitive data such as secrets.

Best practice: Use CI/CD including automated testing across separate accounts

Continuous integration/delivery/deployment is one of the cornerstones of cloud application development and a vital part of a DevOps initiative.

Explanation of CI/CD stages

Explanation of CI/CD stages

Building CI/CD pipelines increases software delivery quality and feedback time for detecting and resolving errors. I cover how to deploy multiple stages in isolated environments and accounts, which helps with creating separate testing CI/CD pipelines in part 2. As the serverless airline example is using AWS Amplify Console, this comes with a built-in CI/CD pipeline.

Automate the build, deployment, testing, and rollback of the workload using KPI and operational alerts. This eases troubleshooting, enables faster remediation and feedback time, and enables automatic and manual rollback/roll-forward should an alert trigger.

I cover metrics, KPIs, and operational alerts in this series in the Application Health part 1, and part 2 posts. I cover rollout deployments with traffic shifting based on metrics in this question’s part 2.

CI/CD pipelines should include integration, and end-to-end tests. I cover local unit testing for Lambda and API Gateway in part 2.

Add an optional testing stage to Amplify Console to catch regressions before pushing code to production. Use the test step to run any test commands at build time using any testing framework of your choice. Amplify Console has deeper integration with the Cypress test suite that allows you to generate a UI report for your tests. Here is an example to set up end-to-end tests with Cypress.

Cypress testing example

Cypress testing example

There are a number of AWS and third-party solutions to host code and create CI/CD pipelines for serverless applications.

AWS Code Suite

AWS Code Suite

For more information on how to use the AWS Code* services together, see the detailed Quick Start deployment guide Serverless CI/CD for the Enterprise on AWS.

All these AWS services have a number of integrations with third-party products so you can integrate your serverless applications with your existing tools. For example, CodeBuild can build from GitHub and Atlassian Bitbucket repositories. CodeDeploy integrates with a number of developer tools and configuration management systems. CodePipeline has a number of pre-built integrations to use existing tools for your serverless applications. For more information specifically on using CircleCI for serverless applications, see Simplifying Serverless CI/CD with CircleCI and the AWS Serverless Application Model.

Improvement plan summary

  1. Use a continuous integration/continuous deployment (CI/CD) pipeline solution that deploys multiple stages in isolated environments/accounts.
  2. Automate testing including but not limited to unit, integration, and end-to-end tests.
  3. Favor rollout deployments over all-at-once deployments for more resilience, and gradually learn what metrics best determine your workload’s health to appropriately alert on.
  4. Use a deployment system that supports traffic shifting as part of your pipeline, and rollback/roll-forward traffic to previous versions if an alert is triggered.

Good practice: Review function runtime deprecation policy

Lambda functions created using AWS provided runtimes follow official long-term support deprecation policies. Third-party provided runtime deprecation policy may differ from official long-term support. Review your runtime deprecation policy and have a mechanism to report on runtimes that, if deprecated, may affect your workload to operate as intended.

Review the AWS Lambda runtime policy support page to understand the deprecation schedule for your runtime.

AWS Health provides ongoing visibility into the state of your AWS resources, services, and accounts. Use the AWS Personal Health Dashboard for a personalized view and automate custom notifications to communication channels other than your AWS Account email.

Use AWS Config to report on AWS Lambda function runtimes that might be near their deprecation. Run compliance and operational checks with AWS Config for Lambda functions.

If you are unable to migrate to newer runtimes within the deprecation schedule, use AWS Lambda custom runtimes as an interim solution.

Improvement plan summary

  1. Identify and report runtimes that might deprecate and their support policy.

Conclusion

Introducing application lifecycle management improves the development, deployment, and management of serverless applications. In part 1, I cover using infrastructure as code with version control to deploy applications in a repeatable manner. This reduces errors caused by manual processes and gives you more confidence your application works as expected. In part 2, I cover prototyping new features using temporary environments, and rollout deployments to gradually shift traffic to new application code.

In this post I cover configuration management, CI/CD for serverless applications, and managing function runtime deprecation.

In an upcoming post, I will cover the first Security question from the Well-Architected Serverless Lens – Controlling access to serverless APIs.

Building scalable serverless applications with Amazon S3 and AWS Lambda

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-scalable-serverless-applications-with-amazon-s3-and-aws-lambda/

Well-designed serverless applications are typically a combination of managed services connected by custom business logic. One of the most powerful combinations for enterprise application development is Amazon S3 and AWS Lambda. S3 is a highly durable, highly available object store that scales to meet your storage needs. Lambda runs custom code in response to events, automatically scaling with the size of the workload. When you use the two services together, they can provide a scalable core for serverless solutions.

This blog post shows how to design and deploy serverless applications designed around S3 events. The solutions presented use AWS services to create scalable serverless architectures, using minimal custom code. This is the conclusion of a series showing how the S3-to-Lambda pattern can implement the following business solutions:

Bringing the compute layer to the data

Much traditional software operates by bringing data to the compute layer. This means that processes run on batches of data in files, databases, and other sources. This is inherently harder to scale as data volumes grow, often needing a fleet of servers to scale out at peak times. For the developer, this creates operational overhead to ensure that the compute capacity is keeping pace with the data volume.

The S3-to-Lambda serverless pattern instead brings the compute layer to the data. As data arrives, the compute process scales up and down automatically to meet the demand. This allows developers to focus on building business logic for a single item of data, and the execution at scale is handled by the Lambda service.

The image optimization application is a good example for comparing the traditional and serverless approaches. For a busy media site, capturing hundreds of images per minute in an S3 bucket, the operations overhead becomes clearer. A script running on a server must scale up across multiple instances to keep pace with this level of traffic. Compare this to the Lambda-based approach, which scales on-demand. The code itself does not change, whether it is used for a single image or thousands.

Receiving and processing events from S3 in custom code

S3 raises events when objects are put, copied, or deleted in a bucket. It also raises a broad number of other notifications, such as when lifecycle events occur. You can configure S3 to invoke Lambda from these events by using the S3 console, Lambda console, AWS CLI, or AWS Serverless Application Model (SAM) templates.

S3 passes details of the event, not the object itself, to the Lambda function in a JSON object. This object contains an array of records, so it’s possible to receive more than one S3 event per invocation:

S3 passes event details to Lambda

As the Lambda handler may receive more than one record, it should iterate through the records collection. It’s best practice to keep the handler small and generic, calling out to the business logic in a separate function or file:

const processEvent = require('my-custom-logic’)

// A Node.js Lambda handler
exports.handler = async (event) => {

  // Capture event – can be used to create mock events
  console.log (JSON.stringify(event, null, 2))  

  // Handle each incoming S3 object in the event
  await Promise.all(
    event.Records.map(async (event) => {
      try {
	  // Pass each event to the business logic handler
        await processEvent(event)
      } catch (err) {
        console.error('Handler error: ', err)
      }
    })
  )
}

This code example takes advantage of concurrent asynchronous executions available in Node.js but similar constructs are available in many other languages. This means that multiple objects are processed in parallel to minimize the overall function execution time.

Instead of handling and logging any errors within the function’s code, it’s also possible to use destinations for asynchronous invocations. You use an On failure condition to route the error to various potential targets, including another Lambda function or other AWS services. For complex applications or those handling large volumes, this provides greater control for managing events that fail processing.

During the development process, you can debug and test the S3-to-Lambda integration locally. First, capture a sample event during development to create a mock event for local testing. The sample applications in this series each use a test harness so the developer can test the handler on a local machine. The test harness invokes the handler locally, providing mock environment variables:

// Mock event
const event = require('./localTestEvent')

// Mock environment variables
process.env.AWS_REGION = 'us-east-1'
process.env.localTest = true
process.env.language = 'en'

// Lambda handler
const { handler } = require('./app')

const main = async () => {
  console.time('localTest')
  await handler(event)
  console.timeEnd('localTest')
}

main().catch(error => console.error(error))

Scaling up when more data arrives

The Lambda service scales up if S3 sends multiple events simultaneously. How this works depends on several factors. If the target Lambda function has sufficient concurrency available, and if any active instances of the function are already processing events, the Lambda service scales up.

Lambda scaling up as events queue grows

The function does not scale up if the reserved concurrency is set to 1 or the scaling capacity is fully consumed for a Region in your account. In this case, the events from S3 are queued internally until a Lambda instance is available for processing. You can request to increase the regional concurrency limit by submitting a request in the Support Center console. You may also intend to perform one-at-a-time processing by setting the reserved concurrency to 1.

One-at-a-time processing with Lambda

Generally, multiple instances of a function are invoked simultaneously when S3 receives multiple objects, to process the events as quickly as possible. It’s this rapid scaling and parallelization in both S3 and Lambda that make this pattern such a powerful core architecture for many applications.

Amazon SNS and Amazon SQS integrations

The native S3 to Lambda integration provides a reliable way to invoke one function per prefix or suffix-pattern per bucket. In example, invoking a function when object keys end in .pdf in a single bucket. This works well for the vast majority of use-cases but you may want to invoke multiple Lambda functions per S3 event.

In this case, S3 can publish notifications to SNS, where events are delivered to a range of targets. These include Lambda functions, SQS queues, HTTP endpoints, email, text messages and push notifications. SNS provides fan-out capability, enabling one event to be delivered to multiple destinations, such as Lambda functions or web hooks, for example.

In busy applications, the volume of S3 events may be too large for a downstream system, such as a non-serverless service. In this case, you can also use an SQS queue as a notification target. After events are published to a queue, they can be consumed by Lambda functions and other services. The queue acts as a buffer and can help smooth out traffic for systems consuming these events. See the DynamoDB importer repository for an example.

Uploading data to S3 in upstream applications

You may have upstream services in your architecture that generate the data stored in S3. Some upstream workloads have spiky usage patterns and large numbers of users, like web or mobile applications. You may increase the performance and throughput of these workloads by uploading directly to S3. This avoids proxying binary data through an API Gateway endpoint or web server.

For example, for a mobile application uploading user photos, S3 and Lambda can handle the upload process for large of numbers of users:

  1. The upstream process, in this case a mobile client, requests a presigned URL from an API Gateway endpoint.
  2. This invokes a Lambda function that requests a presigned URL for the S3 bucket, and returns this back via the API call.
  3. The mobile client sends the data directly to the presigned S3 URL using HTTPS POST. The upload is managed directly by S3.

This simple pattern can be a scalable and cost-effective way to upload large binary data into your applications. After the object successfully uploads, the S3 put event can then asynchronously invoke downstream workflows.

Visit this repository to see an example of a serverless S3 uploader application. You can also see a walkthrough of this process in this YouTube video.

Developing larger applications

As you develop larger serverless applications, it often becomes more practical to split applications into multiple services and repositories for separate teams. Often, individual services must integrate with existing S3 buckets, not create these in the application templates. You may also have to integrate a single service with multiple S3 buckets.

In decoupling larger applications with Amazon EventBridge, I show how you can decouple services within an application using an event bus. This pattern helps separate the producers and consumers of events in your workload. This can make each service become more independent and more resilient to changes with the overall application.

This example demonstrates how the document repository solution can be refactored into several smaller applications that communicate using events. This uses Amazon EventBridge as the event router coordinating the flow. Each application contains a SAM template that defines the EventBridge rule to filter for events, and publishes data back to the event bus after processing is complete.

One of the major benefits to using an event-based architecture is that development teams retain flexibility even as the application grows. It allows developers to separate AWS resources like S3 buckets and DynamoDB tables, from the compute resources, like Lambda functions. This decoupling can simplify the deployment process, help avoid building monoliths, and reduce the cognitive load of developing in large applications.

Conclusion

S3 and Lambda are two highly scalable AWS services that can be powerful when combined in serverless applications. In this post, I summarize many of the patterns shown across this series. I explain the integration pattern and the scaling behavior, and how you can use mock events for local testing and development. You can also use SNS and SQS in some applications for fan-out and buffering of events.

Upstream applications can upload data directly to S3 to achieve greater scalability by avoiding proxies. For larger applications, I show how using an event-based architecture modeled around EventBridge can help decouple application services. This can promote service independence, and help maintain flexibility as applications grow.

To learn more about the S3-to-Lambda architecture pattern, watch the YouTube video series, or explore the articles listed at top of this post.

Serving Billions of Ads in Just 100 ms Using Amazon Elasticache for Redis

Post Syndicated from Rodrigo Asensio original https://aws.amazon.com/blogs/architecture/serving-billions-of-ads-with-amazon-elasticache-for-redis/

This post was co-written with Lucas Ceballos, CTO of Smadex

Introduction

Showing ads may seem to be a simple task, but it’s not. Showing the right ad to the right user is an incredibly complex challenge that involves multiple disciplines such as artificial intelligence, data science, and software engineering. Doing it one million times per second with a 100-ms constraint is even harder.

In the ad-tech business, speed and infrastructure costs are the keys to success. The less the final user waits for an ad, the higher the probability of that user clicking on the ad. Doing that while keeping infrastructure costs under control is crucial for business profitability.

About Smadex

Smadex is the leading mobile-first programmatic advertising platform specifically built to deliver best user acquisition performance and complete transparency.

Its state-of-the-art digital signal processing (DSP) technology provides advertisers with the tools they need to achieve their goals and ROI, with measurable results from web forms, post-app install events, store visits, and sales.

Smadex advertising architecture

What does showing ads look like under the hood? At Smadex, our technology works based on the OpenRTB (Real-Time Bidding) protocol.

RTB is a means by which advertising inventory is bought and sold on a per-impression basis, via programmatic instantaneous auction, which is similar to financial markets.

To show ads, we participate in auctions deciding in real time which ad to show and how much to bid trying to optimize the cost of every impression.

High level diagram

  1. The final user browses the publisher’s website or app.
  2. Ad-exchange is called to start a new auction.
  3. Smadex receives the bid request and has to decide which ad to show and how much to offer in just 100 ms (and this is happening one million times per second).
  4. If Smadex won the auction, the ad must be sent and rendered on the publisher’s website or app.
  5. In the end, the user interacts with the ad sending new requests and events to Smadex platform.

Flow of data

As you can see in the previous diagram, showing ads is just one part of the challenge. After the ad is shown, the final user interacts with it in multiple ways, such as clicking it, installing an application, subscribing to a service, etc. This happens during a determined period that we call the “attribution window.” All of those interactions must be tracked and linked to the original bid transaction (using the request_id parameter).

Doing this is complicated: billions of bid transactions must be stored and available so that they can be quickly accessed every time the user interacts with the ad. The longer we store the transactions, the longer we can “wait” for an interaction to take place, and the better for our business and our clients, too.

Detailed diagram

Challenge #1: Cost

The challenge is: What kind of database can store billions of records per day, with at least a 30-day retention capacity (attribution window), be accessed by key-value, and all by spending as little as possible?

The answer is…none! Based on our research, all the available options that met the technical requirements were way out of our budget.

So…how to solve it? Here is when creativity and the combination of different AWS services comes into place.

We started to analyze the time dispersion of the events trying to find some clues. The interesting thing we spotted was that 90% of what we call “post-bid events” (impression, click, install, etc.) happened within one hour after the auction took place.

That means that we can process 90% of post-bid events by storing just one hour of bids.

Under our current workload, in one hour we participate in approximately 3.7 billion auctions generating 100 million bid records of an average 600 bytes each. This adds up to 55 gigabytes per hour, an easier amount of data to process.

Instead of thinking about one single database to store all the bid requests, we decided to split bids into two different categories:

  • Hot Bid: A request that took place within the last hour (small amount and frequently accessed)
  • Cold Bid: A request that took place more than our hour ago (huge amount and infrequently accessed)

Amazon ElastiCache for Redis is the best option to store 55 GB of data in memory, which gives us the ability to query in a key-value way with the lowest possible latency.

Hot Bids flow

Hot Bids flow diagram

  1. Every new bid is a hot bid by definition so it’s going to be stored in the hot bids Redis cluster.
  2. At the moment of the user interaction with the ad, the Smadex tracker component receives an HTTPS notification, including the bid request UUID that originated it.
  3. Based on the date of occurrence extracted from the received UUID, the tracker component can determine if it’s looking for a hot bid or not. If it’s a hot bid, the tracker reads it directly from Redis performing a key-value lookup query.

It’s been easy so far but what to do with the other 29 days and 23 hours we need to store?

Challenge #2: Performance

As we previously mentioned, cold bids are a huge infrequently accessed number of records with only 10% of post-bid events pointing to them. That sounds like a good use case for an inexpensive and slower data store like Amazon S3.

Thanks to the S3 low-cost storage prices combined with the ability to query S3 objects directly using Amazon Athena, we were able to optimize our costs by storing and querying cold bids by implementing a serverless architecture.

Cold Bids Flow

Cold Bids flow diagram

  1. Incoming bids are buffered by Fluentd and flushed to S3 every one minute in JSON format. Every single file flushed to S3 contains all the bids processed by a specific EC2 instance for one minute.
  2. An AWS Lambda function is automatically triggered on every new PutObject event from S3. This function transforms the JSON records to Parquet format and will save it back the S3 bucket, but this time into a specific partition folder based on file creation timestamp.
  3. As seen on the hot bids flow, the tracker component will determine if it’s looking for a hot or a cold bid based on the extracted timestamp of the request UUID. In this case, the cold bid will be retrieved by running an Amazon Athena look-up query leveraging the use of partitions and Parquet format to reduce as much as possible the latency and data that needs to be scanned.

Conclusion

Thanks to this combined approach using different technologies and a variety of AWS services we were able to extend our attribution window from 30 to 90 days while reducing the infrastructure costs by 45%.

 

 

Introduction to Messaging for Modern Cloud Architecture

Post Syndicated from Sam Dengler original https://aws.amazon.com/blogs/architecture/introduction-to-messaging-for-modern-cloud-architecture/

We hope you’ve enjoyed reading our posts on best practices for your serverless applications. The posts in the following series will focus on best practices when introducing messaging patterns into your applications. Let’s review some core messaging concepts and see how they can be used to address challenges when designing modern cloud architectures.

Introduction

Applications can communicate information with each other using messages, a mechanism for packaging a data payload and associated metadata. The application that sends a message is called the producer and the application that receives the message is called the consumer. Producers and consumers exchange messages using a variety of transportation channels, for example point-to-point requests, message queues, subscription topics, or event buses. These transportation channels have differently characteristics that make them useful when implementing message communication patterns. Dependencies emerge when producers and consumers exchange messages, which is called coupling.

Synchronous Communication

synchronous communication

Message communication is called synchronous when the producer sends a message to the consumer and waits for a response before the producer continues its processing logic. An example of synchronous communication over a point-to-point channel is when a HTTP client makes a request to a HTTP service, waits for the service to process the request, and then applies logic to the HTTP response to determine how to proceed.

Synchronous communication patterns are more straightforward to implement, however they create tight coupling between producers and consumers. Tight coupling can cause problems due to traffic spikes and failures propagating directly throughout the application. For example, in a three-tier architecture, when the application experiences a spike in client traffic, the web tier directly translates the traffic spike as pressure on downstream resources (the logic and data tiers), which may not scale to meet the sudden demand. Likewise, downstream resource failure in the logic or data tier directly impacts the web tier from responding to client requests. Applications can mimic a synchronous experience, for example a status spinner, using asynchronous communication with a polling or push notification strategy.

Asynchronous Communication

Asynchronous communication

Message communication is called asynchronous when the producer sends a message to the consumer and proceeds without waiting for the response. An example of asynchronous communication over a message queue channel is when a client publishes a message to a queue, and after the queue acknowledges receipt of the message, the publisher proceeds without waiting for the consumer to process the message.

Asynchronous communication patterns are implemented using transportation channels such as queues, topics, and event buses to create loose coupling between producers and consumers. Loose coupling increases an architecture’s resiliency to failure and ability to handle traffic spikes because it creates an indirection between producer and consumer communication, enabling them to operate independently of each other. Using the three-tier architecture example, a message queue can be introduced between the web, logic, and data tiers to enable each to scale independently of each other. When the application experiences a spike in client traffic, the web tier translates the traffic spike as more messages to the queue for processing, however the logic tier may continue to process messages off the queue without being directly impacted.

Considerations and Next Steps

Although asynchronous communication patterns can benefit modern cloud architectures, there are tradeoffs to consider. Asynchronous messaging adds latency to end-to-end processing time due to the addition of middleware. Producers and consumers take a dependency on the middleware stack, which must also scale to meet demand and be resilient to failure. Care must be taken to appropriately configure producers, consumers, and middleware to handle errors so that messages are not lost, more monitoring is required to ensure proper operations, and multiple logs must be correlated to troubleshoot and diagnose problems.

Amazon MQ, Amazon KinesisAmazon Simple Queue Service (SQS), Amazon Simple Notification Service (SNS), and Amazon EventBridge are highly available, large scale, failure resistant managed services that can be used to implement asynchronous messaging patterns. You can explore these services at the AWS Messaging page and their integration into Serverless Architectures in the free new digital course, Architecting Serverless Solutions. You can also visit the AWS Event-Driven Architecture page to learn how to apply messaging patterns to build event-driven solutions. The upcoming posts in this series will explore these AWS services to help ensure message patterns are implemented using best practices when applied to modern cloud architecture.

Creating a Seamless Handoff Between Amazon Pinpoint and Amazon Connect

Post Syndicated from Brent Meyer original https://aws.amazon.com/blogs/messaging-and-targeting/creating-a-seamless-handoff-between-amazon-pinpoint-and-amazon-connect/

Note: This post was written by Ilya Pupko, Senior Consultant for the AWS Digital User Engagement team.


Time to read 5 minutes
Learning level Intermediate (200)
Services used Amazon Pinpoint, Amazon SNS, AWS Lambda, Amazon Lex, Amazon Connect

Your customers deserve to have helpful communications with your brand, regardless of the channel that you use to interact with them. There are many situations in which you might have to move customers from one channel to another—for example, when a customer is interacting with a chatbot over SMS, but their needs suddenly change to require voice assistance. To create a great customer experience, your communications with your customers should be seamless across all communication channels.

Welcome aboard Customer Obsessed Airlines

In this post, we look at a scenario that involves our fictitious airline, Customer Obsessed Airlines. Severe storms in one area of the country have caused Customer Obsessed Airlines to cancel a large number of flights. Customer Obsessed Airlines has to notify all of the affected customers of the cancellations right away. But most importantly, to keep customers as happy as possible in this unfortunate and unavoidable situation, Customer Obsessed Airlines has to make it easy for customers to rebook their flights.

Fortunately, Customer Obsessed Airlines has implemented the solution that’s outlined later in this post. This solution uses Amazon Pinpoint to send messages to a targeted segment of customers—in this case, the specific customers who were booked on the affected flights. Some of these customers might have straightforward travel itineraries that can simply be rebooked through interactions with a chatbot. Other customers who have more complex itineraries, or those who simply prefer to interact with a human over the phone, can be handed off to an agent in your call center.

About the solution

The solution that we’ll build to handle this scenario can be deployed in under an hour. The following diagram illustrates the interactions in this solution.

At a high level, this solution uses the following workflow:

  1. An event occurs. Automated impact analysis systems trigger the creation of custom segments—in this case, all passengers whose flights were cancelled.
  2. Amazon Pinpoint sends a message to the affected passengers through their preferred channels. Amazon Pinpoint supports the email, SMS, push, and voice channels, but in this example, we focus exclusively on SMS.
  3. Passengers who receive the message can respond. When they do, they interact with a chatbot that helps them book a different flight.
  4. If a passenger requests a live agent, or if their situation can’t be handled by a chatbot, then Amazon Pinpoint passes information about the customer’s situation and communication history to Amazon Connect. The passenger is entered into a queue. When the passenger reaches the front of the queue, they receive a phone call from an agent.
  5. After being re-booked, the passenger receives a written confirmation of the changes to their itinerary through their preferred channel. Passengers are also given the option of providing feedback on their interaction when the process is complete.

To build this solution, we use Amazon Pinpoint to segment our customers based on their attributes (such as which flight they’ve booked), and to deliver messages to those segments.

We also use Amazon Connect to manage the voice calling part of the solution, and Amazon Lex to power the chatbot. Finally, we connect these services using logic that’s defined in AWS Lambda functions.

Setting up the solution

Step 1: Set up Amazon Pinpoint and link it with Amazon Lex

The first step in setting up this solution is to create a new Amazon Pinpoint project and configure the SMS channel. When that’s done, you can create an Amazon Lex chatbot and link it to the Amazon Pinpoint project.

We described this process in detail in an earlier blog post. Complete the procedures in Create an SMS Chatbot with Amazon Pinpoint and Amazon Lex, and then proceed to step 2.

Step 2: Set up Amazon Connect and link it with your Amazon Lex chatbot

By completing step 1, we’ve created a system that can send messages to our passengers and receive messages from them. The next step is to create a way for passengers to communicate with our call center.

The Amazon Connect Administrator Guide provides instructions for linking an Amazon Lex bot to an Amazon Connect instance. For complete procedures, see Add an Amazon Lex Bot.

When you complete these procedures, link your Amazon Connect instance to the same Amazon Lex bot that you created in step 1. This step is intended to provide customers with a consistent, cohesive experience across channels.

Step 3: Set up an Amazon Connect callback queue and use Amazon Pinpoint keyword logic to trigger it

Now that we’ve configured Amazon Pinpoint and Amazon Connect, we can connect them.

Linking the two services makes it possible for passengers to request additional assistance. Traditionally, passengers in this situation would have to call a call center themselves and then wait on hold for an agent to become available. However, in this solution, our call center calls the passenger directly as soon as an agent is available. When the agent calls the passenger, the agent has all of the information about the passenger’s issue, as well as a transcript of the passenger’s interactions with your chatbot.

To implement an automatic callback mechanism, use the Amazon Pinpoint Connect Callback Requestor, which is available on the AWS GitHub page.

Next steps

By completing the preceding three steps, you can send messages to a subset of your users based on the criteria you choose and the type of message you want to send. Your customers can interact with your message by replying with questions. When they do, a chatbot responds intelligently and appropriately.

You can add to this solution by expanding it to cover other communication channels, such as push notifications. You can also automate the initial communication by integrating the solution with your systems of record.

We’re excited to see what you build using the solution that we outlined in this post. Let us know of your ideas and your successes in the comments.

Application analytics pipeline with Amazon EventBridge

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/application-analytics-pipeline-with-amazon-eventbridge/

This post is courtesy of Rajdeep Tarat, Solutions Architect and Venugopal Pai, Solutions Architect

Customers across industry verticals collect, analyze, and derive insights from end-user application analytics using solutions such as Google Analytics and MixPanel. While these solutions provide built-in dashboards for marketing analytics, it can be difficult to reuse the raw event data.

Setting up a pipeline to move the raw event data into AWS opens up possibilities for various rule-based, statistical, and machine learning algorithms to derive deep insights about end-user behavior. Additionally, the raw event data can be enriched with other transactional data points available within the customer’s AWS environment.

This post uses the Segment Partner integration in Amazon EventBridge to pipe the data into your AWS environment. Segment allows you to collect, unify, and connect end-user application analytics into AWS using Amazon EventBridge as a destination.

Segment already supports direct, optimized connections to many AWS services such as Amazon Redshift, Amazon Personalize, Amazon Kinesis, Amazon Kinesis Data Firehose, AWS Lambda, and Amazon S3. The EventBridge destination is a good choice for customers who want the flexibility and centralization that EventBridge offers.

EventBridge makes it easy to build scalable event-driven application pipelines by handling event ingestion, delivery, security, authorization, and error handling for you. The architecture of this pipeline is shown below:Segment architecture

In the diagram, end-user applications send the data into Segment, which is routed to each of the configured destinations (for example, EventBridge). Once the data reaches EventBridge, it is again routed to multiple targets. With this approach you can continue using existing solutions supported by Segment in parallel to the Amazon EventBridge destination.

This architecture makes the pipeline highly extensible and modular. Firstly, you can configure multiple Segment destinations to fan out the event data into other existing solutions in parallel to EventBridge. Marketing teams can continue to use their existing tools without any disruptions while the data is seamlessly pumped into AWS. Within the AWS Cloud, EventBridge can again route the event data to up to five targets per rule.

The following section provides a walkthrough of setting up the Segment integration with EventBridge, and configuring two targets within the AWS Cloud.

  • The first target uses an Amazon Kinesis Data Firehose to deliver the data to an S3 bucket. From the S3 bucket, multiple AWS services can use the data (learn more about using S3 as a data lake).
  • The second target posts the event data to an SNS topic. From here, the data can be consumed by subscribers for the topic.

Walkthrough

To set up the pipeline, you must configure the Segment partner integration in EventBridge, and then set up the targets where analytics data is sent.

Amazon EventBridge – Segment partner integration:

  1. From the Amazon EventBridge console, navigate to the Partner Event Sources > Segment setup page. Copy your AWS Account ID from here.
    Segment setup
  2. On the Segment destination setup page, use the Amazon EventBridge integration. Enter the AWS Account ID and select a Region (learn more about setting up a Segment destination).
    EventBridge settings

Create the event bus:

  1. After linking the Segment Destination with the AWS Account ID, send a test event from Segment to create a Partner Event Source. This also creates an Event bus with the same name. This is done by firing a test event from the Event Tester in the Segment Dashboard.
    Event Tester
  2. After the first test event is fired, the Partner Event Source and the corresponding event bus is created in the EventBridge console.
    Partner event sources

Create rules:

  1. A rule watches for incoming events and routes them to specific targets that are configured. Start by creating a new rule and entering a name.
    Creating rules
  2. For Event Pattern, select the Predefined pattern by Service, and select Service Partners > Segment.
    Define event pattern
  3. Under Select event bus, select the Custom or partner event bus, and the name of the event bus created.
    Select event bus

Configuring multiple targets for the event bus:

  1. For Kinesis streams, select Kinesis stream from the Target dropdown, and the name of the stream. For more details on creating a Kinesis data stream, read this documentation.
    Select targets
  2. For SNS topic, choose Add Target and repeat the same steps to add an SNS topic instead. For more details on creating an SNS topic, read this documentation.
    SNS as target
  3. You can optionally tag the resource, then choose Create.

The pipeline is ready to send data to the targets configured via the event bus. You can now send test events from the Segment dashboard and monitor Kinesis Data Firehose or by setting up subscribers for the SNS topic.

Conclusion

This post shows how customers can capture end-user application analytics using the partner solution Segment in real time, and ingest data into Amazon EventBridge. The data routing is made extensible using multiple Segment destinations (for third-party solutions), and using multiple rules in EventBridge (for multiple destinations within the AWS Cloud).

To learn more about Amazon EventBridge integrations, read the EventBridge documentation.

ICYMI: Serverless Q4 2019

Post Syndicated from Rob Sutter original https://aws.amazon.com/blogs/compute/icymi-serverless-q4-2019/

Welcome to the eighth edition of the AWS Serverless ICYMI (in case you missed it) quarterly recap. Every quarter, we share the most recent product launches, feature enhancements, blog posts, webinars, Twitch live streams, and other interesting things that you might have missed!

In case you missed our last ICYMI, checkout what happened last quarter here.

The three months comprising the fourth quarter of 2019

AWS re:Invent

AWS re:Invent 2019

re:Invent 2019 dominated the fourth quarter at AWS. The serverless team presented a number of talks, workshops, and builder sessions to help customers increase their skills and deliver value more rapidly to their own customers.

Serverless talks from re:Invent 2019

Chris Munns presenting 'Building microservices with AWS Lambda' at re:Invent 2019

We presented dozens of sessions showing how customers can improve their architecture and agility with serverless. Here are some of the most popular.

Videos

Decks

You can also find decks for many of the serverless presentations and other re:Invent presentations on our AWS Events Content.

AWS Lambda

For developers needing greater control over performance of their serverless applications at any scale, AWS Lambda announced Provisioned Concurrency at re:Invent. This feature enables Lambda functions to execute with consistent start-up latency making them ideal for building latency sensitive applications.

As shown in the below graph, provisioned concurrency reduces tail latency, directly impacting response times and providing a more responsive end user experience.

Graph showing performance enhancements with AWS Lambda Provisioned Concurrency

Lambda rolled out enhanced VPC networking to 14 additional Regions around the world. This change brings dramatic improvements to startup performance for Lambda functions running in VPCs due to more efficient usage of elastic network interfaces.

Illustration of AWS Lambda VPC to VPC NAT

New VPC to VPC NAT for Lambda functions

Lambda now supports three additional runtimes: Node.js 12, Java 11, and Python 3.8. Each of these new runtimes has new version-specific features and benefits, which are covered in the linked release posts. Like the Node.js 10 runtime, these new runtimes are all based on an Amazon Linux 2 execution environment.

Lambda released a number of controls for both stream and async-based invocations:

  • You can now configure error handling for Lambda functions consuming events from Amazon Kinesis Data Streams or Amazon DynamoDB Streams. It’s now possible to limit the retry count, limit the age of records being retried, configure a failure destination, or split a batch to isolate a problem record. These capabilities help you deal with potential “poison pill” records that would previously cause streams to pause in processing.
  • For asynchronous Lambda invocations, you can now set the maximum event age and retry attempts on the event. If either configured condition is met, the event can be routed to a dead letter queue (DLQ), Lambda destination, or it can be discarded.

AWS Lambda Destinations is a new feature that allows developers to designate an asynchronous target for Lambda function invocation results. You can set separate destinations for success and failure. This unlocks new patterns for distributed event-based applications and can replace custom code previously used to manage routing results.

Illustration depicting AWS Lambda Destinations with success and failure configurations

Lambda Destinations

Lambda also now supports setting a Parallelization Factor, which allows you to set multiple Lambda invocations per shard for Kinesis Data Streams and DynamoDB Streams. This enables faster processing without the need to increase your shard count, while still guaranteeing the order of records processed.

Illustration of multiple AWS Lambda invocations per Kinesis Data Streams shard

Lambda Parallelization Factor diagram

Lambda introduced Amazon SQS FIFO queues as an event source. “First in, first out” (FIFO) queues guarantee the order of record processing, unlike standard queues. FIFO queues support messaging batching via a MessageGroupID attribute that supports parallel Lambda consumers of a single FIFO queue, enabling high throughput of record processing by Lambda.

Lambda now supports Environment Variables in the AWS China (Beijing) Region and the AWS China (Ningxia) Region.

You can now view percentile statistics for the duration metric of your Lambda functions. Percentile statistics show the relative standing of a value in a dataset, and are useful when applied to metrics that exhibit large variances. They can help you understand the distribution of a metric, discover outliers, and find hard-to-spot situations that affect customer experience for a subset of your users.

Amazon API Gateway

Screen capture of creating an Amazon API Gateway HTTP API in the AWS Management Console

Amazon API Gateway announced the preview of HTTP APIs. In addition to significant performance improvements, most customers see an average cost savings of 70% when compared with API Gateway REST APIs. With HTTP APIs, you can create an API in four simple steps. Once the API is created, additional configuration for CORS and JWT authorizers can be added.

AWS SAM CLI

Screen capture of the new 'sam deploy' process in a terminal window

The AWS SAM CLI team simplified the bucket management and deployment process in the SAM CLI. You no longer need to manage a bucket for deployment artifacts – SAM CLI handles this for you. The deployment process has also been streamlined from multiple flagged commands to a single command, sam deploy.

AWS Step Functions

One powerful feature of AWS Step Functions is its ability to integrate directly with AWS services without you needing to write complicated application code. In Q4, Step Functions expanded its integration with Amazon SageMaker to simplify machine learning workflows. Step Functions also added a new integration with Amazon EMR, making EMR big data processing workflows faster to build and easier to monitor.

Screen capture of an AWS Step Functions step with Amazon EMR

Step Functions step with EMR

Step Functions now provides the ability to track state transition usage by integrating with AWS Budgets, allowing you to monitor trends and react to usage on your AWS account.

You can now view CloudWatch Metrics for Step Functions at a one-minute frequency. This makes it easier to set up detailed monitoring for your workflows. You can use one-minute metrics to set up CloudWatch Alarms based on your Step Functions API usage, Lambda functions, service integrations, and execution details.

Step Functions now supports higher throughput workflows, making it easier to coordinate applications with high event rates. This increases the limits to 1,500 state transitions per second and a default start rate of 300 state machine executions per second in US East (N. Virginia), US West (Oregon), and Europe (Ireland). Click the above link to learn more about the limit increases in other Regions.

Screen capture of choosing Express Workflows in the AWS Management Console

Step Functions released AWS Step Functions Express Workflows. With the ability to support event rates greater than 100,000 per second, this feature is designed for high-performance workloads at a reduced cost.

Amazon EventBridge

Illustration of the Amazon EventBridge schema registry and discovery service

Amazon EventBridge announced the preview of the Amazon EventBridge schema registry and discovery service. This service allows developers to automate discovery and cataloging event schemas for use in their applications. Additionally, once a schema is stored in the registry, you can generate and download a code binding that represents the schema as an object in your code.

Amazon SNS

Amazon SNS now supports the use of dead letter queues (DLQ) to help capture unhandled events. By enabling a DLQ, you can catch events that are not processed and re-submit them or analyze to locate processing issues.

Amazon CloudWatch

Amazon CloudWatch announced Amazon CloudWatch ServiceLens to provide a “single pane of glass” to observe health, performance, and availability of your application.

Screenshot of Amazon CloudWatch ServiceLens in the AWS Management Console

CloudWatch ServiceLens

CloudWatch also announced a preview of a capability called Synthetics. CloudWatch Synthetics allows you to test your application endpoints and URLs using configurable scripts that mimic what a real customer would do. This enables the outside-in view of your customers’ experiences, and your service’s availability from their point of view.

CloudWatch introduced Embedded Metric Format, which helps you ingest complex high-cardinality application data as logs and easily generate actionable metrics. You can publish these metrics from your Lambda function by using the PutLogEvents API or using an open source library for Node.js or Python applications.

Finally, CloudWatch announced a preview of Contributor Insights, a capability to identify who or what is impacting your system or application performance by identifying outliers or patterns in log data.

AWS X-Ray

AWS X-Ray announced trace maps, which enable you to map the end-to-end path of a single request. Identifiers show issues and how they affect other services in the request’s path. These can help you to identify and isolate service points that are causing degradation or failures.

X-Ray also announced support for Amazon CloudWatch Synthetics, currently in preview. CloudWatch Synthetics on X-Ray support tracing canary scripts throughout the application, providing metrics on performance or application issues.

Screen capture of AWS X-Ray Service map in the AWS Management Console

X-Ray Service map with CloudWatch Synthetics

Amazon DynamoDB

Amazon DynamoDB announced support for customer-managed customer master keys (CMKs) to encrypt data in DynamoDB. This allows customers to bring your own key (BYOK) giving you full control over how you encrypt and manage the security of your DynamoDB data.

It is now possible to add global replicas to existing DynamoDB tables to provide enhanced availability across the globe.

Another new DynamoDB capability to identify frequently accessed keys and database traffic trends is currently in preview. With this, you can now more easily identify “hot keys” and understand usage of your DynamoDB tables.

Screen capture of Amazon CloudWatch Contributor Insights for DynamoDB in the AWS Management Console

CloudWatch Contributor Insights for DynamoDB

DynamoDB also released adaptive capacity. Adaptive capacity helps you handle imbalanced workloads by automatically isolating frequently accessed items and shifting data across partitions to rebalance them. This helps reduce cost by enabling you to provision throughput for a more balanced workload instead of over provisioning for uneven data access patterns.

Amazon RDS

Amazon Relational Database Services (RDS) announced a preview of Amazon RDS Proxy to help developers manage RDS connection strings for serverless applications.

Illustration of Amazon RDS Proxy

The RDS Proxy maintains a pool of established connections to your RDS database instances. This pool enables you to support a large number of application connections so your application can scale without compromising performance. It also increases security by enabling IAM authentication for database access and enabling you to centrally manage database credentials using AWS Secrets Manager.

AWS Serverless Application Repository

The AWS Serverless Application Repository (SAR) now offers Verified Author badges. These badges enable consumers to quickly and reliably know who you are. The badge appears next to your name in the SAR and links to your GitHub profile.

Screen capture of SAR Verifiedl developer badge in the AWS Management Console

SAR Verified developer badges

AWS Developer Tools

AWS CodeCommit launched the ability for you to enforce rule workflows for pull requests, making it easier to ensure that code has pass through specific rule requirements. You can now create an approval rule specifically for a pull request, or create approval rule templates to be applied to all future pull requests in a repository.

AWS CodeBuild added beta support for test reporting. With test reporting, you can now view the detailed results, trends, and history for tests executed on CodeBuild for any framework that supports the JUnit XML or Cucumber JSON test format.

Screen capture of AWS CodeBuild

CodeBuild test trends in the AWS Management Console

Amazon CodeGuru

AWS announced a preview of Amazon CodeGuru at re:Invent 2019. CodeGuru is a machine learning based service that makes code reviews more effective and aids developers in writing code that is more secure, performant, and consistent.

AWS Amplify and AWS AppSync

AWS Amplify added iOS and Android as supported platforms. Now developers can build iOS and Android applications using the Amplify Framework with the same category-based programming model that they use for JavaScript apps.

Screen capture of 'amplify init' for an iOS application in a terminal window

The Amplify team has also improved offline data access and synchronization by announcing Amplify DataStore. Developers can now create applications that allow users to continue to access and modify data, without an internet connection. Upon connection, the data synchronizes transparently with the cloud.

For a summary of Amplify and AppSync announcements before re:Invent, read: “A round up of the recent pre-re:Invent 2019 AWS Amplify Launches”.

Illustration of AWS AppSync integrations with other AWS services

Q4 serverless content

Blog posts

October

November

December

Tech talks

We hold several AWS Online Tech Talks covering serverless tech talks throughout the year. These are listed in the Serverless section of the AWS Online Tech Talks page.

Here are the ones from Q4:

Twitch

October

There are also a number of other helpful video series covering Serverless available on the AWS Twitch Channel.

AWS Serverless Heroes

We are excited to welcome some new AWS Serverless Heroes to help grow the serverless community. We look forward to some amazing content to help you with your serverless journey.

AWS Serverless Application Repository (SAR) Apps

In this edition of ICYMI, we are introducing a section devoted to SAR apps written by the AWS Serverless Developer Advocacy team. You can run these applications and review their source code to learn more about serverless and to see examples of suggested practices.

Still looking for more?

The Serverless landing page has much more information. The Lambda resources page contains case studies, webinars, whitepapers, customer stories, reference architectures, and even more Getting Started tutorials. We’re also kicking off a fresh series of Tech Talks in 2020 with new content providing greater detail on everything new coming out of AWS for serverless application developers.

Throughout 2020, the AWS Serverless Developer Advocates are crossing the globe to tell you more about serverless, and to hear more about what you need. Follow this blog to keep up on new launches and announcements, best practices, and examples of serverless applications in action.

You can also follow all of us on Twitter to see latest news, follow conversations, and interact with the team.

Chris Munns: @chrismunns
Eric Johnson: @edjgeek
James Beswick: @jbesw
Moheeb Zara: @virgilvox
Ben Smith: @benjamin_l_s
Rob Sutter: @rts_rob
Julian Wood: @julian_wood

Happy coding!