Tag Archives: Amazon EventBridge

AWS Week in Review – December 12, 2022

2022-12-13 Marcia Villalba

Post Syndicated from Marcia Villalba original https://aws.amazon.com/blogs/aws/aws-week-in-review-december-12-2022/

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

The world is asynchronous, is what Werner Vogels, Amazon CTO, reminded us during his keynote last week at AWS re:Invent. At the beginning of the keynote, he showed us how weird a synchronous world would be and how everything in nature is asynchronous. One example of an event-driven application he showcased during his keynote is Serverlesspresso, a project my team has been working on for the last year. And last week, we announced Serverlesspresso extensions, a new program that lets you contribute to Serverlesspresso and learn how event-driven applications can be extended.

Last Week’s Launches
Here are some launches that got my attention during the previous week.

Amazon SageMaker Studio now supports fine-grained data access control with AWS LakeFormation when accessing data through Amazon EMR. Now, when you connect to EMR clusters to SageMaker Studio notebooks, you can choose what runtime IAM role you want to connect with, and the notebooks will only access data and resources permitted by the attached runtime role.

Amazon Lex has now added support for Arabic, Cantonese, Norwegian, Swedish, Polish, and Finnish. This opens new possibilities to create chat bots and conversational experiences in more languages.

Amazon RDS Proxy now supports creating proxies in Amazon Aurora Global Database primary and secondary Regions. Now, building multi-Region applications with Amazon Aurora is simpler. RDS proxy sits between your application and the database pool and shares established database connections.

Amazon FSx for NetApp ONTAP launched many new features. First, it added the support for Nitro-based encryption of data in transit. It also extended NVMe read cache support to Single-AZ file systems. And it added four new features to ease the use of the service: easily assign a snapshot policy to your volumes, easily create data protection volumes, configure volumes so their tags are automatically copied to the backups, and finally, add or remove VPC route tables for your existing Multi-AZ file systems.

I would also like to mention two launches that happened before re:Invent but were not covered on the News Blog:

Amazon EventBridge Scheduler is a new capability from Amazon EventBridge that allows you to create, run, and manage scheduled tasks at scale. Using this new capability, you can schedule one-time or recurrent tasks across 270 AWS services.

AWS IoT RoboRunner is now generally available. Last year at re:Invent Channy wrote a blog post introducing the preview for this service. IoT RoboRunner is a robotic service that makes it easier to build and deploy applications for fleets of robots working seamlessly together.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
Some other updates and news that you may have missed:

I would like to recommend this really interesting Amazon Science article about federated learning. This is a framework that allows edge devices to work together to train a global model while keeping customers’ data on-device.

Podcast Charlas Técnicas de AWS – If you understand Spanish, this podcast is for you. Podcast Charlas Técnicas is one of the official AWS podcasts in Spanish, and every other week there is a new episode. Today the final episode for season three launched, and in it, we discussed many of the re:Invent launches. You can listen to all the episodes directly from your favorite podcast app or at AWS Podcasts en español.

AWS open-source news and updates–This is a newsletter curated by my colleague Ricardo to bring you the latest open-source projects, posts, events, and more.

Upcoming AWS Events
Check your calendars and sign up for these AWS events:

AWS Resiliency Hub Activation Day is a half-day technical virtual session to deep dive into the features and functionality of Resiliency Hub. You can register for free here.

AWS re:Invent recaps in your area. During the re:Invent week, we had lots of new announcements, and in the next weeks you can find in your area a recap of all these launches. All the events will be posted on this site, so check it regularly to find an event nearby.

AWS re:Invent keynotes, leadership sessions, and breakout sessions are available on demand. I recommend that you check the playlists and find the talks about your favorite topics in one collection.

That’s all for this week. Check back next Monday for another Week in Review!

— Marcia

Configuration driven dynamic multi-account CI/CD solution on AWS

2022-12-12 Anshul Saxena

Post Syndicated from Anshul Saxena original https://aws.amazon.com/blogs/devops/configuration-driven-dynamic-multi-account-ci-cd-solution-on-aws/

Many organizations require durable automated code delivery for their applications. They leverage multi-account continuous integration/continuous deployment (CI/CD) pipelines to deploy code and run automated tests in multiple environments before deploying to Production. In cases where the testing strategy is release specific, you must update the pipeline before every release. Traditional pipeline stages are predefined and static in nature, and once the pipeline stages are defined it’s hard to update them. In this post, we present a configuration driven dynamic CI/CD solution per repository. The pipeline state is maintained and governed by configurations stored in Amazon DynamoDB. This gives you the advantage of automatically customizing the pipeline for every release based on the testing requirements.

By following this post, you will set up a dynamic multi-account CI/CD solution. Your pipeline will deploy and test a sample pet store API application. Refer to Automating your API testing with AWS CodeBuild, AWS CodePipeline, and Postman for more details on this application. New code deployments will be delivered with custom pipeline stages based on the pipeline configuration that you create. This solution uses services such as AWS Cloud Development Kit (AWS CDK), AWS CloudFormation, Amazon DynamoDB, AWS Lambda, and AWS Step Functions.

Solution overview

The following diagram illustrates the solution architecture:

The image represents the solution workflow, highlighting the integration of the AWS components involved.

Figure 1: Architecture Diagram

Users insert/update/delete entry in the DynamoDB table.
The Step Function Trigger Lambda is invoked on all modifications.
The Step Function Trigger Lambda evaluates the incoming event and does the following:
1. On insert and update, triggers the Step Function.
2. On delete, finds the appropriate CloudFormation stack and deletes it.
Steps in the Step Function are as follows:
1. Collect Information (Pass State) – Filters the relevant information from the event, such as repositoryName and referenceName.
2. Get Mapping Information (Backed by CodeCommit event filter Lambda) – Retrieves the mapping information from the Pipeline config stored in the DynamoDB.
3. Deployment Configuration Exist? (Choice State) – If the StatusCode == 200, then the DynamoDB entry is found, and Initiate CloudFormation Stack step is invoked, or else StepFunction exits with Successful.
4. Initiate CloudFormation Stack (Backed by stack create Lambda) – Constructs the CloudFormation parameters and creates/updates the dynamic pipeline based on the configuration stored in the DynamoDB via CloudFormation.

Code deliverables

The code deliverables include the following:

AWS CDK app – The AWS CDK app contains the code for all the Lambdas, Step Functions, and CloudFormation templates.
sample-application-repo – This directory contains the sample application repository used for deployment.
automated-tests-repo– This directory contains the sample automated tests repository for testing the sample repo.

Deploying the CI/CD solution

Clone this repository to your local machine.
Follow the README to deploy the solution to your main CI/CD account. Upon successful deployment, the following resources should be created in the CI/CD account:
1. A DynamoDB table
2. Step Function
3. Lambda Functions
Navigate to the Amazon Simple Storage Service (Amazon S3) console in your main CI/CD account and search for a bucket with the name: cloudformation-template-bucket-<AWS_ACCOUNT_ID>. You should see two CloudFormation templates (templates/codepipeline.yaml and templates/childaccount.yaml) uploaded to this bucket.
Run the childaccount.yaml in every target CI/CD account (Alpha, Beta, Gamma, and Prod) by going to the CloudFormation Console. Provide the main CI/CD account number as the “CentralAwsAccountId” parameter, and execute.
Upon successful creation of Stack, two roles will be created in the Child Accounts:
1. ChildAccountFormationRole
2. ChildAccountDeployerRole

Pipeline configuration

Make an entry into devops-pipeline-table-info for the Repository name and branch combination. A sample entry can be found in sample-entry.json.

The pipeline is highly configurable, and everything can be configured through the DynamoDB entry.

The following are the top-level keys:

RepoName: Name of the repository for which AWS CodePipeline is configured.
RepoTag: Name of the branch used in CodePipeline.
BuildImage: Build image used for application AWS CodeBuild project.
BuildSpecFile: Buildspec file used in the application CodeBuild project.
DeploymentConfigurations: This key holds the deployment configurations for the pipeline. Under this key are the environment specific configurations. In our case, we’ve named our environments Alpha, Beta, Gamma, and Prod. You can configure to any name you like, but make sure that the entries in json are the same as in the codepipeline.yaml CloudFormation template. This is because there is a 1:1 mapping between them. Sub-level keys under DeploymentConfigurations are as follows:

EnvironmentName. This is the top-level key for environment specific configuration. In our case, it’s Alpha, Beta, Gamma, and Prod. Sub level keys under this are:
- <Env>AwsAccountId: AWS account ID of the target environment.
- Deploy<Env>: A key specifying whether or not the artifact should be deployed to this environment. Based on its value, the CodePipeline will have a deployment stage to this environment.
- ManualApproval<Env>: Key representing whether or not manual approval is required before deployment. Enter your email or set to false.
- Tests: Once again, this is a top-level key with sub-level keys. This key holds the test related information to be run on specific environments. Each test based on whether or not it will be run will add an additional step to the CodePipeline. The tests’ related information is also configurable with the ability to specify the test repository, branch name, buildspec file, and build image for testing the CodeBuild project.

Execute

Make an entry into the devops-pipeline-table-info DynamoDB table in the main CI/CD account. A sample entry can be found in sample-entry.json. Make sure to replace the configuration values with appropriate values for your environment. An explanation of the values can be found in the Pipeline Configuration section above.
After the entry is made in the DynamoDB table, you should see a CloudFormation stack being created. This CloudFormation stack will deploy the CodePipeline in the main CI/CD account by reading and using the entry in the DynamoDB table.

Customize the solution for different combinations such as deploying to an environment while skipping for others by updating the pipeline configurations stored in the devops-pipeline-table-info DynamoDB table. The following is the pipeline configured for the sample-application repository’s main branch.

The image represents the dynamic CI/CD pipeline deployed in your account.

Figure 2: Dynamic Multi-Account CI/CD Pipeline

Clean up your dynamic multi-account CI/CD solution and related resources

To avoid ongoing charges for the resources that you created following this post, you should delete the following:

The pipeline configuration stored in the DynamoDB
The CloudFormation stacks deployed in the target CI/CD accounts
The AWS CDK app deployed in the main CI/CD account
Empty and delete the retained S3 buckets.

Conclusion

This configuration-driven CI/CD solution provides the ability to dynamically create and configure your pipelines in DynamoDB. IDEMIA, a global leader in identity technologies, adopted this approach for deploying their microservices based application across environments. This solution created by AWS Professional Services allowed them to dynamically create and configure their pipelines per repository per release. As Kunal Bajaj, Tech Lead of IDEMIA, states, “We worked with AWS pro-serve team to create a dynamic CI/CD solution using lambdas, step functions, SQS, and other native AWS services to conduct cross-account deployments to our different environments while providing us the flexibility to add tests and approvals as needed by the business.”

About the authors:

New — Create Point-to-Point Integrations Between Event Producers and Consumers with Amazon EventBridge Pipes

2022-12-01 Donnie Prakoso

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/new-create-point-to-point-integrations-between-event-producers-and-consumers-with-amazon-eventbridge-pipes/

It is increasingly common to use multiple cloud services as building blocks to assemble a modern event-driven application. Using purpose-built services to accomplish a particular task ensures developers get the best capabilities for their use case. However, communication between services can be difficult if they use different technologies to communicate, meaning that you need to learn the nuances of each service and how to integrate them with each other. We usually need to create integration code (or “glue” code) to connect and bridge communication between services. Writing glue code slows our velocity, increases the risk of bugs, and means we spend our time writing undifferentiated code rather than building better experiences for our customers.

Introducing Amazon EventBridge Pipes
Today, I’m excited to announce Amazon EventBridge Pipes, a new feature of Amazon EventBridge that makes it easier for you to build event-driven applications by providing a simple, consistent, and cost-effective way to create point-to-point integrations between event producers and consumers, removing the need to write undifferentiated glue code.

The simplest pipe consists of a source and a target. An optional filtering step allows only specific source events to flow into the Pipe and an optional enrichment step using AWS Lambda, AWS Step Functions, Amazon EventBridge API Destinations, or Amazon API Gateway enriches or transforms events before they reach the target. With Amazon EventBridge Pipes, you can integrate supported AWS and self-managed services as event producers and event consumers into your application in a simple, reliable, consistent and cost-effective way.

Amazon EventBridge Pipes bring the most popular features of Amazon EventBridge Event Bus, such as event filtering, integration with more than 14 AWS services, and automatic delivery retries.

How Amazon EventBridge Pipes Works
Amazon EventBridge Pipes provides you a seamless means of integrating supported AWS and self-managed services, favouring configuration over code. To start integrating services with EventBridge Pipes, you need to take the following steps:

Choose a source that is producing your events. Supported sources include: Amazon DynamoDB, Amazon Kinesis Data Streams, Amazon SQS, Amazon Managed Streaming for Apache Kafka, and Amazon MQ (both ActiveMQ and RabbitMQ).
(Optional) Specify an event filter to only process events that match your filter (you’re not charged for events that are filtered out).
(Optional) Transform and enrich your events using built-in free transformations, or AWS Lambda, AWS Step Functions, Amazon API Gateway, or EventBridge API Destinations to perform more advanced transformations and enrichments.
Choose a target destination from more than 14 AWS services, including Amazon Step Functions, Kinesis Data Streams, AWS Lambda, and third-party APIs using EventBridge API destinations.

Amazon EventBridge Pipes provides simplicity to accelerate development velocity by reducing the time needed to learn the services and write integration code, to get reliable and consistent integration.

EventBridge Pipes also comes with additional features that can help in building event-driven applications. For example, with event filtering, Pipes helps event-driven applications become more cost-effective by only processing the events of interest.

Get Started with Amazon EventBridge Pipes
Let’s see how to get started with Amazon EventBridge Pipes. In this post, I will show how to integrate an Amazon SQS queue with AWS Step Functions using Amazon EventBridge Pipes.

The following screenshot is my existing Amazon SQS queue and AWS Step Functions state machine. In my case, I need to run the state machine for every event in the queue. To do so, I need to connect my SQS queue and Step Functions state machine with EventBridge Pipes.

Existing Amazon SQS queue and AWS Step Functions state machine

First, I open the Amazon EventBridge console. In the navigation section, I select Pipes. Then I select Create pipe.

On this page, I can start configuring a pipe and set the AWS Identity and Access Management (IAM) permission, and I can navigate to the Pipe settings tab.

Navigate to Pipe Settings

In the Permissions section, I can define a new IAM role for this pipe or use an existing role. To improve developer experience, the EventBridge Pipes console will figure out the IAM role for me, so I don’t need to manually configure required permissions and let EventBridge Pipes configures least-privilege permissions for IAM role. Since this is my first time creating a pipe, I select Create a new role for this specific resource.

Setting IAM Permission for pipe

Then, I go back to the Build pipe section. On this page, I can see the available event sources supported by EventBridge Pipes.

List of available services as the event source

I select SQS and select my existing SQS queue. If I need to do batch processing, I can select Additional settings to start defining Batch size and Batch window. Then, I select Next.

Select SQS Queue as event source

On the next page, things get even more interesting because I can define Event filtering from the event source that I just selected. This step is optional, but the event filtering feature makes it easy for me to process events that only need to be processed by my event-driven application. In addition, this event filtering feature also helps me to be more cost-effective, as this pipe won’t process unnecessary events. For example, if I use Step Functions as the target, the event filtering will only execute events that match the filter.

Event filtering in Amazon EventBridge Pipes

I can use sample events from AWS events or define custom events. For example, I want to process events for returned purchased items with a value of 100 or more. The following is the sample event in JSON format:

{
   "event-type":"RETURN_PURCHASE",
   "value":100
}

Then, in the event pattern section, I can define the pattern by referring to the Content filtering in Amazon EventBridge event patterns documentation. I define the event pattern as follows:

{
   "event-type": ["RETURN_PURCHASE"],
   "value": [{
      "numeric": [">=", 100]
   }]
}

I can also test by selecting test pattern to make sure this event pattern will match the custom event I’m going to use. Once I’m confident that this is the event pattern that I want, I select Next.

Defining and testing an event pattern for filtering

In the next optional step, I can use an Enrichment that will augment, transform, or expand the event before sending the event to the target destination. This enrichment is useful when I need to enrich the event using an existing AWS Lambda function, or external SaaS API using the Destination API. Additionally, I can shape the event using the Enrichment Input Transformer.

The final step is to define a target for processing the events delivered by this pipe.

Defining target destination service

Here, I can select various AWS services supported by EventBridge Pipes.

I select my existing AWS Step Functions state machine, named pipes-statemachine.

In addition, I can also use Target Input Transformer by referring to the Transforming Amazon EventBridge target input documentation. For my case, I need to define a high priority for events going into this target. To do that, I define a sample custom event in Sample events/Event Payload and add the priority: HIGH in the Transformer section. Then in the Output section, I can see the final event to be passed to the target destination service. Then, I select Create pipe.

In less than a minute, my pipe was successfully created.

Pipe successfully created

To test this pipe, I need to put an event into the Amaon SQS queue.

Sending a message into Amazon SQS Queue

To check if my event is successfully processed by Step Functions, I can look into my state machine in Step Functions. On this page, I see my event is successfully processed.

I can also go to Amazon CloudWatch Logs to get more detailed logs.

Things to Know
Event Sources – At launch, Amazon EventBridge Pipes supports the following services as event sources: Amazon DynamoDB, Amazon Kinesis, Amazon Managed Streaming for Apache Kafka (Amazon MSK) alongside self-managed Apache Kafka, Amazon SQS (standard and FIFO), and Amazon MQ (both for ActiveMQ and RabbitMQ).

Event Targets – Amazon EventBridge Pipes supports 15 Amazon EventBridge targets, including AWS Lambda, Amazon API Gateway, Amazon SNS, Amazon SQS, and AWS Step Functions. To deliver events to any HTTPS endpoint, developers can use API destinations as the target.

Event Ordering – EventBridge Pipes maintains the ordering of events received from an event sources that support ordering when sending those events to a destination service.

Programmatic Access – You can also interact with Amazon EventBridge Pipes and create a pipe using AWS Command Line Interface (CLI), AWS CloudFormation, and AWS Cloud Development Kit (AWS CDK).

Independent Usage – EventBridge Pipes can be used separately from Amazon EventBridge bus and Amazon EventBridge Scheduler. This flexibility helps developers to define source events from supported AWS and self-managed services as event sources without Amazon EventBridge Event Bus.

Availability – Amazon EventBridge Pipes is now generally available in all AWS commercial Regions, with the exception of Asia Pacific (Hyderabad) and Europe (Zurich).

Visit the Amazon EventBridge Pipes page to learn more about this feature and understand the pricing. You can also visit the documentation page to learn more about how to get started.

Happy building!

— Donnie

AWS Week in Review – November 21, 2022

2022-11-21 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/aws-week-in-review-november-21-2022/

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

A new week starts, and the News Blog team is getting ready for AWS re:Invent! Many of us will be there next week and it would be great to meet in person. If you’re coming, do you know about PeerTalk? It’s an onsite networking program for re:Invent attendees available through the AWS Events mobile app (which you can get on Google Play or Apple App Store) to help facilitate connections among the re:Invent community.

If you’re not coming to re:Invent, no worries, you can get a free online pass to watch keynotes and leadership sessions.

Last Week’s Launches
It was a busy week for our service teams! Here are the launches that got my attention:

AWS Region in Spain – The AWS Region in Aragón, Spain, is now open. The official name is Europe (Spain), and the API name is eu-south-2.

Amazon Athena – You can now apply AWS Lake Formation fine-grained access control policies with all table and file format supported by Amazon Athena to centrally manage permissions and access data catalog resources in your Amazon Simple Storage Service (Amazon S3) data lake. With fine-grained access control, you can restrict access to data in query results using data filters to achieve column-level, row-level, and cell-level security.

Amazon EventBridge – With these additional filtering capabilities, you can now filter events by suffix, ignore case, and match if at least one condition is true. This makes it easier to write complex rules when building event-driven applications.

AWS Controllers for Kubernetes (ACK) – The ACK for Amazon Elastic Compute Cloud (Amazon EC2) is now generally available and lets you provision and manage EC2 networking resources, such as VPCs, security groups and internet gateways using the Kubernetes API. Also, the ACK for Amazon EMR on EKS is now generally available to allow you to declaratively define and manage EMR on EKS resources such as virtual clusters and job runs as Kubernetes custom resources. Learn more about ACK for Amazon EMR on EKS in this blog post.

Amazon HealthLake – New analytics capabilities make it easier to query, visualize, and build machine learning (ML) models. Now HealthLake transforms customer data into an analytics-ready format in near real-time so that you can query, and use the resulting data to build visualizations or ML models. Also new is Amazon HealthLake Imaging (preview), a new HIPAA-eligible capability that enables you to easily store, access, and analyze medical images at any scale. More on HealthLake Imaging can be found in this blog post.

Amazon RDS – You can now transfer files between Amazon Relational Database Service (RDS) for Oracle and an Amazon Elastic File System (Amazon EFS) file system. You can use this integration to stage files like Oracle Data Pump export files when you import them. You can also use EFS to share a file system between an application and one or more RDS Oracle DB instances to address specific application needs.

Amazon ECS and Amazon EKS – We added centralized logging support for Windows containers to help you easily process and forward container logs to various AWS and third-party destinations such as Amazon CloudWatch, S3, Amazon Kinesis Data Firehose, Datadog, and Splunk. See these blog posts for how to use this new capability with ECS and with EKS.

AWS SAM CLI – You can now use the Serverless Application Model CLI to locally test and debug an AWS Lambda function defined in a Terraform application. You can see a walkthrough in this blog post.

AWS Lambda – Now supports Node.js 18 as both a managed runtime and a container base image, which you can learn more about in this blog post. Also check out this interesting article on why and how you should use AWS SDK for JavaScript V3 with Node.js 18. And last but not least, there is new tooling support to build and deploy native AOT compiled .NET 7 applications to AWS Lambda. With this tooling, you can enable faster application starts and benefit from reduced costs through the faster initialization times and lower memory consumption of native AOT applications. Learn more in this blog post.

AWS Step Functions – Now supports cross-account access for more than 220 AWS services to process data, automate IT and business processes, and build applications across multiple accounts. Learn more in this blog post.

AWS Fargate – Adds the ability to monitor the utilization of the ephemeral storage attached to an Amazon ECS task. You can track the storage utilization with Amazon CloudWatch Container Insights and ECS Task Metadata endpoint.

AWS Proton – Now has a centralized dashboard for all resources deployed and managed by AWS Proton, which you can learn more about in this blog post. You can now also specify custom commands to provision infrastructure from templates. In this way, you can manage templates defined using the AWS Cloud Development Kit (AWS CDK) and other templating and provisioning tools. More on CDK support and AWS CodeBuild provisioning can be found in this blog post.

AWS IAM – You can now use more than one multi-factor authentication (MFA) device for root account users and IAM users in your AWS accounts. More information is available in this post.

Amazon ElastiCache – You can now use IAM authentication to access Redis clusters. With this new capability, IAM users and roles can be associated with ElastiCache for Redis users to manage their cluster access.

Amazon WorkSpaces – You can now use version 2.0 of the WorkSpaces Streaming Protocol (WSP) host agent that offers significant streaming quality and performance improvements, and you can learn more in this blog post. Also, with Amazon WorkSpaces Multi-Region Resilience, you can implement business continuity solutions that keep users online and productive with less than 30-minute recovery time objective (RTO) in another AWS Region during disruptive events. More on multi-region resilience is available in this post.

Amazon CloudWatch RUM – You can now send custom events (in addition to predefined events) for better troubleshooting and application specific monitoring. In this way, you can monitor specific functions of your application and troubleshoot end user impacting issues unique to the application components.

AWS AppSync – You can now define GraphQL API resolvers using JavaScript. You can also mix functions written in JavaScript and Velocity Template Language (VTL) inside a single pipeline resolver. To simplify local development of resolvers, AppSync released two new NPM libraries and a new API command. More info can be found in this blog post.

AWS SDK for SAP ABAP – This new SDK makes it easier for ABAP developers to modernize and transform SAP-based business processes and connect to AWS services natively using the SAP ABAP language. Learn more in this blog post.

AWS CloudFormation – CloudFormation can now send event notifications via Amazon EventBridge when you create, update, or delete a stack set.

AWS Console – With the new Applications widget on the Console home, you have one-click access to applications in AWS Systems Manager Application Manager and their resources, code, and related data. From Application Manager, you can view the resources that power your application and your costs using AWS Cost Explorer.

AWS Amplify – Expands Flutter support (developer preview) to Web and Desktop for the API, Analytics, and Storage use cases. You can now build cross-platform Flutter apps with Amplify that target iOS, Android, Web, and Desktop (macOS, Windows, Linux) using a single codebase. Learn more on Flutter Web and Desktop support for AWS Amplify in this post. Amplify Hosting now supports fully managed CI/CD deployments and hosting for server-side rendered (SSR) apps built using Next.js 12 and 13. Learn more in this blog post and see how to deploy a NextJS 13 app with the AWS CDK here.

Amazon SQS – With attribute-based access control (ABAC), you can define permissions based on tags attached to users and AWS resources. With this release, you can now use tags to configure access permissions and policies for SQS queues. More details can be found in this blog.

AWS Well-Architected Framework – The latest version of the Data Analytics Lens is now available. The Data Analytics Lens is a collection of design principles, best practices, and prescriptive guidance to help you running analytics on AWS.

AWS Organizations – You can now manage accounts, organizational units (OUs), and policies within your organization using CloudFormation templates.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
A few more stuff you might have missed:

Introducing our final AWS Heroes of the year – As the end of 2022 approaches, we are recognizing individuals whose enthusiasm for knowledge-sharing has a real impact with the AWS community. Please meet them here!

The Distributed Computing Manifesto – Werner Vogles, VP & CTO at Amazon.com, shared the Distributed Computing Manifesto, a canonical document from the early days of Amazon that transformed the way we built architectures and highlights the challenges faced at the end of the 20th century.

AWS re:Post – To make this community more accessible globally, we expanded the user experience to support five additional languages. You can now interact with AWS re:Post also using Traditional Chinese, Simplified Chinese, French, Japanese, and Korean.

For AWS open-source news and updates, here’s the latest newsletter curated by Ricardo to bring you the most recent updates on open-source projects, posts, events, and more.

Upcoming AWS Events
As usual, there are many opportunities to meet:

AWS re:Invent – Our yearly event is next week from November 28 to December 2. If you can’t be there in person, get your free online pass to watch live the keynotes and the leadership sessions.

AWS Community Days – AWS Community Day events are community-led conferences to share and learn together. Join us in Sri Lanka (on December 6-7), Dubai, UAE (December 10), Pune, India (December 10), and Ahmedabad, India (December 17).

That’s all from me for this week. Next week we’ll focus on re:Invent, and then we’ll take a short break. We’ll be back with the next Week in Review on December 12!

— Danilo

ICYMI: Serverless pre:Invent 2022

2022-11-21 Benjamin Smith

Post Syndicated from Benjamin Smith original https://aws.amazon.com/blogs/compute/icymi-serverless-preinvent-2022/

During the last few weeks, the AWS serverless team has been releasing a wave of new features in the build-up to AWS re:Invent 2022. This post recaps some of the most important releases for serverless developers building event-driven applications.

AWS Lambda

Lambda Support for Node.js 18

You can now develop Lambda functions using the Node.js 18 runtime. This version is in active LTS status and considered ready for general use. When creating or updating functions, specify a runtime parameter value of nodejs18.x or use the appropriate container base image to use this new runtime. Lambda’s Node.js runtimes include the AWS SDK for JavaScript.

This enables customers to use the AWS SDK to connect to other AWS services from their function code, without having to include the AWS SDK in their function deployment. This is especially useful when creating functions in the AWS Management Console. It’s also useful for Lambda functions deployed as inline code in CloudFormation templates. This blog post explains the major changes available with the Node.js 18 runtime in Lambda.

Lambda Telemetry API

The AWS Lambda team launched Lambda Telemetry API to provide an easier way to receive enhanced function telemetry directly from the Lambda service and send it to custom destinations. This makes it easier for developers and operators using third-party observability extensions to monitor and observe their Lambda functions.

The Lambda Telemetry API is an enhanced version of Logs API, which enables extensions to receive platform events, traces, and metrics directly from Lambda in addition to logs. This enables tooling vendors to collect enriched telemetry from their extensions, and send to any destination.

To see how the Telemetry API works, try the demos in the GitHub repository. Build your own extensions using the Telemetry API today, or use extensions provided by the Lambda observability partners.

.NET tooling

Lambda launched tooling support to enable applications running .NET 7 to be built and deployed on AWS Lambda. This includes applications compiled using .NET 7 native AOT. .NET 7 is the latest version of .NET and brings many performance improvements and optimizations. Customers can use .NET 7 with Lambda in two ways. First, Lambda has released a base container image for .NET 7, enabling customers to build and deploy .NET 7 functions as container images. Second, you can use Lambda’s custom runtime support to run functions compiled to native code using .NET 7 native AOT.

The new AWS Parameters and Secrets Lambda Extension provides a convenient method for Lambda users to retrieve parameters from AWS Systems Manager Parameter Store and secrets from AWS Secrets Manager. Use the extension to improve application performance by reducing latency and cost of retrieving parameters and secrets. The extension caches parameters and secrets, and persists them throughout the lifecycle of the Lambda function.

Amazon EventBridge

Amazon EventBridge Scheduler

Amazon EventBridge announced Amazon EventBridge Scheduler, a new capability that allows you to create, run, and manage scheduled tasks at scale. With EventBridge Scheduler, you can schedule one-time or recurrently tens of millions of tasks across many AWS services without provisioning or managing underlying infrastructure.

With EventBridge Scheduler, you can create schedules that trigger over 200 services with more than 6,000 APIs. EventBridge Scheduler allows you to configure schedules with a minimum granularity of one minute. It is priced per one million invocations, and the service is included in the AWS Free Tier. See the pricing page for more information. Visit the launch blog post to get started with EventBridge scheduler.

EventBridge now supports enhanced filtering capabilities including the ability to match against characters at the end of a value (suffix filtering), to ignore case sensitivity (equals-ignore-case), and to have a single EventBridge rule match if any conditions across multiple separate fields are true (OR matching). The bounds supported for numeric values has also been increased from -5e9 to 5e9 from -1e9 to 1e9. The new filtering capabilities further reduce the need to write and manage custom filtering code in downstream services.

AWS Step Functions

Intrinsic Functions

We have added 14 new intrinsic functions to AWS Step Functions. These are Amazon States Language (ASL) functions that perform basic data transformations. Intrinsic functions allow you to reduce the use of other services, such as AWS Lambda or AWS Fargate to perform basic data manipulation. This helps to reduce the amount of code and maintenance in your application. Intrinsics can also help reduce the cost of running your workflows by decreasing the number of states, number of transitions, and total workflow duration.

Standard Workflows, Express Workflows, and synchronous Express Workflows all support the new intrinsic functions, which can be grouped into six categories:

The intrinsic functions documentation contains the complete list of intrinsics.

Cross-account access capabilities

Now, customers can take advantage of identity-based policies in Step Functions so your workflow can directly invoke resources in other AWS accounts, allowing cross-account service API integrations. The compute blog post demonstrates how to use cross-account capability using two AWS accounts.

New executions experience for Express Workflows

Step Functions now provides a new console experience for viewing and debugging your Express Workflow executions that makes it easier to trace and root cause issues in your executions.

You can opt in to the new console experience of Step Functions, which allows you to inspect executions using three different views: graph, table, and event view, and add many new features to enhance the navigation and analysis of the executions. You can search and filter your executions and the events in your executions using unique attributes such as state name and error type. Errors are now easier to root cause as the experience highlights the reason for failure in a workflow execution.

The new execution experience for Express Workflows is now available in all Regions where AWS Step Functions is available. For a complete list of Regions and service offerings, see AWS Regions.

Step Functions Workflows Collection

The AWS Serverless Developer Advocate team created the Step Functions Workflows Collection, a fresh experience that makes it easier to discover, deploy, and share Step Functions workflows. Use the Step Functions workflows collection to find simple “building blocks”, reusable patterns, and example applications to help build your serverless applications with Step Functions. All Step Functions builders are invited to contribute to the collection. This is done by submitting a pull request to the Step Functions Workflows Collection GitHub repository. Each submission is reviewed by the Serverless Developer advocate team for quality and relevancy before publishing.

AWS Serverless Application Model (AWS SAM)

AWS SAM Connector

Speed up serverless development while maintaining secure best practices using new AWS SAM connector. AWS SAM Connector allows builders to focus on the relationships between components without expert knowledge of AWS Identity and Access Management (IAM) or direct creation of custom policies. AWS SAM connector supports AWS Step Functions, Amazon DynamoDB, AWS Lambda, Amazon SQS, Amazon SNS, Amazon API Gateway, Amazon EventBridge and Amazon S3, with more resources planned in the future.

Connectors are best for those getting started and who want to focus on modeling the flow of data and events within their applications. Connectors will take the desired relationship model and create the permissions for the relationship to exist and function as intended.

View the Developer Guide to find out more about AWS SAM connectors.

SAM CLI Pipelines now supports Open ID Connect Protocol

SAM Pipelines make it easier to create continuous integration and deployment (CI/CD) pipelines for serverless applications with Jenkins, GitLab, GitHub Actions, Atlassian Bitbucket Pipelines, and AWS CodePipeline. With this launch, SAM Pipelines can be configured to support OIDC authentication from providers supporting OIDC, such as GitHub Actions, GitLab and BitBucket. SAM Pipelines will use the OIDC tokens to configure the AWS Identity and Access Management (IAM) identity providers, simplifying the setup process.

AWS SAM CLI Terraform support

You can now use AWS SAM CLI to test and debug serverless applications defined using Terraform configurations. This public preview allows you to build locally, test, and debug Lambda functions defined in Terraform. Support for the Terraform configuration is currently in preview, and the team is asking for feedback and feature request submissions. The goal is for both communities to help improve the local development process using AWS SAM CLI. Submit your feedback by creating a GitHub issue here.

Still looking for more?

Get your free online pass to watch all the biggest AWS news and updates from this year’s re:Invent.

For more learning resources, visit Serverless Land.

Use an event-driven architecture to build a data mesh on AWS

2022-11-15 Jan Michael Go Tan

Post Syndicated from Jan Michael Go Tan original https://aws.amazon.com/blogs/big-data/use-an-event-driven-architecture-to-build-a-data-mesh-on-aws/

In this post, we take the data mesh design discussed in Design a data mesh architecture using AWS Lake Formation and AWS Glue, and demonstrate how to initialize data domain accounts to enable managed sharing; we also go through how we can use an event-driven approach to automate processes between the central governance account and data domain accounts (producers and consumers). We build a data mesh pattern from scratch as Infrastructure as Code (IaC) using AWS CDK and use an open-source self-service data platform UI to share and discover data between business units.

The key advantage of this approach is being able to add actions in response to data mesh events such as permission management, tag propagation, search index management, and to automate different processes.

Before we dive into it, let’s look at AWS Analytics Reference Architecture, an open-source library that we use to build our solution.

AWS Analytics Reference Architecture

AWS Analytics Reference Architecture (ARA) is a set of analytics solutions put together as end-to-end examples. It regroups AWS best practices for designing, implementing, and operating analytics platforms through different purpose-built patterns, handling common requirements, and solving customers’ challenges.

ARA exposes reusable core components in an AWS CDK library, currently available in Typescript and Python. This library contains AWS CDK constructs (L3) that can be used to quickly provision analytics solutions in demos, prototypes, proofs of concept, and end-to-end reference architectures.

The following table lists data mesh specific constructs in the AWS Analytics Reference Architecture library.

Construct Name	Purpose
CentralGovernance	Creates an Amazon EventBridge event bus for central governance account that is used to communicate with data domain accounts (producer/consumer). Creates workflows to automate data product registration and sharing.
DataDomain	Creates an Amazon EventBridge event bus for data domain account (producer/consumer) to communicate with central governance account. It creates data lake storage (Amazon S3), and workflow to automate data product registration. It also creates a workflow to populate AWS Glue Catalog metadata for newly registered data product.

You can find AWS CDK constructs for the AWS Analytics Reference Architecture on Construct Hub.

In addition to ARA constructs, we also use an open-source Self-service data platform (User Interface). It is built using AWS Amplify, Amazon DynamoDB, AWS Step Functions, AWS Lambda, Amazon API Gateway, Amazon EventBridge, Amazon Cognito, and Amazon OpenSearch. The frontend is built with React. Through the self-service data platform you can: 1) manage data domains and data products, and 2) discover and request access to data products.

Central Governance and data sharing

For the governance of our data mesh, we will use AWS Lake Formation. AWS Lake Formation is a fully managed service that simplifies data lake setup, supports centralized security management, and provides transactional access on top of your data lake. Moreover, it enables data sharing across accounts and organizations. This centralized approach has a number of key benefits, such as: centralized audit; centralized permission management; and centralized data discovery. More importantly, this allows organizations to gain the benefits of centralized governance while taking advantage of the inherent scaling characteristics of decentralized data product management.

There are two ways to share data resources in Lake Formation: 1) Named Based Access Control (NRAC), and 2) Tag-Based Access Control (LF-TBAC). NRAC uses AWS Resource Access Manager (AWS RAM) to share data resources across accounts. Those are consumed via resource links that are based on created resource shares. Tag-Based Access Control (LF-TBAC) is another approach to share data resources in AWS Lake Formation, that defines permissions based on attributes. These attributes are called LF-tags. You can read this blog to learn about LF-TBAC in the context of data mesh.

The following diagram shows how NRAC and LF-TBAC data sharing works. In this example, data domain is registered as a node on mesh and therefore we create two databases in the central governance account. NRAC database is shared with data domain via AWS RAM. Access to data products that we register in this database will be handled through NRAC. LF-TBAC database is tagged with data domain N line of business (LOB) LF-tag: <LOB:N>. LOB tag is automatically shared with data domain N account and therefore database is available in that account. Access to Data Products in this database will be handled through LF-TBAC.

In our solution we will demonstrate both NRAC and LF-TBAC approaches. With the NRAC approach, we will build up an event-based workflow that would automatically accept RAM share in the data domain accounts and automate the creation of the necessary metadata objects (eg. local database, resource links, etc). While with the LF-TBAC approach, we rely on permissions associated with the shared LF-Tags to allow producer data domains to manage their data products, and consumer data domains read access to the relevant data products associated with the LF-Tags that they requested access to.

We use CentralGovernance construct from ARA library to build a central governance account. It creates an EventBridge event bus to enable communication with data domain accounts that register as nodes on mesh. For each registered data domain, specific event bus rules are created that route events towards that account. Central governance account has a central metadata catalog that allows for data to be stored in different data domains, as opposed to a single central lake. For each registered data domain, we create two separate databases in central governance catalog to demonstrate both NRAC and LF-TBAC data sharing. CentralGovernance construct creates workflows for data product registration and data product sharing. We also deploy a self-service data platform UI to enable good user experience to manage data domains, data products, and to simplify data discovery and sharing.

A data domain: producer and consumer

We use DataDomain construct from ARA library to build a data domain account that can be either producer, consumer, or both. Producers manage the lifecycle of their respective data products in their own AWS accounts. Typically, this data is stored in Amazon Simple Storage Service (Amazon S3). DataDomain construct creates a data lake storage with cross-account bucket policy that enables central governance account to access the data. Data is encrypted using AWS KMS, and central governance account has a permission to use the key. Config secret in AWS Secrets Manager contains all the necessary information to register data domain as a node on mesh in central governance. It includes: 1) data domain name, 2) S3 location that holds data products, and 3) encryption key ARN. DataDomain construct also creates data domain and crawler workflows to automate data product registration.

Creating an event-driven data mesh

Data mesh architectures typically require some level of communication and trust policy management to maintain least privileges of the relevant principals between the different accounts (for example, central governance to producer, central governance to consumer). We use event-driven approach via EventBridge to securely forward events from one event bus to event bus in another account while maintaining the least privilege access. When we register data domain to central governance account through the self-service data platform UI, we establish bi-directional communication between the accounts via EventBridge. Domain registration process also creates database in the central governance catalog to hold data products for that particular domain. Registered data domain is now a node on mesh and we can register new data products.

The following diagram shows data product registration process:

Starts Register Data Product workflow that creates an empty table (the schema is managed by the producers in their respective producer account). This workflow also grants a cross-account permission to the producer account that allows producer to manage the schema of the table.
When complete, this emits an event into the central event bus.
The central event bus contains a rule that forwards the event to the producer’s event bus. This rule was created during the data domain registration process.
When the producer’s event bus receives the event, it triggers the Data Domain workflow, which creates resource-links and grants permissions.
Still in the producer account, Crawler workflow gets triggered when the Data Domain workflow state changes to Successful. This creates the crawler, runs it, waits and checks if the crawler is done, and deletes the crawler when it’s complete. This workflow is responsible for populating tables’ schemas.

Now other data domains can find newly registered data products using the self-service data platform UI and request access. The sharing process works in the same way as product registration by sending events from the central governance account to consumer data domain, and triggering specific workflows.

Solution Overview

The following high-level solution diagram shows how everything fits together and how event-driven architecture enables multiple accounts to form a data mesh. You can follow the workshop that we released to deploy the solution that we covered in this blog post. You can deploy multiple data domains and test both data registration and data sharing. You can also use self-service data platform UI to search through data products and request access using both LF-TBAC and NRAC approaches.

Conclusion

Implementing a data mesh on top of an event-driven architecture provides both flexibility and extensibility. A data mesh by itself has several moving parts to support various functionalities, such as onboarding, search, access management and sharing, and more. With an event-driven architecture, we can implement these functionalities in smaller components to make them easier to test, operate, and maintain. Future requirements and applications can use the event stream to provide their own functionality, making the entire mesh much more valuable to your organization.

To learn more how to design and build applications based on event-driven architecture, see the AWS Event-Driven Architecture page. To dive deeper into data mesh concepts, see the Design a Data Mesh Architecture using AWS Lake Formation and AWS Glue blog.

If you’d like our team to run data mesh workshop with you, please reach out to your AWS team.

About the authors

Jan Michael Go Tan is a Principal Solutions Architect for Amazon Web Services. He helps customers design scalable and innovative solutions with the AWS Cloud.

Dzenan Softic is a Senior Solutions Architect at AWS. He works with startups to help them define and execute their ideas. His main focus is in data engineering and infrastructure.

David Greenshtein is a Specialist Solutions Architect for Analytics at AWS with a passion for ETL and automation. He works with AWS customers to design and build analytics solutions enabling business to make data-driven decisions. In his free time, he likes jogging and riding bikes with his son.

Vincent Gromakowski is an Analytics Specialist Solutions Architect at AWS where he enjoys solving customers’ analytics, NoSQL, and streaming challenges. He has a strong expertise on distributed data processing engines and resource orchestration platform.

Introducing Amazon EventBridge Scheduler

2022-11-11 Marcia Villalba

Post Syndicated from Marcia Villalba original https://aws.amazon.com/blogs/compute/introducing-amazon-eventbridge-scheduler/

Today, we are announcing Amazon EventBridge Scheduler. This is a new capability from Amazon EventBridge that allows you to create, run, and manage scheduled tasks at scale. With EventBridge Scheduler, you can schedule one-time or recurrently tens of millions of tasks across many AWS services without provisioning or managing underlying infrastructure.

Previously, many customers used commercial off-the-shelf tools or built their own scheduling capabilities. This can increase application complexity, slow application development, and increase costs, which are magnified at scale. Most of these solutions are limited in what services they can trigger and create complexity in managing concurrency limitations of invoked targets that can affect application performance.

When to use EventBridge Scheduler?

For example, consider a company that develops a task management system. One feature that the application provides is that users can add a reminder for a task and be reminded by email one week before, two days before, or on the day of the task due date. You can automate the creation of all the schedules with EventBridge Scheduler, create the task for each of the reminders, and send it to Amazon SNS to send the notifications.

Or consider a large organization, like a supermarket, with thousands of AWS accounts and tens of thousands of Amazon EC2 instances. These instances are used in different parts of the world during business hours. You want to make sure that all the instances are started before the stores open and terminated after the business hours to reduce costs as much as possible. You can use EventBridge Scheduler to start and stop all the thousands of instances and also respect time zones.

SaaS providers can also benefit from EventBridge Scheduler, as now they can more easily manage all the different scheduled tasks that their customers have. For example, consider a SaaS provider with a subscription model for your customers paying a monthly or annual fee. You want to ensure that their license key is valid until the end of their current billing period. With Scheduler, you can create a schedule that removes the access to the service when the billing period is over, or when the user cancels their subscription. Also, you can create a series of emails that let your customer knows that their license is expiring so they can purchase a renewal.

Use cases for EventBridge Scheduler are diverse, from simplifying new feature development to improving your infrastructure operations.

How does EventBridge Scheduler work?

With EventBridge Scheduler, you can now create single or recurrent schedules that trigger over 200 services with more than 6,000 APIs. EventBridge Scheduler allows you to configure schedules with a minimum granularity of one minute.

EventBridge Scheduler provides at-least-once event delivery to targets, and you can create schedules that adjust to different delivery patterns by setting the window of delivery, the number of retries, the time for the event to be retained, and the dead letter queue (DLQ). You can learn more about each configuration from the Scheduler User Guide.

Time window allows you to start a schedule within a window of time. This means that the scheduled tasks are dispersed across the time window to reduce the impact of multiple requests on downstream services.
Maximum retention time of the event is the maximum time to keep an unprocessed event in the scheduler. If the target is not responding during this time, the event is dropped or sent to a DLQ.
Retries with exponential backoff help to retry a failed task with delayed attempts. This improves the success of the task when the target is available.
A dead letter queue is an Amazon SQS queue where events that failed to get delivered to the target are routed.

By default, EventBridge Scheduler tries to send the event for 24 hours and a maximum of 185 times. You can configure these values. If that fails, the message is dropped, since by default there is not a DLQ configured.

In addition, by default, all events in Scheduler are encrypted with a key that AWS owns and manages. You can also use your own AWS KMS encryption keys.

You can also schedule tasks using Amazon EventBridge rules. But to schedule tasks at scale, EventBridge Scheduler is better suited for this task. The following table shows the main differences between EventBridge Scheduler and EventBridge rules:

	Amazon EventBridge Scheduler	Amazon EventBridge rules
Quota on schedules	1 million per account	300 rule limit per account per Region
Event invocation throughput	Able to support throughput in 1,000s of TPS	Because of the schedule limit, you can only have 300 1-minute schedules for max throughput of 5 TPS
Targets	Over 270 services and over 6,000 API Actions with AWS SDK targets	20+ targets supported by EventBridge
Time expression and time-zones	at(), cron(), rate() All time-zones and DST	cron(), rate(), UTC No support for DST
One-time schedules	Yes	No
Time window schedules	Yes	No
Event bus support	No event bus is needed	Default bus only
Rule quota consumption	No. 1 million schedules soft limit	Yes, consumes from 2,000 rules per bus

Getting started with EventBridge Scheduler

This walkthrough builds a series of schedules to get started with EventBridge Scheduler. For that, you use the AWS Command Line (AWS CLI) to configure the schedules that send notifications using Amazon SNS.

Prerequisites

Update your AWS CLI to the latest version (v1.27.7).

As a prerequisite, you must create an SNS topic with an email subscription and an AWS IAM role that EventBridge Scheduler can assume to publish messages on your behalf. You can deploy these AWS resources using AWS SAM. Follow the instructions in the README file.

Scheduling a one-time schedule

Once configured, create your first schedule. This is a one-time schedule that publishes an event for the SNS topic you created.

For creating the schedule, run this command in your terminal and replace the schedule expression and time zone with values for your task:

$ aws scheduler create-schedule --name SendEmailOnce \ 
--schedule-expression "at(2022-11-01T11:00:00)" \ 
--schedule-expression-timezone "Europe/Helsinki" \
--flexible-time-window "{\"Mode\": \"OFF\"}" \
--target "{\"Arn\": \"arn:aws:sns:us-east-1:xxx:test-chronos-send-email\", \"RoleArn\": \" arn:aws:iam::xxxx:role/sam_scheduler_role\" }"

Let’s analyze the different parts of this command. The first parameter is the name of the schedule.

In the schedule expression attribute, you can define if the event is a one-time schedule or a recurrent schedule. Because this is a one-time schedule, it uses the at() expression with the date and time you want this schedule to run. Also, you must configure the schedule expression time zone in which this schedule run:

--schedule-expression "at(2022-11-01T11:00:00)" --schedule-expression-timezone "Europe/Helsinki"

Another setting that you can configure is the flexible time window. It’s not used for this example, but if you choose a time window, EventBridge Scheduler invokes the task within that timeframe. This setting helps to distribute the invocations across time and manage the downstream service limits.

--flexible-time-window "{\"Mode\": \"OFF\"}"

Finally, pass the IAM role ARN. This is the role previously created with the AWS SAM template. This role is the one that EventBridge Scheduler assumes when publishing events to SNS and it has permissions to publish messages on that topic.

Finally, you must configure the target. Scheduler comes with predefined targets with simpler APIs, that include actions like putting events for Amazon EventBridge, invoke a Lambda function, send a message to an Amazon SQS queue. For this example, use the universal target, which allows you to invoke almost any AWS services. Learn more about the targets from the User Guide.

--target "{\"Arn\": \"arn:aws:sns:us-east-1:xxx:test-chronos-send-email\", \"RoleArn\": \" arn:aws:iam::xxxx:role/sam_scheduler_role\" }"

Scheduling groups

Scheduling groups help you organize your schedules. Scheduling groups support tags that you can use for cost allocation, access control, and resource organization. When creating a new schedule, you can add it to a scheduling group.

To create a new scheduling group, run:

$ aws scheduler create-schedule-group --name ScheduleGroupTest

Scheduling a recurrent schedule

Now let’s create a recurrent schedule for that scheduling group. This schedule runs every five minutes and publishes a message to the SNS topic you created during the prerequisites.

$ aws scheduler create-schedule --name SendEmailTest \
--group-name ScheduleGroupTest \
--schedule-expression "rate(5minutes)" \
--flexible-time-window "{\"Mode\": \"OFF\"}" \
--target "{\"Arn\": \"arn:aws:sns:us-east-1:xxxx:test-chronos-send-email\", \"RoleArn\": \" arn:aws:iam::xxxx:role/sam_scheduler_role \" }"

Recurrent schedules can be configured with a cron expression or rate expression, to define the frequency that this schedule should be triggered. For scheduling this to run every five minutes, you can use an expression like this one:

--schedule-expression "rate(5minutes)"

Because you have selected the recurring schedule, you can define the timeframe in which this schedule runs. You can optionally choose a start and end date and time for your schedule. If you don’t do it, the schedule starts as soon as you create the task. These times are formatted in the same way as other AWS CLI timestamps.

--start-date "2022-11-01T18:48:00Z" --end-date "2022-11-01T19:00:00Z"

If you run the previous recurrent schedule for some time, and then check Amazon CloudWatch metrics, you find a metric called InvocationAttemptCount, for the schedule invocations that happened within the scheduling group you just created.

You can graph that metric in a dashboard and see how many times this schedule run. Also, you can create alarms to get notified if the number of invocations exceeds a threshold. For example, you can set this threshold to be close to the limits of your downstream service, to prevent reaching those limits.

Cleaning up

Make sure that you delete all the recurrent schedules that you created without an end time.

To check all the schedules that you have configured:

$ aws scheduler list-schedules

To delete a schedule using the AWS CLI:

$ aws scheduler delete-schedule --name <name-of-schedule> --group <name-of-group>

Also delete the CloudFormation stack with the prerequisite infrastructure when you complete this demo, as is defined in the README file of that project.

Conclusion

This blog post introduces the new Amazon EventBridge Scheduler, its use cases and its differences with existing scheduling options. It shows you how to create a new schedule using Amazon EventBridge Scheduler to simplify the creation, execution, and managing of scheduled tasks at scale.

You can get started today with EventBridge Scheduler from the AWS Management Console, AWS CLI, AWS CloudFormation, AWS SDK, and AWS SAM.

For more serverless learning resources, visit Serverless Land.

How to Automatically Prevent Email Throttling when Reaching Concurrency Limit

2022-10-28 Mark Richman

Post Syndicated from Mark Richman original https://aws.amazon.com/blogs/messaging-and-targeting/prevent-email-throttling-concurrency-limit/

Introduction

Many users of Amazon Simple Email Service (Amazon SES) send large email campaigns that target tens of thousands of recipients. Regulating the flow of Amazon SES requests can prevent throttling due to exceeding the AWS service limit on the account.

Amazon SES service quotas include a soft limit on the number of emails sent per second (also known as the “sending rate”). This quota is intended to protect users from accidentally sending unintended volumes of email, or from spending more money than intended. Most Amazon SES customers have this quota increased, but very large campaigns may still exceed that limit. As a result, Amazon SES will throttle email requests. When this happens, your messages will fail to reach their destination.

This blog provides users of Amazon SES a mechanism for regulating the flow of messages that are sent to Amazon SES. Cloud Architects, Engineers, and DevOps Engineers designing new, or improving an existing Amazon SES solution would benefit from reading this post.

Overview

A common solution for regulating the flow of API requests to Amazon SES is achieved using Amazon Simple Queue Service (Amazon SQS). Amazon SQS can send, store, and receive messages at virtually any volume and can serve as part of a solution to buffer and throttle the rate of API calls. It achieves this without the need for other services to be available to process the messages. In this solution, Amazon SQS prevents messages from being lost while waiting for them to be sent as emails.

Fig 1 — High level architecture diagram

But this common solution introduces a new challenge. The mechanism used by the Amazon SQS event source mapping for AWS Lambda invokes a function as soon as messages are visible. Our challenge is to regulate the flow of messages, rather than invoke Amazon SES as messages arrive to the queue.

Fig 2 — Leaky bucket

Developers typically limit the flow of messages in a distributed system by implementing the “leaky bucket” algorithm. This algorithm is an analogy to a bucket which has a hole in the bottom from which water leaks out at a constant rate. Water can be added to the bucket intermittently. If too much water is added at once or at too high a rate, the bucket overflows.

In this solution, we prevent this overflow by using throttling. Throttling can be handled in two ways: either before messages reach Amazon SQS, or after messages are removed from the queue (“dequeued”). Both of these methods pose challenges in handling the throttled messages and reprocessing them. These challenges introduce complexity and lead to the excessive use of resources that may cause a snowball effect and make the throttling worse.

Developers often use the following techniques to help improve the successful processing of feeds and submissions:

Submit requests at times other than on the hour or on the half hour. For example, submit requests at 11 minutes after the hour or at 41 minutes after the hour. This can have the effect of limiting competition for system resources with other periodic services.
Take advantage of times during the day when traffic is likely to be low, such as early evening or early morning hours.

However, these techniques assume that you have control over the rate of requests, which is usually not the case.

Amazon SQS, acting as a highly scalable buffer, allows you to disregard the incoming message rate and store messages at virtually any volume. Therefore, there is no need to throttle messages before adding them to the queue. As long as you eventually process messages faster than you receive new ones, you will be fine with the inflow that will get processed later on.

Regulating flow of messages from Amazon SQS

The proposed solution in this post regulates the dequeue of messages from one or more SQS queues. This approach can help prevent you from exceeding the per-second quota of Amazon SES, thereby preventing Amazon SES from throttling your API calls.

Available configuration controls

When it comes to regulating outflow from Amazon SQS you have a few options. MaxNumberOfMessages controls the maximum number of messages you can dequeue in a single read request. WaitTimeSeconds defines whether Amazon SQS uses short polling (0 seconds wait) or long polling (more than 0 seconds) while waiting to read messages from a queue. Though these capabilities are helpful in many use cases, they don’t provide full control over outflow rates.

Amazon SQS Event source mapping for Lambda is a built-in mechanism that uses a poller within the Lambda service. The poller polls for visible messages in the queue. Once messages are read, they immediately invoke the configured Lambda function. In order to prevent downstream throttling, this solution implements a custom poller to regulate the rate of messages polled instead of the Amazon SQS Event source mechanism.

Custom poller Lambda

Let’s look at the process of implementing a custom poller Lambda function. Your function should actively regulate the outflow rate without throttling or losing any messages.

First, you have to consider how to invoke the poller Lambda function once every second. Using Amazon EventBridge rules you can schedule Lambda invocations at a rate of once per minute. You also have to consider how to complete processing of Amazon SES invocations as soon as possible. And finally, you have to consider how to send requests to Amazon SES at a rate as close as possible to your per-second quota, without exceeding it.

You can use long polling to meet all of these requirements. Using long polling (by setting the WaitTimeSeconds value to a number greater than zero) means the request queries all of the Amazon SQS servers, or waits until the maximum number of messages you can handle per second (the MaxNumberOfMessages value) are read. By setting the MaxNumberOfMessages equal to your Amazon SES request per-second quota, you prevent your requests from exceeding that limit.

By splitting the looping logic from the poll logic (by using two Lambda functions) the code loops every second (60 times per minute) and asynchronously runs the polling logic.

Fig 3 — Custom poller diagram

You can use the following Python code to create the scheduler loop function:

import os
from time import sleep, time_ns

import boto3

SENDER_FUNCTION_NAME = os.getenv("SENDER_FUNCTION_NAME")
lambda_client = boto3.client("lambda")

def lambda_handler(event, context):
    print(event)

    for _ in range(60):
        prev_ns = time_ns()

        response = lambda_client.invoke_async( 
            FunctionName=SENDER_FUNCTION_NAME, InvokeArgs="{}" 
        ) 
        print(response)

        delta_ns = time_ns() - prev_ns

        if delta_ns < 1_000_000_000: 
            secs = (1_000_000_000.0 - delta_ns) / 1_000_000_000 
            sleep(secs)

This Python code creates a poller function:

import json 
import os

import boto3

UNREGULATED_QUEUE_URL = os.getenv("UNREGULATED_QUEUE_URL") 
MAX_NUMBER_OF_MESSAGES = 3 
WAIT_TIME_SECONDS = 1 
CHARSET = "UTF-8"

ses_client = boto3.client("ses") 
sqs_client = boto3.client("sqs")

def lambda_handler(event, context): 
    response = sqs_client.receive_message( 
        QueueUrl=UNREGULATED_QUEUE_URL, 
        MaxNumberOfMessages=MAX_NUMBER_OF_MESSAGES, 
        WaitTimeSeconds=WAIT_TIME_SECONDS, 
    )

    try: 
        messages = response["Messages"] 
    except KeyError: 
        print("No messages in queue") 
        return

    for message in messages: 
        message_body = json.loads(message["Body"]) 
        to_address = message_body["to_address"] 
        from_address = message_body["from_address"] 
        subject = message_body["subject"] 
        body = message_body["body"]

        print(f"Sending email to {to_address}")

        ses_client.send_email( 
            Destination={ 
                "ToAddresses": [ 
                    to_address, 
                ], 
            }, 
            Message={ 
                "Body": { 
                    "Text": { 
                        "Charset": CHARSET, 
                        "Data": body, 
                    } 
                }, 
                "Subject": { 
                    "Charset": CHARSET, 
                    "Data": subject, 
                }, 
            }, 
            Source=from_address, 
        )

        sqs_client.delete_message( 
            QueueUrl=UNREGULATED_QUEUE_URL, ReceiptHandle=message["ReceiptHandle"] 
        )

Regulating flow of prioritized messages from Amazon SQS

In the use case above, you may be serving a very large marketing campaign (“campaign1”) that takes hours to process. At the same time, you may want to process another, much smaller campaign (“campaign2″), which won’t be able to run until campaign1 is complete.

Obvious solution is to prioritize the campaigns by processing both campaigns in parallel. For example, allocate 90% of the Amazon SES per-second capacity limit to process the larger campaign1, while allowing the smaller campaign2 to take 10% of the available capacity under the limit. Amazon SQS does not provide message priority functionalities out-of-the-box. Instead, create two separate queues and poll each queue according to your desired frequency.

Fig 4 — Prioritize campaigns by queue diagram

This solution works fine if you have consistent flow of incoming messages to both queues. Unfortunately, once you finish processing campaign2 you will keep processing campaign1, using only 90% of the limit capacity per second.

Handling unbalanced flow

For handling unbalanced flow of messages merge both of your poller Lambdas. Implement one Lambda that polls both queues for MaxNumberOfMessages (that equals 100% of the limits of both). In this implementation send from your poller Lambda 90% of campaign1 messages and 10% of campaign2 messages. When campaign2 no longer has messages to process, keep processing 100% of the capacity from campaign1’s queue.

Do not delete unsent messages from the queues. These messages will become visible after their queue’s visibility timeout is reached.

To further improve on the previous implementations, introduce a third FIFO Queue to aggregate all messages from both queues and regulate dequeuing from that third FIFO queue. This will allow you to use all available capacity under your SES limit, while interweaving messages from both campaigns at a 1:10 ratio.

Fig 5 — Adding FIFO merge queue diagram

Processing 100% of the available capacity limit of the large campaign1 and 10% of the capacity limit of the small campaign2 allows you to make sure campign2 messages will not wait until campaign1 messages were all processed. Once campain2 messages are all processed, campign1 messages will continue to be processed using 100% of the capacity limit.

You can find here instructions for Configuring Amazon SQS queues.

Conclusion

In this blog post, we have shown you how to regulate the dequeue of Amazon SQS queue messages. This will prevent you from exceeding your Amazon SES per second limit. This will also remove the need to deal with throttled requests. We explained how to combine Amazon SQS, AWS Lambda, Amazon EventBridge to create a custom serverless regulating queue poller. Finally, we described how to regulate the flow of Amazon SES requests when using multiple priority queues. These technics can reduce implementation time for reprocessing throttled requests, optimize utilization of SES request limit, and reduce costs.

About the Authors

This blog post was written by Guy Loewy and Mark Richman, AWS Senior Solutions Architects for SMB.

ICYMI: Serverless Q3 2022

2022-10-03 David Boyne

Post Syndicated from David Boyne original https://aws.amazon.com/blogs/compute/serverless-icymi-q3-2022/

Welcome to the 19th edition of the AWS Serverless ICYMI (in case you missed it) quarterly recap. Every quarter, we share all the most recent product launches, feature enhancements, blog posts, webinars, Twitch live streams, and other interesting things that you might have missed!

In case you missed our last ICYMI, check out what happened last quarter here.

AWS Lambda

AWS has now introduced tiered pricing for Lambda. With tiered pricing, customers who run large workloads on Lambda can automatically save on their monthly costs. Tiered pricing is based on compute duration measured in GB-seconds. The tiered pricing breaks down as follows:

With tiered pricing, you can save on the compute duration portion of your monthly Lambda bills. This allows you to architect, build, and run large-scale applications on Lambda and take advantage of these tiered prices automatically. To learn more about Lambda cost optimizations, watch the new serverless office hours video.

Developers are using AWS SAM CLI to simplify serverless development making it easier to build, test, package, and deploy their serverless applications. For JavaScript and TypeScript developers, you can now simplify your Lambda development further using esbuild in the AWS SAM CLI.

Together with esbuild and SAM Accelerate, you can rapidly iterate on your code changes in the AWS Cloud. You can approximate the same levels of productivity as when testing locally, while testing against a realistic application environment in the cloud. esbuild helps simplify Lambda development with support for tree shaking, minification, source maps, and loaders. To learn more about this feature, read the documentation.

Lambda announced support for Attribute-Based Access Control (ABAC). ABAC is designed to simplify permission management using access permissions based on tags. These can be attached to IAM resources, such as IAM users, and roles. ABAC support for Lambda functions allows you to scale your permissions as your organization innovates. It gives granular access to developers without requiring a policy update when a user or project is added, removed, or updated. To learn more about ABAC, read about ABAC for Lambda.

AWS Lambda Powertools is an open-source library to help customers discover and incorporate serverless best practices more easily. Powertools for TypeScript is now generally available and currently focused on three observability features: distributed tracing (Tracer), structured logging (Logger), and asynchronous business and application metrics (Metrics). Powertools is helping builders around the world with more than 10M downloads it is also available in Python and Java programming languages.

To learn more:

Watch AWS Lambda Powertools for TypeScript/Node.js on Serverless office hours
Explore the documentation for Lambda Powertools and utilities for Python, Java and TypeScript.

AWS Step Functions

Amazon States Language (ASL) provides a set of functions known as intrinsics that perform basic data transformations. Customers have asked for additional intrinsics to perform more data transformation tasks, such as formatting JSON strings, creating arrays, generating UUIDs, and encoding data. Step functions have now added 14 new intrinsic functions which can be grouped into six categories:

Intrinsic functions allow you to reduce the use of other services to perform basic data manipulations in your workflow. Read the release blog for use-cases and more details.

Step Functions expanded its AWS SDK integrations with support for Amazon Pinpoint API 2.0, AWS Billing Conductor, Amazon GameSparks, and 195 more AWS API actions. This brings the total to 223 AWS Services and 10,000+ API Actions.

Amazon EventBridge

EventBridge released support for bidirectional event integrations with Salesforce, allowing customers to consume Salesforce events directly into their AWS accounts. Customers can also utilize API Destinations to send EventBridge events back to Salesforce, completing the bidirectional event integrations between Salesforce and AWS.

EventBridge also released the ability to start receiving events from GitHub, Stripe, and Twilio using quick starts. Customers can subscribe to events from these SaaS applications and receive them directly onto their EventBridge event bus for further processing. With Quick Starts, you can use AWS CloudFormation templates to create HTTP endpoints for your event bus that are configured with security best practices.

To learn more:

Read the feature announcement for Amazon EventBridge salesforce integration
Watch the EventBridge team talk through salesforce integration on serverless office hours
Learn how to start receiving events from GitHub, Stripe, and Twilio with quick starts.

Amazon DynamoDB

DynamoDB now supports bulk imports from Amazon S3 into new DynamoDB tables. You can use bulk imports to help you migrate data from other systems, load test your applications, facilitate data sharing between tables and accounts, or simplify your disaster recovery and business continuity plans. Bulk imports support CSV, DynamoDB JSON, and Amazon Ion as input formats. You can get started with DynamoDB import via API calls or the AWS Management Console. To learn more, read the documentation or follow this guide.

DynamoDB now supports up to 100 actions per transaction. With Amazon DynamoDB transactions, you can group multiple actions together and submit them as a single all-or-nothing operation. The maximum number of actions in a single transaction has now increased from 25 to 100. The previous limit of 25 actions per transaction would sometimes require writing additional code to break transactions into multiple parts. Now with 100 actions per transaction, builders will encounter this limit much less frequently. To learn more about best practices for transactions, read the documentation.

Amazon SNS

SNS has introduced the public preview of message data protection to help customers discover and protect sensitive data in motion without writing custom code. With message data protection for SNS, you can scan messages in real time for PII/PHI data and receive audit reports containing scan results. You can also prevent applications from receiving sensitive data by blocking inbound messages to an SNS topic or outbound messages to an SNS subscription. These scans include people’s names, addresses, social security numbers, credit card numbers, and prescription drug codes.

To learn more:

EDA Day – London 2022

The Serverless DA team hosted the world’s first event-driven architecture (EDA) day in London on September 1. This brought together prominent figures in the event-driven architecture community, AWS, and customer speakers, and AWS product leadership from EventBridge and Step Functions.

EDA day covered 13 sessions, 3 workshops, and a Q&A panel. The conference was keynoted by Gregor Hohpe and speakers included Sheen Brisals and Sarah Hamilton from Lego, Toli Apostolidis from Cinch, David Boyne and Marcia Villalba from Serverless DA, and the AWS product team leadership for the panel. Customers could also interact with EDA experts at the Serverlesspresso bar and the Ask the Experts whiteboard.

Gregor Hohpe talking at EDA Day London 2022

Serverless snippets collection added to Serverless Land

Serverless Land is a website that is maintained by the Serverless Developer Advocate team to help you build with workshops, patterns, blogs, and videos. The team has extended Serverless Land and introduced the new AWS Serverless snippets collection. Builders can use serverless snippets to find and integrate tools, code examples, and Amazon CloudWatch Logs Insights queries to help with their development workflow.

Serverless Blog Posts

Videos

Serverless Office Hours – Tues 10AM PT

Weekly live virtual office hours. In each session we talk about a specific topic or technology related to serverless and open it up to helping you with your real serverless challenges and issues. Ask us anything you want about serverless technologies and applications.

YouTube: youtube.com/serverlessland
Twitch: twitch.tv/aws

July

Jul 5 – AWS SAM Accelerate GA + more!

Jul 12 – Infrastructure as actual code

Jul 19 – The AWS Step Functions Workflows Collection

Jul 26 – AWS Lambda Attribute-Based Access Control (ABAC)

August

Aug 2 – AWS Lambda Powertools for TypeScript/Node.js

Aug 9 – AWS CloudFormation Hooks

Aug 16 – Java on Lambda best-practices

Aug 30 – Alex de Brie: DynamoDB Misconceptions

September

Sep 06 – AWS Lambda Cost Optimization

Sep 13 – Amazon EventBridge Salesforce integration

Sep 20 – .NET on AWS Lambda best practices

FooBar Serverless YouTube channel

Marcia Villalba frequently publishes new videos on her popular serverless YouTube channel. You can view all of Marcia’s videos at https://www.youtube.com/c/FooBar_codes.

July

Jul 7 – Amazon Cognito – Add authentication and authorization to your web apps

Jul 14 – Add Amazon Cognito to an existing application – NodeJS-Express and React

Jul 21 – Introduction to Amazon CloudFront – Add CDN to your applications

Jul 28 – Add Amazon S3 storage and use a CDN in an existing application

August

Aug 04 – Testing serverless application locally – Demo with Node.js, Express, and React

Aug 11 – Building Amazon CloudWatch dashboards with AWS CDK

Aug 19 – Let’s code – Lift and Shift migration to Serverless of Node.js, Express, React and Mongo app

Aug 25 – Let’s code – Lift and Shift migration to Serverless, migrating Authentication and Authorization

Aug 29 – Deploying AWS Lambda functions using AWS Controllers for Kubernetes (ACK)

September

Sep 1 – Run Artillery in a Lambda function | Load test your serverless applications

Sep 8 – Let’s code – Lift and Shift migration to Serverless, migrating Storage with Amazon S3 and CloudFront

Sep 15 – What are Event-Driven Architectures? Why we care?

Sep 22 – Queues – Point to Point Messaging – Exploring Event-Driven Patterns

Still looking for more?

The Serverless landing page has more information. The Lambda resources page contains case studies, webinars, whitepapers, customer stories, reference architectures, and even more Getting Started tutorials.

You can also follow the Serverless Developer Advocacy team on Twitter to see the latest news, follow conversations, and interact with the team.

Eric Johnson: @edjgeek
James Beswick: @jbesw
Ben Smith: @benjamin_l_s
Julian Wood: @julian_wood
Marcia Villalba: @mavi888uy
David Boyne: @boyney123

Using CloudFormation events to build custom workflows for post provisioning management

2022-09-29 Vivek Kumar

Post Syndicated from Vivek Kumar original https://aws.amazon.com/blogs/devops/using-cloudformation-events-to-build-custom-workflows-for-post-provisioning-management/

Over one million active customers manage application resources with AWS CloudFormation every week. CloudFormation is a service that helps you model, provision, and manage your cloud resources by treating Infrastructure as Code (IaC). It can simplify infrastructure management, quickly replicate your environment to multiple AWS regions with a single turn-key solution, and let you easily control and track changes in your infrastructure.

You can create various AWS resources using CloudFormation to setup an environment for your workloads. You continue to interact with and manage those resources throughout the workload lifecycle to make sure the resource configuration is aligned with business objectives such as adhering to security compliance standards, meeting required reliability targets, and aligning with budget requirements. The inability to perform a hand-off between resource provisioning actions in CloudFormation and resource management actions in other relevant AWS and non-AWS services poses a challenge. For example, after provisioning of resources, customers might need to perform additional tasks to manage these resources such as adding cost allocation tags, populating resource inventory database or trigger downstream processes.

While they are able to obtain the logical resource grouping that is tied to a workload or a workload component with a CloudFormation stack, that context does not extend beyond CloudFormation for the most part when they use various AWS and non-AWS services to conduct post-provisioning resource management. These AWS and non-AWS services typically offer a resource level view, or in some cases offer basic aggregated views such as supporting a tag group, or an account level abstraction to see all resources in a given account. For a CloudFormation customer, the inability to not have the context of a stack beyond resource provisioning provides a disjointed experience given there is no hand-off between resource provisioning actions in CloudFormation and resource management actions in other relevant AWS and non-AWS services. The various management actions customers take with their workload resources through out their lifecycle are

CloudFormation events provide a robust way to track the status of individual resources during the lifecycle of a stack. You can send CloudFormation events to Amazon EventBridge whenever a create, update, or drift detection action is performed on your stack. Then you can set up additional workflows based on those events from EventBridge. For example, by tagging the resources automatically, you can reference that tag group when using AWS Trusted Advisor, and continue your resource management experience post-provisioning. CloudFormation sends these events to EventBridge automatically so that you don’t need to do anything. One real-world use case is to use these events to create actionable tasks for your teams to troubleshoot issues. CloudFormation events published to EventBridge can be used to create OpsItems within AWS Systems Manager OpsCenter. OpsItems are the work items created in OpsCenter for engineers to view, investigate and remediate tasks/issues. This enables teams to respond and resolve any issues more efficiently.

Walkthrough

To set up the EventBridge rule, go to the AWS console and navigate to EventBridge. Select on Create Rule to get started. Enter Name, description and select Next:

Create Rule

On the next screen, select AWS events in the Event source section.

This sample event is for the CREATE_COMPLETE event. It contains the source, AWS account number, AWS region, event type, resources and details about the event.

On the same page in the Event pattern section:

Select Custom patterns (JSON editor) and enter the following event pattern. This will match any events when a resource fails to create, update, or delete. Learn more about EventBridge event patterns.

{
    "source": [
        "aws.cloudformation"
    ],
    "detail-type": [
        "CloudFormation Resource Status Change"
    ],
    "detail": {
        "status-details": {
            "status": [
                "CREATE_FAILED",
                "UPDATE_FAILED",
                "DELETE_FAILED"
            ]
        }
    }
}

Custom patterns - JSON editor

Select Next. On the Target screen, select AWS service, then select System Manager OpsItem as the target for this rule.

Target 1

Add a second target – an Amazon Simple Notification Service (SNS) Topic – to notify the Ops team whenever a failure occurs and an OpsItem has been created.

Target 2

Select Next and optionally add tags.

Select next to review the selections, and select Create rule.

Now your rule is created and whenever a stack failure occurs, an OpsItem gets created and a notification is sent out for the operators to troubleshoot and fix the issue. The OpsItem contains operational data, such as the resource that failed, the reason for failure, as well as the stack to which it belongs, which is useful for troubleshooting the issue. Operators can take manual actions or use runbooks codified as Systems Manager Documents to take corrective actions. From the AWS Console you can go to OpsCenter to see the events:

operational data

Once the issues have been addressed, operators can mark the OpsItem as resolved, and retry the stack operation that failed, resulting in a swift resolution of the issue, and preventing duplication of efforts.

This walkthrough is for the Console but you can use AWS Command Line Interface (AWS CLI), AWS SDK or even CloudFormation to accomplish all of this. Refer to AWS CLI documentation for more information on creating EventBridge rules through CLI. Furthermore, refer to AWS SDK documentation for creating EventBridge rules through AWS SDK. You can use following CloudFormation template to deploy the EventBridge rules example used as part of the walkthrough in this blog post:

{
	"Parameters": {
		"SNSTopicARN": {
			"Type": "String",
			"Description": "Enter the ARN of the SNS Topic where you want stack failure notifications to be sent."
		}
	},
	"Resources": {
		"CFNEventsRule": {
			"Type": "AWS::Events::Rule",
			"Properties": {
				"Description": "Event rule to capture CloudFormation failure events",
				"EventPattern": {
					"source": [
						"aws.cloudformation"
					],
					"detail-type": [
						"CloudFormation Resource Status Change"
					],
					"detail": {
						"status-details": {
							"status": [
								"CREATE_FAILED",
								"UPDATE_FAILED",
								"DELETE_FAILED"
							]
						}
					}
				},
				"Name": "cfn-stack-failure-test",
				"State": "ENABLED",
				"Targets": [
					{
						"Arn": {
							"Fn::Sub": "arn:aws:ssm:${AWS::Region}:${AWS::AccountId}:opsitem"
						},
						"Id": "opsitems",
						"RoleArn": {
							"Fn::GetAtt": [
								"TargetInvocationRole",
								"Arn"
							]
						}
					},
					{
						"Arn": {
							"Ref": "SNSTopicARN"
						},
						"Id": "sns"
					}
				]
			}
		},
		"TargetInvocationRole": {
			"Type": "AWS::IAM::Role",
			"Properties": {
				"AssumeRolePolicyDocument": {
					"Version": "2012-10-17",
					"Statement": [
						{
							"Effect": "Allow",
							"Principal": {
								"Service": [
									"events.amazonaws.com"
								]
							},
							"Action": [
								"sts:AssumeRole"
							]
						}
					]
				},
				"Path": "/",
				"Policies": [
					{
						"PolicyName": "createopsitem",
						"PolicyDocument": {
							"Version": "2012-10-17",
							"Statement": [
								{
									"Effect": "Allow",
									"Action": [
										"ssm:CreateOpsItem"
									],
									"Resource": "*"
								}
							]
						}
					}
				]
			}
		},
		"AllowSNSPublish": {
			"Type": "AWS::SNS::TopicPolicy",
			"Properties": {
				"PolicyDocument": {
					"Statement": [
						{
							"Sid": "grant-eventbridge-publish",
							"Effect": "Allow",
							"Principal": {
								"Service": "events.amazonaws.com"
							},
							"Action": [
								"sns:Publish"
							],
							"Resource": {
								"Ref": "SNSTopicARN"
							}
						}
					]
				},
				"Topics": [
					{
						"Ref": "SNSTopicARN"
					}
				]
			}
		}
	}
}

Summary

Responding to CloudFormation stack events becomes easy with the integration between CloudFormation and EventBridge. CloudFormation events can be used to perform post-provisioning actions on workload resources. With the variety of targets available to EventBridge rules, various actions such as adding tags and, troubleshooting issues can be performed. This example above uses Systems Manager and Amazon SNS but you can have numerous targets including, Amazon API gateway, AWS Lambda, Amazon Elastic Container Service (Amazon ECS) task, Amazon Kinesis services, Amazon Redshift, Amazon SageMaker pipeline, and many more. These events are available for free in EventBridge.

Learn more about Managing events with CloudFormation and EventBridge.

About the Author

Vivek is a Solutions Architect at AWS based out of New York. He works with customers providing technical assistance and architectural guidance on various AWS services. He brings more than 25 years of experience in software engineering and architecture roles for various large-scale enterprises.

Mahanth is a Solutions Architect at Amazon Web Services (AWS). As part of the AWS Well-Architected team, he works with customers and AWS Partner Network partners of all sizes to help them build secure, high-performing, resilient, and efficient infrastructure for their applications. He spends his free time playing with his pup Cosmo, learning more about astronomy, and is an avid gamer.

Sukhchander is a Solutions Architect at Amazon Web Services. He is passionate about helping startups and enterprises adopt the cloud in the most scalable, secure, and cost-effective way by providing technical guidance, best practices, and well architected solutions.

Maintain visibility over the use of cloud architecture patterns

2022-09-19 Rostislav Markov

Post Syndicated from Rostislav Markov original https://aws.amazon.com/blogs/architecture/maintain-visibility-over-the-use-of-cloud-architecture-patterns/

Cloud platform and enterprise architecture teams use architecture patterns to provide guidance for different use cases. Cloud architecture patterns are typically aggregates of multiple Amazon Web Services (AWS) resources, such as Elastic Load Balancing with Amazon Elastic Compute Cloud, or Amazon Relational Database Service with Amazon ElastiCache. In a large organization, cloud platform teams often have limited governance over cloud deployments, and, therefore, lack control or visibility over the actual cloud pattern adoption in their organization.

While having decentralized responsibility for cloud deployments is essential to scale, a lack of visibility or controls leads to inefficiencies, such as proliferation of infrastructure templates, misconfigurations, and insufficient feedback loops to inform cloud platform roadmap.

To address this, we present an integrated approach that allows cloud platform engineers to share and track use of cloud architecture patterns with:

AWS Service Catalog to publish an IT service catalog of codified cloud architecture patterns that are pre-approved for use in the organization.
Amazon QuickSight to track and visualize actual use of service catalog products across the organization.

This solution enables cloud platform teams to maintain visibility into the adoption of cloud architecture patterns in their organization and build a release management process around them.

Publish architectural patterns in your IT service catalog

We use AWS Service Catalog to create portfolios of pre-approved cloud architecture patterns and expose them as self-service to end users. This is accomplished in a shared services AWS account where cloud platform engineers manage the lifecycle of portfolios and publish new products (Figure 1). Cloud platform engineers can publish new versions of products within a portfolio and deprecate older versions, without affecting already-launched resources in end-user AWS accounts. We recommend using organizational sharing to share portfolios with multiple AWS accounts.

Application engineers launch products by referencing the AWS Service Catalog API. Access can be via infrastructure code, like AWS CloudFormation and TerraForm, or an IT service management tool, such as ServiceNow. We recommend using a multi-account setup for application deployments, with an application deployment account hosting the deployment toolchain: in our case, using AWS developer tools.

Although not explicitly depicted, the toolchain can be launched as an AWS Service Catalog product and include pre-populated infrastructure code to bootstrap initial product deployments, as described in the blog post Accelerate deployments on AWS with effective governance.

Figure 1. Launching cloud architecture patterns as AWS Service Catalog products

Track the adoption of cloud architecture patterns

Track the usage of AWS Service Catalog products by analyzing the corresponding AWS CloudTrail logs. The latter can be forwarded to an Amazon EventBridge rule with a filter on the following events: CreateProduct, UpdateProduct, DeleteProduct, ProvisionProduct and TerminateProvisionedProduct.

The logs are generated no matter how you interact with the AWS Service Catalog API, such as through ServiceNow or TerraForm. Once in EventBridge, Amazon Kinesis Data Firehose delivers the events to Amazon Simple Storage Service (Amazon S3) from where QuickSight can access them. Figure 2 depicts the end-to-end flow.

Figure 2. Tracking adoption of AWS Service Catalog products with Amazon QuickSight

Depending on your AWS landing zone setup, CloudTrail logs from all relevant AWS accounts and regions need to be forwarded to a central S3 bucket in your shared services account or, otherwise, centralized logging account. Figure 3 provides an overview of this cross-account log aggregation.

Figure 3. Aggregating AWS Service Catalog product logs across AWS accounts

If your landing zone allows, consider giving permissions to EventBridge in all accounts to write to a central event bus in your shared services AWS account. This avoids having to set up Kinesis Data Firehose delivery streams in all participating AWS accounts and further simplifies the solution (Figure 4).

Figure 4. Aggregating AWS Service Catalog product logs across AWS accounts to a central event bus

If you are already using an organization trail, you can use Amazon Athena or AWS Lambda to discover the relevant logs in your QuickSight dashboard, without the need to integrate with EventBridge and Kinesis Data Firehose.

Reporting on product adoption can be customized in QuickSight. The S3 bucket storing AWS Service Catalog logs can be defined in QuickSight as datasets, for which you can create an analysis and publish as a dashboard.

In the past, we have reported on the top ten products used in the organization (if relevant, also filtered by product version or time period) and the top accounts in terms of product usage. The following figure offers an example dashboard visualizing product usage by product type and number of times they were provisioned. Note: the counts of provisioned and terminated products differ slightly, as logging was activated after the first products were created and provisioned for demonstration purposes.

Figure 5. Example Amazon QuickSight dashboard tracking AWS Service Catalog product adoption

Conclusion

In this blog, we described an integrated approach to track adoption of cloud architecture patterns using AWS Service Catalog and QuickSight. The solution has a number of benefits, including:

Building an IT service catalog based on pre-approved architectural patterns
Maintaining visibility into the actual use of patterns, including which patterns and versions were deployed in the organizational units’ AWS accounts
Compliance with organizational standards, as architectural patterns are codified in the catalog

In our experience, the model may compromise on agility if you enforce a high level of standardization and only allow the use of a few patterns. However, there is the potential for proliferation of products, with many templates differing slightly without a central governance over the catalog. Ideally, cloud platform engineers assume responsibility for the roadmap of service catalog products, with formal intake mechanisms and feedback loops to account for builders’ localization requests.

AWS Week In Review – September 12, 2022

2022-09-13 Sébastien Stormacq

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/aws-week-in-review-september-12-2022/

I am working from London, UK, this week to record sessions for the upcoming Innovate EMEA online conference—more about this in a future Week In Review. While I was crossing the channel, I took the time to review what happened on AWS last week.

Last Week’s Launches
Here are some launches that got my attention:

Seekable OCI for lazy loading container images. Seekable OCI (SOCI) is a technology open sourced by AWS that enables containers to launch faster by lazily loading the container image. SOCI works by creating an index of the files within an existing container image. This index is a key enabler to launching containers faster, providing the capability to extract an individual file from a container image before downloading the entire archive. Check out the source code on GitHub.

Amazon Lookout for Metrics now lets you filter data by dimensions and increased the limits on the number of measures and dimensions. Lookout for Metrics uses machine learning (ML) to automatically detect and diagnose anomalies (i.e., outliers from the norm) in business and operational data, such as a sudden dip in sales revenue or customer acquisition rates.

Amazon SageMaker has three new capabilities. First, SageMaker Canvas added additional capabilities to explore and analyze data with advanced visualizations. Second, SageMaker Studio now sends API user identity data to AWS CloudTrail. And third, SageMaker added TensorFlow image classification to its list of builtin algorithms.

The AWS console launches a widget to display the most recent AWS blog posts on the console landing page. Being part of the AWS News Blog team, I couldn’t be more excited about a launch this week.

Other AWS News
Some other updates and news that you may have missed:

The Amazon Science blog published an article on the design of a pinch grasping robot. It is one of the many areas where we try to improve the efficiency of our fulfillment centers. A must-read if you’re into robotics or logistics.

The Public Sector blog has an article on how Satellogic and AWS are harnessing the power of space and cloud. Satellogic is creating a live catalog of Earth and delivering daily updates to create a complete picture of changes to our planet for decision-makers. Satellogic is generating massive volumes of data, with each of its satellites collecting an average of 50GB of data daily. They are using compute, storage, analytics, and ground station infrastructure in support of their growth.

Event Ruler is now open-source. Talking about open-source, the source code of the core rule engine built first for Amazon CloudWatch Events, and now the core of Amazon Event Bridge, is newly available on GitHub. This is a Java library that allows applications to identify events that match a set of rules. Events and rules are expressed as JSON documents. Rules are compiled for fast evaluation by a finite state engine. Read the announcement blog post to understand how Event Bridge works under the hood.

HP Anyware (formerly Teradici CAS) is now available for Amazon EC2 Mac instances, from the AWS Marketplace. HP Anyware is a remote access solution that provides pixel-perfect rendering for your remote Mac Mini running in the AWS cloud. It uses PCoIP to securely and efficiently access the remote macOS machines. You can connect from anywhere, using a PCoIP client application or from thin terminals such as Thin Clients or Zero Clients workstations.

Upcoming AWS Events
Check your calendars and sign up for these AWS events that are happening all over the world:

AWS Summits – Come together to connect, collaborate, and learn about AWS. Registration is open for the following in-person AWS Summits: Mexico City (September 21–22), Bogotá (October 4), and Singapore (October 6).

AWS Community Days – AWS Community Day events are community-led conferences to share and learn with one another. In September, the AWS community in the US will run events in Arlington, Virginia (September 30). In Europe, Community Day events will be held in October. Join us in Amersfoort, Netherlands (October 3), Warsaw, Poland (October 14), and Dresden, Germany (October 19).

That’s all from me for this week. Come back next Monday for another Week in Review!

— seb

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

A multi-dimensional approach helps you proactively prepare for failures, Part 3: Operations and process resiliency

2022-09-02 Piyali Kamra

Post Syndicated from Piyali Kamra original https://aws.amazon.com/blogs/architecture/a-multi-dimensional-approach-helps-you-proactively-prepare-for-failures-part-3-operations-and-process-resiliency/

In Part 1 and Part 2 of this series, we discussed how to build application layer and infrastructure layer resiliency.

In Part 3, we explore how to develop resilient applications, and the need to test and break our operational processes and run books. Processes are needed to capture baseline metrics and boundary conditions. Detecting deviations from accepted baselines requires logging, distributed tracing, monitoring, and alerting. Testing automation and rollback are part of continuous integration/continuous deployment (CI/CD) pipelines. Keeping track of network, application, and system health requires automation.

In order to meet recovery time and point objective (RTO and RPO, respectively) requirements of distributed applications, we need automation to implement failover operations across multiple layers. Let’s explore how a distributed system’s operational resiliency needs to be addressed before it goes into production, after it’s live in production, and when a failure happens.

Pattern 1: Standardize and automate AWS account setup

Create processes and automation for onboarding users and providing access to AWS accounts according to their role and business unit, as defined by the organization. Federated access to AWS accounts and organizations simplifies cost management, security implementation, and visibility. Having a strategy for a suitable AWS account structure can reduce the blast radius in case of a compromise.

Have auditing mechanisms in place. AWS CloudTrail monitors compliance, improving security posture, and auditing all the activity records across AWS accounts.
Practice the least privilege security model when setting up access to the CloudTrail audit logs plus network and applications logs. Follow best practices on service control policies and IAM boundaries to help ensure your AWS accounts stay within your organization’s access control policies.
Explore AWS Budgets, AWS Cost Anomaly Detection, and AWS Cost Explorer for cost-optimizing techniques. The AWS Compute Optimizer and Instance Scheduler on AWS resource resizing and auto-shutdown for non-working hours. A Beginner’s Guide to AWS Cost Management explores multiple cost-optimization techniques.
Use AWS CloudFormation and AWS Config to detect infrastructure drift and take corrective actions to make resources compliant, as demonstrated in Figure 1.

Figure 1. Compliance control and drift detection

Pattern 2: Documenting knowledge about the distributed system

Document high-level infrastructure and dependency maps.

Define availability characteristics of distributed system. Systems have components with varying RTO and RPO needs. Document application component boundaries and capture dependencies with other infrastructure components, including Domain Name System (DNS), IAM permissions; and access patterns, secrets, and certificates. Discover dependencies through solutions, such as Workload Discovery on AWS, to plan resiliency methods and ensure the order of execution of various steps during failover are correct.

Capture non-functional requirements (NFRs), such as business key performance indicators (KPIs), RTO, and RPO, for your composing services. NFRs are quantifiable and define system availability, reliability, and recoverability requirements. They should include throughput, page-load, and response time requirements. Quantify the RTO and RPO of different components of the distributed system by defining them. The KPIs measure if you are meeting the business objectives. As mentioned in Part 2: Infrastructure layer, RTO and RPO help define the failover and data recovery procedures.

Pattern 3: Define CI/CD pipelines for application code and infrastructure components

Establish a branching strategy. Implement automated checks for version and tagging compliance in feature/sprint/bug fix/hot fix/release candidate branches, according to your organization’s policies. Define appropriate release management processes and responsibility matrices, as demonstrated in Figures 2 and 3.

Test at all levels as part of an automated pipeline. This includes security, unit, and system testing. Create a feedback loop that provides the ability to detect issues and automate rollback in case of production failures, which are indicated by business KPI negative impact and other technical metrics.

Figure 2. Define the release management process

Figure 3. Sample roles and responsibility matrix

Pattern 4: Keep code in a source control repository, regardless of GitOps

Merge requests and configuration changes follow the same process as application software. Just like application code, manage infrastructure as code (IaC) by checking the code into a source control repository, submitting pull requests, scanning code for vulnerabilities, alerting and sending notifications, running validation tests on deployments, and having an approval process.

You can audit your infrastructure drift, design reusable and repeatable patterns, and adhere to your distributed application’s RTO objectives by building your IaC (Figure 4). IaC is crucial for operational resilience.

Figure 4. CI/CD pipeline for deploying IaC

Pattern 5: Immutable infrastructure

An immutable deployment pipeline launches a set of new instances running the new application version. You can customize immutability at different levels of granularity depending on which infrastructure part is being rebuilt for new application versions, as in Figure 5.

The more immutable infrastructure components being rebuilt, the more expensive deployments are in both deployment time and actual operational costs. Immutable infrastructure also is easier to rollback.

Figure 5. Different granularity levels of immutable infrastructure

Pattern 6: Test early, test often

In a shift-left testing approach, begin testing in the early stages, as demonstrated in Figure 6. This can surface defects that can be resolved in a more time- and cost-effective manner compared with after code is released to production.

Figure 6. Shift-left test strategy

Continuous testing is an essential part of CI/CD. CI/CD pipelines can implement various levels of testing to reduce the likelihood of defects entering production. Testing can include: unit, functional, regression, load, and chaos.

Continuous testing requires testing and breaking existing boundary conditions, and updating test cases if the boundaries have changed. Test cases should test distributed systems’ idempotency. Chaos testing benefits our incidence response mechanisms for distributed systems that have multiple integration points. By testing our auto scaling and failover mechanisms, chaos testing improves application performance and resiliency.

AWS Fault Injection Simulator (AWS FIS) is a service for chaos testing. An experiment template contains actions, such as StopInstance and StartInstance, along with targets on which the test will be performed. In addition, you can mention stop conditions and check if they triggered the required Amazon CloudWatch alarms, as demonstrated in Figure 7.

Figure 7. AWS Fault Injection Simulator architecture for chaos testing

Pattern 7: Providing operational visibility

In production, operational visibility across multiple dimensions is necessary for distributed systems (Figure 8). To identify performance bottlenecks and failures, use AWS X-Ray and other open-source libraries for distributed tracing.

Write application, infrastructure, and security logs to CloudWatch. When metrics breach alarm thresholds, integrate the corresponding alarms with Amazon Simple Notification Service or a third-party incident management system for notification.

Monitoring services, such as Amazon GuardDuty, are used to analyze CloudTrail, virtual private cloud flow logs, DNS logs, and Amazon Elastic Kubernetes Service audit logs to detect security issues. Monitor AWS Health Dashboard for maintenance, end-of-life, and service-level events that could affect your workloads. Follow the AWS Trusted Advisor recommendations to ensure your accounts follow best practices.

Figure 8. Dimensions for operational visibility (click the image to enlarge)

Figure 9 explores various application and infrastructure components integrating with AWS logging and monitoring components for increased problem detection and resolution, which can provide operational visibility.

Figure 9. Tooling architecture to provide operational visibility

Having an incident response management plan is an important mechanism for providing operational visibility. Successful execution of this requires educating the stakeholders on the AWS shared responsibility model, simulation of anticipated and unanticipated failures, documentation of the distributed system’s KPIs, and continuous iteration. Figure 10 demonstrates the features of a successful incidence response management plan.

Figure 10. An incidence response management plan (click the image to enlarge)

Conclusion

In Part 3, we discussed continuous improvement of our processes by testing and breaking them. In order to understand the baseline level metrics, service-level agreements, and boundary conditions of our system, we need to capture NFRs. Operational capabilities are required to capture deviations from baseline, which is where alerting, logging, and distributed tracing come in. Processes should be defined for automating frequent testing in CI/CD pipelines, detecting network issues, and deploying alternate infrastructure stacks in failover regions based on RTOs and RPOs. Automating failover steps depends on metrics and alarms, and by using chaos testing, we can simulate failover scenarios.

Prepare for failure, and learn from it. Working to maintain resilience is an ongoing task.

Want to learn more?

Integrating Salesforce with AWS DynamoDB using Amazon AppFlow bi-directionally

2022-08-31 Abhijit Vaidya

Post Syndicated from Abhijit Vaidya original https://aws.amazon.com/blogs/architecture/integrating-salesforce-with-aws-dynamodb-using-amazon-appflow-bi-directionally/

In this blog post, we demonstrate how to integrate Salesforce Lightning with Amazon DynamoDB by using Amazon AppFlow and Amazon EventBridge services bi-directionally. This is an event-driven, serverless-based microservice, allowing Salesforce users to update configuration data stored in DynamoDB tables without giving AWS account access from AWS Command Line Interface or AWS Management Console.

This architecture describes a contact center application that utilizes DynamoDB database to store configuration data, including holiday data, across different global regions. Updating this data in a table is a manual process, which includes creating a support ticket and waiting for completion of the support request. The architecture herein allows authorized business users to update the configuration data in DynamoDB directly from the Salesforce screen and not relying on manual update process.

Solution overview

As demonstrated in Figure 1, Amazon AppFlow consumes events payload from salesforce and Lambda updates the DynamoDB table after processing the payload. In parallel, a scheduled based flow sends response back to salesforce object as a notification and sends a SNS email notification to users. Contact center application (Amazon Connect) reads data from DynamoDB table.

Figure 1. Architecture overview

Event (data add or delete) occurring on a particular object in Salesforce is captured by Amazon AppFlow (event-based flow); event payload displays as “Input” in AWS environment.
An EventBridge bus receives the “Input” payload.
The EventBridge bus triggers the AWS Lambda (Lambda-1).
Lambda-1 processes the “Input” payload, then performs a write operation (data add or delete) on a DynamoDB table.
A write operation on the table triggers DynamoDB streams.
DynamoDB stream triggers Lambda (Lambda-2).
Lambda-2 processes the payload from DynamoDB streams. It saves the results in .csv file format, with a “Success” or “Fail” flag. This .csv file is uploaded to an S3 bucket.
A second flow (schedule-based) is configured in Amazon AppFlow. This reads the S3 bucket at a regular interval of time to find new .csv files created by Lambda-2.
A schedule-based flow transfers the .csv file data as an event-status output to the Salesforce object, notifying the user that their event has been successfully handled in DynamoDB table.
In parallel, DynamoDB streams trigger Lambda (Lambda-3), which processes the payload and initiates an Amazon Simple Notification Service (Amazon SNS) notification based on the status of the event request.
Amazon SNS sends an email to the user detailing that the event created in the Salesforce object was successfully completed in DynamoDB table.
If any of the two flows break due to any unforeseen events, their statuses are instantly captured and streamed to an EventBridge bus.
The EventBridge bus triggers Lambda (Lambda-5).
Lambda-5 tries to analyze the error causing failure and tries to get the flow to a working state again. If Lambda-5 fails, it initiates an Amazon SNS email to the solution support team to manually fix the Amazon AppFlow error and get the solution active again.
Meanwhile, whenever an Amazon Connect contact center receives a call, it triggers Lambda (Lambda-4), which holds read-only access to DynamoDB table.
Lambda-4 fetches the data stored by Lambda-1 in DynamoDB table and provides that data to a contact center in Amazon Connect.

Prerequisites

An AWS account with permissions to access all the required
An active Salesforce account with administration-level permissions set
Python 3.X version setup for developing Lambda-1 to -5

Implementation and development

1. Development steps on the Salesforce end

Note: We recommend working with a Salesforce expert. These steps can help initiate the development from Salesforce end.

a. Use the Platform Event feature to allow the data flow from Salesforce environment to AWS environment and then back to Salesforce from AWS environment using Amazon AppFlow.
b. Salesforce Connected Apps and web server flow used for integrating Amazon AppFlow with Salesforce.
c. New permissions set and new integration users are created to restrict the access to custom object.

2. Development steps on the AWS end

a. Create a new connection in Amazon AppFlow with “Salesforce” as the source
The “Client ID” and “Client Secret” of Salesforce Managed Connected App are stored in AWS Secrets Manager, which is encrypted by custom keys generated in AWS Key Management Service.

To setup an Amazon AppFlow connection with Salesforce, a stand-alone Lambda function is created using Python Boto3 API. This creates a connection in Amazon AppFlow using the “Client ID” and “Client Secret” of Salesforce connected app.

For more details regarding setting up the Salesforce connection in Amazon AppFlow, refer to the Amazon AppFlow User Guide.

b. Create new event-based input flow in Amazon AppFlow, using Salesforce as the source with the newly created connection
Select the “Salesforce events” option as per the business use case. For representation in this blog, “Telephony” is chosen as salesforce events. Select Amazon EventBridge as the destination. Create new “partner event source” for successful creation of input flow, as in Figure 2.

Figure 2. Configure an event-based flow in Amazon AppFlow

c. Handling large size input flow event payloads
Post successful creation of “Partner event source”, specify the S3 bucket for events that are larger than 256 KB, Amazon AppFlow sends a URL of the S3 object to the event bus instead of the event payload.

d. Configuring details for input flow
Select trigger pattern of input flow as “Run flow on event”, as shown in Figure 3.

Figure 3. Trigger pattern for creating input flow

As displayed in Figure 4, we can configure data field mapping, validation rules, and filters with Amazon AppFlow. This enables us to enrich and modify event data before it is sent to the event bus. Post this, input flow create action is complete.

Figure 4. Mapping Salesforce object fields with Amazon EventBridge bus

e. Associating Amazon AppFlow generated partner event source with the event bus in the Amazon EventBridge dashboard
Before activating the flow, access the Amazon EventBridge console to associate the AppFlow generated partner-event-source with the event bus (Figure 5).

Figure 5. Associating input flow partner event source with the Amazon EventBridge bus

f. Post Amazon EventBridge bus association, activate the input flow
After associating the bus with input flow, navigate back to Amazon AppFlow console and click the “Activate flow” button for the input flow. Once active, input flow is ready to consume the event payload from Salesforce.

g. Amazon EventBridge bus triggering Lambda function
The EventBridge bus receives an event payload from the input flow and triggers Lambda (Lambda-1) to process that raw event payload and write the output results in a designated DynamoDB table. A sample of event input payload is in Figure 6. This payload content depends on the use case for which developer is working.

Figure 6. Input event payload sample from Salesforce

Lambda-1 adds the record-ID in the DynamoDB table, which is a unique event ID for each Salesforce event as shown in Figure 7.

Figure 7. Data added in Amazon DynamoDB table by Lambda-1

h. Configuring DynamoDB streams
Within “Export and Streams” option in the DynamoDB table, enable the “DynamoDB stream details” and in the trigger section click on “Create trigger” option and select Lambda-2 and Lambda -3, as detailed in Figure 8.

Figure 8. Configuring Amazon DynamoDB streams

Lambda-2 stores the event output results and a success flag value in a .csv file that is created for every unique event; the .csv file is uploaded to an S3 bucket with the file name structure “Salesforce event recorded-event action-timestamp.csv”. For example:

Example 1: “abcd1234-event-created- 2022-05-19-11_23_51.csv” (data added)
Example 2: “abcd1234-event-deleted- 2022-05-19-11_24_50.csv” (data deleted)

i. Create new schedule-based flow (output flow) in Amazon AppFlow
The source is the S3 bucket folder where the .csv file is uploaded by Lambda-2. Select the destination as “Salesforce”, and choose the Salesforce object that is used to create the input flow (Figure 9). Revisit Step b for reference.

Figure 9. Configuring a schedule-based output flow

Output flow sends a response back to the same Salesforce object from which data addition/deletion event request was made. This confirms to the user that a data addition/deletion event created in Salesforce was successfully invoked in Dynamo DB table as well.

j. Error handling in output flow
In case output flow fails to write the response back to the Salesforce object, choose the option “Stop the current flow run”.

k. Configuring output flow as a run-on schedule
Output flow is a schedule-based flow that runs at specific time. Within the flow trigger window, select “Run flow on schedule”. Update the other fields, such as “Repeats”, “Every”, “Start date”, and “Starting at” per your specific needs. Within “Transfer mode”, select “Incremental transfer”. Refer to Figure 10.

Figure 10. Trigger pattern for schedule-based output flow

l. Amazon AppFlow to Salesforce object mapping for output flow
Select Mapping method as “Manually map fields” and Destination record preference as “Upsert records”, as in Figure 11.

Output Flow will update the event record status in the Salesforce object, with success status flag value in the .csv file based on the unique “Record ID” that every Salesforce event payload contains. Once the field mapping is completed, output flow is active.

Figure 11. Manually mapping data fields with Salesforce

m. Real-time monitoring and failure handling
In case input/output flow breaks for unforeseen reasons, a rule is configured in EventBridge bus console that invokes Lambda (Lambda-5). Lambda-5 tries to handle the error and reactivate the flow. In case this action fails, Lambda sends an Amazon SNS email to the solution support team informing of the failure in Amazon AppFlow and the cause.

n. Integrating DynamoDB with Amazon Connect
Lambda (Lambda-4) is configured with contact center in Amazon Connect. As the call comes to contact center, Lambda-4 fetches the relevant data from the DynamoDB table. Amazon Connect operates per this data.

Cleanup

To avoid incurring future charges, delete any AWS resources that are no longer needed.

Conclusion

This post demonstrates the approach for developing an event-driven, serverless application that integrates DynamoDB with Salesforce using Amazon AppFlow bi-directionally. The contact center is based in Amazon Connect and functions dependent on the real-time data in a DynamoDB table—without manual intervention. The manual process can take a minimum of 24 hours, but the same action can be automatically completed using a self-service, UI-based form in the user’s Salesforce account.

This solution can be tailored depending on the business or technical requirement, differing how the data is consumed by multiple AWS services.

How to export AWS Security Hub findings to CSV format

2022-08-23 Andy Robinson

Post Syndicated from Andy Robinson original https://aws.amazon.com/blogs/security/how-to-export-aws-security-hub-findings-to-csv-format/

AWS Security Hub is a central dashboard for security, risk management, and compliance findings from AWS Audit Manager, AWS Firewall Manager, Amazon GuardDuty, IAM Access Analyzer, Amazon Inspector, and many other AWS and third-party services. You can use the insights from Security Hub to get an understanding of your compliance posture across multiple AWS accounts. It is not unusual for a single AWS account to have more than a thousand Security Hub findings. Multi-account and multi-Region environments may have tens or hundreds of thousands of findings. With so many findings, it is important for you to get a summary of the most important ones. Navigating through duplicate findings, false positives, and benign positives can take time.

In this post, we demonstrate how to export those findings to comma separated values (CSV) formatted files in an Amazon Simple Storage Service (Amazon S3) bucket. You can analyze those files by using a spreadsheet, database applications, or other tools. You can use the CSV formatted files to change a set of status and workflow values to align with your organizational requirements, and update many or all findings at once in Security Hub.

The solution described in this post, called CSV Manager for Security Hub, uses an AWS Lambda function to export findings to a CSV object in an S3 bucket, and another Lambda function to update Security Hub findings by modifying selected values in the downloaded CSV file from an S3 bucket. You use an Amazon EventBridge scheduled rule to perform periodic exports (for example, once a week). CSV Manager for Security Hub also has an update function that allows you to update the workflow, customer-specific notation, and other customer-updatable values for many or all findings at once. If you’ve set up a Region aggregator in Security Hub, you should configure the primary CSV Manager for Security Hub stack to export findings only from the aggregator Region. However, you may configure other CSV Manager for Security Hub stacks that export findings from specific Regions or from all applicable Regions in specific accounts. This allows application and account owners to view their own Security Hub findings without having access to other findings for the organization.

How it works

CSV Manager for Security Hub has two main features:

Export Security Hub findings to a CSV object in an S3 bucket
Update Security Hub findings from a CSV object in an S3 bucket

Overview of the export function

The overview of the export function CsvExporter is shown in Figure 1.

Figure 1: Architecture diagram of the export function

Figure 1 shows the following numbered steps:

In the AWS Management Console, you invoke the CsvExporter Lambda function with a test event.
The export function calls the Security Hub GetFindings API action and gets a list of findings to export from Security Hub.
The export function converts the most important fields to identify and sort findings to a 37-column CSV format (which includes 12 updatable columns) and writes to an S3 bucket.

Overview of the update function

To update existing Security Hub findings that you previously exported, you can use the update function CsvUpdater to modify the respective rows and columns of the CSV file you exported, as shown in Figure 2. There are 12 modifiable columns out of 37 (any changes to other columns are ignored), which are described in more detail in Step 3: View or update findings in the CSV file later in this post.

Figure 2: Architecture diagram of the update function

Figure 2 shows the following numbered steps:

You download the CSV file that the CsvExporter function generated from the S3 bucket and update as needed.
You upload the CSV file that contains your updates to the S3 bucket.
In the AWS Management Console, you invoke the CsvUpdater Lambda function with a test event containing the URI of the CSV file.
CsvUpdater reads the updated CSV file from the S3 bucket.
CsvUpdater identifies the minimum set of updates and invokes the Security Hub BatchUpdateFindings API action.

Step 1: Use the CloudFormation template to deploy the solution

You can set up and use CSV Manager for Security Hub by using either AWS CloudFormation or the AWS Cloud Development Kit (AWS CDK).

To deploy the solution (AWS CDK)

You can find the latest code in the aws-security-hub-csv-manager GitHub repository, where you can also contribute to the sample code. The following commands show how to deploy the solution by using the AWS CDK. First, the AWS CDK initializes your environment and uploads the AWS Lambda assets to an S3 bucket. Then, you deploy the solution to your account by using the following commands. Replace <INSERT_AWS_ACCOUNT> with your account number, and replace <INSERT_REGION> with the AWS Region that you want the solution deployed to, for example us-east-1.

cdk bootstrap aws://<INSERT_AWS_ACCOUNT>/<INSERT_REGION>
cdk deploy

To deploy the solution (CloudFormation)

Choose the following Launch Stack button to open the AWS CloudFormation console pre-loaded with the template for this solution:
In the Parameters section, as shown in Figure 3, enter your values.

Figure 3: CloudFormation template variables
1. For What folder for CSV Manager for Security Hub Lambda code, leave the default Code. For What folder for CSV Manager for Security Hub exports, leave the default Findings.
  These are the folders within the S3 bucket that the CSV Manager for Security Hub CloudFormation template creates to store the Lambda code, as well as where the findings are exported by the Lambda function.
2. For Frequency, for this solution you can leave the default value cron(0 8 ? * SUN *). This default causes automatic exports to occur every Sunday at 8:00 AM local time using an EventBridge scheduled rule. For more information about how to update this value to meet your needs, see Schedule Expressions for Rules in the Amazon CloudWatch Events User Guide.
3. The values you enter for the Regions field depend on whether you have configured an aggregation Region in Security Hub.
  - If you have configured an aggregation Region, enter only that Region code, for example eu-north-1, as shown in Figure 3.
  - If you haven’t configured an aggregation Region, enter a comma-separated list of Regions in which you have enabled Security Hub, for example us-east-1, eu-west-1, eu-west-2.
  - If you would like to export findings from all Regions where Security Hub is enabled, leave the Regions field blank. Regions where Security Hub is not enabled will generate a message and will be skipped.
Choose Next.

The CloudFormation stack deploys the necessary resources, including an EventBridge scheduling rule, AWS System Managers Automation documents, an S3 bucket, and Lambda functions for exporting and updating Security Hub findings.

After you deploy the CloudFormation stack

After you create the CSV Manager for Security Hub stack, you can do the following:

Perform the export function to write some or all Security Hub findings to a CSV file by following the instructions in Step 2: Export Security Hub findings to a CSV file later in this post.
Perform a bulk update of Security Hub findings by following the instructions in Step 3: View or update findings in the CSV file later in this post. You can make changes to one or more of the 12 updatable columns of the CSV file, and perform the update function to update some or all Security Hub findings.

Step 2: Export Security Hub findings to a CSV file

You can export Security Hub findings from the AWS Lambda console. To do this, you create a test event and invoke the CsvExporter Lambda function. CsvExporter exports all Security Hub findings from all applicable Regions to a single CSV file in the S3 bucket for CSV Manager for Security Hub.

To export Security Hub findings to a CSV file

In the AWS Lambda console, find the CsvExporter Lambda function and select it.
On the Code tab, choose the down arrow at the right of the Test button, as shown in Figure 4, and select Configure test event.

Figure 4: The down arrow at the right of the Test button
To create an empty test event, on the Configure test event page, do the following:
1. Choose Create a new event.
2. Enter an event name; in this example we used testEvent.
3. For Template, leave the default hello-world.
4. For Event JSON, enter the JSON object {} as shown in Figure 5.
Figure 5: Creating an empty test event
Choose Save to save the empty test event.
To invoke the Lambda function, choose the Test button, as shown in Figure 6.

Figure 6: Test button to invoke the Lambda function

On the Execution Results tab, note the following details, which you will need for the next step.

{
"message": "Export succeeded", 
"bucket": DOC-EXAMPLE-BUCKET,
"exportKey”: DOC-EXAMPLE-OBJECT,
"resultCode": 200
}

Locate the CSV object that matches the value of “exportKey” (in this example, DOC-EXAMPLE-OBJECT) in the S3 bucket that matches the value of “bucket” (in this example, DOC-EXAMPLE-BUCKET).

Now you can view or update the findings in the CSV file, as described in the next section.

Step 3: (Optional) Using filters to limit CSV results

In your test event, you can specify any filter that is accepted by the GetFindings API action. You do this by adding a filter key to your test event. The filter key can either contain the word HighActive (which is a predefined filter configured as a default for selecting active high-severity and critical findings, as shown in Figure 8), or a JSON filter object.

Figure 8 depicts an example JSON filter that performs the same filtering as the HighActive predefined filter.

To use filters to limit CSV results

In the AWS Lambda console, find the CsvExporter Lambda function and select it.
On the Code tab, choose the down arrow at the right of the Test button, as shown in Figure 7, and select Configure test event.

Figure 7: The down arrow at the right of the Test button
To create a test event containing a filter, on the Configure test event page, do the following:
1. Choose Create a new event.
2. Enter an event name; in this example we used filterEvent.
3. For Template, select testEvent,
4. For Event JSON, enter the following JSON object, as shown in Figure 8.
```
{
   "SeverityLabel":[
      {
         "Value":"CRITICAL",
         "Comparison":"EQUALS"
      },
      {
         "Value":"HIGH",
         "Comparison":"EQUALS"
      }
   ],
   "RecordState":[
      {
         "Comparison":"EQUALS",
         "Value":"ACTIVE"
      }
   ]
}
```
  Figure 8: Test button to invoke the Lambda function
5. Choose Save.
To invoke the Lambda function, choose the Test button as shown in Figure 9.

Figure 9: Test button to invoke the Lambda function

On the Execution Results tab, note the following details, which you will need for the next step.

{
"message": "Export succeeded", 
"bucket": DOC-EXAMPLE-BUCKET,
"exportKey": DOC-EXAMPLE-OBJECT,
"resultCode": 200
}

Locate the CSV object that matches the value of “exportKey” (in this example, DOC-EXAMPLE-OBJECT) in the S3 bucket that matches the value of “bucket” (in this example, DOC-EXAMPLE-BUCKET).

The results in this CSV file should be a filtered set of Security Hub findings according to the filter you specified above. You can now proceed to step 4 if you want to view or update findings.

Step 4: View or update findings in the CSV file

You can use any program that allows you to view or edit CSV files, such as Microsoft Excel. The first row in the CSV file are the column names. These column names correspond to fields in the JSON objects that are returned by the GetFindings API action.

Warning: Do not modify the first two columns, Id (column A) or ProductArn (column B). If you modify these columns, Security Hub will not be able to locate the finding to update, and any other changes to that finding will be discarded.

You can locally modify any of the columns in the CSV file, but only 12 columns out of 37 columns will actually be updated if you use CsvUpdater to update Security Hub findings. The following are the 12 columns you can update. These correspond to columns C through N in the CSV file.

Column name	Spreadsheet column	Description
Criticality	C	An integer value between 0 and 100.
Confidence	D	An integer value between 0 and 100.
NoteText	E	Any text you wish
NoteUpdatedBy	F	Automatically updated with your AWS principal user ID.
CustomerOwner*	G	Information identifying the owner of this finding (for example, email address).
CustomerIssue*	H	A Jira issue or another identifier tracking a specific issue.
CustomerTicket*	I	A ticket number or other trouble/problem tracking identification.
ProductSeverity**	J	A floating-point number from 0.0 to 99.9.
NormalizedSeverity**	K	An integer between 0 and 100.
SeverityLabel	L	One of the following: INFORMATIONAL LOW MEDIUM HIGH HIGH CRITICAL
VerificationState	M	One of the following: UNKNOWN — Finding has not been verified yet. TRUE_POSITIVE — This is a valid finding and should be treated as a risk. FALSE_POSITIVE — This an incorrect finding and should be ignored or suppressed. BENIGN_POSITIVE — This is a valid finding, but the risk is not applicable or has been accepted, transferred, or mitigated.
Workflow	N	One of the following: NEW — This is a new finding that has not been reviewed. NOTIFIED — The responsible party or parties have been notified of this finding. RESOLVED — The finding has been resolved. SUPPRESSED — A false or benign finding has been suppressed so that it does not appear as a current finding in Security Hub.

* These columns are stored inside the UserDefinedFields field of the updated findings. The column names imply a certain kind of information, but you can put any information you wish.

** These columns are stored inside the Severity field of the updated findings. These values have a fixed format and will be rejected if they do not meet that format.

Columns with fixed text values (L, M, N) in the previous table can be specified in mixed case and without underscores—they will be converted to all uppercase and underscores added in the CsvUpdater Lambda function. For example, “false positive” will be converted to “FALSE_POSITIVE”.

Step 5: Create a test event and update Security Hub by using the CSV file

If you want to update Security Hub findings, make your changes to columns C through N as described in the previous table. After you make your changes in the CSV file, you can update the findings in Security Hub by using the CSV file and the CsvUpdater Lambda function.

Use the following procedure to create a test event and run the CsvUpdater Lambda function.

To create a test event and run the CsvUpdater Lambda function

In the AWS Lambda console, find the CsvUpdater Lambda function and select it.
On the Code tab, choose the down arrow to the right of the Test button, as shown in Figure 10, and select Configure test event.

Figure 10: The down arrow to the right of the Test button
To create a test event as shown in Figure 11, on the Configure test event page, do the following:
1. Choose Create a new event.
2. Enter an event name; in this example we used testEvent.
3. For Template, leave the default hello-world.
4. For Event JSON, enter the following:
```
{
"input": <s3ObjectUri>,
"primaryRegion": <aggregationRegionName>
}
```
  Replace <s3ObjectUri> with the full URI of the S3 object where the updated CSV file is located.
  
  Replace <aggregationRegionName> with your Security Hub aggregation Region, or the primary Region in which you initially enabled Security Hub.
  
  Figure 11: Create and save a test event for the CsvUpdater Lambda function
Choose Save.
Choose the Test button, as shown in Figure 12, to invoke the Lambda function.

Figure 12: Test button to invoke the Lambda function
To verify that the Lambda function ran successfully, on the Execution Results tab, review the results for “message”: “Success”, as shown in the following example. Note that the results may be thousands of lines long.
```
{
"message": "Success",
"details": {
"processed": [{"Id": arn:aws:securityhub:us-east-1: 111122223333:subscription/cis-aws-foundations-benchmark/v/1.2.0/1.7/finding/6d543b22-6a3d-405c-ae7f-224469bde7d2, "ProductArn": arn:aws:securityhub:us-east-1::product/aws/securityhub}, … ],
"unprocessed": [],
"message": "Updated succeeded",
"success": true
},
"input": s3://DOC-EXAMPLE-BUCKET/DOC-EXAMPLE-OBJECT,
"resultCode": 200
}
```
The processed array lists every successfully updated finding by Id and ProductArn.

If any of the findings were not successfully updated, their Id and ProductArn appear in the unprocessed array. In the previous example, no findings were unprocessed.

The value s3://DOC-EXAMPLE-BUCKET/DOC-EXAMPLE-OBJECT is the URI of the S3 object from which your updates were read.

Cleaning up

To avoid incurring future charges, first delete the CloudFormation stack that you deployed in Step 1: Use the CloudFormation template to deploy the solution. Next, you need to manually delete the S3 bucket deployed with the stack. For instructions, see Deleting a bucket in the Amazon Simple Storage Service User Guide.

Conclusion

In this post, we showed you how you can export Security Hub findings to a CSV file in an S3 bucket and update the exported findings by using CSV Manager for Security Hub. We showed you how you can automate this process by using AWS Lambda, Amazon S3, and AWS Systems Manager. Full documentation for CSV Manager for Security Hub is available in the aws-security-hub-csv-manager GitHub repository. You can also investigate other ways to manage Security Hub findings by checking out our blog posts about Security Hub integration with Amazon OpenSearch Service, Amazon QuickSight, Slack, PagerDuty, Jira, or ServiceNow.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the Security Hub re:Post. To learn more or get started, visit AWS Security Hub.

Want more AWS Security news? Follow us on Twitter.

Extending your SaaS platform with AWS Lambda

2022-08-23 Hasan Tariq

Post Syndicated from Hasan Tariq original https://aws.amazon.com/blogs/architecture/extending-your-saas-platform-with-aws-lambda/

Software as a service (SaaS) providers continuously add new features and capabilities to their products to meet their growing customer needs. As enterprises adopt SaaS to reduce the total cost of ownership and focus on business priorities, they expect SaaS providers to enable customization capabilities.

Many SaaS providers allow their customers (tenants) to provide customer-specific code that is triggered as part of various workflows by the SaaS platform. This extensibility model allows customers to customize system behavior and add rich integrations, while allowing SaaS providers to prioritize engineering resources on the core SaaS platform and avoid per-customer customizations.

To simplify experience for enterprise developers to build on SaaS platforms, SaaS providers are offering the ability to host tenant’s code inside the SaaS platform. This blog provides architectural guidance for running custom code on SaaS platforms using AWS serverless technologies and AWS Lambda without the overhead of managing infrastructure on either the SaaS provider or customer side.

Vendor-hosted extensions

With vendor-hosted extensions, the SaaS platform runs the customer code in response to events that occur in the SaaS application. In this model, the heavy-lifting of managing and scaling the code launch environment is the responsibility of the SaaS provider.

To host and run custom code, SaaS providers must consider isolating the environment that runs untrusted custom code from the core SaaS platform, as detailed in Figure 1. This introduces additional challenges to manage security, cost, and utilization.

Figure 1. Distribution of responsibility between Customer and SaaS platform with vendor-hosted extensions

Using AWS serverless services to run custom code

Using AWS serverless technologies removes the tasks of infrastructure provisioning and management, as there are no servers to manage, and SaaS providers can take advantage of automatic scaling, high availability, and security, while only paying for value.

Example use case

Let’s take an example of a simple SaaS to-do list application that supports the ability to initiate custom code when a new to-do item is added to the list. This application is used by customers who supply custom code to enrich the content of newly added to-do list items. The requirements for the solution consist of:

Custom code provided by each tenant should run in isolation from all other tenants and from the SaaS core product
Track each customer’s usage and cost of AWS resources
Ability to scale per customer

Solution overview

The SaaS application in Figure 2 is the core application used by customers, and each customer is considered a separate tenant. For the sake of brevity, we assume that the customer code was already stored in an Amazon Simple Storage Service (Amazon S3) bucket as part of the onboarding. When an eligible event is generated in the SaaS application as a result of user action, like a new to-do item added, it gets propagated down to securely launch the associated customer code.

Figure 2. Example use case architecture

Walkthrough of custom code run

Let’s detail the initiation flow of custom code when a user adds a new to-do item:

An event is generated in the SaaS application when a user performs an action, like adding a new to-do list item. To extend the SaaS application’s behavior, this event is linked to the custom code. Each event contains a tenant ID and any additional data passed as part of the payload. Each of these events is an “initiation request” for the custom code Lambda function.
Amazon EventBridge is used to decouple the SaaS Application from event processing implementation specifics. EventBridge makes it easier to build event-driven applications at scale and keeps the future prospect of adding additional consumers. In case of unexpected failure in any downstream service, EventBridge retries sending events a set number of times.
EventBridge sends the event to an Amazon Simple Queue Service (Amazon SQS) queue as a message that is subsequently picked up by a Lambda function (Dispatcher) for further routing. Amazon SQS enables decoupling and scaling of microservices and also provides a buffer for the events that are awaiting processing.
The Dispatcher polls the messages from SQS queue and is responsible for routing the events to respective tenants for further processing. The Dispatcher retrieves the tenant ID from the message and performs a lookup in the database (we recommend Amazon DynamoDB for low latency), retrieves tenant SQS Amazon Resource Name (ARN) to determine which queue to route the event. To further improve performance, you can cache the tenant-to-queue mapping.
The tenant SQS queue acts as a message store buffer and is configured as an event source for a Lambda function. Using Amazon SQS as an event source for Lambda is a common pattern.
Lambda executes the code uploaded by the tenant to perform the desired operation. Common utility and management code (including logging and telemetry code) is kept in Lambda layers that get added to every custom code Lambda function provisioned.
After performing the desired operation on data, custom code Lambda returns a value back to the SaaS application. This completes the run cycle.

This architecture allows SaaS applications to create a self-managed queue infrastructure for running custom code for tenants in parallel.

Tenant code upload

The SaaS platform can allow customers to upload code either through a user interface or using a command line interface that the SaaS provider provides to developers to facilitate uploading custom code to the SaaS platform. Uploaded code is saved in the custom code S3 bucket in .zip format that can be used to provision Lambda functions.

Custom code Lambda provisioning

The tenant environment includes a tenant SQS queue and a Lambda function that polls initiation requests from the queue. This Lambda function serves several purposes, including:

It polls messages from the SQS queue and constructs a JSON payload that will be sent an input to custom code.
It “wraps” the custom code provided by the customer using boilerplate code, so that custom code is fully abstracted from the processing implementation specifics. For example, we do not want custom code to know that the payload it is getting is coming from Amazon SQS or be aware of the destination where launch results will be sent.
Once custom code initiation is complete, it sends a notification with launch results back to the SaaS application. This can be done directly via EventBridge or Amazon SQS.
This common code can be shared across tenants and deployed by the SaaS provider, either as a library or as a Lambda layer that gets added to the Lambda function.

Each Lambda function execution environment is fully isolated by using a combination of open-source and proprietary isolation technologies, it helps you to address the risk of cross-contamination. By having a separate Lambda function provisioned per-tenant, you achieve the highest level of isolation and benefit from being able to track per-tenant costs.

Conclusion

In this blog post, we explored the need to extend SaaS platforms using custom code and why AWS serverless technologies—using Lambda and Amazon SQS—can be a good fit to accomplish that. We also looked at a solution architecture that can provide the necessary tenant isolation and is cost-effective for this use case.

For more information on building applications with Lambda, visit Serverless Land. For best practices on building SaaS applications, visit SaaS on AWS.

Introducing bidirectional event integrations with Salesforce and Amazon EventBridge

2022-08-12 James Beswick

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/introducing-bidirectional-event-integrations-with-salesforce-and-amazon-eventbridge/

This post is written by Alseny Diallo, Prototype Solutions Architect and Rohan Mehta, Associate Cloud Application Architect.

AWS now supports Salesforce as a partner event source for Amazon EventBridge, allowing you to send Salesforce events to AWS. You can also configure Salesforce with EventBridge API Destinations and send EventBridge events to Salesforce. These integrations enable you to act on changes to your Salesforce data in real-time and build custom applications with EventBridge and over 100 built-in sources and targets.

In this blog post, you learn how to set up a bidirectional integration between Salesforce and EventBridge and use cases for working with Salesforce events. You see an example application for interacting with Salesforce support case events with automated workflows for detecting sentiment with AWS AI/ML services and enriching support cases with customer order data.

Integration overview

Salesforce is a customer relationship management (CRM) platform that gives companies a single, shared view of customers across their marketing, sales, commerce, and service departments. Salesforce Event Relays for AWS enable bidirectional event flows between Salesforce and AWS through EventBridge.

Amazon EventBridge is a serverless event bus that makes it easier to build event-driven applications at scale using events generated from your applications, integrated software as a service (SaaS) applications, and AWS services. EventBridge partner event source integrations enable customers to receive events from over 30 SaaS applications and ingest them into their AWS applications.

Salesforce as a partner event source for EventBridge makes it easier to build event-driven applications that span customers’ data in Salesforce and applications running on AWS. Customers can send events from Salesforce to EventBridge and vice versa without having to write custom code or manage an integration.

EventBridge joins Amazon AppFlow as a way to integrate Salesforce with AWS. The Salesforce Amazon AppFlow integration is well suited for use cases that require ingesting large volumes of data, like a daily scheduled data transfer sending Salesforce records into an Amazon Redshift data warehouse or an Amazon S3 data lake. The Salesforce EventBridge integration is a good fit for real-time processing of changes to individual Salesforce records.

Use cases

Customers can act on new or modified Salesforce records through integrations with a variety of EventBridge targets, including AWS Lambda, AWS Step Functions, and API Gateway. The integration can enable use cases across industries that must act on customer events in real time.

Retailers can automatically unify their Salesforce data with AWS data sources. When a new customer support case is created in Salesforce, enrich the support case with recent order data from that customer retrieved from an orders database running on AWS.
Media and entertainment providers can augment their omnichannel experiences with AWS AI/ML services to increase customer engagement. When a new customer account is created in Salesforce, use Amazon Personalize and Amazon Simple Email Service to send a welcome email with personalized media recommendations.
Insurers can automate form processing workflows. When a new insurance claim form PDF is uploaded to Salesforce, extract the submitted information with Amazon Textract and orchestrate processing the claim information with AWS Step Functions.

Solution overview

The example application shows how the integration can enhance customer support experiences by unifying support tickets with customer order data, detecting customer sentiment, and automating support case workflows.

A new case is created in Salesforce and an event is sent to an EventBridge partner event bus.
If the event matches the EventBridge rule, the rule sends the event to both the Enrich Case and Case Processor Workflows in parallel.
The Enrich Case Workflow uses the Customer ID in the event payload to query the Orders table for the customer’s recent order. If this step fails, the event is sent to an Amazon SQS dead letter queue.
The Enrich Case Workflow publishes a new event with the customer’s recent order to an EventBridge custom event bus.
The Case Processor Workflow performs sentiment analysis on the support case content and sends a customized text message to the customer. See the diagram below for details on the workflow.
The Case Processor Workflow publishes a new event with the sentiment analysis results to the custom event bus.
EventBridge rules match the events published to the associated rules: CaseProcessorEventRule and EnrichCaseAppEventRule.
These rules send the events to EventBridge API Destinations. API Destinations sends the events to Salesforce HTTP endpoints to create two Salesforce Platform Events.
Salesforce data is updated with the two Platform Events:
1. The support case record is updated with the customer’s recent order details and the support case sentiment.
2. If the support case sentiment is negative, a task is created for an agent to follow up with the customer.

The Case Processor workflow uses Step Functions to process the Salesforce events.

Detect the sentiment of the customer feedback using Amazon Comprehend. This is positive, negative, or neutral.
Check if the customer phone number is a mobile number and can receive SMS using Amazon Pinpoint’s mobile number validation endpoint.
If the customer did not provide a mobile number, bypass the SMS steps and put an event with the detected sentiment onto the custom event bus.
If the customer provided a mobile number, send them an SMS with the appropriate message based on the sentiment of their case.
1. If sentiment is positive or neutral, the message is thanking the customer for their feedback.
2. If the sentiment is negative, the message offers additional support.
The state machine then puts an event with the sentiment analysis results onto the custom event bus.

Prerequisites

An AWS account and an AWS user with AdministratorAccess (see the instructions on the AWS Identity and Access Management (IAM) console).
Access to the following AWS services: AWS Step Functions, Amazon EventBridge, Amazon Comprehend, Amazon Pinpoint, Amazon Simple Notification Service.
AWS SAM CLI using the instructions here.

Environment setup

Follow the instructions here to set up your Salesforce Event Relay. Once you have an event bus created with the partner event source, proceed to step 2.
Copy the ARN of the event bus.
Create a Salesforce Connected App. This is used for the API Destinations configuration to send updates back into Salesforce.
You can create a new user within Salesforce with appropriate API permissions to update records. The user name and password is used by the API Destinations configuration.
The example provided by Salesforce uses a Platform Event called “Carbon Comparison”. For this sample app, you create three custom platform events with the following configurations:
1. Customer Support Case (Salesforce to AWS):
2. Processed Support Case (AWS to Salesforce):
3. Enrich Case (AWS to Salesforce):
This example application assumes that a custom Sentiment field is added to the Salesforce Case record type. See this link for how to create custom fields in Salesforce.
The example application uses Salesforce Flows to trigger outbound platform events and handle inbound platform events. See this link for how to use Salesforce Flows to build event driven applications on Salesforce.
Clone the AWS SAM template here.
```
sam build
sam deploy —guided
```
For the parameter prompts, enter:

SalesforceOauthClientId and SalesforceOauthClientSecret: Use the values created with the Connected App in step 3.
SalesforceUsername and SalesforcePassword: Use the values created for the new user in step 4.
SalesforceOauthUrl: Salesforce URL for OAuth authentication
SalesforceCaseProcessorEndpointUrl: Salesforce URL for creating a new Processed Support Case Platform Event object, in this case: https://MyDomainName.my.salesforce.com/services/data/v54.0/sobjects/Processed_Support_Case__e
SFEnrichCaseEndpointUrl: Salesforce URL for creating a new Enrich Case Platform Event object, in this case: https://MyDomainName.my.salesforce.com/services/data/v54.0/sobjects/Enrich_Case__e
SalesforcePartnerEventBusArn: Use the value from step 2.
SalesforcePartnerEventPattern: The detail-type value should be the API name of the custom platform event, in thiscase: {“detail-type”: [“Customer_Support_Case__e”]}

Conclusion

This blog shows how to act on changes to your Salesforce data in real-time using the new Salesforce partner event source integration with EventBridge. The example demonstrated how your Salesforce data can be processed and enriched with custom AWS applications and updates sent back to Salesforce using EventBridge API Destinations.

To learn more about EventBridge partner event sources and API Destinations, see the EventBridge Developer Guide. For more serverless resources, visit Serverless Land.

Securely retrieving secrets with AWS Lambda

2022-08-05 Julian Wood

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/securely-retrieving-secrets-with-aws-lambda/

AWS Lambda functions often need to access secrets, such as certificates, API keys, or database passwords. Storing secrets outside the function code in an external secrets manager helps to avoid exposing secrets in application source code. Using a secrets manager also allows you to audit and control access, and can help with secret rotation. Do not store secrets in Lambda environment variables, as these are visible to anyone who has access to view function configuration.

This post highlights some solutions to store secrets securely and retrieve them from within your Lambda functions.

AWS Partner Network (APN) member Hashicorp provides Vault to secure secrets and application data. Vault allows you to control access to your secrets centrally, across applications, systems, and infrastructure. You can store secrets in Vault and access them from a Lambda function to access a database, for example. The Vault Agent for AWS helps you authenticate with Vault, retrieve the database credentials, and then perform the queries. You can also use the Vault AWS Lambda extension to manage connectivity to Vault.

AWS Systems Manager Parameter Store enables you to store configuration data securely, including secrets, as parameter values. For information on Parameter Store pricing, see the documentation.

AWS Secrets Manager allows you to replace hardcoded credentials in your code with an API call to Secrets Manager to retrieve the secret programmatically. You can generate, protect, rotate, manage, and retrieve secrets throughout their lifecycle. By default, Secrets Manager does not write or cache the secret to persistent storage. Secrets Manager supports cross-account access to secrets. For information on Secrets Manager pricing, see the documentation.

Parameter Store integrates directly with Secrets Manager as a pass-through service for references to Secrets Manager secrets. Use this integration if you prefer using Parameter Store as a consistent solution for calling and referencing secrets across your applications. For more information, see “Referencing AWS Secrets Manager secrets from Parameter Store parameters.”

For an example application to show Secrets Manager functionality, deploy the example detailed in “How to securely provide database credentials to Lambda functions by using AWS Secrets Manager”.

When to retrieve secrets

When Lambda first invokes your function, it creates a runtime environment. It runs the function’s initialization (init) code, which is the code outside the main handler. Lambda then runs the function handler code as the invocation. This receives the event payload and processes your business logic. Subsequent invocations can use the same runtime environment.

You can retrieve secrets during each function invocation from within your handler code. This ensures that the secret value is always up to date but can lead to increased function duration and cost, as the function calls the secret manager during each invocation. There may also be additional retrieval costs from Secret Manager.

Retrieving secret during each invocation

You can reduce costs and improve performance by retrieving the secret during the function init process. During subsequent invocations using the same runtime environment, your handler code can use the same secret.

Retrieving secret during function initialization.

The Serverless Land pattern example shows how to retrieve a secret during the init phase using Node.js and top-level await.

If a secret may change between subsequent invocations, ensure that your handler can check for the secret validity and, if necessary, retrieve the secret again.

Retrieve changed secret during subsequent invocation.

You can also use Lambda extensions to retrieve secrets from Secrets Manager, cache them, and automatically refresh the cache based on a time value. The extension retrieves the secret from Secrets Manager before the init process and makes it available via a local HTTP endpoint. The function then retrieves the secret from the local HTTP endpoint, rather than directly from Secrets Manager, increasing performance. You can also share the extension with multiple functions, which can reduce function code. The extension handles refreshing the cache based on a configurable timeout value. This ensures that the function has the updated value, without handling the refresh in your function code, which increases reliability.

Using Lambda extensions to cache and refresh secret.

You can deploy the solution using the steps in Cache secrets using AWS Lambda extensions.

Lambda Powertools

Lambda Powertools provides a suite of utilities for Lambda functions to simplify the adoption of serverless best practices. AWS Lambda Powertools for Python and AWS Lambda Powertools for Java both provide a parameters utility that integrates with Secrets Manager.

from aws_lambda_powertools.utilities import parameters
def handler(event, context):
    # Retrieve a single secret
    value = parameters.get_secret("my-secret")

import software.amazon.lambda.powertools.parameters.SecretsProvider;
import software.amazon.lambda.powertools.parameters.ParamManager;

public class AppWithSecrets implements RequestHandler<APIGatewayProxyRequestEvent, APIGatewayProxyResponseEvent> {
    // Get an instance of the Secrets Provider
    SecretsProvider secretsProvider = ParamManager.getSecretsProvider();

    // Retrieve a single secret
    String value = secretsProvider.get("/my/secret");

Rotating secrets

You should rotate secrets to prevent the misuse of your secrets. This helps you to replace long-term secrets with short-term ones, which reduces the risk of compromise.

Secrets Manager has built-in functionality to rotate secrets on demand or according to a schedule. Secrets Manager has native integrations with Amazon RDS, Amazon DocumentDB, and Amazon Redshift, using a Lambda function to manage the rotation process for you. It deploys an AWS CloudFormation stack and populates the function with the Amazon Resource Name (ARN) of the secret. You specify the permissions to rotate the credentials, and how often you want to rotate the secret. You can view and edit Secrets Manager rotation settings in the Secrets Manager console.

Secrets Manager rotation settings

You can also create your own rotation Lambda function for other services.

Auditing secrets access

You should continually review how applications are using your secrets to ensure that the usage is as you expect. You should also log any changes to them so you can investigate any potential issues, and roll back changes if necessary.

When using Hashicorp Vault, use Audit devices to log all requests and responses to Vault. Audit devices can append logs to a file, write to syslog, or write to a socket.

Secrets Manager supports logging API calls using AWS CloudTrail. CloudTrail monitors and records all API calls for Secrets Manager as events. This includes calls from code calling the Secrets Manager APIs and access via the Secrets Manager console. CloudTrail data is considered sensitive, so you should use AWS KMS encryption to protect it.

The CloudTrail event history shows the requests to secretsmanager.amazonaws.com.

Viewing CloudTrail access to Secrets Manager

You can use Amazon EventBridge to respond to alerts based on specific operations recorded in CloudTrail. These include secret rotation or deleted secrets. You can also generate an alert if someone tries to use a version of a secret version while it is pending deletion. This may help identify and alert you when an outdated certificate is used.

Securing secrets

You must tightly control access to secrets because of their sensitive nature. Create AWS Identity and Access Management (IAM) policies and resource policies to enable minimal access to secrets. You can use role-based, as well as attribute-based, access control. This can prevent credentials from being accidentally used or compromised. For more information, see “Authentication and access control for AWS Secrets Manager”.

Secrets Manager supports encryption at rest using AWS Key Management Service (AWS KMS) using keys that you manage. Secrets are encrypted in transit using TLS by default, which requires request signing.

You can access secrets from inside an Amazon Virtual Private Cloud (Amazon VPC) without requiring internet access. Use AWS PrivateLink and configure a Secrets Manager specific VPC endpoint.

Do not store plaintext secrets in Lambda environment variables. Ensure that you do not embed secrets directly in function code, commit these secrets to code repositories, or log the secret to CloudWatch.

Conclusion

Using a secrets manager to store secrets such as certificates, API keys or database passwords helps to avoid exposing secrets in application source code. This post highlights some AWS and third-party solutions, such as Hashicorp Vault, to store secrets securely and retrieve them from within your Lambda functions.

Secrets Manager is the preferred AWS solution for storing and managing secrets. I explain when to retrieve secrets, including using Lambda extensions to cache secrets, which can reduce cost and improve performance.

You can use the Lambda Powertools parameters utility, which integrates with Secrets Manager. Rotating secrets reduces the risk of compromise and you can audit secrets using CloudTrail and respond to alerts using EventBridge. I also cover security considerations for controlling access to your secrets.

For more serverless learning resources, visit Serverless Land.

How to track AWS account metadata within your AWS Organizations

2022-08-03 Jonathan Nguyen

Post Syndicated from Jonathan Nguyen original https://aws.amazon.com/blogs/architecture/how-to-track-aws-account-metadata-within-your-aws-organizations/

United States Automobile Association (USAA) is a San Antonio-based insurance, financial services, banking, and FinTech company supporting millions of military members and their families. USAA has partnered with Amazon Web Services (AWS) to digitally transform and build multiple USAA solutions that help keep members safe and save members’ money and time.

Why build an AWS account metadata solution?

The USAA Cloud Program developed a centralized solution for collecting all AWS account metadata to facilitate core enterprise functions, such as financial management, remediation of vulnerable and insecure configurations, and change release processes for critical application and infrastructure changes.

Companies without centralized metadata solutions may have distributed documents and wikis that contain account metadata, which has to be updated manually. Manually inputting/updating information generally leads to outdated or incorrect metadata and, in addition, requires individuals to reach out to multiple resources and teams to collect specific information.

Solution overview

USAA utilizes AWS Organizations and a series of GitLab projects to create, manage, and baseline all AWS accounts and infrastructure within the organization, including identity and access management, security, and networking components. Within their GitLab projects, each deployment uses a GitLab baseline version that determines what version of the project was provisioned within the AWS account.

During the creation and onboarding of new AWS accounts, which are created for each application team and use-case, there is specific data that is used for tracking and governance purposes, and applied across the enterprise. USAA’s Public Cloud Security team took an opportunity within a hackathon event to develop the solution depicted in Figure 1.

AWS account is created conforming to a naming convention and added to AWS Organizations.

Metadata tracked per AWS account includes:

- AWS account name
- Points of contact
- Line of business (LOB)
- Cost center #
- Application ID #
- Status
- Cloud governance record #
- GitLab baseline version
Amazon EventBridge rule invokes AWS Step Functions when new AWS accounts are created.
Step Functions invoke an AWS Lambda function to pull AWS account metadata and load into a centralized Amazon DynamoDB table with Streams enabled to support automation.
A private Amazon API Gateway is exposed to USAA’s internal network, which queries the DynamoDB table and provides AWS account metadata.

Figure 1. Overview of USAA architecture automation workflow to manage AWS account metadata

After the solution was deployed, USAA teams leveraged the data in multiple ways:

User interface: a front-end user-interface querying the API Gateway to allow internal users on the USAA network to filter and view metadata for any AWS accounts within AWS Organizations.
Event-driven automation: DynamoDB streams for any changes in the table that would invoke a Lambda function, which would check the most recent version from GitLab and the GitLab baseline version in the AWS account. For any outdated deployments, the Lambda function invokes the CI/CD pipeline for that AWS account to deploy a standardized set of IAM, infrastructure, and security resources and configurations.
Incident response: the Cyber Threat Response team reduces mean-time-to-respond by developing automation to query the API Gateway to append points-of-contact, environment, and AWS account name for custom detections as well as Security Hub and Amazon GuardDuty findings.
Financial management: Internal teams have integrated workflows to their applications to query the API Gateway to return cost center, LOB, and application ID to assist with financial reporting and tracking purposes. This replaces manually reviewing the AWS account metadata from an internal and manually updated wiki page.
Compliance and vulnerability management: automated notification systems were developed to send consolidated reports to points-of-contact listed in the AWS account from the API Gateway to remediate non-compliant resources and configurations.

Conclusion

In this post, we reviewed how USAA enabled core enterprise functions and teams to collect, store, and distribute AWS account metadata by developing a secure and highly scalable serverless application natively in AWS. The solution has been leveraged for multiple use-cases, including internal application teams in USAA’s production AWS environment.

Automatically block suspicious DNS activity with Amazon GuardDuty and Route 53 Resolver DNS Firewall

2022-07-20 Akshay Karanth

Post Syndicated from Akshay Karanth original https://aws.amazon.com/blogs/security/automatically-block-suspicious-dns-activity-with-amazon-guardduty-and-route-53-resolver-dns-firewall/

In this blog post, we’ll show you how to use Amazon Route 53 Resolver DNS Firewall to automatically respond to suspicious DNS queries that are detected by Amazon GuardDuty within your Amazon Web Services (AWS) environment.

The Security Pillar of the AWS Well-Architected Framework includes incident response, stating that your organization should implement mechanisms to automatically respond to and mitigate the potential impact of security issues. Automating incident response helps you scale your capabilities, rapidly reduce the scope of compromised resources, and reduce repetitive work by security teams.

Use cases for Route 53 Resolver DNS Firewall

Route 53 Resolver DNS Firewall is a managed firewall that you can use to block DNS queries that are made for known malicious domains and to allow queries for trusted domains. It provides more granular control over the DNS querying behavior of resources within your VPCs.

Let’s discuss two use cases for Route 53 Resolver DNS Firewall:

Use of allow lists – If you have stricter security requirements around network security controls and want to deny all outbound DNS queries for domains that don’t match those on your lists of approved domains (known as allow lists), you can create such rules. This is called a walled garden approach to DNS security. These allow lists only include the domains for which resources within your Amazon Virtual Private Cloud (Amazon VPC) are allowed to make DNS queries through Amazon-provided DNS. This helps to ensure that the DNS queries containing the domains that your organization doesn’t trust are blocked.

Use of deny lists – If your organization prefers to allow all outbound DNS lookups within your accounts by default and only requires the ability to block DNS queries for known malicious domains, you can use DNS Firewall to create deny lists, which include all the malicious domain names that your organization is aware of. DNS Firewall also provides AWS Managed Rules, giving you to the ability to configure protections against known DNS threats like command-and-control (C&C) bots. You can also add block lists from open-source third-party threat intelligence sources.

A few important points about the use of allow and deny lists:

Broader use of allow lists is more effective at blocking a greater number of malicious DNS queries than a short deny list. For example, if your workloads only need access to .com domains, then allowing only .com will block many malicious domains that might be specific to certain countries. View a list of country code top-level domains (ccTLDs).
If you use allow lists, you need to make sure that you keep up with the domains that your applications need to communicate with. Likewise, if you use deny lists, you need to keep up with updates to the lists.
Allow lists and deny lists are not mutually exclusive models and can be used together. For example, let’s say that you have an allow list that only allows .com domains (with the intention of blocking several ccTLDs by default). You can also use the built-in AWS Managed Rules deny list to block known malicious .com domains for an additional layer of security.

Solution overview

Refer to the DNS Firewall documentation to familiarize yourself with its constructs and understand how it works. The automation example we provide in this blog post is focused on providing blocks or alerts for DNS queries with suspicious domain names. For example, consider the scenario where an Amazon Elastic Compute Cloud (Amazon EC2) instance queries a domain name that is associated with a known command-and-control server. As shown in Figure 1, when GuardDuty detects communication with the malicious domain, it initiates a series of steps. First, AWS Step Functions orchestrates the remediation response through a defined workflow, then DNS Firewall adds the suspicious domain to deny list or alert list, and finally GuardDuty notifies the security operators of the attempted communication.

Figure 1: High-level solution overview

In this solution, the detection of threats by GuardDuty triggers the automated remediation procedure documented in this post. GuardDuty informs you of the status of your AWS environment by producing security findings. Each GuardDuty finding has an assigned severity level and value that reflects the potential risk that the finding could have to your network as determined by our security engineers. The value of the severity can fall anywhere within the 0.1 to 8.9 range, with higher values indicating greater security risk. To help you determine a response to a potential security issue that is highlighted by a finding, GuardDuty breaks down this range into High, Medium, and Low severity levels. We have seen that many of the DNS-based GuardDuty findings fall into the category of High severity, and many times these findings are strongly indicative of potential compromise (for example, pre ransomware activity).

In this blog post, we specifically focus on the following types of GuardDuty findings:

Backdoor:EC2/C&CActivity.B!DNS
Impact:EC2/MaliciousDomainRequest.Reputation
Trojan:EC2/DNSDataExfiltration

We’ve configured DNS Firewall to block only events with High severity by sending only those domains to the deny list. DNS Firewall sends the rest of the domains to an alert list.

This solution uses Step Functions and AWS Lambda so that incident response steps run in the correct order. Step Functions also provides retry and error-handling logic. Lambda functions interact with networking services to block traffic, and with databases to store data about blocked domain lists and AWS Security Hub finding Amazon Resource Names (ARNs).

How it works

Figure 2 shows the automated remediation workflow in detail.

Figure 2: Detailed workflow diagram

The solution is implemented as follows:

GuardDuty detects communication attempts that include a suspicious domain. GuardDuty generates a finding, in JSON format, that includes details such as the EC2 instance ID involved (if applicable), account information, type of finding, domain, and other details. Following is a sample finding (some fields removed for brevity).

{
  "schemaVersion": "2.0",
  "accountId": "123456789012",
  "id": " 1234567890abcdef0",
  "type": "Backdoor:EC2/C&CActivity.B!DNS",
  "service": {
    "serviceName": "guardduty",
    "action": {
      "actionType": "DNS_REQUEST",
     "dnsRequestAction": {
"domain": "guarddutyc2activityb.com",
"protocol": "UDP",
"blocked": false
      }
    }
  }
}

Security Hub ingests the finding generated by GuardDuty and consolidates it with findings from other AWS security services. Security Hub also publishes the contents of the finding to the default bus in Amazon EventBridge. Following is a snippet from a sample event published to EventBridge.

{ 
  "id": "12345abc-ca56-771b-cd1b-710550598e37", 
  "detail-type": "Security Hub Findings - Imported", 
  "source": "aws.securityhub", 
  "account": "123456789012", 
  "time": "2021-01-05T01:20:33Z", 
  "region": "us-east-1", 
  "detail": { 
    "findings": [ 
        { "ProductArn": "arn:aws:securityhub:us-east-1::product/aws/guardduty", 
        "Types": ["Software and Configuration Checks/Backdoor:EC2.C&CActivity.B!DNS"], 
        "LastObservedAt": "2021-01-05T01:15:01.549Z", 
        "ProductFields": 
            {"aws/guardduty/service/action/dnsRequestAction/blocked": "false",
            "aws/guardduty/service/action/dnsRequestAction/domain": "guarddutyc2activityb.com"} 
            }
            ]}}

EventBridge has a rule with an event pattern that matches GuardDuty events that contain the malicious domain name. When an event matching the pattern is published on the default bus, EventBridge routes that event to the designated target, in this case a Step Functions state machine. Following is a snippet of AWS CloudFormation code that defines the EventBridge rule.

# EventBridge Event Rule - For Security Hub event published to EventBridge:
  SecurityHubtoFirewallStateMachineEvent:
    Type: "AWS::Events::Rule"
    Properties:
      Description: "Security Hub - GuardDuty findings with DNS Domain"
      EventPattern:
        source:
        - aws.securityhub
        detail:
          findings:
            ProductFields:
              aws/guardduty/service/action/dnsRequestAction/blocked:
                - "exists": true
      State: "ENABLED"
      Targets:
        -
          Arn: !GetAtt SecurityHubtoDnsFirewallStateMachine.Arn
          RoleArn: !GetAtt SecurityHubtoFirewallStateMachineEventRole.Arn
          Id: "GuardDutyEvent-StepFunctions-Trigger"

The Step Functions state machine ingests the details of the Security Hub finding published in EventBridge and orchestrates the remediation response through a defined workflow. Figure 3 shows the state machine workflow.

Figure 3: AWS Step Functions state machine workflow
The first two steps in the state machine, getDomainFromDynamo and isDomainInDynamo, invoke the Lambda function CheckDomainInDynamoLambdaFunction that checks whether the flagged domain is already in the Amazon DynamoDB table. If the domain already exists in DynamoDB, then the workflow continues to check whether the domain is also in the domain list and adds it accordingly. If the domain is not in DynamoDB, then the workflow considers it a new addition and adds the domain to both domain lists, as well as the DynamoDB table.
The next three steps in the state machine—getDomainFromDomainList, isDomainInDomainList, and addDomainToDnsFirewallDomainList—invoke a second Lambda function that checks and updates the DNS Firewall domain lists with the domain name. Figure 4 shows an example of the DNS Firewall rules and associated domain list.

Figure 4: Sample rules in a DNS Firewall rule group

Figure 5 shows the domain lists.

Figure 5: Domain lists

The next step in the state machine, updateDynamoDB, invokes a third Lambda function that updates the DynamoDB table with the domain that was just added to the domain list. Figure 6 shows an example domain entry that gets stored inside the DynamoDB table.

Figure 6: DynamoDB table entry
The notifySuccess step of the state machine uses an Amazon Simple Notification Service (Amazon SNS) topic to send out a message that the automatic block or alert happened.
If there was a failure in any of the previous steps, then the state machine runs the notifyFailure step. The state machine publishes a message on the SNS topic that the automated remediation workflow has failed to complete, and that manual intervention might be required.

Solution deployment and testing

To set up this solution, you’ll do the following steps:

Verify prerequisites in your AWS account.
Deploy the CloudFormation template.
Create a test Security Hub event.
Confirm the entry in the DNS Firewall rule group domain list.
Confirm the SNS notification.
Apply the rule group to your VPC by using DNS Firewall.

Step 1: Verify prerequisites in your AWS account

The sample solution we provide in this blog post requires that you activate both GuardDuty and Security Hub in your AWS account. If either of these services is not activated in your account, do the following:

GuardDuty – Learn more about the free trial and pricing for this service, and follow the steps in the GuardDuty documentation to set up the service and start monitoring your account.
Security Hub – Learn more about the free trial and pricing for this service, and follow the steps in the AWS Security Hub documentation to activate and configure the service.

Step 2: Deploy the CloudFormation template

For this next step, make sure that you deploy the template within the AWS account and the AWS Region where you want to monitor GuardDuty findings and block suspicious DNS activity. Depending on your architecture, you can deploy the solution one time centrally in a security account or deploy it repeatedly across multiple accounts.

To deploy the template

Choose the Launch Stack button to launch a CloudFormation stack in your account:

Note: The stack will launch in the N. Virginia (us-east-1) Region. It takes approximately 15 minutes for the CloudFormation stack to complete. To deploy this solution into other AWS Regions, download the solution’s CloudFormation template and deploy it to the selected Region. Network Firewall isn’t currently available in all Regions. For more information about where it’s available, see the list of service endpoints.
In the AWS CloudFormation console, select the Select Template form, and then choose Next.
On the Specify Details page, provide the following input parameters. You can modify the default values to customize the solution for your environment.
- AdminEmail – The email address to receive notifications. This must be a valid email address. There is no default value.
- DnsFireWallAlertDomainListName – The name of the domain list for DNS Firewall that consists of domains that will be only alerted and not blocked. The default value is DemoAlertDomainListAutoUpdated.
- DnsFireWallBlockDomainListName – The name of the domain list for DNS Firewall that consists of domains that will be blocked. The default value is DemoBlockedDomainListAutoUpdated.
- DnsFirewallBlockAction – You can select NODATA or NXDOMAIN. NODATA implies that there is no response available if a DNS query from the VPC matches a domain in the block domain list. NXDOMAIN implies that the response is an error message, which indicates that a domain doesn’t exist. The default value is NODATA.
Figure 7 shows an example of the values entered in the Parameters screen.

Figure 7: Sample CloudFormation stack parameters
After you’ve entered values for all of the input parameters, choose Next.
On the Options page, keep the defaults, and then choose Next.
On the Review page, in the Capabilities section, select the check box next to I acknowledge that AWS CloudFormation might create IAM resources. Then choose Create. Figure 8 shows what the CloudFormation capabilities acknowledgement prompt looks like.

Figure 8: AWS CloudFormation capabilities acknowledgement

While the stack is being created, check the email inbox that corresponds to the value that you gave for the AdminEmail address parameter. Look for an email message with the subject “AWS Notification – Subscription Confirmation.” Choose the link to confirm the subscription to the SNS topic.

After the Status field for the CloudFormation stack changes to CREATE_COMPLETE, as shown in Figure 9, the solution is implemented and is ready for testing.

Figure 9: CloudFormation stack completed deployment

Step 3: Create a test Security Hub event

After the CloudFormation stack has completed deployment, you can test the functionality by creating a test event in the same format as would be published by Security Hub.

To create a test run of the solution

In the AWS Management Console, choose Services, choose CloudFormation, and then for Stack, choose the stack name that you provided in Step 2: Deploy the CloudFormation template.
In the Resources tab for the stack, look for the SecurityHubDnsFirewallStateMachine entry. It should appear as shown in Figure 10.

Figure 10: CloudFormation stack resources
Choose the link in the entry. You’ll be redirected to the Step Functions console, with the state machine already open. Choose Start execution.

Figure 11: AWS Step Functions state machine
To facilitate testing, we’ve provided a test event file. On the Start execution page, in the Input section, paste the C&CActivity.B!DNS finding sample as shown in Figure 12.

Figure 12: Sample input for the Step Functions state machine execution
Note the domain name guarddutyc2activityb.com for the remote host identified in the GuardDuty finding in the test event on line 57 of the sample. The solution should block or alert traffic from that domain name in the following steps.
Choose Start execution to begin the processing of the test event.
You can now track the state machine processing of the test event. The processing should complete within a few seconds. You can select different steps in the visual Graph inspector to view input and output data. Figure 13 shows the input to the addDomainToDnsFirewallDomainList step that launches a Lambda function that interacts with DNS Firewall.

Figure 13: Step Functions state machine step details

Step 4: Confirm the entry in the DNS Firewall rule group

Now that a test event was processed by the state machine, you can check whether the DNS Firewall rule group would block traffic to the domain name identified in the GuardDuty finding.

To validate entries in the DNS Firewall rule group

In the AWS Management Console, choose Services, and then choose VPC. In the DNS Firewall section in the left navigation bar, choose DNS Firewall rule groups.
Choose the demoDnsFirewallRuleGroup rule group created by the solution, and you’ll be able to see the rules as shown in Figure 14.

Figure 14: Select the DNS Firewall rule
Choose the domain list associated with the BLOCK rule. Confirm that the rules blocking the traffic from the source and to the domain that you specified in the test event were created. The domain list should look similar to what is shown in Figure 15.

Figure 15: Verify that the domain was added to the blocked domain list

Step 5: Confirm the SNS notification

In this step, you’ll view the SNS notification that was sent to the email address you set up.

To confirm the SNS notification

Review the email inbox for the value that you provided for the AdminEmail parameter and look for a message with the subject line “AWS Notification Message.” The contents of the message from SNS should be similar to the following.

{"Blocked":"true","Input":{"ResponseMetadata":{"RequestId":"HOLOAAENUS3MN9B0DS6CO8BF4BVV4KQNSO5AEMVJF66Q9ASUAAJG","HTTPStatusCode":200,"HTTPHeaders":{"server":"Server","date":"Wed, 17 Nov 2021 08:20:38 GMT","content-type":"application/x-amz-json-1.0","content-length":"2","connection":"keep-alive","x-amzn-requestid":"HOLOAAENUS3MN9B0DS6CO8BF4BVV4KQNSO5AEMVJF66Q9ASUAAJG","x-amz-crc32":"2745614147"},"RetryAttempts":0}}}

Step 6: Apply the rule group to your VPC by using DNS Firewall

As part of the CloudFormation template deployment, two test VPCs have been created for you, to demonstrate that you can assign a single DNS Firewall rule group to multiple VPCs. You can also associate this rule group to your existing VPC of interest. To learn how to do this task, see Managing associations between your VPC and Route 53 Resolver DNS Firewall rule group. For visibility into DNS queries and for debugging purposes, the template creates log groups that accumulate DNS Resolver query logs.

After you’ve successfully tested the given sample that emulates C&CActivity.B!DNS, you can repeat steps 3 to 6 for the MaliciousDomainRequest.Reputation finding sample and the DNSDataExfiltration finding sample.

These samples are supplied for your convenience, and you will see the blocking action in a matter of minutes. Alternatively, you can use other ways to test, which might need about an hour for blocking action to happen. To initiate DNS C&C activity, you can make a DNS request from your instance (using dig for Linux or nslookup for Windows) against the test domain guarddutyc2activityb.com. Alternatively, you can use GuardDuty Tester, which generates DNS C&C activity and DNS exfiltration unauthorized events.

To take this solution one step further, you can implement automatic aging out of the domains that get added to the domain list. One way to do this is to use the Time to Live feature in DynamoDB and keep repopulating the domain list from DynamoDB at regular intervals of time. The benefit of this is that if the malicious nature of a domain in the domain list changes over time, the list will be kept up to date during this age out and repopulation process.

Considerations

There are a few considerations that you should keep in mind regarding DNS Firewall:

DNS Firewall and AWS Network Firewall work together for improved domain-filtering capability across HTTP(S) traffic. A domain list that you configure in Network Firewall should reflect the domain list configured in DNS Firewall.
DNS Firewall filters based on the domain name. It doesn’t translate that domain name to an IP address to be blocked.
It’s a best practice to block outbound traffic to port 53 with network access control lists (network ACLs) or Network Firewall so that GuardDuty can monitor DNS queries.
DNS Firewall filters DNS queries to the Amazon Route 53 Resolver (also known as AmazonProvidedDNS or VPC .2 Resolver) in the VPC. So for traffic leaving the VPC, we recommend that you use DNS Firewall along with Network Firewall, which you can use to secure traffic that isn’t headed to Amazon Route 53 Resolver. Network Firewall can also block domain names that exist in network traffic leaving the Amazon VPC, such as in HTTP HOST headers, TLS Server Name Indication (SNI) fields, and so on.
You can use Network Firewall to block external encrypted DNS services so that these services can’t be used to circumvent your DNS Firewall policies.

Conclusion

In this blog post, you learned how to automatically block malicious domains by using Route 53 Resolver DNS Firewall and GuardDuty. You can use this sample solution to automatically block communication to suspicious hosts discovered by GuardDuty, and you can apply those blocks across all configured DNS Firewall firewalls within your account.

All of the code for this solution is available on GitHub. Feel free to play around with the code; we hope it helps you learn more about automated security remediation. You can adjust the code to better fit your unique environment or extend the code with additional steps.

If you have comments about this blog post, submit them in the Comments section below. If you have questions about using this solution, start a thread in the Route 53 Resolver forum or GuardDuty forums, or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.