Tag Archives: Amazon EventBridge

Getting Started with Event-Driven Architecture

Post Syndicated from Talia Nassi original https://aws.amazon.com/blogs/compute/getting-started-with-event-driven-architecture/

In modern application development, event-driven architecture is becoming more prominent because it can make building applications in the cloud easier. Event-driven architecture can allow you to decouple your services, which increases developer velocity, and can make it easier for you to debug applications. It also can help remove the bottleneck that occurs when features expand across different teams, which allows teams to progress more independently.

One way to think about how an application works is as a system that reacts to events from other places, like from within your application. In this approach, you focus on the system’s interaction with its surroundings as a transmission of events. The application receives and creates events. Inputs to the application and outputs from the application act as events. At its core, this is event-driven architecture.

API-driven architecture vs. event-driven architecture

Commands/APIs Events
Synchronous Asynchronous

Has an intent

Directed to a target

It’s a fact

Happened in the past

“CreateAccount”

“AddProduct”

“AccountCreated”

“ProductAdded”

A common way of making components of an application work together is through an API-driven, request-response architecture where you have requests and responses. For example, you query a list of orders from an Orders API, and the Orders API responds with a list of orders. This is an example of synchronous architecture. The system asking for the orders waits for the response. You cannot move on until the response comes back. In this approach, you send commands that are directed to a target (for example, “place this order” or “add this record to the database”).

sync vs async

In a synchronous model, the client makes a request to Service A. Service A calls Service B, but then Service A waits for Service B to respond before it continues on and eventually responds to the client.

In asynchronous, event-driven architecture, there is no response path. The service surfaces the event and then immediately moves forward. The trade-off here is that there’s no direct channel for Service B to pass back information to Service A, besides confirming it received the event. But in many cases, you don’t need that explicit coupling between the request and response channels.

An event is something that happened. For example, a new account is created, or an item is dropped into an Amazon S3 bucket. Events are immutable, which means you cannot change them. Once an event happens, you cannot undo it. For example, if there is an event raised when an order is placed, there can be another event for an order being cancelled. Events can come from various places such as messaging systems or databases.

Events are JSON objects that tell you information about something that happened in your application. In event-driven architecture, events represent facts. Each component of the application raises an event whenever anything changes. Other components listen and decide what to do with it and how they would like to react.

event

In the event above, S3 raises the event when you put the image into an Amazon S3 bucket. The event source is an S3 bucket named sam-app-sourcebucket. The object that is put into the bucket is called “brad.jpeg”.

Request-driven applications typically use directed commands to coordinate downstream functions to complete an activity and are often tightly coupled. This makes it harder to determine when errors occur in your application. Event-driven applications create events that are observable by other services and systems. However, the event producer is unaware of which consumers, if any, are listening. Typically, these are loosely coupled.

Events are observable. Any service that is authorized can watch an event. Consider a coffee shop example where there is a barista, who makes coffee, and a pastry chef, who makes pastries. When a customer enters the coffee shop and orders a cup of coffee, the barista starts to make the coffee, and the pastry chef takes no action.

However, if a customer comes in to the coffee shop and orders a chocolate croissant, then the pastry chef starts making the chocolate croissant, and the barista takes no action. The pastry chef is only interested in orders relating to pastries and the barista is only interested in events relating to coffee.

In an ecommerce application, like Amazon.com, there are different departments that respond to different events. You can place orders through Whole Foods, Amazon Fresh, and Amazon.com. When you place an order with Amazon Fresh, the subscribers to that event take action and fulfill your order.

event

Event-driven architecture and command-driven architecture also differ in the ways that they store state. In a typical command-driven architecture, you have only one component store a particular piece of data, and other components ask that component for the data when needed.

In event-driven architecture, every component stores all the data it needs and listens to update events for that data. In command-driven architecture, the component that stores the data is responsible for updating it. In event-driven architecture, all it has to do is ensure new events are raised on the updates.

Benefits of using event-driven architecture

Decoupling event sources and event targets

Many applications are built in a monolith, where the components are tightly coupled, and are highly dependent on each other. This proves to be problematic when there are bugs and you are trying to pinpoint exactly what part of the application is failing. Decoupled architectures are composed of components or services that are loosely coupled. In an event-driven, decoupled architecture, you broadcast events without caring who responds to them. This saves time because events can be queued and forwarded whenever the receiver is ready to process them. This allows for building scalable, highly modifiable systems.

Decoupled applications enable teams to act more independently, which increases their velocity. For example, with an API-based integration, if my team wants to know about some change that happened in another team’s microservice, I have to ask that team to make an API call to my service. That means I have to deal with authentication, coordination with the other team over the structure of the API call, etc. This causes back and forth between teams, which slows down development time. With an event-driven application, you can subscribe to events sent from a microservice and the event router (for example, Amazon EventBridge) takes care of routing the event and handling authentication.

Decoupled applications also allow you to build new features faster. Adding new features or extending existing ones is simpler with event-driven architectures. This is because you only have to choose the event you need to trigger your new feature, and subscribe to it. There’s no need to modify any of your existing services to add new functionality.

Write less code

When you build applications using event-driven architecture, often you write less code because you only need to consider new events, as well as which service is subscribed to those events. For example, if you are building new features for your application, all you have to do is consider the existing events and then add senders and receivers as necessary. In this way, you speed up development time because each functional unit is smaller and there is often less code.

Better extensibility

In the example above, you built a highly extensible application. Other teams can extend features and add functionality without impacting other microservices. By publishing events using EventBridge, this application integrates with existing systems, but also enables any future application to integrate as an event consumer. Producers of events have no knowledge of event consumers, which can help simplify the microservice logic.

Enhancing team collaboration

A common process to build applications is to work with your product managers and business stakeholders to gather requirements. Developers then translate those requirements into code. However, there may be a disconnect between the product requirements and the code. When you use events, everyone in the business understands the logic. You define the events in an application (for example, a customer adds an item to their shopping cart or a customer account is created) and that becomes your product requirements. Whenever that action happens, it produces an event, and whoever is interested can take action on that event.

For example, a marketing manager could be interested whenever a customer creates a new account. One way to choreograph this in event-driven architecture is to have a Marketing event bus that listens for the New Account event. There could also be other teams that are interested, such as the Analytics team, who also subscribe to that event. Each team/service can subscribe to events that are relevant to them. Event-driven architecture is a great way for businesses to describe their business problems and represent them.

Conclusion

This post introduces events, and then compares event-driven architecture to command-driven, request-response architecture. It also explains the benefits of event-driven architecture, including decoupling event sources and targets, writing less code, having better extensibility, and enhancing team collaboration.

For more serverless learning resources, visit Serverless Land.

ICYMI: Serverless Q1 2022

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/icymi-serverless-q1-2022/

Welcome to the 16th edition of the AWS Serverless ICYMI (in case you missed it) quarterly recap. Every quarter, we share all the most recent product launches, feature enhancements, blog posts, webinars, Twitch live streams, and other interesting things that you might have missed!

Calendar

In case you missed our last ICYMI, check out what happened last quarter here.

AWS Lambda

Lambda now offers larger ephemeral storage for functions, up to 10 GB. Previously, the storage was set to 512 MB. There are several common use-cases that can benefit from expanded temporary storage, including extract-transform load (ETL) jobs, machine learning inference, and data processing workloads. To see how to configure the amount of /tmp storage in AWS SAM, deploy this Serverless Land Pattern.

Ephemeral storage settings

For Node.js developers, Lambda now supports ES Modules and top-level await for Node.js 14. This enables developers to use a wider range of JavaScript packages in functions. With top-level await, when used with Provisioned Concurrency, this can improve cold-start performance when using asynchronous initialization.

For .NET developers, Lambda now supports .NET 6 as both a managed runtime and container base image. You can now use new features of the runtime such as improved logging, simplified function definitions using top-level statements, and improved performance using source generators.

The Lambda console now allows you to share test events with other developers in your team, using granular IAM permissions. Previously, test events were only visible to the builder who created them. To learn about creating sharable test events, read this documentation.

Amazon EventBridge

Amazon EventBridge Schema Registry helps you create code bindings from event schemas for use directly in your preferred IDE. You can generate these code bindings for a schema by using the EventBridge console, APIs, or AWS SDK toolkits for Jetbrains (Intellij, PyCharm, Webstorm, Rider) and VS Code. This feature now supports Go, in addition to Java, Python, and TypeScript, and is available at no additional cost.

AWS Step Functions

Developers can test state machines locally using Step Functions Local, and the service recently announced mocked service integrations for local testing. This allows you to define sample output from AWS service integrations and combine them into test cases to validate workflow control. This new feature introduces a robust way to state machines in isolation.

Amazon DynamoDB

Amazon DynamoDB now supports limiting the number of items processed in PartiQL operation, using an optional parameter on each request. The service also increased default Service Quotas, which can help simplify the use of large numbers of tables. The per-account, per-Region quota increased from 256 to 2,500 tables.

AWS AppSync

AWS AppSync added support for custom response headers, allowing you to define additional headers to send to clients in response to an API call. You can now use the new resolver utility $util.http.addResponseHeaders() to configure additional headers in the response for a GraphQL API operation.

Serverless blog posts

January

Jan 6 – Using Node.js ES modules and top-level await in AWS Lambda

Jan 6 – Validating addresses with AWS Lambda and the Amazon Location Service

Jan 20 – Introducing AWS Lambda batching controls for message broker services

Jan 24 – Migrating AWS Lambda functions to Arm-based AWS Graviton2 processors

Jan 31 – Using the circuit breaker pattern with AWS Step Functions and Amazon DynamoDB

Jan 31 – Mocking service integrations with AWS Step Functions Local

February

Feb 8 – Capturing client events using Amazon API Gateway and Amazon EventBridge

Feb 10 – Introducing AWS Virtual Waiting Room

Feb 14 – Building custom connectors using the Amazon AppFlow Custom Connector SDK

Feb 22 – Building TypeScript projects with AWS SAM CLI

Feb 24 – Introducing the .NET 6 runtime for AWS Lambda

March

Mar 6 – Migrating a monolithic .NET REST API to AWS Lambda

Mar 7 – Decoding protobuf messages using AWS Lambda

Mar 8 – Building a serverless image catalog with AWS Step Functions Workflow Studio

Mar 9 – Composing AWS Step Functions to abstract polling of asynchronous services

Mar 10 – Building serverless multi-Region WebSocket APIs

Mar 15 – Using organization IDs as principals in Lambda resource policies

Mar 16 – Implementing mutual TLS for Java-based AWS Lambda functions

Mar 21 – Running cross-account workflows with AWS Step Functions and Amazon API Gateway

Mar 22 – Sending events to Amazon EventBridge from AWS Organizations accounts

Mar 23 – Choosing the right solution for AWS Lambda external parameters

Mar 28 – Using larger ephemeral storage for AWS Lambda

Mar 29 – Using AWS Step Functions and Amazon DynamoDB for business rules orchestration

Mar 31 – Optimizing AWS Lambda function performance for Java

First anniversary of Serverless Land Patterns

Serverless Patterns Collection

The DA team launched the Serverless Patterns Collection in March 2021 as a repository of serverless examples that demonstrate integrating two or more AWS services. Each pattern uses an infrastructure as code (IaC) framework to automate the deployment. These can simplify the creation and configuration of the services used in your applications.

The Serverless Patterns Collection is both an educational resource to help developers understand how to join different services, and an aid for developers that are getting started with building serverless applications.

The collection has just celebrated its first anniversary. It now contains 239 patterns for CDK, AWS SAM, Serverless Framework, and Terraform, covering 30 AWS services. We have expanded example runtimes to include .NET, Java, Rust, Python, Node.js and TypeScript. We’ve served tens of thousands of developers in the first year and we’re just getting started.

Many thanks to our contributors and community. You can also contribute your own patterns.

Videos

YouTube: youtube.com/serverlessland

Serverless Office Hours – Tues 10 AM PT

Weekly live virtual office hours. In each session we talk about a specific topic or technology related to serverless and open it up to helping you with your real serverless challenges and issues. Ask us anything you want about serverless technologies and applications.

YouTube: youtube.com/serverlessland
Twitch: twitch.tv/aws

January

February

March

FooBar Serverless YouTube channel

The Developer Advocate team is delighted to welcome Marcia Villalba onboard. Marcia was an AWS Serverless Hero before joining AWS over two years ago, and she has created one of the most popular serverless YouTube channels. You can view all of Marcia’s videos at https://www.youtube.com/c/FooBar_codes.

January

February

March

AWS Summits

AWS Global Summits are free events that bring the cloud computing community together to connect, collaborate, and learn about AWS. This year, we have restarted in-person Summits at major cities around the world.

The next 4 Summits planned are Paris (April 12), San Francisco (April 20-21), London (April 27), and Madrid (May 4-5). To find and register for your nearest AWS Summit, visit the AWS Summits homepage.

Still looking for more?

The Serverless landing page has more information. The Lambda resources page contains case studies, webinars, whitepapers, customer stories, reference architectures, and even more Getting Started tutorials.

You can also follow the Serverless Developer Advocacy team on Twitter to see the latest news, follow conversations, and interact with the team.

Choosing the right solution for AWS Lambda external parameters

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/choosing-the-right-solution-for-aws-lambda-external-parameters/

This post is written by Thomas Moore, Solutions Architect, Serverless.

When using AWS Lambda to build serverless applications, customers often need to retrieve parameters from an external source at runtime. This allows you to share parameter values across multiple functions or microservices, providing a single source of truth for updates. A common example is retrieving database connection details from an external source and then using the retrieved hostname, user name, and password to connect to the database:

Lambda function retrieving database credentials from an external source

Lambda function retrieving database credentials from an external source

AWS provides a number of options to store parameter data, including AWS Systems Manager Parameter Store, AWS AppConfig, Amazon S3, and Lambda environment variables. This blog explores the different parameter data that you may need to store. I cover considerations for choosing the right parameter solution and how to retrieve and cache parameter data efficiently within the Lambda function execution environment.

Common use cases

Common parameter examples include:

  • Securely storing secret data, such as credentials or API keys.
  • Database connection details such as hostname, port, and credentials.
  • Schema data (for example, a structured JSON response).
  • TLS certificate for mTLS or JWT validation.
  • Email template.
  • Tenant configuration in a multitenant system.
  • Details of external AWS resources to communicate with such as an Amazon SQS queue URL, Amazon EventBridge event bus name, or AWS Step Functions ARN.

Key considerations

There are a number of key considerations when choosing the right solution for external parameter data.

  1. Cost – how much does it cost to store the data and retrieve it via an API call?
  2. Security – what encryption and fine-grained access control is required?
  3. Performance – what are the retrieval latency requirements?
  4. Data size – how much data is there to store and retrieve?
  5. Update frequency – how often does the parameter change and how does the function handle stale parameters?
  6. Access scope – do multiple functions or services access the parameter?

These considerations help to determine where to store the parameter data and how often to retrieve it.

For example, a 4KB parameter that updates hourly and is used by hundreds of functions needs to be optimized for low retrieval costs and high performance. Choosing a solution that supports low-cost API GET requests at a high transaction per second (TPS) would be better than one that supports large data.

AWS service options

There are a number of AWS services available to store external parameter data.

Amazon S3

S3 is an object storage service offering 99.999999999% (11 9s) of data durability and virtually unlimited scalability at low cost. Objects can be up to 5 TB in size in any format, making S3 a good solution to store larger parameter data.

Amazon DynamoDB

Amazon DynamoDB is a fully managed, serverless, key-value NoSQL database designed for single-digit millisecond performance at any scale. Due to the high performance of this service, it’s a great place to store parameters when low retrieval latency is important.

AWS Secrets Manager

AWS Secrets Manager makes it easier to rotate, manage, and retrieve secret data. This makes it the ideal place to store sensitive parameters such as passwords and API keys.

AWS Systems Manager Parameter Store

Parameter Store provides a centralized store to manage configuration data. This data can be plaintext or encrypted using AWS Key Management Service (KMS). Parameters can be tagged and organized into hierarchies for simpler management. Parameter Store is a good default choice for general-purpose parameters in AWS. The standard version (no additional charge) can store parameters up to 4 KB in size and the advanced version (additional charges apply) up to 8 KB.

For a code example using Parameter Store for Lambda parameters, see the Serverless Land pattern.

AWS AppConfig

AppConfig is a capability of AWS Systems Manager to create, manage, and quickly deploy application configurations. AppConfig allows you to validate changes during roll-outs and automatically roll back, if there is an error. AppConfig deployment strategies help to manage configuration changes safely.

AppConfig also provides a Lambda extension to retrieve and locally cache configuration data. This results in fewer API calls and reduced function duration, reducing costs.

AWS Lambda environment variables

You can store parameter data as Lambda environment variables as part of the function’s version-specific configuration. Lambda environment variables are stored during function creation or updates. You can access these variables directly from your code without needing to contact an external source. Environment variables are ideal for parameter values that don’t need updating regularly and help make function code reusable across different environments. However, unlike the other options, values cannot be accessed centrally by multiple functions or services.

Lambda execution lifecycle

It is worth understanding the Lambda execution lifecycle, which has a number of stages. This helps to decide when to handle parameter retrieval within your Lambda code, including cache management.

Lambda execution lifecycle

Lambda execution lifecycle

When a Lambda function is invoked for the first time, or when Lambda is scaling to handle additional requests, an execution environment is created. The first phase in the execution environment’s lifecycle is initialization (Init), during which the code outside the main handler function runs. This is known as a cold start.

The execution environment can then be re-used for subsequent invocations. This means that the Init phase does not need to run again and only the main handler function code runs. This is known as a warm start.

An execution environment can only run a single invocation at a time. Concurrent invocations require additional execution environments. When a new execution environment is required, this starts a new Init phase, which runs the cold start process.

Caching and updates

Retrieving the parameter during Init

Retrieving the parameter during Init

Retrieving the parameter during Init

As Lambda execution environments are re-used, you can improve the performance and reduce the cost of retrieving an external parameter by caching the value. Writing the value to memory or the Lambda /tmp file system allows it to be available during subsequent invokes in the same execution environment.

This approach reduces API calls, as they are not made during every invocation. However, this can cause an out-of-date parameter and potentially different values across concurrent execution environments.

The following Python example shows how to retrieve a Parameter Store value outside the Lambda handler function during the Init phase.

import boto3
ssm = boto3.client('ssm', region_name='eu-west-1')
parameter = ssm.get_parameter(Name='/my/parameter')
def lambda_handler(event, context):
    # My function code...

Retrieving the parameter on every invocation

Retrieving the parameter on every invocation

Retrieving the parameter on every invocation

Another option is to retrieve the parameter during every invocation by making the API call inside the handler code. This keeps the value up to date, but can lead to higher retrieval costs and longer function durations due to the added API call during every invocation.

The following Python example shows this approach:

import boto3
ssm = boto3.client('ssm', region_name='eu-west-1')
def lambda_handler(event, context):
    parameter = ssm.get_parameter(Name='/my/parameter')
    # My function code...

Using AWS AppConfig Lambda extension

Using AWS AppConfig Lambda extension

Using AWS AppConfig Lambda extension

AppConfig allows you to retrieve and cache values from the service using a Lambda extension. The extension retrieves the values and makes them available via a local HTTP server. The Lambda function then queries the local HTTP server for the value. The AppConfig extension refreshes the values at a configurable poll interval, which defaults to 45 seconds. This improves performance and reduces costs, as the function only needs to make a local HTTP call.

The following Python code example shows how to access the cached parameters.

import urllib.request
def lambda_handler(event, context):
    url = f'http://localhost:2772/applications/application_name/environments/environment_name/configurations/configuration_name'
    config = urllib.request.urlopen(url).read()
    # My function code...

For caching secret values using a Lambda extension local HTTP cache and AWS Secrets Manager, see the AWS Prescriptive Guidance documentation.

Using Lambda Powertools for Python or Java

Lambda Powertools for Python or Lambda Powertools for Java contains utilities to manage parameter caching. You can configure the cache interval, which defaults to 5 seconds. Supported parameter stores include Secrets Manager, AWS Systems Manager Parameter Store, AppConfig, and DynamoDB. You also have the option to bring your own provider. The following example shows the Powertools for Python parameters utility retrieving a single value from Systems Manager Parameter Store.

from aws_lambda_powertools.utilities import parameters
def handler(event, context):
    value = parameters.get_parameter("/my/parameter")
    # My function code…

Security

Parameter security is a key consideration. You should evaluate encryption at rest, in-transit, private network access, and fine-grained permissions for each external parameter solution based on the use case.

All services highlighted in this post support server-side encryption at rest, and you can choose to use AWS KMS to manage your own keys. When accessing parameters using the AWS SDK and CLI tools, connections are encrypted in transit using TLS by default. You can force most to use TLS 1.2.

To access parameters from inside an Amazon Virtual Private Cloud (Amazon VPC) without internet access, you can use AWS PrivateLink and create a VPC endpoint for each service. All the services mentioned in this post support AWS PrivateLink connections.

Use AWS Identity and Access Management (IAM) policies to manage which users or roles can access specific parameters.

General guidance

This blog explores a number of considerations to make when using an external source for Lambda parameters. The correct solution is use-case dependent. There are some general guidelines when selecting an AWS service.

  • For general-purpose low-cost parameters, use AWS Systems Manager Parameter Store.
  • For single function, small parameters, use Lambda environment variables.
  • For secret values that require automatic rotation, use AWS Secrets Manager.
  • When you need a managed cache, use the AWS AppConfig Lambda extension or Lambda Powertools for Python/Java.
  • For items larger than 400 KB, use Amazon S3.
  • When access frequency is high, and low latency is required, use Amazon DynamoDB.

Conclusion

External parameters provide a central source of truth across distributed systems, allowing for efficient updates and code reuse. This blog post highlights a number of considerations when using external parameters with Lambda to help you choose the most appropriate solution for your use case.

Consider how you cache and reuse parameters inside the Lambda execution environment. Doing this correctly can help you reduce costs and improve the performance of your Lambda functions.

There are a number of services to choose from to store parameter data. These include DynamoDB, S3, Parameter Store, Secrets Manager, AppConfig, and Lambda environment variables. Each comes with a number of advantages, depending on the use case. This blog guidance, along with the AWS documentation and Service Quotas, can help you select the most appropriate service for your workload.

For more serverless learning resources, visit Serverless Land.

Sending events to Amazon EventBridge from AWS Organizations accounts

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/sending-events-to-amazon-eventbridge-from-aws-organizations-accounts/

This post is written by Elinisa Canameti, Associate Cloud Architect, and Iris Kraja, Associate Cloud Architect.

AWS Organizations provides a hierarchical grouping of multiple accounts. This helps larger projects establish a central managing system that governs the infrastructure. This can help with meeting budgetary and security requirements.

Amazon EventBridge is a serverless event-driven service that delivers events from customized applications, AWS services, or software as service (SaaS) applications to targets.

This post shows how to send events from multiple accounts in AWS Organizations to the management account using AWS CDK. This uses AWS CloudFormation StackSets to deploy the infrastructure in the member’s accounts.

Solution overview

This blog post describes a centralized event management strategy in a multi-account setup. The AWS accounts are organized by using AWS Organizations service into one management account and many member accounts.  You can explore and deploy the reference solution from this GitHub repository.

In the example, there are events from three different AWS services: Amazon CloudWatch, AWS Config, and Amazon GuardDuty. These are sent from the member accounts to the management account.

The management account publishes to an Amazon SNS topic, which sends emails when these events occur. AWS CloudFormation StackSets are created in the management account to deploy the infrastructure in the member’s accounts.

Overview

To fully automate the process of adding and removing accounts to the organization unit, an EventBridge rule is triggered:

  • When a new account is moved from and in the organization unit (MoveAccount event).
  • When an account is removed from the organization (RemoveAccountFromOrganization event).

These events invoke an AWS Lambda function, which updates the management EventBridge rule with the additional source accounts.

Prerequisites

  1. At least one AWS account, which represents the member’s account.
  2. An AWS account, which is the management account.
  3. Install AWS Command Line (CLI).
  4. Install AWS CDK.

Set up the environment with AWS Organizations

1. Login into the main account used to manage your organization.

2. Select the AWS Organizations service. Choose Create Organization. Once you create the organization, you receive a verification email.

3. After verification is completed, you can add other customer accounts into the organization.

4. AWS sends an invitation to each account added under the root account.

5. To see the invitation, log in to each of the accounts and search for AWS Organization service. You can find the invitation option listed on the side.

6. Accept the invitation from the root account.

7. Create an Organization Unit (OU) and place all the member accounts, which should propagate the events to the management account in this OU. The OU identifier is used later to deploy the StackSet.

8. Finally, enable trusted access in the organization to be able to create StackSets and deploy resources from the management account to the member accounts.

Management account

After the initial deployment of the solution in the management account, a StackSet is created. It’s configured to deploy the member account infrastructure in all the members of the organization unit.

When you add a new account in the organization unit, this StackSet automatically deploys the resources specified. Read the Member account section for more information about resources in this stack.

All events coming from the member account pass through the custom event bus in the management account. To allow other accounts to put events in the management account, the resource policy of the event bus grants permission to every account in the organization:

{
  "Version": "2012-10-17",
  "Statement": [{
    "Sid": "AllowAllAccountsInOrganizationToPutEvents",
    "Effect": "Allow",
    "Principal": "*",
    "Action": "events:PutEvents",
    "Resource": "arn:aws:events:us-east-1:xxxxxxxxxxxx:event-bus/CentralEventBus ",
    "Condition": {
      "StringEquals": {
        "aws:PrincipalOrgID": "o-xxxxxxxx"
      }
    }
  }]
}

You must configure EventBridge in the management account to handle the events coming from the member accounts. In this example, you send an email with the event information using SNS. Create a new rule with the following event pattern:

Event pattern

When you add or remove accounts in the organization unit, an EventBridge rule invokes a Lambda function to update the rule in the management account. This rule reacts to two events from the organization’s source: MoveAccount and RemoveAccountFromOrganization.

Event pattern

// EventBridge to trigger updateRuleFunction Lambda whenever a new account is added or removed from the organization.
new Rule(this, 'AWSOrganizationAccountMemberChangesRule', {
   ruleName: 'AWSOrganizationAccountMemberChangesRule',
   eventPattern: {
      source: ['aws.organizations'],
      detailType: ['AWS API Call via CloudTrail'],
      detail: {
        eventSource: ['organizations.amazonaws.com'],
        eventName: [
          'RemoveAccountFromOrganization',
          'MoveAccount'
        ]
      }
    },
    targets: [
      new LambdaFunction(updateRuleFunction)
    ]
});

Custom resource Lambda function

The custom resource Lambda function is executed to run custom logic whenever the CloudFormation stack is created, updated or deleted.

// Cloudformation Custom resource event
switch (event.RequestType) {
  case "Create":
    await putRule(accountIds);
    break;
  case "Update":
    await putRule(accountIds);
    break;
  case "Delete":
    await deleteRule()
}

async function putRule(accounts) {
  await eventBridgeClient.putRule({
    Name: rule.name,
    Description: rule.description,
    EventBusName: eventBusName,
    EventPattern: JSON.stringify({
      account: accounts,
      source: rule.sources
    })
  }).promise();
  await eventBridgeClient.putTargets({
    Rule: rule.name,
    Targets: [
      {
        Arn: snsTopicArn,
        Id: `snsTarget-${rule.name}`
      }
    ]
  }).promise();
}

Amazon EventBridge triggered Lambda function code

// New AWS Account moved to the organization unit of out it.
if (eventName === 'MoveAccount' || eventName === 'RemoveAccountFromOrganization') {
   await putRule(accountIds);
}

All events generated from the members’ accounts are then sent to a SNS topic, which has an email address as an endpoint. More sophisticated targets can be configured depending on the application’s needs. The targets include, but are not limited to: Step Functions state machine, Kinesis stream, SQS queue, etc.

Member account

In the member account, we use an Amazon EventBridge rule to route all the events coming from Amazon CloudWatch, AWS Config, and Amazon GuardDuty to the event bus created in the management account.

    const rule = {
      name: 'MemberEventBridgeRule',
      sources: ['aws.cloudwatch', 'aws.config', 'aws.guardduty'],
      description: 'The Rule propagates all Amazon CloudWatch Events, AWS Config Events, AWS Guardduty Events to the management account'
    }

    const cdkRule = new Rule(this, rule.name, {
      description: rule.description,
      ruleName: rule.name,
      eventPattern: {
        source: rule.sources,
      }
    });
    cdkRule.addTarget({
      bind(_rule: IRule, generatedTargetId: string): RuleTargetConfig {
        return {
          arn: `arn:aws:events:${process.env.REGION}:${process.env.CDK_MANAGEMENT_ACCOUNT}:event-bus/${eventBusName.valueAsString}`,
          id: generatedTargetId,
          role: publishingRole
        };
      }
    });

Deploying the solution

Bootstrap the management account:

npx cdk bootstrap  \ 
    --profile <MANAGEMENT ACCOUNT AWS PROFILE>  \ 
    --cloudformation-execution-policies arn:aws:iam::aws:policy/AdministratorAccess  \
    aws://<MANAGEMENT ACCOUNT ID>/<REGION>

To deploy the stack, use the cdk-deploy-to.sh script and pass as argument the management account ID, Region, AWS Organization ID, AWS Organization Unit ID and an accessible email address.

sh ./cdk-deploy-to.sh <MANAGEMENT ACCOUNT ID> <REGION> <AWS ORGANIZATION ID> <MEMBER ORGANIZATION UNIT ID> <EMAIL ADDRESS> AwsOrganizationsEventBridgeSetupManagementStack 

Make sure to subscribe to the SNS topic, when you receive an email after the Management stack is deployed.

After the deployment is completed, a stackset starts deploying the infrastructure to the members’ account. You can view this process in the AWS Management Console, under the AWS CloudFormation service in the management account as shown in the following image, or by logging in to the member account, under CloudFormation stacks.

Stack output

Testing the environment

This project deploys an Amazon CloudWatch billing alarm to the member accounts to test that we’re retrieving email notifications when this metric is in alarm. Using the member’s account credentials, run the following AWS CLI Command to change the alarm status to “In Alarm”:

aws cloudwatch set-alarm-state --alarm-name 'BillingAlarm' --state-value ALARM --state-reason "testing only" 

You receive emails at the email address configured as an endpoint on the Amazon SNS topic.

Conclusion

This blog post shows how to use AWS Organizations to organize your application’s accounts by using organization units and how to centralize event management using Amazon EventBridge across accounts in the organization. A fully automated solution is provided to ensure that adding new accounts to the organization unit is efficient.

For more serverless learning resources, visit https://serverlessland.com.

Audit AWS service events with Amazon EventBridge and Amazon Kinesis Data Firehose

Post Syndicated from Anand Shah original https://aws.amazon.com/blogs/big-data/audit-aws-service-events-with-amazon-eventbridge-and-amazon-kinesis-data-firehose/

Amazon EventBridge is a serverless event bus that makes it easy to build event-driven applications at scale using events generated from your applications, integrated software as a service (SaaS) applications, and AWS services. Many AWS services generate EventBridge events. When an AWS service in your account emits an event, it goes to your account’s default event bus.

The following are a few event examples:

By default, these AWS service-generated events are transient and therefore not retained. This post shows how you can forward AWS service-generated events or custom events to Amazon Simple Storage Service (Amazon S3) for long-term storage, analysis, and auditing purposes using EventBridge rules and Amazon Kinesis Data Firehose.

Solution overview

In this post, we provide a working example of AWS service-generated events ingested to Amazon S3. To make sure we have some service events available in default event bus, we use Parameter Store, a capability of AWS Systems Manager to store new parameters manually. This action generates a new event, which is ingested by the following pipeline.

Architecture Diagram

The pipeline includes the following steps:

  1. AWS service-generated events (for example, a new parameter created in Parameter Store) goes to the default event bus at EventBridge.
  2. The EventBridge rule matches all events and forwards those to Kinesis Data Firehose.
  3. Kinesis Data Firehose delivers events to the S3 bucket partitioned by detail-type and receipt time using its dynamic partitioning capability.
  4. The S3 bucket stores the delivered events, and their respective event schema is registered to the AWS Glue Data Catalog using an AWS Glue crawler.
  5. You query events using Amazon Athena.

Deploy resources using AWS CloudFormation

We use AWS CloudFormation templates to create all the necessary resources for the ingestion pipeline. This removes opportunities for manual error, increases efficiency, and provides consistent configurations over time. The template is also available on GitHub.

Complete the following steps:

  1. Click here to
    Launch Stack
  2. Acknowledge that the template may create AWS Identity and Access Management (IAM) resources.
  3. Choose Create stack.

The template takes about 10 minutes to complete and creates the following resources in your AWS account:

  • An S3 bucket to store event data.
  • A Firehose delivery stream with dynamic partitioning configuration. Dynamic partitioning enables you to continuously partition streaming data in Kinesis Data Firehose by using keys within the data (for example, customer_id or transaction_id) and then deliver the data grouped by these keys into corresponding S3 prefixes.
  • An EventBridge rule that forwards all events from the default event bus to Kinesis Data Firehose.
  • An AWS Glue crawler that references the path to the event data in the S3 bucket. The crawler inspects data landed to Amazon S3 and registers tables as per the schema with the AWS Glue Data Catalog.
  • Athena named queries for you to query the data processed by this example.

Trigger a service event

After you create the CloudFormation stack, you trigger a service event.

  1. On the AWS CloudFormation console, navigate to the Outputs tab for the stack.
  2. Choose the link for the key CreateParameter.

Create Parameter

You’re redirected to the Systems Manager console to create a new parameter.

  1. For Name, enter a name (for example, my-test-parameter).
  2. For Value, enter the test value of your choice (for example, test-value).

My Test parameter

  1. Leave everything else as default and choose Create parameter.

This step saves the new Systems Manager parameter and pushes the parameter-created event to the default EventBridge event bus, as shown in the following code:

{
  "version": "0",
  "id": "6a7e4feb-b491-4cf7-a9f1-bf3703497718",
  "detail-type": "Parameter Store Change",
  "source": "aws.ssm",
  "account": "123456789012",
  "time": "2017-05-22T16:43:48Z",
  "region": "us-east-1",
  "resources": [
    "arn:aws:ssm:us-east-1:123456789012:parameter/foo"
  ],
  "detail": {
    "operation": "Create",
    "name": "my-test-parameter",
    "type": "String",
    "description": ""
  }
}

Discover the event schema

After the event is triggered by saving the parameter, wait at least 2 minutes for the event to be ingested via Kinesis Data Firehose to the S3 bucket. Now complete the following steps to run an AWS Glue crawler to discover and register the event schema in the Data Catalog:

  1. On the AWS Glue console, choose Crawlers in the navigation pane.
  2. Select the crawler with the name starting with S3EventDataCrawler.
  3. Choose Run crawler.

Run Crawler

This step runs the crawler, which takes about 2 minutes to complete. The crawler discovers the schema from all events and registers it as tables in the Data Catalog.

Query the event data

When the crawler is complete, you can start querying event data. To query the event, complete the following steps:

  1. On the AWS CloudFormation console, navigate to the Outputs tab for your stack.
  2. Choose the link for the key AthenaQueries.

Athena Queries

You’re redirected to the Saved queries tab on the Athena console. If you’re running Athena queries for the first time, set up your S3 output bucket. For instructions, see Working with Query Results, Recent Queries, and Output Files.

  1. Search for Blog to find the queries created by this post.
  2. Choose the query Blog – Query Parameter Store Events.

Find Athena Saved Queries

The query opens on the Athena console.

  1. Choose Run query.

You can update the query to search the event you created earlier.

  1. Apply a WHERE clause with the parameter name you selected earlier:
SELECT * FROM "AwsDataCatalog"."eventsdb-randomId"."parameter_store_change"
WHERE detail.name = 'Your event name'

You can also choose the link next to the key CuratedBucket from the CloudFormation stack outputs to see paths and the objects loaded to the S3 bucket from other event sources. Similarly, you can query them via Athena.

Clean up

Complete the following steps to delete your resources and stop incurring costs:

  1. On the AWS CloudFormation console, select the stack you created and choose Delete.
  2. On the Amazon S3 console, find the bucket with the name starting with eventbridge-firehose-blog-curatedbucket.
  3. Select the bucket and choose Empty.
  4. Enter permanently delete to confirm the choice.
  5. Select the bucket again and choose Delete.
  6. Confirm the action by entering the bucket name when prompted.
  7. On the Systems Manager console, go to the parameter store and delete the parameter you created earlier.

Summary

This post demonstrates how to use an EventBridge rule to redirect AWS service-generated events or custom events to Amazon S3 using Kinesis Data Firehose to use for long-term storage, analysis, querying, and audit purposes.

For more information, see the Amazon EventBridge User Guide. To learn more about AWS service events supported by EventBridge, see Events from AWS services.


About the Author

Anand ShahAnand Shah is a Big Data Prototyping Solution Architect at AWS. He works with AWS customers and their engineering teams to build prototypes using AWS analytics services and purpose-built databases. Anand helps customers solve the most challenging problems using the art of the possible technology. He enjoys beaches in his leisure time.

Capturing client events using Amazon API Gateway and Amazon EventBridge

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/capturing-client-events-using-amazon-api-gateway-and-amazon-eventbridge/

This post is written by Tim Bruce, Senior Solutions Architect, DevAx.

Event producers are one of the three main components in an event-driven architecture. Event producers create and publish events to event routers, which send them to event consumers. Any portion of a system, including a mobile or web client, can be an event producer.

To extend the event model to your mobile and web clients, you must implement standards for security, messaging formats, and event storage.

This post shows how to build a client-enabled event-handling solution. It uses Amazon EventBridge, Amazon API Gateway, AWS Lambda, and Amazon Cognito. This architecture supports routing client events to internal and external destinations. It provides a blueprint that you can use to simplify the integration.

Overview

This example creates a RESTful API using API Gateway. It sends events directly to EventBridge without the need for compute services. In production, you have more requirements than only receiving and forwarding events. Additional requirements include security, user identification, validation, enrichment, transformation, event forwarding, and storing.

In this example, API Gateway provides security and user identification by invoking a Lambda authorizer. The authorizer generates a policy and returns client identification to API Gateway. API Gateway then performs request validation and message enrichment before forwarding the events to EventBridge.

EventBridge evaluates the events against rules and forwards the events to targets. The rules apply transformation to the events and forward an event to up to five targets. Targets include AWS services, such as Amazon Kinesis Data Firehose, and many third-party solutions, such as Zendesk, with HTTPS endpoints.

Lastly, Kinesis Data Firehose provides a cost-effective solution to store events into an Amazon S3 bucket. Before storing the events, Kinesis Data Firehose transforms records via Lambda transformers. It also partitions records using data in the record or calculated data via a Lambda function. Kinesis Data Firehose uses this partitioning data to create keys in the bucket and store matching records within the keys.

Example architecture

Example architecture

The example consists of the following resources defined in the AWS SAM template:

Data flow

Data flow

  1. Application clients collect or generate the events.
  2. The client sends the events to API Gateway as URL-encoded JSON. The client includes the user’s JWT in an authorization header with the request for validation.
  3. The Lambda authorizer validates the JWT with Amazon Cognito and returns the user’s unique clientID value to API Gateway.
  4. API Gateway transforms the request into events, appending clientId, the bus name, and environment.
  5. API Gateway sends the events to EventBridge.
  6. EventBridge rules match the events and:
    1. Forwards all client events to Kinesis Data Firehose.
    2. Forwards client events with detail.eventType of “loyaltypurchase” to Zendesk.
  7. Kinesis Data Firehose receives the records.
  8. The Kinesis Data Firehose data transformation processes each record, moving the client ID to the detail object.
  9. Kinesis Data Firehose partitions the records and stores them in an S3 bucket.

Overall design

The following sections discuss details of the solution, starting from the event in a web or mobile client. This solution requires the client to create an HTTPS request, including the user’s JWT as an authorization header.

{"entries": [{"entry": "{\"eventType\": \"searching\", \"schemaVersion\":1, \"data\": {\"searchTerm\":\"games\"}}"}]}

The preceding JSON shows a sample request body for this solution. The top-level item “entries” is an array of “entry” items. API Gateway will translate each “entry” to the event-detail field in EventBridge events. The client must escape the data for “entry” to prevent translation errors.

API Gateway and Lambda authorizer

API Gateway receives the request and validates the JWT by invoking the Lambda authorizer. The authorizer generates a policy allowing the request for valid tokens. It adds the Amazon Cognito “custom:clientId” custom attribute to the response context before returning the response to API Gateway. The “custom:clientId” attribute is a unique client identifier in the form of a UUID that downstream systems can use to retrieve data about the customer.

API Gateway validates the request by matching the request body against a model. Models represent what a request should look like. A mapping template then transforms valid requests to the format required by EventBridge. Mapping templates use velocity templating language (VTL) to do this.

VTL template
This mapping template uses a #foreach loop to process the array “entries” from the request body. The process enriches each event with the user’s “custom:clientId” and stage variables for bus name and environment from API Gateway.

Integration request

The preceding API Gateway AWS integration enables API Gateway to send the events to EventBridge without using compute services, such as Lambda or Amazon EC2. The integration and IAM execution role enable API Gateway to call the EventBridge PutEvents API to do this.

EventBridge rules and transformations

EventBridge rules match events against criteria, transform the events, and forward the events to targets. There are two rules in this example. One processes events for Zendesk tickets and the other forwards data to Kinesis Data Firehose to store events for triage and analytics.

This example creates service tickets in the Zendesk ticketing system. The tickets trigger agents to contact customers who are expecting a call to complete their purchases. The software client, by sending the event directly, reducing time-to-action for back-office processes and helping improve customer satisfaction.

Matching EventBridge rule

This rule matches client event messages for loyalty purchases and forwards details to the Zendesk API. The rule includes a transformation, which selects a portion of the event before sending the information to the target.

EventBridge uses an API destination to store details about the HTTP endpoint and usage policies. Additionally, an EventBridge connection and an AWS Secrets Manager secret store details. These include the authentication policy and authentication credentials to connect to the API destination.

Zendesk dashboard

Successfully processed events open tickets in Zendesk using the API destination. Agents now have a list of customers to contact.

Enterprises often require storing the events for troubleshooting or analytics. EventBridge does not include a newline between records when forwarding events to Kinesis Data Firehose. Because of this, it may be more challenging to discern each record when analyzing the data.

Rule to transform events
A rule for all client events changes this behavior. This AWS CloudFormation snippet defines the rule that will transform each event, adding a new line after each. The “\n” character in the InputTemplate field adds the separator between records before forwarding the data to Kinesis Data Firehose.

After, Kinesis Data Firehose receives each record separated by a new line, enabling both triage and analytics without extra overhead.

Kinesis Data Firehose to S3

Kinesis Data Firehose is a cost-effective way to batch and write records to S3. It offers optional transformation capabilities by invoking a Lambda function. This example uses a Lambda function that moves the “clientID” field to the detail section of the event record.

Kinesis Data Firehose to S3

Kinesis Data Firehose also supports dynamic partitioning of records when writing to S3. It selects data from the records or data calculated by a Lambda function. In this example, it selects data from the records to store data in separate folders in S3.

Event durability considerations

You can extend this example using an EventBridge archive and Amazon Kinesis Data Streams. Archiving allows you to create an encrypted archive of matching events. You can define the data retention in days, from one through indefinite. You can replay events from your archive when you must re-process data.

Kinesis Data Streams is a serverless data streaming solution. The EventBridge rule for all records can forward data to Kinesis Data Streams instead of Kinesis Data Firehose. Multiple applications can consume the Kinesis Data Streams. Kinesis Data Firehose would consume this stream of data and store it in S3.

Prerequisites

You need the following prerequisites to deploy the example solution:

Implementation

The full source of the solution is in the GitHub repository and is deployed with AWS SAM.

  1. Create a Secrets Manager secret using the command the AWS CLI:
    aws secretsmanager create-secret --name proto/Zendesk --secret-string '{"username":"<YOUR EMAIL>","apiKey":"<YOUR APIKEY>"}
  2. Clone the solution repository using git:
    git clone https://github.com/aws-samples/client-event-sample
  3. Build the AWS SAM project:
    sam build --use-container
  4. Deploy the project using AWS SAM:
    sam deploy --guided --capabilities CAPABILITY_NAMED_IAMAWS SAM deployment output
  5. From the outputs from the deployment, set the following shell variables:
    APPCLIENTID=<output APPCLIENTID>
    APIID=<output APIID>
    REGION=<region you deployed to>
  6. Create a user in Amazon Cognito using the AWS CLI:
    aws cognito-idp sign-up --client-id $APPCLIENTID --username <YOUR USER ID> --password <YOUR PASSWORD> --user-attributes Name=email,Value=<YOUR EMAIL>
  7. After you receive the confirmation code, confirm the user using the AWS CLI:
    aws cognito-idp confirm-sign-up --client-id $APPCLIENTID --username <userid> --confirmation-code <confirmation code>
  8. Test the user login with the AWS CLI:
    aws cognito-idp initiate-auth --auth-flow USER_PASSWORD_AUTH --client-id $APPCLIENTID --auth-parameters USERNAME=<YOUR USER ID>,PASSWORD=<YOUR PASSWORD>

If successful, this returns a JSON web token (JWT).

Testing the client event solution

  1. The sample repository includes an event generator in the util directory. The generator uses your credentials and simulates events from a user’s software client. From the utils directory, run the generator:
    python3 generator.py
    --minutes <minutes to run generator> --batch <batch size from 1-10>
    --errors <True|False> --userid <YOUR USER ID> --password <YOUR
    PASSWORD> --region $REGION --appclientid $APPCLIENTID --apiid $APIID
  2. Log in to your Zendesk console and view the created tickets.
  3. After five minutes, review the “clientevents” bucket to view the event records.

Cleaning up

To remove the example:

  1. Delete the data stored in the clientevents buckets created from the template.
  2. Delete the stack using the command:
    sam delete --stack-name clientevents
  3. Delete the secret using the command:
    aws secretsmanager delete-secret --secret-id <arn of secret>

Conclusion

This post shows how to send client events to an API and EventBridge to enable new customer experiences. The example covers enabling new experiences by creating a way for software clients to send events with minimal custom code. This blueprint shows how you can include client events in your solution, featuring validation, enrichment, transformation, and storage.

You can modify the example code provided here for your use in your organization. This enables your client software to register events without modifying backend code.

For more serverless learning resources, visit Serverless Land.

How to enrich AWS Security Hub findings with account metadata

Post Syndicated from Siva Rajamani original https://aws.amazon.com/blogs/security/how-to-enrich-aws-security-hub-findings-with-account-metadata/

In this blog post, we’ll walk you through how to deploy a solution to enrich AWS Security Hub findings with additional account-related metadata, such as the account name, the Organization Unit (OU) associated with the account, security contact information, and account tags. Account metadata can help you search findings, create insights, and better respond to and remediate findings.

AWS Security Hub ingests findings from multiple AWS services, including Amazon GuardDuty, Amazon Inspector, Amazon Macie, AWS Firewall Manager, AWS Identity and Access Management (IAM) Access Analyzer, and AWS Systems Manager Patch Manager. Findings from each service are normalized into the AWS Security Finding Format (ASFF), so you can review findings in a standardized format and take action quickly. You can use AWS Security Hub to provide a single view of all security-related findings, and to set up alerts, automate remediation, and export specific findings to third‑party incident management systems.

The Security or DevOps teams responsible for investigating, responding to, and remediating Security Hub findings may need additional account metadata beyond the account ID, to determine what to do about the finding or where to route it. For example, determining whether the finding originated from a development or production account can be key to determining the priority of the finding and the type of remediation action needed. Having this metadata information in the finding allows customers to create custom insights in Security Hub to track which OUs or applications (based on account tags) have the most open security issues. This blog post demonstrates a solution to enrich your findings with account metadata to help your Security and DevOps teams better understand and improve their security posture.

Solution Overview

In this solution, you will use a combination of AWS Security Hub, Amazon EventBridge and AWS Lambda to ingest the findings and automatically enrich them with account related metadata by querying AWS Organizations and Account management service APIs. The solution architecture is shown in Figure 1 below:

Figure 1: Solution Architecture and workflow for metadata enrichment

Figure 1: Solution Architecture and workflow for metadata enrichment

The solution workflow includes the following steps:

  1. New findings and updates to existing Security Hub findings from all the member accounts flow into the Security Hub administrator account. Security Hub generates Amazon EventBridge events for the findings.
  2. An EventBridge rule created as part of the solution in the Security Hub administrator account will trigger a Lambda function configured as a target every time an EventBridge notification for a new or updated finding imported into Security Hub matches the EventBridge rule shown below:
    {
      "detail-type": ["Security Hub Findings - Imported"],
      "source": ["aws.securityhub"],
      "detail": {
        "findings": {
          "RecordState": ["ACTIVE"],
          "UserDefinedFields": {
            "findingEnriched": [{
              "exists": false
            }]
          }
        }
      }
    }

  3. The Lambda function uses the account ID from the event payload to retrieve both the account information and the alternate contact information from the AWS Organizations and Account management service API. The following code within the helper.py constructs the account_details object representing the account information to enrich the finding:
    def get_account_details(account_id, role_name):
        account_details ={}
        organizations_client = AwsHelper().get_client('organizations')
        response = organizations_client.describe_account(AccountId=account_id)
        account_details["Name"] = response["Account"]["Name"]
        response = organizations_client.list_parents(ChildId=account_id)
        ou_id = response["Parents"][0]["Id"]
        if ou_id and response["Parents"][0]["Type"] == "ORGANIZATIONAL_UNIT":
            response = organizations_client.describe_organizational_unit(OrganizationalUnitId=ou_id)
            account_details["OUName"] = response["OrganizationalUnit"]["Name"]
        elif ou_id:
            account_details["OUName"] = "ROOT"
        if role_name:
            account_client = AwsHelper().get_session_for_role(role_name).client("account")
        else:
            account_client = AwsHelper().get_client('account')
        try:
            response = account_client.get_alternate_contact(
                AccountId=account_id,
                AlternateContactType='SECURITY'
            )
            if response['AlternateContact']:
                print("contact :{}".format(str(response["AlternateContact"])))
                account_details["AlternateContact"] = response["AlternateContact"]
        except account_client.exceptions.AccessDeniedException as error:
            #Potentially due to calling alternate contact on Org Management account
            print(error.response['Error']['Message'])
        
        response = organizations_client.list_tags_for_resource(ResourceId=account_id)
        results = response["Tags"]
        while "NextToken" in response:
            response = organizations_client.list_tags_for_resource(ResourceId=account_id, NextToken=response["NextToken"])
            results.extend(response["Tags"])
        
        account_details["tags"] = results
        AccountHelper.logger.info("account_details: %s" , str(account_details))
        return account_details

  4. The Lambda function updates the finding using the Security Hub BatchUpdateFindings API to add the account related data into the Note and UserDefinedFields attributes of the SecurityHub finding:
    #lookup and build the finding note and user defined fields  based on account Id
    enrichment_text, tags_dict = enrich_finding(account_id, assume_role_name)
    logger.debug("Text to post: %s" , enrichment_text)
    logger.debug("User defined Fields %s" , json.dumps(tags_dict))
    #add the Note to the finding and add a userDefinedField to use in the event bridge rule and prevent repeat lookups
    response = secHubClient.batch_update_findings(
        FindingIdentifiers=[
            {
                'Id': enrichment_finding_id,
                'ProductArn': enrichment_finding_arn
            }
        ],
        Note={
            'Text': enrichment_text,
            'UpdatedBy': enrichment_author
        },
        UserDefinedFields=tags_dict
    )

    Note: All state change events published by AWS services through Amazon Event Bridge are free of cost. The AWS Lambda free tier includes 1M free requests per month, and 400,000 GB-seconds of compute time per month at the time of publication of this post. If you process 2M requests per month, the estimated cost for this solution would be approximately $7.20 USD per month.

  5. Prerequisites

    1. Your AWS organization must have all features enabled.
    2. This solution requires that you have AWS Security Hub enabled in an AWS multi-account environment which is integrated with AWS Organizations. The AWS Organizations management account must designate a Security Hub administrator account, which can view data from and manage configuration for its member accounts. Follow these steps to designate a Security Hub administrator account for your AWS organization.
    3. All the members accounts are tagged per your organization’s tagging strategy and their security alternate contact is filled. If the tags or alternate contacts are not available, the enrichment will be limited to the Account Name and the Organizational Unit name.
    4. Trusted access must be enabled with AWS Organizations for AWS Account Management service. This will enable the AWS Organizations management account to call the AWS Account Management API operations (such as GetAlternateContact) for other member accounts in the organization. Trusted access can be enabled either by using AWS Management Console or by using AWS CLI and SDKs.

      The following AWS CLI example enables trusted access for AWS Account Management in the calling account’s organization.

      aws organizations enable-aws-service-access --service-principal account.amazonaws.com

    5. An IAM role with a read only access to lookup the GetAlternateContact details must be created in the Organizations management account, with a trust policy that allows the Security Hub administrator account to assume the role.

    Solution Deployment

    This solution consists of two parts:

    1. Create an IAM role in your Organizations management account, giving it necessary permissions as described in the Create the IAM role procedure below.
    2. Deploy the Lambda function and the other associated resources to your Security Hub administrator account

    Create the IAM role

    Using console, AWS CLI or AWS API

    Follow the Creating a role to delegate permissions to an IAM user instructions to create a IAM role using the console, AWS CLI or AWS API in the AWS Organization management account with role name as account-contact-readonly, based on the trust and permission policy template provided below. You will need the account ID of your Security Hub administrator account.

    The IAM trust policy allows the Security Hub administrator account to assume the role in your Organization management account.

    IAM Role trust policy

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "AWS": "arn:aws:iam::<SH administrator Account ID>:root"
          },
          "Action": "sts:AssumeRole",
          "Condition": {}
        }
      ]
    }

    Note: Replace the <SH Delegated Account ID> with the account ID of your Security Hub administrator account. Once the solution is deployed, you should update the principal in the trust policy shown above to use the new IAM role created for the solution.

    IAM Permission Policy

    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": [
                    "account:GetAlternateContact"
                ],
                "Resource": "arn:aws:account::<Org. Management Account id>:account/o-*/*"
            }
        ]
    }

    The IAM permission policy allows the Security Hub administrator account to look up the alternate contact information for the member accounts.

    Make a note of the Role ARN for the IAM role similar to this format:

    arn:aws:iam::<Org. Management Account id>:role/account-contact-readonly. 
    			

    You will need this while the deploying the solution in the next procedure.

    Using AWS CloudFormation

    Alternatively, you can use the  provided CloudFormation template to create the role in the management account. The IAM role ARN is available in the Outputs section of the created CloudFormation stack.

    Deploy the Solution to your Security Hub administrator account

    You can deploy the solution using either the AWS Management Console, or from the GitHub repository using the AWS SAM CLI.

    Note: if you have designated an aggregation Region within the Security Hub administrator account, you can deploy this solution only in the aggregation Region, otherwise you need to deploy this solution separately in each Region of the Security Hub administrator account where Security Hub is enabled.

    To deploy the solution using the AWS Management Console

    1. In your Security Hub administrator account, launch the template by choosing the Launch Stack button below, which creates the stack the in us-east-1 Region.

      Note: if your Security Hub aggregation region is different than us-east-1 or want to deploy the solution in a different AWS Region, you can deploy the solution from the GitHub repository described in the next section.

      Select this image to open a link that starts building the CloudFormation stack

    2. On the Quick create stack page, for Stack name, enter a unique stack name for this account; for example, aws-security-hub–findings-enrichment-stack, as shown in Figure 2 below.
      Figure 2: Quick Create CloudFormation stack for the Solution

      Figure 2: Quick Create CloudFormation stack for the Solution

    3. For ManagementAccount, enter the AWS Organizations management account ID.
    4. For OrgManagementAccountContactRole, enter the role ARN of the role you created previously in the Create IAM role procedure.
    5. Choose Create stack.
    6. Once the stack is created, go to the Resources tab and take note of the name of the IAM Role which was created.
    7. Update the principal element of the IAM role trust policy which you previously created in the Organization management account in the Create the IAM role procedure above, replacing it with the role name you noted down, as shown below.
      Figure 3 Update Management Account Role’s Trust

      Figure 3 Update Management Account Role’s Trust

    To deploy the solution from the GitHub Repository and AWS SAM CLI

    1. Install the AWS SAM CLI
    2. Download or clone the github repository using the following commands
      $ git clone https://github.com/aws-samples/aws-security-hub-findings-account-data-enrichment.git
      $ cd aws-security-hub-findings-account-data-enrichment

    3. Update the content of the profile.txt file with the profile name you want to use for the deployment
    4. To create a new bucket for deployment artifacts, run create-bucket.sh by specifying the region as argument as below.
      $ ./create-bucket.sh us-east-1

    5. Deploy the solution to the account by running the deploy.sh script by specifying the region as argument
      $ ./deploy.sh us-east-1

    6. Once the stack is created, go to the Resources tab and take note of the name of the IAM Role which was created.
    7. Update the principal element of the IAM role trust policy which you previously created in the Organization management account in the Create the IAM role procedure above, replacing it with the role name you noted down, as shown below.
      "AWS": "arn:aws:iam::<SH Delegated Account ID>: role/<Role Name>"

    Using the enriched attributes

    To test that the solution is working as expected, you can create a standalone security group with an ingress rule that allows traffic from the internet. This will trigger a finding in Security Hub, which will be populated with the enriched attributes. You can then use these enriched attributes to filter and create custom insights, or take specific response or remediation actions.

    To generate a sample Security Hub finding using AWS CLI

    1. Create a Security Group using following AWS CLI command:
      aws ec2 create-security-group --group-name TestSecHubEnrichmentSG--description "Test Security Hub enrichment function"

    2. Make a note of the security group ID from the output, and use it in Step 3 below.
    3. Add an ingress rule to the security group which allows unrestricted traffic on port 100:
      aws ec2 authorize-security-group-ingress --group-id <Replace Security group ID> --protocol tcp --port 100 --cidr 0.0.0.0/0

    Within few minutes, a new finding will be generated in Security Hub, warning about the unrestricted ingress rule in the TestSecHubEnrichmentSG security group. For any new or updated findings which do not have the UserDefinedFields attribute findingEnriched set to true, the solution will enrich the finding with account related fields in both the Note and UserDefinedFields sections in the Security Hub finding.

    To see and filter the enriched finding

    1. Go to Security Hub and click on Findings on the left-hand navigation.
    2. Click in the filter field at the top to add additional filters. Choose a filter field of AWS Account ID, a filter match type of is, and a value of the AWS Account ID where you created the TestSecHubEnrichmentSG security group.
    3. Add one more filter. Choose a filter field of Resource type, a filter match type of is, and the value of AwsEc2SecurityGroup.
    4. Identify the finding for security group TestSecHubEnrichmentSG with updates to Note and UserDefinedFields, as shown in Figures 4 and 5 below:
      Figure 4: Account metadata enrichment in Security Hub finding’s Note field

      Figure 4: Account metadata enrichment in Security Hub finding’s Note field

      Figure 5: Account metadata enrichment in Security Hub finding’s UserDefinedFields field

      Figure 5: Account metadata enrichment in Security Hub finding’s UserDefinedFields field

      Note: The actual attributes you will see as part of the UserDefinedFields may be different from the above screenshot. Attributes shown will depend on your tagging configuration and the alternate contact configuration. At a minimum, you will see the AccountName and OU fields.

    5. Once you confirm that the solution is working as expected, delete the stand-alone security group TestSecHubEnrichmentSG, which was created for testing purposes.

    Create custom insights using the enriched attributes

    You can use the attributes available in the UserDefinedFields in the Security Hub finding to filter the findings. This lets you generate custom Security Hub Insight and reports tailored to suit your organization’s needs. The example shown in Figure 6 below creates a custom Security Hub Insight for findings grouped by severity for a specific owner, using the Owner attribute within the UserDefinedFields object of the Security Hub finding.

    Figure 6: Custom Insight with Account metadata filters

    Figure 6: Custom Insight with Account metadata filters

    Event Bridge rule for response or remediation action using enriched attributes

    You can also use the attributes in the UserDefinedFields object of the Security Hub finding within the EventBridge rule to take specific response or remediation actions based on values in the attributes. In the example below, you can see how the Environment attribute can be used within the EventBridge rule configuration to trigger specific actions only when value matches PROD.

    {
      "detail-type": ["Security Hub Findings - Imported"],
      "source": ["aws.securityhub"],
      "detail": {
        "findings": {
          "RecordState": ["ACTIVE"],
          "UserDefinedFields": {
            "Environment": "PROD"
          }
        }
      }
    }

    Conclusion

    This blog post walks you through a solution to enrich AWS Security Hub findings with AWS account related metadata using Amazon EventBridge notifications and AWS Lambda. By enriching the Security Hub findings with account related information, your security teams have better visibility, additional insights and improved ability to create targeted reports for specific account or business teams, helping them prioritize and improve overall security response. To learn more, see:

 
If you have feedback about this post, submit comments in the Comments section below. If you have any questions about this post, start a thread on the AWS Security Hub forum.

Want more AWS Security news? Follow us on Twitter.

Siva Rajamani

Siva Rajamani

Siva Rajamani is a Boston-based Enterprise Solutions Architect at AWS. Siva enjoys working closely with customers to accelerate their AWS cloud adoption and improve their overall security posture.

Prashob Krishnan

Prashob Krishnan

Prashob Krishnan is a Denver-based Technical Account Manager at AWS. Prashob is passionate about security. He enjoys working with customers to solve their technical challenges and help build a secure scalable architecture on the AWS Cloud.

Automatically resolve Security Hub findings for resources that no longer exist

Post Syndicated from Kris Normand original https://aws.amazon.com/blogs/security/automatically-resolve-security-hub-findings-for-resources-that-no-longer-exist/

In this post, you’ll learn how to automatically resolve AWS Security Hub findings for previously deleted Amazon Web Services (AWS) resources. By using an event-driven solution, you can automatically resolve findings for AWS and third-party service integrations.

Security Hub provides a comprehensive view of your security alerts and security posture across your AWS accounts. Security Hub provides a single place that aggregates, organizes, and prioritizes your security alerts (also called findings) from multiple AWS services and partner solutions. Security Hub lets you assign workflow statuses of NEW, NOTIFIED, SUPPRESSED, or RESOLVED to findings. These statuses help you understand the state of your security findings and identify which need attention. As AWS resources are spun up and down during the course of normal business activities, there might be findings in Security Hub for those resources. AWS Security Hub findings backed by AWS Config are automatically archived when AWS Config identifies that a resource has been deleted. However, for some AWS service integrations—such as Amazon GuardDuty and third-party partner products—findings aren’t automatically resolved or archived when a resource is deleted. This can result in orphaned findings for resources that no longer exist.

In this post, we show you how to use an event-driven architecture to automatically resolve findings for all providers—AWS and third-party—for resources that have been deleted. Automatically resolving these findings reduces alert fatigue by decreasing noise, allowing your security team to focus on investigating and remediating high fidelity findings.

A common use case for automatically resolving findings is for Amazon Elastic Compute Cloud (Amazon EC2) instances that are ephemeral in nature. For example, Amazon EC2 instances that are part of an Amazon EC2 Auto Scaling group. EC2 instances can scale to thousands of nodes multiple times a day depending on the workload. Without automatically resolving these findings, you could end up with one or more findings for each instance. By automatically resolving findings for the deleted resources, your teams can focus on investigating and remediating findings that affect active resources.

Prerequisites

This solution assumes that you have Security Hub and AWS Config configured across all of your AWS accounts. Instructions for configuring Security Hub and its dependencies can be found in the Security Hub user guide. Ensure you have configured Security Hub to use a delegated administrator account, which centralizes findings from all member accounts.

Solution overview

In Security Hub, the investigation status of a finding is tracked using the workflow status attribute. The workflow status attribute for new findings is initially set to NEW. You can change the workflow status of a finding by either selecting it in the AWS Security Hub console, or by automating the change of workflow status by using the AWS Command Line Interface (AWS CLI) or Security Hub SDKs. The usual workflow for a finding, whether managed manually or through automation, is NEW, NOTIFIED, then RESOLVED or SUPPRESSED.

In this solution, we show you how to automatically set the workflow status to RESOLVED for all applicable findings when an EC2 instance, Amazon Simple Storage Service (Amazon S3) bucket, or an AWS Identity and Access Management (IAM) role is deleted. This event-driven solution uses Amazon EventBridge event patterns—which can be easily customized to meet your specific business needs—to invoke the resolution workflow on Delete or Terminate API calls. An EventBridge event bus is used to forward all Delete or Terminate API calls to your Security Hub delegated administrator account. Event patterns are used to filter for specific events and forward them to a target. With this solution, you filter for specific Delete and Terminate events, identified by the event name. The target for matching events is an AWS Lambda function. The invocation of this function includes context around the event which includes the metadata for the resource that was just deleted or terminated. This function queries the Security Hub GetFindings API for all findings for the resource with a status of NEW or NOTIFIED. The function then sets the workflow status to RESOLVED for all findings for the Amazon Resource Name (ARN) of the given resource by calling the BatchUpdateFindings Security Hub API.

Solution architecture

Figure 1: Solution architecture overview

Figure 1: Solution architecture overview

Figure 1 shows the deletion of an AWS resource in a Security Hub member account being forwarded to the EventBridge event bus in the Security Hub administrator account. The process flow is as follows:

  1. In a Security Hub member account, a user deletes or terminates a resource through the AWS Management Console, AWS CLI, or SDK.
  2. AWS CloudTrail logs the user activity and automatically forwards an event to EventBridge.
  3. An EventBridge event pattern filters for the delete or terminate API call, and generates an event.
  4. The event is forwarded to the event bus in the Security Hub administrator account.
  5. In the Security Hub administrator account, an event pattern is used to filter for all delete or terminate API calls.
  6. Matching events generate an EventBridge event.
  7. The target for this event is the Lambda function to resolve Security Hub findings for the recently deleted resource.
  8. The Lambda function generates a list of all findings for the recently deleted resource and updates the workflow status for each finding to RESOLVED in the Security Hub delegated administrator account.
  9. The workflow status propagates from the Security Hub delegated administrator account to the member accounts of Security Hub.

To deploy the solution

In the Security Hub administrator account complete the following steps:

  1. In the following resource policy, replace <Region> with the AWS Region where the solution is deployed, <AccountID> with the Security Hub administrator account ID and <OrgID> is the ID of the organization within your AWS Organizations implementation.
    {
      "Version": "2012-10-17",
      "Statement": [{
        "Sid": "allow_all_accounts_from_organization_to_put_events",
        "Effect": "Allow",
        "Principal": "*",
        "Action": "events:PutEvents",
        "Resource": "arn:aws:events:<Region>:<AccountID>:event-bus/default",
        "Condition": {
          "StringEquals": {
            "aws:PrincipalOrgID": "<OrgID>"
          }
        }
      }]
    }
    

  2. Add the edited resource policy to the default EventBridge event bus to allow all accounts in your organization to send delete events for IAM roles, EC2 instances, and S3 buckets to the default event bus in the Security Hub administrator account.

    Note: You can also choose to specify a list of accounts to receive events from. For more information about configuring a resource policy see Managing event bus permissions in Permissions for Amazon EventBridge event buses.

  3. Deploy the AWS CloudFormation template that creates the required resources.

    Launch Stack Button

In each Security Hub member account, deploy the CloudFormation template. You will need to specify the Security Hub administrator AWS account ID to deploy the stack.

Launch Stack Button

Tip: CloudFormation StackSets can be used to deploy stacks across all accounts in your organization. For more information, see Working with AWS CloudFormation StackSets.

Note: With CloudFormation StackSets, the template isn’t deployed in the StackSet administrator account by default. The CloudFormation stack must be deployed separately in the StackSet administrator account.

Note: Security Hub now supports cross-Region aggregation of findings. If you have Security Hub cross-Region aggregation enabled. The solution in this post will work for findings in all aggregated regions.

Next steps

Understanding and fixing the root cause for Security Hub findings will improve your security posture and reduce the number of future findings. As a best practice, you should periodically analyze the findings for resources that have been automatically resolved by this solution to identify trends so that your team can investigate and fix root causes. You can use the filter below in the Security Hub console to view all findings automatically resolved by this solution:

To analyze findings

  1. Open the Security Hub console and select Findings.
  2. Check to see that Workflow status is RESOLVED and Note updated by is DeletedResourceFindingResolver.
  3. (Optional) You can also create a custom insight for these findings by adding Group by: ProductName to the filter.
  4. Select Create Insight as shown in Figure 2.
Figure 2: AWS Security Hub

Figure 2: AWS Security Hub

Note: You can expand the solution to include other resource types based on your requirements, such as security groups, Amazon Relational Database Service (Amazon RDS) databases, and IAM users by updating the event pattern in the EventBridge rule and modifying the Lambda function code.

Conclusion

In this post, we showed how you can automatically resolve findings for deleted EC2, IAM and S3 resources by using the Security Hub GetFindings and BatchUpdateFindings API actions. We showed you how to configure EventBridge patterns and rules to initiate a Lambda function through a centralized event bus to address these findings for resources across your Organizations.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the Security Hub forum. To start your 30-day free trial of Security Hub, visit AWS Security Hub.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Kris Normand

Kris is a Senior Security Consultant with AWS Professional Services. He partners with Chief Information Security Officers to lead their digital transformation efforts on AWS. When not working on security and compliance initiatives with customers, Kris enjoys traveling, hiking, and spending time with his family. He is also a veteran of the U.S Air Force.

Author

Cory Smith

Cory is a Senior Security Consultant with AWS Professional Services based in San Antonio, TX. He partners with customers to deliver key business outcomes in a secure and compliant manner within AWS. He’s passionate about solving complex technical problems with new and innovative solutions.

Author

Kafayat Adeyemi

Kafayat is a Senior Technical Account Manager at AWS based in Atlanta, GA. She is passionate about security and works with enterprise customers to build, deploy, and manage secure and scalable workloads on AWS. Outside of work, she loves to travel, bake, and spend time with her family.

Author

Justin Kontny

Justin is a Senior Security Consultant with the AWS Global Security Practice, a part of our Worldwide Professional Services Organization. He helps customers improve their security posture as they migrate their most sensitive workloads to AWS. Justin has a passion for detective controls and scaling security via automation.

ICYMI: Serverless Q4 2021

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/icymi-serverless-q4-2021/

Welcome to the 15th edition of the AWS Serverless ICYMI (in case you missed it) quarterly recap. Every quarter, we share all of the most recent product launches, feature enhancements, blog posts, webinars, Twitch live streams, and other interesting things that you might have missed!

Q4 calendar

In case you missed our last ICYMI, check out what happened last quarter here.

AWS Lambda

For developers using Amazon MSK as an event source, Lambda has expanded authentication options to include IAM, in addition to SASL/SCRAM. Lambda also now supports mutual TLS authentication for Amazon MSK and self-managed Kafka as an event source.

Lambda also launched features to make it easier to operate across AWS accounts. You can now invoke Lambda functions from Amazon SQS queues in different accounts. You must grant permission to the Lambda function’s execution role and have SQS grant cross-account permissions. For developers using container packaging for Lambda functions, Lambda also now supports pulling images from Amazon ECR in other AWS accounts. To learn about the permissions required, see this documentation.

The service now supports a partial batch response when using SQS as an event source for both standard and FIFO queues. When messages fail to process, Lambda marks the failed messages and allows reprocessing of only those messages. This helps to improve processing performance and may reduce compute costs.

Lambda launched content filtering options for functions using SQS, DynamoDB, and Kinesis as an event source. You can specify up to five filter criteria that are combined using OR logic. This uses the same content filtering language that’s used in Amazon EventBridge, and can dramatically reduce the number of downstream Lambda invocations.

Amazon EventBridge

Previously, you could consume Amazon S3 events in EventBridge via CloudTrail. Now, EventBridge receives events from the S3 service directly, making it easier to build serverless workflows triggered by activity in S3. You can use content filtering in rules to identify relevant events and forward these to 18 service targets, including AWS Lambda. You can also use event archive and replay, making it possible to reprocess events in testing, or in the event of an error.

AWS Step Functions

The AWS Batch console has added support for visualizing Step Functions workflows. This makes it easier to combine these services to orchestrate complex workflows over business-critical batch operations, such as data analysis or overnight processes.

Additionally, Amazon Athena has also added console support for visualizing Step Functions workflows. This can help when building distributed data processing pipelines, allowing Step Functions to orchestrate services such as AWS Glue, Amazon S3, or Amazon Kinesis Data Firehose.

Synchronous Express Workflows now supports AWS PrivateLink. This enables you to start these workflows privately from within your virtual private clouds (VPCs) without traversing the internet. To learn more about this feature, read the What’s New post.

Amazon SNS

Amazon SNS announced support for token-based authentication when sending push notifications to Apple devices. This creates a secure, stateless communication between SNS and the Apple Push Notification (APN) service.

SNS also launched the new PublishBatch API which enables developers to send up to 10 messages to SNS in a single request. This can reduce cost by up to 90%, since you need fewer API calls to publish the same number of messages to the service.

Amazon SQS

Amazon SQS released an enhanced DLQ management experience for standard queues. This allows you to redrive messages from a DLQ back to the source queue. This can be configured in the AWS Management Console, as shown here.

Amazon DynamoDB

The NoSQL Workbench for DynamoDB is a tool to simplify designing, visualizing and querying DynamoDB tables. The tools now supports importing sample data from CSV files and exporting the results of queries.

DynamoDB announced the new Standard-Infrequent Access table class. Use this for tables that store infrequently accessed data to reduce your costs by up to 60%. You can switch to the new table class without an impact on performance or availability and without changing application code.

AWS Amplify

AWS Amplify now allows developers to override Amplify-generated IAM, Amazon Cognito, and S3 configurations. This makes it easier to customize the generated resources to best meet your application’s requirements. To learn more about the “amplify override auth” command, visit the feature’s documentation.

Similarly, you can also add custom AWS resources using the AWS Cloud Development Kit (CDK) or AWS CloudFormation. In another new feature, developers can then export Amplify backends as CDK stacks and incorporate them into their deployment pipelines.

AWS Amplify UI has launched a new Authenticator component for React, Angular, and Vue.js. Aside from the visual refresh, this provides the easiest way to incorporate social sign-in in your frontend applications with zero-configuration setup. It also includes more customization options and form capabilities.

AWS launched AWS Amplify Studio, which automatically translates designs made in Figma to React UI component code. This enables you to connect UI components visually to backend data, providing a unified interface that can accelerate development.

AWS AppSync

You can now use custom domain names for AWS AppSync GraphQL endpoints. This enables you to specify a custom domain for both GraphQL API and Realtime API, and have AWS Certificate Manager provide and manage the certificate.

To learn more, read the feature’s documentation page.

News from other services

Serverless blog posts

October

November

December

AWS re:Invent breakouts

AWS re:Invent was held in Las Vegas from November 29 to December 3, 2021. The Serverless DA team presented numerous breakouts, workshops and chalk talks. Rewatch all our breakout content:

Serverlesspresso

We also launched an interactive serverless application at re:Invent to help customers get caffeinated!

Serverlesspresso is a contactless, serverless order management system for a physical coffee bar. The architecture comprises several serverless apps that support an ordering process from a customer’s smartphone to a real espresso bar. The customer can check the virtual line, place an order, and receive a notification when their drink is ready for pickup.

Serverlesspresso booth

You can learn more about the architecture and download the code repo at https://serverlessland.com/reinvent2021/serverlesspresso. You can also see a video of the exhibit.

Videos

Serverless Land videos

Serverless Office Hours – Tues 10 AM PT

Weekly live virtual office hours. In each session we talk about a specific topic or technology related to serverless and open it up to helping you with your real serverless challenges and issues. Ask us anything you want about serverless technologies and applications.

YouTube: youtube.com/serverlessland
Twitch: twitch.tv/aws

October

November

December

Still looking for more?

The Serverless landing page has more information. The Lambda resources page contains case studies, webinars, whitepapers, customer stories, reference architectures, and even more Getting Started tutorials.

You can also follow the Serverless Developer Advocacy team on Twitter to see the latest news, follow conversations, and interact with the team.

Use AWS Step Functions to Monitor Services Choreography

Post Syndicated from Vito De Giosa original https://aws.amazon.com/blogs/architecture/use-aws-step-functions-to-monitor-services-choreography/

Organizations frequently need access to quick visual insight on the status of complex workflows. This involves collaboration across different systems. If your customer requires assistance on an order, you need an overview of the fulfillment process, including payment, inventory, dispatching, packaging, and delivery. If your products are expensive assets such as cars, you must track each item’s journey instantly.

Modern applications use event-driven architectures to manage the complexity of system integration at scale. These often use choreography for service collaboration. Instead of directly invoking systems to perform tasks, services interact by exchanging events through a centralized broker. Complex workflows are the result of actions each service initiates in response to events produced by other services. Services do not directly depend on each other. This increases flexibility, development speed, and resilience.

However, choreography can introduce two main challenges for the visibility of your workflow.

  1. It obfuscates the workflow definition. The sequence of events emitted by individual services implicitly defines the workflow. There is no formal statement that describes steps, permitted transitions, and possible failures.
  2. It might be harder to understand the status of workflow executions. Services act independently, based on events. You can implement distributed tracing to collect information related to a single execution across services. However, getting visual insights from traces may require custom applications. This increases time to market (TTM) and cost.

To address these challenges, we will show you how to use AWS Step Functions to model choreographies as state machines. The solution enables stakeholders to gain visual insights on workflow executions, identify failures, and troubleshoot directly from the AWS Management Console.

This GitHub repository provides a Quick Start and examples on how to model choreographies.

Modeling choreographies with Step Functions

Monitoring a choreography requires a formal representation of the distributed system behavior, such as state machines. State machines are mathematical models representing the behavior of systems through states and transitions. States model situations in which the system can operate. Transitions define which input causes a change from the current state to the next. They occur when a new event happens. Figure 1 shows a state machine modeling an order workflow.

Figure 1. Order workflow

Figure 1. Order workflow

The solution in this post uses Amazon State Language to describe a choreography as a Step Functions state machine. The state machine pauses, using Task states combined with a callback integration pattern. It then waits for the next event to be published on the broker. Choice states control transitions to the next state by inspecting event payloads. Figure 2 shows how the workflow in Figure 1 translates to a Step Functions state machine.

Figure 2. Order workflow translated into Step Functions state machine

Figure 2. Order workflow translated into Step Functions state machine

Figure 3 shows the architecture for monitoring choreographies with Step Functions.

Figure 3. Choreography monitoring with AWS Step Functions

Figure 3. Choreography monitoring with AWS Step Functions

  1. Services involved in the choreography publish events to Amazon EventBridge. There are two configured rules. The first rule matches the first event of the choreography sequence, Order Placed in the example. The second rule matches any other event of the sequence. Event payloads contain a correlation id (order_id) to group them by workflow instance.
  2. The first rule invokes an AWS Lambda function, which starts a new execution of the choreography state machine. The correlation id is passed in the name parameter, so you can quickly identify an execution in the AWS Management Console.
  3. The state machine uses Task states with AWS SDK service integrations, to directly call Amazon DynamoDB. Tasks are configured with a callback pattern. They issue a token, which is stored in DynamoDB with the execution name. Then, the workflow pauses.
  4. A service publishes another event on the event bus.
  5. The second rule invokes another Lambda function with the event payload.
  6. The function uses the correlation id to retrieve the task token from DynamoDB.
  7. The function invokes the Step Functions SendTaskSuccess API, with the token and the event payload as parameters.
  8. The state machine resumes the execution and uses Choice states to transition to the next state. If the choreography definition expects the received event payload, it selects the next state and the process will restart from Step # 3. The state machine transitions to a Fail state when it receives an unexpected event.

Increased visibility with Step Functions console

Modeling service choreographies as Step Functions Standard Workflows increases visibility with out-of-the-box features.

1. You can centrally track events produced by distributed components. Step Functions records full execution history for 90 days after the execution completes. You’ll be able to capture detailed information about the input and output of each state, including event payloads. Additionally, state machines integrate with Amazon CloudWatch to publish execution logs and metrics.

2. You can monitor choreographies visually. The Step Functions console displays a list of executions with information such as execution id, status, and start date (see Figure 4).

Figure 4. Step Functions workflow dashboard

Figure 4. Step Functions workflow dashboard

After you’ve selected an execution, a graph inspector is displayed (see Figure 5). It shows states, transitions, and marks individual states with colors. This identifies at a glance, successful tasks, failures, and tasks that are still in progress.

Figure 5. Step Functions graph inspector

Figure 5. Step Functions graph inspector

3. You can implement event-driven automation. Step Functions enables you to capture execution status changes emitting events directly to EventBridge (see Figure 6). Additionally, AWS gives you the ability to emit events by setting alarms on top of metrics. Step Functions publishes these to CloudWatch. You can respond to events by initiating corrective actions, sending notifications, or integrating with third-party solutions, such as issue tracking systems.

Figure 6. Automation with Step Functions, EventBridge, and CloudWatch alarms

Figure 6. Automation with Step Functions, EventBridge, and CloudWatch alarms

Enabling access to AWS Step Functions console

Stakeholders need secure access to the Step Functions console. This requires mechanisms to authenticate users and authorize read-only access to specific Step Functions workflows.

AWS Single Sign-On authenticates users by directly managing identities or through federation. SSO supports federation with Active Directory and SAML 2.0 compliant external identity providers (IdP). Users gain access to Step Functions state machines by assigning a permission set, which is a collection of AWS Identity and Access Management (IAM) policies. Additionally, with permission sets, you can configure a relay state, which is a URL to redirect the user after successful authentication. You can authenticate the user through the selected identity provider and immediately show the AWS Step Functions console with the workflow state machine already displayed. Figure 7 shows this process.

Figure 7. Access to Step Functions state machine with AWS SSO

Figure 7. Access to Step Functions state machine with AWS SSO

  1. The user logs in through the selected identity provider.
  2. The SSO user portal uses the SSO endpoint to send the response from the previous step. SSO uses AWS Security Token Service (STS) to get temporary security credentials on behalf of the user. It then creates a console sign-in URL using those credentials and the relay state. Finally, it sends the URL back as a redirect.
  3. The browser redirects the user to the Step Functions console.

When the identity provider does not support SAML 2.0, SSO is not a viable solution. In this case, you can create a URL with a sign-in token for users to securely access the AWS Management Console. This approach uses STS AssumeRole to get temporary security credentials. Then, it uses credentials to obtain a sign-in token from the AWS federation endpoint. Finally, it constructs a URL for the AWS Management Console, which includes the token. It then distributes this to users to grant access. This is similar to the SSO process. However, it requires custom development.

Conclusion

This post shows how you can increase visibility on choreographed business processes using AWS Step Functions. The solution provides detailed visual insights directly from the AWS Management Console, without requiring custom UI development. This reduces TTM and cost.

To learn more:

Find Public IPs of Resources – Use AWS Config for Vulnerability Assessment

Post Syndicated from Gurkamal Deep Singh Rakhra original https://aws.amazon.com/blogs/architecture/find-public-ips-of-resources-use-aws-config-for-vulnerability-assessment/

Systems vulnerability management is a key component of your enterprise security program. Its goal is to remediate OS, software, and applications vulnerabilities. Scanning tools can help identify and classify these vulnerabilities to keep the environment secure and compliant.

Typically, vulnerability scanning tools operate from internal or external networks to discover and report vulnerabilities. For internal scanning, the tools use private IPs of target systems in scope. For external scans, the public target system’s IP addresses are used. It is important that security teams always maintain an accurate inventory of all deployed resource’s IP addresses. This ensures a comprehensive, consistent, and effective vulnerability assessment.

This blog discusses a scalable, serverless, and automated approach to discover public IP addresses assigned to resources in a single or multi-account environment in AWS, using AWS Config.

Single account is when you have all your resources in a single AWS account. A multi-account environment refers to many accounts under the same AWS Organization.

Understanding scope of solution

You may have good visibility into the private IPs assigned to your resources: Amazon Elastic Compute Cloud (Amazon EC2), Amazon Elastic Kubernetes Service (EKS) clusters, Elastic Load Balancing (ELB), and Amazon Elastic Container Service (Amazon ECS). But it may require some effort to establish a complete view of the existing public IPs. And these IPs can change over time, as new systems join and exit the environment.

An elastic network interface is a logical networking component in a Virtual Private Cloud (VPC) that represents a virtual network card. The elastic network interface routes traffic to other destinations/resources. Usually, you have to make Describe* API calls for the specific resource with an elastic network interface to get information about its configuration and IP address. This may throttle the resource-specific API calls, and result in higher costs. Additionally, if there are tens or hundreds of accounts, it becomes exponentially more difficult to get the information into a single inventory.

AWS Config enables you to assess, audit, and evaluate the configurations of your AWS resources. The advanced query feature provides a single query endpoint and language to get current resource state metadata for a single account and Region, or multiple accounts and Regions. You can use configuration aggregators to run the same queries from a central account across multiple accounts and AWS Regions.

AWS Config supports a subset of structured query language (SQL) SELECT syntax, which enables you to perform property-based queries and aggregations on the current configuration item (CI) data. Advanced query is available at no additional cost to AWS Config customers in all AWS Regions (except China Regions) and AWS GovCloud (US).

AWS Organizations helps you centrally govern your environment. Its integration with other AWS services lets you define central configurations, security mechanisms, audit requirements, and resource sharing across accounts in your organization.

Choosing scope of advanced queries in AWS Config

When running advanced queries in AWS Config, you must choose the scope of the query. The scope defines the accounts you want to run the query against and is configured when you create an aggregator.

Following are the three possible scopes when running advanced queries:

  1. Single account and single Region
  2. Multiple accounts and multiple Regions
  3. AWS Organization accounts

Single account and single Region

Figure 1. AWS Config workflow for single account and single Region

Figure 1. AWS Config workflow for single account and single Region

The use case shown in Figure 1 addresses the need of customers operating within a single account and single Region. With AWS Config enabled for the individual account, you will use AWS Config advanced query feature to run SQL queries. These will give you resource metadata about associated public IPs. You do not require an aggregator for single-account and single Region.

In Figure 1.1, the advanced query returned results from a single account and all Availability Zones within the Region in which the query was run.

Figure 1.1 Advanced query returning results for a single account and single Region

Figure 1.1 Advanced query returning results for a single account and single Region

Query for reference

SELECT

  resouceId,

  resourceName,

  resourceType,

  configuration.association.publicIp,

  availabilityZone,

  awsRegion

WHERE

  resourceType='AWS::EC2::NetworkInterface'

  AND configuration.association.publicIp>'0.0.0.0'

This query is fetching the properties of all elastic network interfaces. The WHERE condition is used to list the elastic network interfaces using the resourceType property and find all public IPs greater than 0.0.0.0. This is because elastic network interfaces can exist with a private IP, in which case there will be no public IP assigned to it. For a list of supported resourceType, refer to supported resource types for AWS Config.

Multiple accounts and multiple Regions

Figure 2. AWS Config monitoring workflow for multiple account and multiple Regions. The figure shows EC2, EKS, and Amazon ECS, but it can be any AWS resource having a public elastic network interface.

Figure 2. AWS Config monitoring workflow for multiple account and multiple Regions. The figure shows EC2, EKS, and Amazon ECS, but it can be any AWS resource having a public elastic network interface.

AWS Config enables you to monitor configuration changes against multiple accounts and multiple Regions via an aggregator, see Figure 2. An aggregator is an AWS Config resource type that collects AWS Config data from multiple accounts and Regions. You can choose the aggregator scope when running advanced queries in AWS Config. Remember to authorize the aggregator accounts to collect AWS Config configuration and compliance data.

Figure 2.1 Advanced query returning results from multiple Regions (awsRegion column) as highlighted in the diagram

Figure 2.1 Advanced query returning results from multiple Regions (awsRegion column) as highlighted in the diagram

This use case applies when you have AWS resources in multiple accounts (or span multiple organizations) and multiple Regions. Figure 2.1 shows the query results being returned from multiple AWS Regions.

Accounts in AWS Organization

Figure 3. The workflow of accounts in an AWS Organization being monitored by AWS Config. This figure shows EC2, EKS, and Amazon ECS but it can be any AWS resource having a public elastic network interface.

Figure 3. The workflow of accounts in an AWS Organization being monitored by AWS Config. This figure shows EC2, EKS, and Amazon ECS but it can be any AWS resource having a public elastic network interface.

An aggregator also enables you to monitor all the accounts in your AWS Organization, see Figure 3. When this option is chosen, AWS Config enables you to run advanced queries against the configuration history in all the accounts in your AWS Organization. Remember that an aggregator will only aggregate data from the accounts and Regions that are specified when the aggregator is created.

Figure 3.1 Advanced query returning results from all accounts (accountId column) under an AWS Organization

Figure 3.1 Advanced query returning results from all accounts (accountId column) under an AWS Organization

In Figure 3.1, the query is run against all accounts in an AWS Organization. This scope of AWS Organization is accomplished by the aggregator and it automatically accumulates data from all accounts under a specific AWS Organization.

Common architecture workflow for discovering public IPs

Figure 4. High-level architecture pattern for discovering public IPs

Figure 4. High-level architecture pattern for discovering public IPs

The workflow shown in Figure 4 starts with Amazon EventBridge triggering an AWS Lambda function. You can configure an Amazon EventBridge schedule via rate or cron expressions, which define the frequency. This AWS Lambda function will host the code to make an API call to AWS Config that will run an advanced query. The advanced query will check for all elastic network interfaces in your account(s). This is because any public resource launched in your account will be assigned an elastic network interface.

When the results are returned, they can be stored on Amazon S3. These result files can be timestamped (via naming or S3 versioning) in order to keep a history of public IPs used in your account. The result set can then be fed into or accessed by the vulnerability scanning tool of your choice.

Note: AWS Config advanced queries can also be used to query IPv6 addresses. You can use the “configuration.ipv6Addresses” AWS Config property to get IPv6 addresses. When querying IPv6 addresses, remove “configuration.association.publicIp > ‘0.0.0.0’” condition from the preceding sample queries. For more information on available AWS Config properties and data types, refer to GitHub.

Conclusion

In this blog, we demonstrated how to extract public IP information from resources deployed in your account(s) using AWS Config and AWS Config advanced query. We discussed how you can support your vulnerability scanning process by identifying public IPs in your account(s) that can be fed into your scanning tool. This solution is serverless, automated, and scalable, which removes the undifferentiated heavy lifting required to manage your resources.

Learn more about AWS Config best practices:

Modernized Database Queuing using Amazon SQS and AWS Services

Post Syndicated from Scott Wainner original https://aws.amazon.com/blogs/architecture/modernized-database-queuing-using-amazon-sqs-and-aws-services/

A queuing system is composed of producers and consumers. A producer enqueues messages (writes messages to a database) and a consumer dequeues messages (reads messages from the database). Business applications requiring asynchronous communications often use the relational database management system (RDBMS) as the default message storage mechanism. But the increased message volume, complexity, and size, competes with the inherent functionality of the database. The RDBMS becomes a bottleneck for message delivery, while also impacting other traditional enterprise uses of the database.

In this blog, we will show how you can mitigate the RDBMS performance constraints by using Amazon Simple Queue Service (Amazon SQS), while retaining the intrinsic value of the stored relational data.

Problems with legacy queuing methods

Commercial databases such as Oracle offer Advanced Queuing (AQ) mechanisms, while SQL Server supports Service Broker for queuing. The database acts as a message queue system when incoming messages are captured along with metadata. A message stored in a database is often processed multiple times using a sequence of message extraction, transformation, and loading (ETL). The message is then routed for distribution to a set of recipients based on logic that is often also stored in the database.

The repetitive manipulation of messages and iterative attempts at distributing pending messages may create a backlog that interferes with the primary function of the database. This backpressure can propagate to other systems that are trying to store and retrieve data from the database and cause a performance issue (see Figure 1).

Figure 1. A relational database serving as a message queue.

Figure 1. A relational database serving as a message queue.

There are several scenarios where the database can become a bottleneck for message processing:

Message metadata. Messages consist of the payload (the content of the message) and metadata that describes the attributes of the message. The metadata often includes routing instructions, message disposition, message state, and payload attributes.

  • The message metadata may require iterative transformation during the message processing. This creates an inefficient sequence of read, transform, and write processes. This is especially inefficient if the message attributes undergo multiple transformations that must be reflected in the metadata. The iterative read/write process of metadata consumes the database IOPS, and forces the database to scale vertically (add more CPU and more memory).
  • A new paradigm emerges when message management processes exist outside of the database. Here, the metadata is manipulated without interacting with the database, except to write the final message disposition. Application logic can be applied through functions such as AWS Lambda to transform the message metadata.

Message large object (LOB). A message may contain a large binary object that must be stored in the payload.

  • Storing large binary objects in the RDBMS is expensive. Manipulating them consumes the throughput of the database with iterative read/write operations. If the LOB must be transformed, then it becomes wasteful to store the object in the database.
  • An alternative approach offers a more efficient message processing sequence. The large object is stored external to the database in universally addressable object storage, such as Amazon Simple Storage Service (Amazon S3). There is only a pointer to the object that is stored in the database. Smaller elements of the message can be read from or written to the database, while large objects can be manipulated more efficiently in object storage resources.

Message fan-out. A message can be loaded into the database and analyzed for routing, where the same message must be distributed to multiple recipients.

  • Messages that require multiple recipients may require a copy of the message replicated for each recipient. The replication creates multiple writes and reads from the database, which is inefficient.
  • A new method captures only the routing logic and target recipients in the database. The message replication then occurs outside of the database in distributed messaging systems, such as Amazon Simple Notification Service (Amazon SNS).

Message queuing. Messages are often kept in the database until they are successfully processed for delivery. If a message is read from the database and determined to be undeliverable, then the message is kept there until a later attempt is successful.

  • An inoperable message delivery process can create backpressure on the database where iterative message reads are processed for the same message with unsuccessful delivery. This creates a feedback loop causing even more unsuccessful work for the database.
  • Try a message queuing system such as Amazon MQ or Amazon SQS, which offloads the message queuing from the database. These services offer efficient message retry mechanisms, and reduce iterative reads from the database.

Sequenced message delivery. Messages may require ordered delivery where the delivery sequence is crucial for maintaining application integrity.

  • The application may capture the message order within database tables, but the sorting function still consumes processing capabilities. The order sequence must be sorted and maintained for each attempted message delivery.
  • Message order can be maintained outside of the database using a queue system, such as Amazon SQS, with first-in/first-out (FIFO) delivery.

Message scheduling. Messages may also be queued with a scheduled delivery attribute. These messages require an event driven architecture with initiated scheduled message delivery.

  • The database often uses trigger mechanisms to initiate message delivery. Message delivery may require a synchronized point in time for delivery (many messages at once), which can cause a spike in work at the scheduled interval. This impacts the database performance with artificially induced peak load intervals.
  • Event signals can be generated in systems such as Amazon EventBridge, which can coordinate the transmission of messages.

Message disposition. Each message maintains a message disposition state that describes the delivery state.

  • The database is often used as a logging system for message transmission status. The message metadata is updated with the disposition of the message, while the message remains in the database as an artifact.
  • An optimized technique is available using Amazon CloudWatch as a record of message disposition.

Modernized queuing architecture

Decoupling message queuing from the database improves database availability and enables greater message queue scalability. It also provides a more cost-effective use of the database, and mitigates backpressure created when database performance is constrained by message management.

The modernized architecture uses loosely coupled services, such as Amazon S3, AWS Lambda, Amazon Message Queue, Amazon SQS, Amazon SNS, Amazon EventBridge, and Amazon CloudWatch. This loosely coupled architecture lets each of the functional components scale vertically and horizontally independent of the other functions required for message queue management.

Figure 2 depicts a message queuing architecture that uses Amazon SQS for message queuing and AWS Lambda for message routing, transformation, and disposition management. An RDBMS is still leveraged to retain metadata profiles, routing logic, and message disposition. The ETL processes are handled by AWS Lambda, while large objects are stored in Amazon S3. Finally, message fan-out distribution is handled by Amazon SNS, and the queue state is monitored and managed by Amazon CloudWatch and Amazon EventBridge.

Figure 2. Modernized queuing architecture using Amazon SQS

Figure 2. Modernized queuing architecture using Amazon SQS

Conclusion

In this blog, we show how queuing functionality can be migrated from the RDMBS while minimizing changes to the business application. The RDBMS continues to play a central role in sourcing the message metadata, running routing logic, and storing message disposition. However, AWS services such as Amazon SQS offload queue management tasks related to the messages. AWS Lambda performs message transformation, queues the message, and transmits the message with massive scale, fault-tolerance, and efficient message distribution.

Read more about the diverse capabilities of AWS messaging services:

By using AWS services, the RDBMS is no longer a performance bottleneck in your business applications. This improves scalability, and provides resilient, fault-tolerant, and efficient message delivery.

Read our blog on modernization of common database functions:

Serverless Scheduling with Amazon EventBridge, AWS Lambda, and Amazon DynamoDB

Post Syndicated from Peter Grman original https://aws.amazon.com/blogs/architecture/serverless-scheduling-with-amazon-eventbridge-aws-lambda-and-amazon-dynamodb/

Many applications perform scheduled tasks. For instance, you might want to automatically publish an article at a given time, change prices for offers which were defined weeks in advance, or notify customers 8 hours before a flight. These might be one-off tasks, or recurring ones.

On Unix-like operating systems, you might have opted for the cron utility. There are also similar alternatives for many web application frameworks, as well as advanced libraries, to schedule future one-off tasks. In a single server environment, this might seem like a simple solution. However, when you run dozens of instances of your application server, it gets harder to rely on those libraries to schedule tasks reliably at least once, without taking up too many resources. If you decide to build a serverless application, you need a new approach all together.

This post shows how you can build a scalable serverless job scheduler. You can use this method to scale to thousands, or even millions, of distributed jobs per minute. Because you are using serverless technologies, the underlying infrastructure is fully managed by AWS and you only pay for what you use. You can use this solution as an addition to your existing applications, regardless if they already use serverless technologies.

Similarly to a cron job running on a single instance, this solution uses an Amazon EventBridge rule, which starts new events periodically on a schedule. For recurring jobs, you would use this capability to start specific actions directly. This will work if you have only a few dozen periodic tasks, whose execution cycle can be defined as a cron expression. However, remember that there are limits to how many rules can be defined per event bus, and rules with a scheduled expression can only be defined on the default event bus. This post describes a method to multiplex a single Amazon EventBridge rule via an AWS Lambda function and Amazon DynamoDB, to scale beyond thousands of jobs. While this example focuses on one-off tasks, you can use the same approach for recurring jobs as well.

Overview of solution

The following diagram shows the architecture of the serverless scheduling solution.

Figure 1 - Architecture diagram showing Serverless Scheduling with Amazon EventBridge, AWS Lambda, and Amazon DynamoDB

Figure 1 – Architecture diagram showing Serverless Scheduling with Amazon EventBridge, AWS Lambda, and Amazon DynamoDB

Amazon EventBridge with scheduled expressions periodically starts an AWS Lambda function. An Amazon DynamoDB table stores the future jobs. The Lambda function queries the table for due jobs and distributes them via Amazon EventBridge to the workers.

The following services are used:

Amazon EventBridge: to initiate the serverless scheduling solution. Amazon EventBridge is a serverless event bus that makes it easier to build event-driven applications at scale. It can also schedule events based on time intervals or cron expressions.

In this solution, you’ll use EventBridge for two things:

  1. to periodically start the AWS Lambda function, which checks for new jobs to be executed, and
  2. to distribute those jobs to the workers.

Here, you can control the granularity of your job executions. The fastest rate possible is once every minute. But if you don’t need a 1-minute precision, you can also opt for once every 5 minutes, or even once every hour. Remember that you cannot control at which second the event is started. It might be at the beginning of the minute, in the middle, or at the end.

AWS Lambda: to execute the scheduler logic. AWS Lambda is a serverless, event-driven compute service that lets you run code without provisioning or managing servers. The Lambda function queries the jobs from DynamoDB and distributes them via EventBridge. Based on your requirements, you can adjust this to use different mechanisms to notify the workers about the jobs, such as HTTP APIs, gRPC calls, or AWS services like Amazon Simple Notification Service (SNS) or Amazon Simple Queue Service (SQS).

Amazon DynamoDB: to store scheduled jobs. Amazon DynamoDB is a fully managed, serverless, key-value NoSQL database designed to run high-performance applications at any scale. Defining the right data model is important to be able to scale to thousands or even millions of scheduled and processed jobs per minute. The DynamoDB table in this solution has a partition key “pk” and a sort key “sk”. For the Lambda function, to be able to query all due jobs quickly and efficiently, jobs must be partitioned. For this, they are grouped together based on their scheduled times in intervals of 5 minutes. This value is the partition key “pk”. How to calculate this value is explained in detail, when you will test the solution.

The sort key “sk” contains the precise execution time concatenated with a unique identifier, such as a job ID, because the combination of “pk” and “sk” must be unique. To schedule a job in this example, you write it manually into the DynamoDB table. In your production code you can abstract the synchronous DynamoDB access, by implementing it in a shared library, or using Amazon API Gateway. You could also schedule jobs from a Lambda function reacting to events in your system.

Amazon EventBridge: to distribute the jobs. The Lambda function uses Amazon EventBridge as an example to distribute the jobs. The workers which should receive the jobs, must configure the corresponding rules upfront. For testing purposes, this solution comes with a rule which logs all events from the Lambda function into Amazon CloudWatch Logs.

Walkthrough

In this section, you will deploy the solution and test it.

Deploying the solution

To deploy it in your account:

1.  Select Launch Stack.

Launch Stack

2. Select the Region where you want to launch your serverless scheduler.

3. Define a name for your stack. Leave the parameters with the default values for now and select Next.

parameters table

4. At the bottom of the page, acknowledge the required Capabilities and select Create stack.

5.  Wait until the status of the stack is CREATE_COMPLETE, this can take a minute or two.

Testing the solution

In this section, you test the serverless scheduler. First, you’ll schedule a job for some time in the near future. Afterwards you will check that the job has been logged in CloudWatch Logs at the time, it was scheduled.

1. In the AWS Management Console, navigate to the DynamoDB service and select the Items sub-menu on the left side, between Tables and PartiQL editor.

2. Select the JobsTable which you created via the CloudFormation Stack; it should be empty for now:

Jobstable

3. Select Create item. Make sure you switch to the JSON editor at the top, and disable View DynamoDB JSON. Now copy this item into the editor:

{
  "pk": "j#2015-03-20T09:45",
  "sk": "2015-03-20T09:46:47.123Z#564ade05-efda-4a2e-a7db-933ad3c89a83",
  "detail": {
    "action": "send-reminder",
    "userId": "16f3a019-e3a5-47ed-8c46-f668347503d1",
    "taskId": "6d2f710d-99d8-49d8-9f52-92a56d0c6b81",
    "params": {
      "can_skip": false,
      "reminder_volume": 0.5
    }
  },
  "detail_type": "job-reminder"
}

Create table DynamoDB table

This is a sample job definition. You will need to adjust it, to be started a few minutes from now. For this you need to adjust the first 2 attributes, the partition key “pk” and the sort key “sk”. Start with “sk”, this is the UTC timestamp for the due date of the job in ISO 8601 format (YYYY-MM-DDTHH:MM:SS), followed by a separator (“#”) and a unique identifier, to make sure that multiple jobs can have the same due timestamp.

Afterwards adjust “pk”. The “pk” looks like the ISO 8601 timestamp in the “sk” reduced to date and time in hours and minutes. The minutes for the partition key must be an integer multiple of 5. This value represents the grouping of the jobs, so they can be queried quickly and efficiently by the Lambda function. For instance, for me 2021-11-26T13:31:55.000Z is in the future and the corresponding partition would be 2021-11-26T13:30.

Note: your local time zone might not be UTC. You can get the current UTC time on timeanddate.com.

You can find in the following table for every “sk” minute the corresponding “pk” minute:

SK and PK table

The corresponding python code would be:

f'{(sk_minutes – sk_minutes % 5):02d}'

Create table DynamoDB table

4. Now that you defined your event in the near future, you can optionally adjust the content of the “detail” and “detail_type” attributes. These are forwarded to EventBridge as “detail” and “detail-type” and should be used by your workers to understand which task they are supposed to perform. You can find more details on EventBridge event structure in our documentation. After you configured the job correctly, select Create item.

5. It is time to navigate to CloudWatch Log groups and wait for the item to be due and to show up in the debug logs.

CloudWatch log groups

For now, the log streams should be empty:

Log streams sceenshot

After the item was due, you should see a new log stream with the item “detail” and “detail_type” attributes logged.

If you don’t see a new log stream with the item, check back in your DynamoDB table, if the “sk” is in the UTC time zone and the minutes of the “pk” are a multiple of 5. You can consult the table at the end of step 3, to check for the correct “pk” minutes based on your “sk” minutes.

Log events screenshot

You might notice that the timestamp of the message is within a minute after the job was scheduled. In my example, I scheduled the job for 2021-11-26T13:31:55.000Z and it was put into EventBridge at 2021-11-26T13:32:33Z. The delay comes from the Lambda function only starting once per minute. As I mentioned in the beginning, the function also isn’t started at second 00 but at a random second within that minute.

Exploring the Lambda function

Now, let’s have a look at the core logic. For this, navigate to AWS Lambda in the AWS Management console and open the SchedulerFunction.

AWS Lambda screenshot

In the function configuration, you can see that it is triggered by EventBridge via a scheduled expression at the rate, which was defined in the CloudFormation Stack.

Function Configuration in Cloudformation Stack

When you open the Code tab, you can see that it is less than 100 lines of python code. The main part is the lambda_handler function:

def lambda_handler(event, context):
    event_time_in_utc = event['time']
    previous_partition, current_partition = get_partitions(event_time_in_utc)

    previous_jobs = query_jobs(previous_partition, event_time_in_utc)
    current_jobs = query_jobs(current_partition, event_time_in_utc)
    all_jobs = previous_jobs + current_jobs

    print('dispatching {} jobs'.format(len(all_jobs)))

    put_all_jobs_into_event_bridge(all_jobs)
    delete_all_jobs(all_jobs)

    print('dispatched and deleted {} jobs'.format(len(all_jobs)))

The function starts by calculating the current and previous partitions. This is done to ensure that no jobs stay unprocessed in the old partition, when a new one starts. Afterwards, jobs from these partitions are queried up to the current time, so no future jobs will be fetched from the current partition. Lastly, all jobs are put into EventBridge and deleted from the table.

Instead of pushing the jobs into EventBridge, they could be started via HTTP(S), gRPC, or pushed into other AWS services, like Amazon Simple Notification Service (SNS) or Amazon Simple Queue Service (SQS). Also remember that the communication with other AWS services is synchronous and does not use batching options when putting jobs into EventBridge or deleting them from the DynamoDB table. This is to keep the function simpler and easier to understand. When you plan to distribute thousands of jobs per minute, you’d want to adjust this, to improve the throughput of the Lambda function.

Cleaning up

To avoid incurring future charges, delete the CloudFormation Stack and all resources you created.

Conclusion

In this post, you learned how to build a serverless scheduling solution. Using only serverless technologies which scale automatically, don’t require maintenance, and offer a pay as you go pricing model, this scheduler solution can be implemented for use cases with varying throughput requirements for their scheduled jobs. These could range from publishing articles at a scheduled time to notifying hundreds of passengers per minute about their upcoming flight.

You can adjust the Lambda function to distribute the jobs with a technology more fitting to your application, as well as to handle recurring tasks. The grouping interval of 5 minutes for the partition key, can be also adjusted based on your throughput requirements. It’s important to note that for this solution to work, the interval by which the jobs are grouped must be longer than the rate at which the Lambda function is started.

Give it a try and let us know your thoughts in the comments!

New – Use Amazon S3 Event Notifications with Amazon EventBridge

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-use-amazon-s3-event-notifications-with-amazon-eventbridge/

We launched Amazon EventBridge in mid-2019 to make it easy for you to build powerful, event-driven applications at any scale. Since that launch, we have added several important features including a Schema Registry, the power to Archive and Replay Events, support for Cross-Region Event Bus Targets, and API Destinations to allow you to send events to any HTTP API. With support for a very long list of destinations and the ability to do pattern matching, filtering, and routing of events, EventBridge is an incredibly powerful and flexible architectural component.

S3 Event Notifications
Today we are making it even easier for you to use EventBridge to build applications that react quickly and efficiently to changes in your S3 objects. This is a new, “directly wired” model that is faster, more reliable, and more developer-friendly than ever. You no longer need to make additional copies of your objects or write specialized, single-purpose code to process events.

At this point you might be thinking that you already had the ability to react to changes in your S3 objects, and wondering what’s going on here. Back in 2014 we launched S3 Event Notifications to SNS Topics, SQS Queues, and Lambda functions. This was (and still is) a very powerful feature, but using it at enterprise-scale can require coordination between otherwise-independent teams and applications that share an interest in the same objects and events. Also, EventBridge can already extract S3 API calls from CloudTrail logs and use them to do pattern matching & filtering. Again, very powerful and great for many kinds of apps (with a focus on auditing & logging), but we always want to do even better.

Net-net, you can now configure S3 Event Notifications to directly deliver to EventBridge! This new model gives you several benefits including:

Advanced Filtering – You can filter on many additional metadata fields, including object size, key name, and time range. This is more efficient than using Lambda functions that need to make calls back to S3 to get additional metadata in order to make decisions on the proper course of action. S3 only publishes events that match a rule, so you save money by only paying for events that are of interest to you.

Multiple Destinations – You can route the same event notification to your choice of 18 AWS services including Step Functions, Kinesis Firehose, Kinesis Data Streams, and HTTP targets via API Destinations. This is a lot easier than creating your own fan-out mechanism, and will also help you to deal with those enterprise-scale situations where independent teams want to do their own event processing.

Fast, Reliable Invocation – Patterns are matched (and targets are invoked) quickly and directly. Because S3 provides at-least-once delivery of events to EventBridge, your applications will be more reliable.

You can also take advantage of other EventBridge features, including the ability to archive and then replay events. This allows you to reprocess events in case of an error or if you add a new target to an event bus.

Getting Started
I can get started in minutes. I start by enabling EventBridge notifications on one of my S3 buckets (jbarr-public in this case). I open the S3 Console, find my bucket, open the Properties tab, scroll down to Event notifications, and click Edit:

I select On, click Save changes, and I’m ready to roll:

Now I use the EventBridge Console to create a rule. I start, as usual, by entering a name and a description:

Then I define a pattern that matches the bucket and the events of interest:

One pattern can match one or more buckets and one or more events; the following events are supported:

  • Object Created
  • Object Deleted
  • Object Restore Initiated
  • Object Restore Completed
  • Object Restore Expired
  • Object Tags Added
  • Object Tags Deleted
  • Object ACL Updated
  • Object Storage Class Changed
  • Object Access Tier Changed

Then I choose the default event bus, and set the target to an SNS topic (BucketAction) which publishes the messages to my Amazon email address:

I click Create, and I am all set. To test it out, I simply upload some files to my bucket and await the messages:

The message contains all of the interesting and relevant information about the event, and (after some unquoting and formatting), looks like this:

{
    "version": "0",
    "id": "2d4eba74-fd51-3966-4bfa-b013c9da8ff1",
    "detail-type": "Object Created",
    "source": "aws.s3",
    "account": "348414629041",
    "time": "2021-11-13T00:00:59Z",
    "region": "us-east-1",
    "resources": [
        "arn:aws:s3:::jbarr-public"
    ],
    "detail": {
        "version": "0",
        "bucket": {
            "name": "jbarr-public"
        },
        "object": {
            "key": "eb_create_rule_mid_1.png",
            "size": 99797,
            "etag": "7a72374e1238761aca7778318b363232",
            "version-id": "a7diKodKIlW3mHIvhGvVphz5N_ZcL3RG",
            "sequencer": "00618F003B7286F496"
        },
        "request-id": "4Z2S00BKW2P1AQK8",
        "requester": "348414629041",
        "source-ip-address": "72.21.198.68",
        "reason": "PutObject"
    }

My initial event pattern was very simple, and matched only the bucket name. I can use content-based filtering to write more complex and more interesting patterns. For example, I could use numeric matching to set up a pattern that matches events for objects that are smaller than 1 megabyte:

{
    "source": [
        "aws.s3"
    ],
    "detail-type": [
        "Object Created",
        "Object Deleted",
        "Object Tags Added",
        "Object Tags Deleted"
    ],

    "detail": {
        "bucket": {
            "name": [
                "jbarr-public"
            ]
        },
        "object" : {
            "size": [{"numeric" :["<=", 1048576 ] }]
        }
    }
}

Or, I could use prefix matching to set up a pattern that looks for objects uploaded to a “subfolder” (which doesn’t really exist) of a bucket:

"object": {
  "key" : [{"prefix" : "uploads/"}]
  }]
}

You can use all of this in conjunction with all of the existing EventBridge features, including Archive/Replay. You can also access the CloudWatch metrics for each of your rules:

Available Now
This feature is available now and you can start using it today in all commercial AWS Regions. You pay $1 for every 1 million events that match a rule; check out the EventBridge Pricing page for more information.

Jeff;

Expanding cross-Region event routing with Amazon EventBridge

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/expanding-cross-region-event-routing-with-amazon-eventbridge/

This post is written by Stephen Liedig, Sr Serverless Specialist SA.

In April 2021, AWS announced a new feature for Amazon EventBridge that allows you to route events from any commercial AWS Region to US East (N. Virginia), US West (Oregon), and Europe (Ireland). From today, you can now route events between any AWS Regions, except AWS GovCloud (US) and China.

EventBridge enables developers to create event-driven applications by routing events between AWS services, integrated software as a service (SaaS) applications, and your own applications. This helps you produce loosely coupled, distributed, and maintainable architectures. With these new capabilities, you can now route events across Regions and accounts using the same model used to route events to existing targets.

Cross-Region event routing with Amazon EventBridge makes it easier for customers to develop multi-Region workloads to:

  • Centralize your AWS events into one Region for auditing and monitoring purposes, such as aggregating security events for compliance reasons in a single account.
  • Replicate events from source to destinations Regions to help synchronize data in cross-Region data stores.
  • Invoke asynchronous workflows in a different Region from a source event. For example, you can load balance from a target Region by routing events to another Region.

A previous post shows how cross-Region routing works. This blog post expands on these concepts and discusses a common use case for cross-Region event delivery – event auditing. This example explores how you can manage resources using AWS CloudFormation and EventBridge resource policies.

Multi-Region event auditing example walkthrough

Compliance is an important part of building event-driven applications and reacting to any potential policy or security violations. Customers use EventBridge to route security events from applications and globally distributed infrastructure into a single account for analysis. In many cases, they share specific AWS CloudTrail events with security teams. Customers also audit events from their custom-built applications to monitor sensitive data usage.

In this scenario, a company has their base of operations located in Asia Pacific (Singapore) with applications distributed across US East (N. Virginia) and Europe (Frankfurt). The applications in US East (N. Virginia) and Europe (Frankfurt) are using EventBridge for their respective applications and services. The security team in Asia Pacific (Singapore) wants to analyze events from the applications and CloudTrail events for specific API calls to monitor infrastructure security.

Reference architecture

To create the rules to receive these events:

  1. Create a new set of rules directly on all the event buses across the global infrastructure. Alternatively, delegate the responsibility of managing security rules to distributed teams that manage the event bus resources.
  2. Provide the security team with the ability to manage rules centrally, and control the lifecycle of rules on the global infrastructure.

Allowing the security team to manage the resources centrally provides more scalability. It is more consistent with the design principle that event consumers own and manage the rules that define how they process events.

Deploying the example application

The following code snippets are shortened for brevity. The full source code of the solution is in the GitHub repository. The solution uses AWS Serverless Application Model (AWS SAM) for deployment. Clone the repo and navigate to the solution directory:

git clone https://github.com/aws-samples/amazon-eventbridge-resource-policy-samples
cd ./patterns/ cross-region-cross-account-pattern/

To allow the security team to start receiving accounts from any of the cross-Region accounts:

1. Create a security event bus in the Asia Pacific (Singapore) Region with a rule that processes events from the respective event sources.

ap-southest-1 architecture

For simplicity, this example uses an Amazon CloudWatch Logs target to visualize the events arriving from cross-Region accounts:

 SecurityEventBus:
  Type: AWS::Events::EventBus
  Properties:
   Name: !Ref SecurityEventBusName

 # This rule processes events coming in from cross-Region accounts
 SecurityAnalysisRule:
  Type: AWS::Events::Rule
  Properties:
   Name: SecurityAnalysisRule
   Description: Analyze events from cross-Region event buses
   EventBusName: !GetAtt SecurityEventBus.Arn
   EventPattern:
    source:
     - anything-but: com.company.security
   State: ENABLED
   RoleArn: !GetAtt WriteToCwlRole.Arn
   Targets:
    - Id: SendEventToSecurityAnalysisRule
     Arn: !Sub "arn:aws:logs:${AWS::Region}:${AWS::AccountId}:log-group:${SecurityAnalysisRuleTarget}"

In this example, you set the event pattern to process any event from a source that is not from the security team’s own domain. This allows you to process events from any account in any Region. You can filter this further as needed.

2. Set an event bus policy on each default and custom event bus that the security team must receive events from.

Event bus policy

This policy allows the security team to create rules to route events to its own security event bus in the Asia Pacific (Singapore) Region. The following policy defines a custom event bus in Account 2 in US East (N. Virginia) and an AWS::Events::EventBusPolicy that sets the Principal as the security team account.

This allows the security team to manage rules on the CustomEventBus:

 CustomEventBus: 
  Type: AWS::Events::EventBus
  Properties: 
    Name: !Ref EventBusName

 SecurityServiceRuleCreationStatement:
  Type: AWS::Events::EventBusPolicy
  Properties:
   EventBusName: !Ref CustomEventBus # If you omit this, the default event bus is used.
   StatementId: "AllowCrossRegionRulesForSecurityTeam"
   Statement:
    Effect: "Allow"
    Principal:
     AWS: !Sub "arn:aws:iam::${SecurityAccountNo}:root"
    Action:
     - "events:PutRule"
     - "events:DeleteRule"
     - "events:DescribeRule"
     - "events:DisableRule"
     - "events:EnableRule"
     - "events:PutTargets"
     - "events:RemoveTargets"
    Resource:
     - !Sub 'arn:aws:events:${AWS::Region}:${AWS::AccountId}:rule/${CustomEventBus.Name}/*'
    Condition:
     StringEqualsIfExists:
      "events:creatorAccount": "${aws:PrincipalAccount}"

3. With the policies set on the cross-Region accounts, now create the rules. Because you cannot create CloudFormation resources across Regions, you must define the rules in separate templates. This also gives the ability to expand to other Regions.

Once the template is deployed to the cross-Region accounts, use EventBridge resource policies to propagate rule definitions across accounts in the same Region. The security account must have permission to create CloudFormation resources in the cross-Region accounts to deploy the rule templates.

Resource policies

There are two parts to the rule templates. The first specifies a role that allows EventBridge to assume a role to send events to the target event bus in the security account:

 # This IAM role allows EventBridge to assume the permissions necessary to send events
 # from the source event buses to the destination event bus.
 SourceToDestinationEventBusRole:
  Type: "AWS::IAM::Role"
  Properties:
   AssumeRolePolicyDocument:
    Version: 2012-10-17
    Statement:
     - Effect: Allow
      Principal:
       Service:
        - events.amazonaws.com
      Action:
       - "sts:AssumeRole"
   Path: /
   Policies:
    - PolicyName: PutEventsOnDestinationEventBus
     PolicyDocument:
      Version: 2012-10-17
      Statement:
       - Effect: Allow
        Action: "events:PutEvents"
        Resource:
         - !Ref SecurityEventBusArn

The second is the definition of the rule resource. This requires the Amazon Resource Name (ARN) of the event bus where you want to put the rule, the ARN of the target event bus in the security account, and a reference to the SourceToDestinationEventBusRole role:

 SecurityAuditRule2:
  Type: AWS::Events::Rule
  Properties:
   Name: SecurityAuditRuleAccount2
   Description: Audit rule for the Security team in Singapore
   EventBusName: !Ref EventBusArnAccount2 # ARN of the custom event bus in Account 2
   EventPattern:
    source:
     - com.company.marketing
   State: ENABLED
   Targets:
    - Id: SendEventToSecurityEventBusArn
     Arn: !Ref SecurityEventBusArn
     RoleArn: !GetAtt SourceToDestinationEventBusRole.Arn

You can use the AWS SAM CLI to deploy this:

sam deploy -t us-east-1-rules.yaml \
  --stack-name us-east-1-rules \
  --region us-east-1 \
  --profile default \
  --capabilities=CAPABILITY_IAM \
  --parameter-overrides SecurityEventBusArn="arn:aws:events:ap-southeast-1:111111111111:event-bus/SecurityEventBus" EventBusArnAccount1="arn:aws:events:us-east-1:111111111111:event-bus/default" EventBusArnAccount2="arn:aws:events:us-east-1:222222222222:event-bus/custom-eventbus-account-2"

Testing the example application

With the rules deployed across the Regions, you can test by sending events to the event bus in Account 2:

  1. Navigate to the applications/account_2 directory. Here you find an events.json file, which you use as input for the put-events API call.
  2. Run the following command using the AWS CLI. This sends messages to the event bus in us-east-1 which are routed to the security event bus in ap-southeast-1:
    aws events put-events \
     --region us-east-1 \
     --profile [NAMED PROFILE FOR ACCOUNT 2] \
     --entries file://events.json
    

    If you have run this successfully, you see:
    Entries:
    - EventId: a423b35e-3df0-e5dc-b854-db9c42144fa2
    - EventId: 5f22aea8-51ea-371f-7a5f-8300f1c93973
    - EventId: 7279fa46-11a6-7495-d7bb-436e391cfcab
    - EventId: b1e1ecc1-03f7-e3ef-9aa4-5ac3c8625cc7
    - EventId: b68cea94-28e2-bfb9-7b1f-9b2c5089f430
    - EventId: fc48a303-a1b2-bda8-8488-32daa5f809d8
    FailedEntryCount: 0

  3. Navigate to the Amazon CloudWatch console to see a collection of log entries with the events you published. The log group is /aws/events/SecurityAnalysisRule.CloudWatch Logs

Congratulations, you have successfully sent your first events across accounts and Regions!

Conclusion

With cross-Region event routing in EventBridge, you can now route events to and from any AWS Region. This post explains how to manage and configure cross-Region event routing using CloudFormation and EventBridge resource policies to simplify rule propagation across your global event bus infrastructure. Finally, I walk through an example you can deploy to your AWS account.

For more serverless learning resources, visit Serverless Land.

Create a serverless event-driven workflow to ingest and process Microsoft data with AWS Glue and Amazon EventBridge

Post Syndicated from Venkata Sistla original https://aws.amazon.com/blogs/big-data/create-a-serverless-event-driven-workflow-to-ingest-and-process-microsoft-data-with-aws-glue-and-amazon-eventbridge/

Microsoft SharePoint is a document management system for storing files, organizing documents, and sharing and editing documents in collaboration with others. Your organization may want to ingest SharePoint data into your data lake, combine the SharePoint data with other data that’s available in the data lake, and use it for reporting and analytics purposes. AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development. AWS Glue provides all the capabilities needed for data integration so that you can start analyzing your data and putting it to use in minutes instead of months.

Organizations often manage their data on SharePoint in the form of files and lists, and you can use this data for easier discovery, better auditing, and compliance. SharePoint as a data source is not a typical relational database and the data is mostly semi structured, which is why it’s often difficult to join the SharePoint data with other relational data sources. This post shows how to ingest and process SharePoint lists and files with AWS Glue and Amazon EventBridge, which enables you to join other data that is available in your data lake. We use SharePoint REST APIs with a standard open data protocol (OData) syntax. OData advocates a standard way of implementing REST APIs that allows for SQL-like querying capabilities. OData helps you focus on your business logic while building RESTful APIs without having to worry about the various approaches to define request and response headers, query options, and so on.

AWS Glue event-driven workflows

Unlike a traditional relational database, SharePoint data may or may not change frequently, and it’s difficult to predict the frequency at which your SharePoint server generates new data, which makes it difficult to plan and schedule data processing pipelines efficiently. Running data processing frequently can be expensive, whereas scheduling pipelines to run infrequently can deliver cold data. Similarly, triggering pipelines from an external process can increase complexity, cost, and job startup time.

AWS Glue supports event-driven workflows, a capability that lets developers start AWS Glue workflows based on events delivered by EventBridge. The main reason to choose EventBridge in this architecture is because it allows you to process events, update the target tables, and make information available to consume in near-real time. Because frequency of data change in SharePoint is unpredictable, using EventBridge to capture events as they arrive enables you to run the data processing pipeline only when new data is available.

To get started, you simply create a new AWS Glue trigger of type EVENT and place it as the first trigger in your workflow. You can optionally specify a batching condition. Without event batching, the AWS Glue workflow is triggered every time an EventBridge rule matches, which may result in multiple concurrent workflows running. AWS Glue protects you by setting default limits that restrict the number of concurrent runs of a workflow. You can increase the required limits by opening a support case. Event batching allows you to configure the number of events to buffer or the maximum elapsed time before firing the particular trigger. When the batching condition is met, a workflow run is started. For example, you can trigger your workflow when 100 files are uploaded in Amazon Simple Storage Service (Amazon S3) or 5 minutes after the first upload. We recommend configuring event batching to avoid too many concurrent workflows, and optimize resource usage and cost.

To illustrate this solution better, consider the following use case for a wine manufacturing and distribution company that operates across multiple countries. They currently host all their transactional system’s data on a data lake in Amazon S3. They also use SharePoint lists to capture feedback and comments on wine quality and composition from their suppliers and other stakeholders. The supply chain team wants to join their transactional data with the wine quality comments in SharePoint data to improve their wine quality and manage their production issues better. They want to capture those comments from the SharePoint server within an hour and publish that data to a wine quality dashboard in Amazon QuickSight. With an event-driven approach to ingest and process their SharePoint data, the supply chain team can consume the data in less than an hour.

Overview of solution

In this post, we walk through a solution to set up an AWS Glue job to ingest SharePoint lists and files into an S3 bucket and an AWS Glue workflow that listens to S3 PutObject data events captured by AWS CloudTrail. This workflow is configured with an event-based trigger to run when an AWS Glue ingest job adds new files into the S3 bucket. The following diagram illustrates the architecture.

To make it simple to deploy, we captured the entire solution in an AWS CloudFormation template that enables you to automatically ingest SharePoint data into Amazon S3. SharePoint uses ClientID and TenantID credentials for authentication and uses Oauth2 for authorization.

The template helps you perform the following steps:

  1. Create an AWS Glue Python shell job to make the REST API call to the SharePoint server and ingest files or lists into Amazon S3.
  2. Create an AWS Glue workflow with a starting trigger of EVENT type.
  3. Configure CloudTrail to log data events, such as PutObject API calls to CloudTrail.
  4. Create a rule in EventBridge to forward the PutObject API events to AWS Glue when they’re emitted by CloudTrail.
  5. Add an AWS Glue event-driven workflow as a target to the EventBridge rule. The workflow gets triggered when the SharePoint ingest AWS Glue job adds new files to the S3 bucket.

Prerequisites

For this walkthrough, you should have the following prerequisites:

Configure SharePoint server authentication details

Before launching the CloudFormation stack, you need to set up your SharePoint server authentication details, namely, TenantID, Tenant, ClientID, ClientSecret, and the SharePoint URL in AWS Systems Manager Parameter Store of the account you’re deploying in. This makes sure that no authentication details are stored in the code and they’re fetched in real time from Parameter Store when the solution is running.

To create your AWS Systems Manager parameters, complete the following steps:

  1. On the Systems Manager console, under Application Management in the navigation pane, choose Parameter Store.
    systems manager
  2. Choose Create Parameter.
  3. For Name, enter the parameter name /DATALAKE/GlueIngest/SharePoint/tenant.
  4. Leave the type as string.
  5. Enter your SharePoint tenant detail into the value field.
  6. Choose Create parameter.
  7. Repeat these steps to create the following parameters:
    1. /DataLake/GlueIngest/SharePoint/tenant
    2. /DataLake/GlueIngest/SharePoint/tenant_id
    3. /DataLake/GlueIngest/SharePoint/client_id/list
    4. /DataLake/GlueIngest/SharePoint/client_secret/list
    5. /DataLake/GlueIngest/SharePoint/client_id/file
    6. /DataLake/GlueIngest/SharePoint/client_secret/file
    7. /DataLake/GlueIngest/SharePoint/url/list
    8. /DataLake/GlueIngest/SharePoint/url/file

Deploy the solution with AWS CloudFormation

For a quick start of this solution, you can deploy the provided CloudFormation stack. This creates all the required resources in your account.

The CloudFormation template generates the following resources:

  • S3 bucket – Stores data, CloudTrail logs, job scripts, and any temporary files generated during the AWS Glue extract, transform, and load (ETL) job run.
  • CloudTrail trail with S3 data events enabled – Enables EventBridge to receive PutObject API call data in a specific bucket.
  • AWS Glue Job – A Python shell job that fetches the data from the SharePoint server.
  • AWS Glue workflow – A data processing pipeline that is comprised of a crawler, jobs, and triggers. This workflow converts uploaded data files into Apache Parquet format.
  • AWS Glue database – The AWS Glue Data Catalog database that holds the tables created in this walkthrough.
  • AWS Glue table – The Data Catalog table representing the Parquet files being converted by the workflow.
  • AWS Lambda function – The AWS Lambda function is used as an AWS CloudFormation custom resource to copy job scripts from an AWS Glue-managed GitHub repository and an AWS Big Data blog S3 bucket to your S3 bucket.
  • IAM roles and policies – We use the following AWS Identity and Access Management (IAM) roles:
    • LambdaExecutionRole – Runs the Lambda function that has permission to upload the job scripts to the S3 bucket.
    • GlueServiceRole – Runs the AWS Glue job that has permission to download the script, read data from the source, and write data to the destination after conversion.
    • EventBridgeGlueExecutionRole – Has permissions to invoke the NotifyEvent API for an AWS Glue workflow.
    • IngestGlueRole – Runs the AWS Glue job that has permission to ingest data into the S3 bucket.

To launch the CloudFormation stack, complete the following steps:

  1. Sign in to the AWS CloudFormation console.
  2. Choose Launch Stack:
  3. Choose Next.
  4. For pS3BucketName, enter the unique name of your new S3 bucket.
  5. Leave pWorkflowName and pDatabaseName as the default.

cloud formation 1

  1. For pDatasetName, enter the SharePoint list name or file name you want to ingest.
  2. Choose Next.

cloud formation 2

  1. On the next page, choose Next.
  2. Review the details on the final page and select I acknowledge that AWS CloudFormation might create IAM resources.
  3. Choose Create.

It takes a few minutes for the stack creation to complete; you can follow the progress on the Events tab.

You can run the ingest AWS Glue job either on a schedule or on demand. As the job successfully finishes and ingests data into the raw prefix of the S3 bucket, the AWS Glue workflow runs and transforms the ingested raw CSV files into Parquet files and loads them into the transformed prefix.

Review the EventBridge rule

The CloudFormation template created an EventBridge rule to forward S3 PutObject API events to AWS Glue. Let’s review the configuration of the EventBridge rule:

  1. On the EventBridge console, under Events, choose Rules.
  2. Choose the rule s3_file_upload_trigger_rule-<CloudFormation-stack-name>.
  3. Review the information in the Event pattern section.

event bridge

The event pattern shows that this rule is triggered when an S3 object is uploaded to s3://<bucket_name>/data/SharePoint/tablename_raw/. CloudTrail captures the PutObject API calls made and relays them as events to EventBridge.

  1. In the Targets section, you can verify that this EventBridge rule is configured with an AWS Glue workflow as a target.

event bridge target section

Run the ingest AWS Glue job and verify the AWS Glue workflow is triggered successfully

To test the workflow, we run the ingest-glue-job-SharePoint-file job using the following steps:

  1. On the AWS Glue console, select the ingest-glue-job-SharePoint-file job.

glue job

  1. On the Action menu, choose Run job.

glue job action menu

  1. Choose the History tab and wait until the job succeeds.

glue job history tab

You can now see the CSV files in the raw prefix of your S3 bucket.

csv file s3 location

Now the workflow should be triggered.

  1. On the AWS Glue console, validate that your workflow is in the RUNNING state.

glue workflow running status

  1. Choose the workflow to view the run details.
  2. On the History tab of the workflow, choose the current or most recent workflow run.
  3. Choose View run details.

glue workflow visual

When the workflow run status changes to Completed, let’s check the converted files in your S3 bucket.

  1. Switch to the Amazon S3 console, and navigate to your bucket.

You can see the Parquet files under s3://<bucket_name>/data/SharePoint/tablename_transformed/.

parquet file s3 location

Congratulations! Your workflow ran successfully based on S3 events triggered by uploading files to your bucket. You can verify everything works as expected by running a query against the generated table using Amazon Athena.

Sample wine dataset

Let’s analyze a sample red wine dataset. The following screenshot shows a SharePoint list that contains various readings that relate to the characteristics of the wine and an associated wine category. This is populated by various wine tasters from multiple countries.

redwine dataset

The following screenshot shows a supplier dataset from the data lake with wine categories ordered per supplier.

supplier dataset

We process the red wine dataset using this solution and use Athena to query the red wine data and supplier data where wine quality is greater than or equal to 7.

athena query and results

We can visualize the processed dataset using QuickSight.

Clean up

To avoid incurring unnecessary charges, you can use the AWS CloudFormation console to delete the stack that you deployed. This removes all the resources you created when deploying the solution.

Conclusion

Event-driven architectures provide access to near-real-time information and help you make business decisions on fresh data. In this post, we demonstrated how to ingest and process SharePoint data using AWS serverless services like AWS Glue and EventBridge. We saw how to configure a rule in EventBridge to forward events to AWS Glue. You can use this pattern for your analytical use cases, such as joining SharePoint data with other data in your lake to generate insights, or auditing SharePoint data and compliance requirements.


About the Author

Venkata Sistla is a Big Data & Analytics Consultant on the AWS Professional Services team. He specializes in building data processing capabilities and helping customers remove constraints that prevent them from leveraging their data to develop business insights.

Correlate security findings with AWS Security Hub and Amazon EventBridge

Post Syndicated from Marshall Jones original https://aws.amazon.com/blogs/security/correlate-security-findings-with-aws-security-hub-and-amazon-eventbridge/

In this blog post, we’ll walk you through deploying a solution to correlate specific AWS Security Hub findings from multiple AWS services that are related to a single AWS resource, which indicates an increased possibility that a security incident has happened.

AWS Security Hub ingests findings from multiple AWS services, including Amazon GuardDuty, Amazon Inspector, Amazon Macie, AWS Firewall Manager, AWS Identity and Access Management (IAM) Access Analyzer, and AWS Systems Manager Patch Manager. Findings from each service are normalized into the AWS Security Finding Format (ASFF), so that you can review findings in a standardized format and take action quickly. You can use AWS Security Hub to provide a single view of all security-related findings, where you can set up alerting, automatic remediation, and ingestion into third-party incident management systems for specific findings.

Although Security Hub can ingest a vast number of integrations and findings, it cannot create correlation rules like a Security Information and Event Management (SIEM) tool can. However, you can create such rules using EventBridge. It’s important to take a closer look when multiple AWS security services generate findings for a single resource, because this potentially indicates elevated risk. Depending on your environment, the initial number of findings in AWS Security Hub findings may be high, so you may need to prioritize which findings require immediate action. AWS Security Hub natively gives you the ability to filter findings by resource, account, and many other details. With the solution in this post, when one of these correlated sets of findings is detected, a new finding is created and pushed to AWS Security Hub by using the Security Hub BatchImportFindings API operation. You can then respond to these new security incident-oriented findings through ticketing, chat, or incident management systems.

Prerequisites

This solution requires that you have AWS Security Hub enabled in your AWS account. In addition to AWS Security Hub, the following services must be enabled and integrated to AWS Security Hub:

Solution overview

In this solution, you will use a combination of AWS Security Hub, Amazon EventBridge, AWS Lambda, and Amazon DynamoDB to ingest and correlate specific findings that indicate a higher likelihood of a security incident. Each correlation is focused on multiple specific AWS security service findings for a single AWS resource.

The following list shows the correlated findings that are detected by this solution. The Description section for each finding correlation provides context for that correlation, the Remediation section provides general recommendations for remediation, and the Prevention/Detection section provides guidance to either prevent or detect one or more findings within the correlation. With the code provided, you can also add more correlations than those listed here by modifying the Cloud Development Kit (CDK) code and AWS Lambda code. The Solution workflow section breaks down the flow of the solution. If you choose to implement automatic remediation, each finding correlation will be created with the following AWS Security Hub Finding Format (ASFF) fields:

- Severity: CRITICAL
- ProductArn: arn:aws:securityhub:<REGION>:<AWS_ACCOUNT_ID>:product/<AWS_ACCOUNT_ID>/default

These correlated findings are created as part of this solution:

  1. Any Amazon GuardDuty Backdoor findings and three critical common vulnerabilities and exposures (CVEs) from Amazon Inspector that are associated with the same Amazon Elastic Compute Cloud (Amazon EC2) instance.
    • Description: Amazon Inspector has found at least three critical CVEs on the EC2 instance. CVEs indicate that the EC2 instance is currently vulnerable or exposed. The EC2 instance is also performing backdoor activities. The combination of these two findings is a stronger indication of an elevated security incident.
    • Remediation: It’s recommended that you isolate the EC2 instance and follow standard protocol to triage the EC2 instance to verify if the instance has been compromised. If the instance has been compromised, follow your standard Incident Response process for post-instance compromise and forensics. Redeploy a backup of the EC2 instance by using an up-to-date hardened Amazon Machine Image (AMI) or apply all security-related patches to the redeployed EC2 instance.
    • Prevention/Detection: To mitigate or prevent an Amazon EC2 instance from missing critical security updates, consider using Amazon Systems Manager Patch Manager to automate installing security-related patching for managed instances. Alternatively, you can provide developers up-to-date hardened Amazon Machine Images (AMI) by using Amazon EC2 Image Builder. For detection, you can set the AMI property called ‘DeprecationTime’ to indicate when the AMI will become outdated and respond accordingly.
  2. An Amazon Macie sensitive data finding and an Amazon GuardDuty S3 exfiltration finding for the same Amazon Simple Storage Service (Amazon S3) bucket.
    • Description: Amazon Macie has scanned an Amazon S3 bucket and found a possible match for sensitive data. Amazon GuardDuty has detected a possible exfiltration finding for the same Amazon S3 bucket. The combination of these findings indicates a higher risk security incident.
    • Remediation: It’s recommended that you review the source IP and/or IAM principal that is making the S3 object reads against the S3 bucket. If the source IP and/or IAM principal is not authorized to access sensitive data within the S3 bucket, follow your standard Incident Response process for post-compromise plan for S3 exfiltration. For example, you can restrict an IAM principal’s permissions, revoke existing credentials or unauthorized sessions, restricting access via the Amazon S3 bucket policy, or using the Amazon S3 Block Public Access feature.
    • Prevention/Detection: To mitigate or prevent exposure of sensitive data within Amazon S3, ensure the Amazon S3 buckets are using least-privilege bucket policies and are not publicly accessible. Alternatively, you can use the Amazon S3 Block Public Access feature. Review your AWS environment to make sure you are following Amazon S3 security best practices. For detection, you can use Amazon Config to track and auto-remediate Amazon S3 buckets that do not have logging and encryption enabled or publicly accessible.
  3. AWS Security Hub detects an EC2 instance with a public IP and unrestricted VPC Security Group; Amazon GuardDuty unusual network traffic behavior finding; and Amazon GuardDuty brute force finding.
    • Description: AWS Security Hub has detected an EC2 instance that has a public IP address attached and a VPC Security Group that allows traffic for ports outside of ports 80 and 443. Amazon GuardDuty has also determined that the EC2 instance has multiple brute force attempts and is communicating with a remote host on an unusual port that the EC2 instance has not previously used for network communication. The correlation of these lower-severity findings indicates a higher-severity security incident.
    • Remediation: It’s recommended that you isolate the EC2 instance and follow standard protocol to triage the EC2 instance to verify if the instance has been compromised. If the instance has been compromised, follow your standard Incident Response process for post-instance compromise and forensics.
    • Prevention/Detection: To mitigate or prevent these events from occurring within your AWS environment, determine whether the EC2 instance requires a public-facing IP address and review the VPC Security Group(s) has only the required rules configured. Review your AWS environment to make sure you are following Amazon EC2 best practices. For detection, consider implementing AWS Firewall Manager to continuously audit and limit VPC Security Groups.

The solution workflow, shown in Figure 1, is as follows:

  1. Security Hub ingests findings from integrated AWS security services.
  2. An EventBridge rule is invoked based on Security Hub findings in GuardDuty, Macie, Amazon Inspector, and Security Hub security standards.
  3. The EventBridge rule invokes a Lambda function to store the Security Hub finding, which is passed via EventBridge, in a DynamoDB table for further analysis.
  4. After the new findings are stored in DynamoDB, another Lambda function is invoked by using Dynamo StreamSets and a time-to-live (TTL) set to delete finding entries that are older than 30 days.
  5. The second Lambda function looks at the resource associated with the new finding entry in the DynamoDB table. The Lambda function checks for specific Security Hub findings that are associated with the same resource.
Figure 1: Architecture diagram describing the flow of the solution

Figure 1: Architecture diagram describing the flow of the solution

Solution deployment

You can deploy the solution through either the AWS Management Console or the AWS Cloud Development Kit (AWS CDK).

To deploy the solution by using the AWS Management Console

In your account, launch the AWS CloudFormation template by choosing the following Launch Stack button. It will take approximately 10 minutes for the CloudFormation stack to complete.
Select the Launch Stack button to launch the template

To deploy the solution by using the AWS CDK

You can find the latest code in the aws-security GitHub repository where you can also contribute to the sample code. The following commands show how to deploy the solution by using the AWS CDK. First, the CDK initializes your environment and uploads the AWS Lambda assets to Amazon S3. Then, you can deploy the solution to your account. For <INSERT_AWS_ACCOUNT>, specify the account number, and for <INSERT_REGION>, specify the AWS Region that you want the solution deployed to.

cdk bootstrap aws://<INSERT_AWS_ACCOUNT>/<INSERT_REGION>

cdk deploy

Conclusion

In this blog post, we walked through a solution to use AWS services, including Amazon EventBridge, AWS Lambda, and Amazon DynamoDB, to correlate AWS Security Hub findings from multiple different AWS security services. The solution provides a framework to prioritize specific sets of findings that indicate a higher likelihood that a security incident has occurred, so that you can prioritize and improve your security response.

If you have feedback about this post, submit comments in the Comments section below. If you have any questions about this post, start a thread on the AWS Security Hub forum.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Marshall Jones

Marshall is a worldwide security specialist solutions architect at AWS. His background is in AWS consulting and security architecture, focused on a variety of security domains including edge, threat detection, and compliance. Today, he is focused on helping enterprise AWS customers adopt and operationalize AWS security services to increase security effectiveness and reduce risk.

Author

Jonathan Nguyen

Jonathan is a shared delivery team senior security consultant at AWS. His background is in AWS security, with a focus on threat detection and incident response. He helps enterprise customers develop a comprehensive AWS security strategy, deploy security solutions at scale, and train customers on AWS security best practices.

Building dynamic Amazon SNS subscriptions for auto scaling container workloads 

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/building-dynamic-amazon-sns-subscriptions-for-auto-scaling-container-workloads/

This post is written by Mithun Mallick, Senior Specialist Solutions Architect, App Integration.

Amazon Simple Notification Service (SNS) is a serverless publish subscribe messaging service. It supports a push-based subscriptions model where subscribers must register an endpoint to receive messages. Amazon Simple Queue Service (SQS) is one such endpoint, which is used by applications to receive messages published on an SNS topic.

With containerized applications, the container instances poll the queue and receive the messages. However, containerized applications can scale out for a variety of reasons. The creation of an SQS queue for each new container instance creates maintenance overhead for customers. You must also clean up the SNS-SQS subscription once the instance scales in.

This blog walks through a dynamic subscription solution, which automates the creation, subscription, and deletion of SQS queues for an Auto Scaling group of containers running in Amazon Elastic Container Service (ECS).

Overview

The solution is based on the use of events to achieve the dynamic subscription pattern. ECS uses the concept of tasks to create an instance of a container. You can find more details on ECS tasks in the ECS documentation.

This solution uses the events generated by ECS to manage the complete lifecycle of an SNS-SQS subscription. It uses the task ID as the name of the queue that is used by the ECS instance for pulling messages. More details on the ECS task ID can be found in the task documentation.

This also uses Amazon EventBridge to apply rules on ECS events and trigger an AWS Lambda function. The first rule detects the running state of an ECS task and triggers a Lambda function, which creates the SQS queue with the task ID as queue name. It also grants permission to the queue and creates the SNS subscription on the topic.

As the container instance starts up, it can send a request to its metadata URL and retrieve the task ID. The task ID is used by the container instance to poll for messages. If the container instance terminates, ECS generates a task stopped event. This event matches a rule in Amazon EventBridge and triggers a Lambda function. The Lambda function retrieves the task ID, deletes the queue, and deletes the subscription from the SNS topic. The solution decouples the container instance from any overhead in maintaining queues, applying permissions, or managing subscriptions. The security permissions for all SNS-SQS management are handled by the Lambda functions.

This diagram shows the solution architecture:

Solution architecture

Events from ECS are sent to the default event bus. There are various events that are generated as part of the lifecycle of an ECS task. You can find more on the various ECS task states in ECS task documentation. This solution uses ECS as the container orchestration service but you can also use Amazon Elastic Kubernetes Service.(EKS). For EKS, you must apply the rules for EKS task state events.

Walkthrough of the implementation

The code snippets are shortened for brevity. The full source code of the solution is in the GitHub repository. The solution uses AWS Serverless Application Model (AWS SAM) for deployment.

SNS topic

The SNS topic is used to send notifications to the ECS tasks. The following snippet from the AWS SAM template shows the definition of the SNS topic:

  SNSDynamicSubsTopic:
    Type: AWS::SNS::Topic
    Properties:
      TopicName: !Ref DynamicSubTopicName

Container instance

The container instance subscribes to the SNS topic using an SQS queue. The container image is a Java class that reads messages from an SQS queue and prints them in the logs. The following code shows some of the message processor implementation:

AmazonSQS sqs = AmazonSQSClientBuilder.defaultClient();
AmazonSQSResponder responder = AmazonSQSResponderClientBuilder.standard()
        .withAmazonSQS(sqs)
        .build();

SQSMessageConsumer consumer = SQSMessageConsumerBuilder.standard()
        .withAmazonSQS(responder.getAmazonSQS())
        .withQueueUrl(queue_url)
        .withConsumer(message -> {
            System.out.println("The message is " + message.getBody());
            sqs.deleteMessage(queue_url,message.getReceiptHandle());

        }).build();
consumer.start();

The queue_url highlighted is the task ID of the ECS task. It is retrieved in the constructor of the class:

String metaDataURL = map.get("ECS_CONTAINER_METADATA_URI_V4");

HttpGet request = new HttpGet(metaDataURL);
CloseableHttpResponse response = httpClient.execute(request);

HttpEntity entity = response.getEntity();
if (entity != null) {
    String result = EntityUtils.toString(entity);
    String taskARN = JsonPath.read(result, "$['Labels']['com.amazonaws.ecs.task-arn']").toString();
    String[] arnTokens = taskARN.split("/");
    taskId = arnTokens[arnTokens.length-1];
    System.out.println("The task arn : "+taskId);
}

queue_url = sqs.getQueueUrl(taskId).getQueueUrl();

The queue URL is constructed from the task ID of the container. Each queue is dedicated to each of the tasks or the instances of the container running in ECS.

EventBridge rules

The following event pattern on the default event bus captures events that match the start of the container instance. The rule triggers a Lambda function:

      EventPattern:
        source:
          - aws.ecs
        detail-type:
          - "ECS Task State Change"
        detail:
          desiredStatus:
            - "RUNNING"
          lastStatus:  
            - "RUNNING"

The start rule routes events to a Lambda function that creates a queue with the name as the task ID. It creates the subscription to the SNS topic and grants permission on the queue to receive messages from the topic.

This event pattern matches STOPPED events of the container task. It also triggers a Lambda function to delete the queue and the associated subscription:

      EventPattern:
        source:
          - aws.ecs
        detail-type:
          - "ECS Task State Change"
        detail:
          desiredStatus:
            - "STOPPED"
          lastStatus:  
            - "STOPPED"

Lambda functions

There are two Lambda functions that perform the queue creation, subscription, authorization, and deletion.

The SNS-SQS-Subscription-Service

The following code creates the queue based on the task id, applies policies, and subscribes it to the topic. It also stores the subscription ARN in a Amazon DynamoDB table:

# get the task id from the event
taskArn = event['detail']['taskArn']
taskArnTokens = taskArn.split('/')
taskId = taskArnTokens[len(taskArnTokens)-1]

create_queue_resp = sqs_client.create_queue(QueueName=queue_name)

response = sns.subscribe(TopicArn=topic_arn, Protocol="sqs", Endpoint=queue_arn)

ddbresponse = dynamodb.update_item(
    TableName=SQS_CONTAINER_MAPPING_TABLE,
    Key={
        'id': {
            'S' : taskId.strip()
        }
    },
    AttributeUpdates={
        'SubscriptionArn':{
            'Value': {
                'S': subscription_arn
            }
        }
    },
    ReturnValues="UPDATED_NEW"
)

The cleanup service

The cleanup function is triggered when the container instance is stopped. It fetches the subscription ARN from the DynamoDB table based on the taskId. It deletes the subscription from the topic and deletes the queue. You can modify this code to include any other cleanup actions or trigger a workflow. The main part of the function code is:

taskId = taskArnTokens[len(taskArnTokens)-1]

ddbresponse = dynamodb.get_item(TableName=SQS_CONTAINER_MAPPING_TABLE,Key={'id': { 'S' : taskId}})
snsresp = sns.unsubscribe(SubscriptionArn=subscription_arn)

queuedelresp = sqs_client.delete_queue(QueueUrl=queue_url)

Conclusion

This blog shows an event driven approach to handling dynamic SNS subscription requirements. It relies on the ECS service events to trigger appropriate Lambda functions. These create the subscription queue, subscribe it to a topic, and delete it once the container instance is terminated.

The approach also allows the container application logic to focus only on consuming and processing the messages from the queue. It does not need any additional permissions to subscribe or unsubscribe from the topic or apply any additional permissions on the queue. Although the solution has been presented using ECS as the container orchestration service, it can be applied for EKS by using its service events.

For more serverless learning resources, visit Serverless Land.

ICYMI: Serverless Q3 2021

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/icymi-serverless-q3-2021/

Welcome to the 15th edition of the AWS Serverless ICYMI (in case you missed it) quarterly recap. Every quarter, we share all of the most recent product launches, feature enhancements, blog posts, webinars, Twitch live streams, and other interesting things that you might have missed!

Q3 calendar

In case you missed our last ICYMI, check out what happened last quarter here.

AWS Lambda

You can now choose next-generation AWS Graviton2 processors in your Lambda functions. This Arm-based processor architecture can provide up to 19% better performance at 20% lower cost. You can configure functions to use Graviton2 in the AWS Management Console, API, CloudFormation, and CDK. We recommend using the AWS Lambda Power Tuning tool to see how your function compare and determine the price improvement you may see.

All Lambda runtimes built on Amazon Linux 2 support Graviton2, with the exception of versions approaching end-of-support. The AWS Free Tier for Lambda includes functions powered by both x86 and Arm-based architectures.

Create Lambda function with new arm64 option

You can also use the Python 3.9 runtime to develop Lambda functions. You can choose this runtime version in the AWS Management Console, AWS CLI, or AWS Serverless Application Model (AWS SAM). Version 3.9 includes a range of new features and performance improvements.

Lambda now supports Amazon MQ for RabbitMQ as an event source. This makes it easier to develop serverless applications that are triggered by messages in a RabbitMQ queue. This integration does not require a consumer application to monitor queues for updates. The connectivity with the Amazon MQ message broker is managed by the Lambda service.

Lambda has added support for up to 10 GB of memory and 6 vCPU cores in AWS GovCloud (US) Regions and in the Middle East (Bahrain), Asia Pacific (Osaka), and Asia Pacific (Hong Kong) Regions.

AWS Step Functions

Step Functions now integrates with the AWS SDK, supporting over 200 AWS services and 9,000 API actions. You can call services directly from the Amazon States Language definition in the resource field of the task state. This allows you to work with services like DynamoDB, AWS Glue Jobs, or Amazon Textract directly from a Step Functions state machine. To learn more, see the SDK integration tutorial.

AWS Amplify

The Amplify Admin UI now supports importing existing Amazon Cognito user pools and identity pools. This allows you to configure multi-platform apps to use the same user pools with different client IDs.

Amplify CLI now enables command hooks, allowing you to run custom scripts in the lifecycle of CLI commands. You can create bash scripts that run before, during, or after CLI commands. Amplify CLI has also added support for storing environment variables and secrets used by Lambda functions.

Amplify Geo is in developer preview and helps developers provide location-aware features to their frontend web and mobile applications. This uses the Amazon Location Service to provide map UI components.

Amazon EventBridge

The EventBridge schema registry now supports discovery of cross-account events. When schema registry is enabled on a bus, it now generates schemes for events originating from another account. This helps organize and find events in multi-account applications.

Amazon DynamoDB

DynamoDB console

The new DynamoDB console experience is now the default for viewing and managing DynamoDB tables. This makes it easier to manage tables from the navigation pane and also provided a new dedicated Items page. There is also contextual guidance and step-by-step assistance to help you perform common tasks more quickly.

API Gateway

API Gateway can now authenticate clients using certificate-based mutual TLS. Previously, this feature only supported AWS Certificate Manager (ACM). Now, customers can use a server certificate issued by a third-party certificate authority or ACM Private CA. Read more about using mutual TLS authentication with API Gateway.

The Serverless Developer Advocacy team built the Amazon API Gateway CORS Configurator to help you configure cross origin resource scripting (CORS) for REST and HTTP APIs. Fill in the information specific to your API and the AWS SAM configuration is generated for you.

Serverless blog posts

July

August

September

Tech Talks & Events

We hold AWS Online Tech Talks covering serverless topics throughout the year. These are listed in the Serverless section of the AWS Online Tech Talks page. We also regularly deliver talks at conferences and events around the world, speak on podcasts, and record videos you can find to learn in bite-sized chunks.

Here are some from Q3:

Videos

Serverless Land

Serverless Office Hours – Tues 10 AM PT

Weekly live virtual office hours. In each session we talk about a specific topic or technology related to serverless and open it up to helping you with your real serverless challenges and issues. Ask us anything you want about serverless technologies and applications.

July

August

September

DynamoDB Office Hours

Are you an Amazon DynamoDB customer with a technical question you need answered? If so, join us for weekly Office Hours on the AWS Twitch channel led by Rick Houlihan, AWS principal technologist and Amazon DynamoDB expert. See upcoming and previous shows

Still looking for more?

The Serverless landing page has more information. The Lambda resources page contains case studies, webinars, whitepapers, customer stories, reference architectures, and even more Getting Started tutorials.

You can also follow the Serverless Developer Advocacy team on Twitter to see the latest news, follow conversations, and interact with the team.

Detect Adversary Behavior in Milliseconds with CrowdStrike and Amazon EventBridge

Post Syndicated from Joby Bett original https://aws.amazon.com/blogs/architecture/detect-adversary-behavior-in-seconds-with-crowdstrike-and-amazon-eventbridge/

By integrating Amazon EventBridge with Falcon Horizon, CrowdStrike has developed a real-time, cloud-based solution that allows you to detect threats in less than a second. This solution uses AWS CloudTrail and EventBridge. CloudTrail allows governance, compliance, operational auditing, and risk auditing of your AWS account. EventBridge is a serverless event bus that makes it easier to build event-driven applications at scale.

In this blog post, we’ll cover the challenges presented by using traditional log file-based security monitoring. We will also discuss how CrowdStrike used EventBridge to create an innovative, real-time cloud security solution that enables high-speed, event-driven alerts that detect malicious actors in milliseconds.

Challenges of log file-based security monitoring

Being able to detect malicious actors in your environment is necessary to stay secure in the cloud. With the growing volume, velocity, and variety of cloud logs, log file-based monitoring makes it difficult to reveal adverse behaviors in time to stop breaches.

When an attack is in progress, a security operations center (SOC) analyst has an average of one minute to detect the threat, ten minutes to understand it, and one hour to contain it. If you cannot meet this 1/10/60 minute rule, you may have a costly breach that may move laterally and explode exponentially across the cloud estate.

Let’s look at a real-life scenario. When a malicious actor attempts a ransom attack that targets high-value data in an Amazon Simple Storage Service (Amazon S3) bucket, it can involve activities in various parts of the cloud services in a brief time window.

These example activities can involve:

  • AWS Identity and Access Management (IAM): account enumeration, disabling multi-factor authentication (MFA), account hijacking, privilege escalation, etc.
  • Amazon Elastic Compute Cloud (Amazon EC2): instance profile privilege escalation, file exchange tool installs, etc.
  • Amazon S3: bucket and object enumeration; impair bucket encryption and versioning; bucket policy manipulation; getObject, putObject, and deleteObject APIs, etc.

With siloed log file-based monitoring, detecting, understanding, and containing a ransom attack while still meeting the 1/10/60 rule is difficult. This is because log files are written in batches, and files are typically only created every 5 minutes. Once the log file is written, it still needs to be fetched and processed. This means that you lose the ability to dynamically correlate disparate activities.

To summarize, top-level challenges of log file-based monitoring are:

  1. Lag time between the breach and the detection
  2. Inability to correlate disparate activities to reveal sophisticated attack patterns
  3. Frequent false positive alarms that obscure true positives
  4. High operational cost of log file synchronizations and reprocessing
  5. Log analysis tool maintenance for fast growing log volume

Security and compliance is a shared responsibility between AWS and the customer. We protect the infrastructure that runs all of the services offered in the AWS Cloud. For abstracted services, such as  Amazon S3, we operate the infrastructure layer, the operating system, and platforms, and customers access the endpoints to store and retrieve data.

You are responsible for managing your data (including encryption options), classifying assets, and using IAM tools to apply the appropriate permissions.

Indicators of attack by CrowdStrike with Amazon EventBridge

In real-world cloud breach scenarios, timeliness of observation, detection, and remediation is critical. CrowdStrike Falcon Horizon IOA is built on an event-driven architecture based on EventBridge and operates at a velocity that can outpace attackers.

CrowdStrike Falcon Horizon IOA performs the following core actions:

  • Observe: EventBridge streams CloudTrail log files across accounts to the CrowdStrike platform as activity occurs. Parallelism is enabled via event bus rules, which enables CrowdStrike to avoid the five-minute lag in fetching the log files and dynamically correlate disparate activities. The CrowdStrike platform observes end-to-end activities from AWS services and infrastructure hosted in the accounts protected by CrowdStrike.
  • Detect: Falcon Horizon invokes indicators of attack (IOA) detection algorithms that reveal adversarial or anomalous activities from the log file streams. It correlates new and historical events in real time while enriching the events with CrowdStrike threat intelligence data. Each IOA is prioritized with the likelihood of activity being malicious via scoring and mapped to the MITRE ATT&CK framework.
  • Remediate: The detected IOA is presented with remediation steps. Depending on the score, applying the remediations quickly can be critical before the attack spreads.
  • Prevent: Unremediated insecure configurations are revealed via indicators of misconfiguration (IOM) in Falcon Horizon. Applying the remediation steps from IOM can prevent future breaches.

Key differentiators of IOA from Falcon Horizon are:

  • Observability of wider attack surfaces with heterogeneous event sources
  • Detection of sophisticated tactics, techniques, and procedures (TTPs) with dynamic event correlation
  • Event enrichment with threat intelligence that aids prioritization and reduces alert fatigue
  • Low latency between malicious activity occurrence and corresponding detection
  • Insight into attacks for each adversarial event from MITRE ATT&CK framework

High-level architecture

Event-driven architectures provide advantages for integrating varied systems over legacy log file-based approaches. For securing cloud attack surfaces against the ever-evolving TTPs, a robust event-driven architecture at scale is a key differentiator.

CrowdStrike maximizes the advantages of event-driven architecture by integrating with EventBridge, as shown in Figure 1. EventBridge allows observing CloudTrail logs in event streams. It also simplifies log centralization from a number of accounts with its direct source-to-target integration across accounts, as follows:

  1. CrowdStrike hosts an EventBridge with central event buses that consume the stream of CloudTrail log events from a multitude of customer AWS accounts.
  2. Within customer accounts, EventBridge rules listen to the local CloudTrail and stream each activity as an event to the centralized EventBridge hosted by CrowdStrike.
  3. CrowdStrike’s event-driven platform detects adversarial behaviors from the event streams in real time. The detection is performed against incoming events in conjunction with historical events. The context that comes from connecting new and historical events minimizes false positives and improves alert efficacy.
  4. Events are enriched with CrowdStrike threat intelligence data that provides additional insight of the attack to SOC analysts and incident responders.
CrowdStrike Falcon Horizon IOA architecture

Figure 1. CrowdStrike Falcon Horizon IOA architecture

As data is received by the centralized EventBridge, CrowdStrike relies on unique customer ID and AWS Region in each event to provide integrity and isolation.

EventBridge allows relatively hassle-free customer onboarding by using cross account rules to transfer customer CloudTrail data into one common event bus that can then be used to filter and selectively forward the data into the Falcon Horizon platform for analysis and mitigation.

Conclusion

As your organization’s cloud footprint grows, visibility into end-to-end activities in a timely manner is critical for maintaining a safe environment for your business to operate. EventBridge allows event-driven monitoring of CloudTrail logs at scale.

CrowdStrike Falcon Horizon IOA, powered by EventBridge, observes end-to-end cloud activities at high speeds at scale. Paired with targeted detection algorithms from in-house threat detection experts and threat intelligence data, Falcon Horizon IOA combats emerging threats against the cloud control plane with its cutting-edge event-driven architecture.

Related information