Tag Archives: Technical How-to

How Amazon CodeGuru Security helps you effectively balance security and velocity

2023-07-19 Leo da Silva

Post Syndicated from Leo da Silva original https://aws.amazon.com/blogs/security/how_amazon_codeguru_security_helps_effectively_balance_security_and_velocity/

Software development is a well-established process—developers write code, review it, build artifacts, and deploy the application. They then monitor the application using data to improve the code. This process is often repeated many times over. As Amazon Web Services (AWS) customers embrace modern software development practices, they sometimes face challenges with the use of third-party code security tools, such as an overwhelming number of findings, high rates of false positives among those findings, and the logistics of tracking open issues across code versions.

Customers tell us they need help to identify the top risks in their application code as it is being built and to receive actionable recommendations to mitigate these risks. In this blog post, we demonstrate how the new Amazon CodeGuru Security service and its fully managed, machine learning (ML)-powered code security analysis capabilities provide intelligent recommendations to improve code security and quality. Amazon CodeGuru Security enhances the overall security posture of applications that are deployed in your environment while reducing the time to deploy in production.

Amazon CodeGuru Security is a managed static application security tool (SAST) service that is also available through Amazon CodeGuru Reviewer, Amazon CodeWhisperer security scanning, the Amazon SageMaker Studio CodeGuru extension, and Amazon Inspector code scanning.

Solution overview

In this blog post, we introduce you to the features and capabilities of Amazon CodeGuru Security. Amazon CodeGuru Security helps you focus on security risks that are relevant to your environment, along with contextually relevant remediation suggestions (provided as code diffs). Integration, centralization, and scalability of the service are facilitated by using an API-based design, plus bug tracking to automatically detect code fixes and close findings without user intervention. Amazon CodeGuru Security currently supports applications that are written in Python, Java, and JavaScript, along with associated artifacts like scripts, configuration, and documentation files.

Created to improve the security posture of applications that were built for the cloud, Amazon CodeGuru Security rules are developed in partnership with Amazon application security teams, applying learnings and adhering to best practices that govern the development of Amazon internal systems and services.

Amazon CodeGuru Security offers multiple integration points:

Integrated development environment (IDE) scans supported with the use of the Amazon CodeWhisperer extension on the local development environment (check supported languages and IDE versions)
The Amazon SageMaker Studio extension
AWS CLI integrations with your existing build and test pipeline processes
Security monitoring of AWS Lambda functions with the use of Amazon Inspector code scanning for Lambda

In Figure 1, you can see one of the proposed architecture patterns that supports the integration of Amazon CodeGuru Security into your existing application deployment pipeline. In this scenario, developers write application code and get it committed into Amazon CodeCommit. This event causes AWS CodeBuild to start building the application and the static security code analysis of the application code, using a pre-build hook. The code and build artifacts are copied to a local Amazon S3 bucket within your account, and Amazon CodeGuru Security scans the application assets.

Figure 1: Example of CodeGuru Security integration with deployment pipeline

Amazon CodeGuru detection engine

At the core of the CodeGuru detector design is the idea of user action in response to findings. Detectors flag security risks or quality issues with a high degree of precision, such that action can be taken directly to remediate the finding. With this goal in mind, we have designed the Guru Query Language (GQL) toolkit. GQL enables precise expression of scenario-centric micro-analyzers that check specific properties (for example, misuse of a particular Java cryptography library or API) through a wide range of analysis constructs (more than 200 at the time of publication).

Among these constructs are capabilities such as type inference (determining the precise types of variables and fields), inter-procedural analysis (analyzing across function boundaries), and advanced taint tracking capabilities, where untrusted data (from taint sources) is tracked through the application to determine whether it reaches security-sensitive operations (known as taint sinks) without being sanitized.

By using GQL, the rule author can combine constructs as building blocks to precisely match the vulnerable patterns that are being targeted. As an example, you can specify taint sources and sinks in a contextual way so that only data read from remote (as opposed to local) files is considered untrusted.

We benchmark detectors against ever-growing datasets, and improve them based on feedback from our partner security teams and customers, as well as metrics that we collect. Detectors are subjected to a rigorous quality control process. Starting from the detector specification, we work closely with subject matter experts (SMEs) to make sure that the suggestions cover the most important application surfaces and are not overly defensive in the warnings they raise. Moving from specification to implementation, detections are reviewed and sampled from shadow runs on live codebases with the same SMEs as well as internal CodeGuru users. If detectors meet an internal performance bar, they are launched internally at AWS. After they are launched, the detectors are monitored by using weekly metrics. A detector graduates into the commercial CodeGuru service only if it meets a high quality bar for several weeks.

Amazon CodeGuru Security uses a detection engine to find security issues in the application code that is scanned. The engine uses a Detector Library, which is a resource that contains detailed information about the CodeGuru security and code quality detectors, to help you build secure and efficient applications. Each detection page within the Detector Library contains descriptions, compliant and non-compliant example code snippets, severities, and additional information that helps you mitigate risks (such as Common Weakness Enumeration (CWE) numbers). The materials presented in the Amazon CodeGuru Detector Library are intended to be a high-level summary of the service’s capabilities, but might not be inclusive of all detectors or their functionality.

Bug Fix Tracking and code fixes

With user action as the ultimate goal, an important metric to us is whether code fixes are made in response to our recommendations. As such, AWS has designed a novel Bug Fix Tracking (BFT) algorithm, whose key functionality is to relate CodeGuru findings across revisions of a given codebase or application. If, for example, CodeGuru reports misuse of a cryptographic API on version V1 of codebase C, then BFT detects whether that misuse issue is still present when version V2 of C is scanned.

Tracking bugs and bug fixes are nontrivial. Code can be refactored into different locations within a file, and sometimes also into different files. In addition, syntax may be adjusted in ways that are orthogonal to fixing an issue (for example, if variables are renamed). The CodeGuru BFT algorithm constructs a bi-partite graph to relate a pair of findings across revisions, or otherwise declare a finding as either closed (no match in V2) or new (no match in V1).

Figure 2 shows the process that is used by BFT in tracking application bugs. After the application version being scanned is identified and the bug detection verification starts, BFT updates the database with its findings, validating the existing issues with findings uncovered in version N-1.

Figure 2: Overview of the Bug Fix Tracking algorithm

The algorithm is staged, starting from the simple case of 1:1 correspondence between findings, through cases where findings might have drifted to a new location but are otherwise the same. For the final, most complex scenario of fuzzy matching, we use advanced hashing techniques to establish the mapping.

BFT provides a metric that guides our own rule development and tuning process on an ongoing basis. Data about BFT findings is available to our customers through the CodeGuru Security API. With gathered data about fixes, security engineers and leaders can measure exposure to security risks, quantify the lifetime of high and critical security issues, monitor burn rate for security issues, and form other insights from the raw data.

Actionable recommendations and concrete remediation

To align with our goal of encouraging user action in response to our recommendations, we’ve added a feature powered by automated reasoning for including concrete remediation advice as part of CodeGuru recommendations. This comes in the form of a code diff, which you can apply mechanically by using standard utilities like patch.

The screenshot in Figure 3 shows how this functionality creates an important bridge between security engineers and software engineers—the former have the necessary security expertise, while the latter are often responsible for carrying out the code fix. Recommendations that are accompanied by concrete fix suggestions can cut through multiple correspondences, alignment issues, and validation cycles, which can help accelerate remediation.

Figure 3: Example of recommendation showing difference between compliant and non-compliant code

To enable the reasoning illustrated in Figure 3, where the data reaching the addObject call goes through sanitization in the form of an HtmlUtils::htmlEscape call, the underlying algorithm performs several steps. First, a formal representation of the code, known as its Abstract Syntax Tree (AST), is constructed. The AST is then visited by one or more transformation “recipes,” whose goal is to manipulate the program such that the vulnerability is mitigated.

Code transformation is done in a contextual manner, so that syntax (for example, variable names) and formatting (for example, indentation levels) are preserved. To verify that the transformation is valid, the algorithm further runs post-processing checks on the resulting code structure and syntax.

An important refinement of the remediation capability is that Amazon CodeGuru Security performs pre-analysis ahead of running the security scan to classify code artifacts into application- versus library-dependencies. It’s more feasible to take action on a recommendation for code owned by you, compared to code in a third-party library. The classification algorithm has been trained on hundreds of thousands of open-source libraries to disassemble code artifacts, including bundling application and library content in the same file, and focus downstream analysis on the most pertinent scanning surfaces.

Critical security issues have been shown to sometimes take hundreds of days to address (as discussed in this study). Internal studies that look at use of CodeGuru have seen a steep drop in time to fix issues thanks to concrete fix suggestions, which is value that the service excited to share with you.

Conclusion

Amazon CodeGuru Security is a static application security testing (SAST) tool that combines ML and automated reasoning to identify security issues in your code. Amazon CodeGuru detection capabilities that use GQL (Guru Query Language), Bug Fix Tracking (BFT), and efficacy mechanisms and AppSec expertise can help you precisely identify code security issues with a low rate of false positives. High signal-to-noise ratio is a key enabler in integrating SAST into the daily work of security engineers and software developers.

In addition, Amazon CodeGuru Security provides thorough fix recommendations, which your development teams can use to improve the overall time to remediate application security issues. At the same time, the recommendations can help you to implement security best practices based on an ML model that was trained on millions of lines of code and vulnerability assessments performed within Amazon. Get started with Amazon CodeGuru Security.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Automate secure access to Amazon MWAA environments using existing OpenID Connect single-sign-on authentication and authorization

2023-07-18 Ajay Vohra

Post Syndicated from Ajay Vohra original https://aws.amazon.com/blogs/big-data/automate-secure-access-to-amazon-mwaa-environments-using-existing-openid-connect-single-sign-on-authentication-and-authorization/

Customers use Amazon Managed Workflows for Apache Airflow (Amazon MWAA) to run Apache Airflow at scale in the cloud. They want to use their existing login solutions developed using OpenID Connect (OIDC) providers with Amazon MWAA; this allows them to provide a uniform authentication and single sign-on (SSO) experience using their adopted identity providers (IdP) across AWS services. For ease of use for end-users of Amazon MWAA, organizations configure a custom domain endpoint to their Apache Airflow UI endpoint. For teams operating and managing multiple Amazon MWAA environments, securing and customizing each environment is a repetitive but necessary task. Automation through infrastructure as code (IaC) can alleviate this heavy lifting to achieve consistency at scale.

This post describes how you can integrate your organization’s existing OIDC-based IdPs with Amazon MWAA to grant secure access to your existing Amazon MWAA environments. Furthermore, you can use the solution to provision new Amazon MWAA environments with the built-in OIDC-based IdP integrations. This approach allows you to securely provide access to your new or existing Amazon MWAA environments without requiring AWS credentials for end-users.

Overview of Amazon MWAA environments

Managing multiple user names and passwords can be difficult—this is where SSO authentication and authorization comes in. OIDC is a widely used standard for SSO, and it’s possible to use OIDC SSO authentication and authorization to access Apache Airflow UI across multiple Amazon MWAA environments.

When you provision an Amazon MWAA environment, you can choose public or private Apache Airflow UI access mode. Private access mode is typically used by customers that require restricting access from only within their virtual private cloud (VPC). When you use public access mode, the access to the Apache Airflow UI is available from the internet, in the same way as an AWS Management Console page. Internet access is needed when access is required outside of a corporate network.

Regardless of the access mode, authorization to the Apache Airflow UI in Amazon MWAA is integrated with AWS Identity and Access Management (IAM). All requests made to the Apache Airflow UI need to have valid AWS session credentials with an assumed IAM role that has permissions to access the corresponding Apache Airflow environment. For more details on the permissions policies needed to access the Apache Airflow UI, refer to Apache Airflow UI access policy: AmazonMWAAWebServerAccess.

Different user personas such as developers, data scientists, system operators, or architects in your organization may need access to the Apache Airflow UI. In some organizations, not all employees have access to the AWS console. It’s fairly common that employees who don’t have AWS credentials may also need access to the Apache Airflow UI that Amazon MWAA exposes.

In addition, many organizations have multiple Amazon MWAA environments. It’s common to have an Amazon MWAA environment setup per application or team. Each of these Amazon MWAA environments can be run in different deployment environments like development, staging, and production. For large organizations, you can easily envision a scenario where there is a need to manage multiple Amazon MWAA environments. Organizations need to provide secure access to all of their Amazon MWAA environments using their existing OIDC provider.

Solution Overview

The solution architecture integrates an existing OIDC provider to provide authentication for accessing the Amazon MWAA Apache Airflow UI. This allows users to log in to the Apache Airflow UI using their OIDC credentials. From a system perspective, this means that Amazon MWAA can integrate with an existing OIDC provider rather than having to create and manage an isolated user authentication and authorization through IAM internally.

The solution architecture relies on an Application Load Balancer (ALB) setup with a fully qualified domain name (FQDN) with public (internet) or private access. This ALB provides SSO access to multiple Amazon MWAA environments. The user-agent (web browser) call flow for accessing an Apache Airflow UI console to the target Amazon MWAA environment includes the following steps:

The user-agent resolves the ALB domain name from the Domain Name System (DNS) resolver.
The user-agent sends a login request to the ALB path /aws_mwaa/aws-console-sso with a set of query parameters populated. The request uses the required parameters mwaa_env and rbac_role as placeholders for the target Amazon MWAA environment and the Apache Airflow role-based access control (RBAC) role, respectively.
Once it receives the request, the ALB redirects the user-agent to the OIDC IdP authentication endpoint. The user-agent authenticates with the OIDC IdP with the existing user name and password.
If user authentication is successful, the OIDC IdP redirects the user-agent back to the configured ALB with a redirect_url with the authorization code included in the URL.
The ALB uses the authorization code received to obtain the access_token and OpenID JWT token with openid email scope from the OIDC IdP. It then forwards the login request to the Amazon MWAA authenticator AWS Lambda function with the JWT token included in the request header in the x-amzn-oidc-data parameter.
The Lambda function verifies the JWT token found in the request header using ALB public keys. The function subsequently authorizes the authenticated user for the requested mwaa_env and rbac_role stored in an Amazon DynamoDB table. The use of DynamoDB for authorization here is optional; the Lambda code function is_allowed can be customized to use other authorization mechanisms.
The Amazon MWAA authenticator Lambda function redirects the user-agent to the Apache Airflow UI console in the requested Amazon MWAA environment with the login token in the redirect URL. Additionally, the function provides the logout functionality.

Amazon MWAA public network access mode

For the Amazon MWAA environments configured with public access mode, the user agent uses public routing over the internet to connect to the ALB hosted in a public subnet.

The following diagram illustrates the solution architecture with a numbered call flow sequence for internet network reachability.

Amazon MWAA private network access mode

For Amazon MWAA environments configured with private access mode, the user agent uses private routing over a dedicated AWS Direct Connect or AWS Client VPN to connect to the ALB hosted in a private subnet.

The following diagram shows the solution architecture for Client VPN network reachability.

Automation through infrastructure as code

To make setting up this solution easier, we have released a pre-built solution that automates the tasks involved. The solution has been built using the AWS Cloud Development Kit (AWS CDK) using the Python programming language. The solution is available in our GitHub repository and helps you achieve the following:

Set up a secure ALB to provide OIDC-based SSO to your existing Amazon MWAA environment with default Apache Airflow Admin role-based access.
Create new Amazon MWAA environments along with an ALB and an authenticator Lambda function that provides OIDC-based SSO support. With the customization provided, you can define the number of Amazon MWAA environments to create. Additionally, you can customize the type of Amazon MWAA environments created, including defining the hosting VPC configuration, environment name, Apache Airflow UI access mode, environment class, auto scaling, and logging configurations.

The solution offers a number of customization options, which can be specified in the cdk.context.json file. Follow the setup instructions to complete the integration to your existing Amazon MWAA environments or create new Amazon MWAA environments with SSO enabled. The setup process creates an ALB with an HTTPS listener that provides the user access endpoint. You have the option to define the type of ALB that you need. You can define whether your ALB will be public facing (internet accessible) or private facing (only accessible within the VPC). It is recommended to use a private ALB with your new or existing Amazon MWAA environments configured using private UI access mode.

The following sections describe the specific implementation steps and customization options for each use case.

Prerequisites

Before you continue with the installation steps, make sure you have completed all prerequisites and run the setup-venv script as outlined within the README.md file of the GitHub repository.

Integrate to a single existing Amazon MWAA environment

If you’re integrating with a single existing Amazon MWAA environment, follow the guides in the Quick start section. You must specify the same ALB VPC as that of your existing Amazon MWAA VPC. You can specify the default Apache Airflow RBAC role that all users will assume. The ALB with an HTTPS listener is configured within your existing Amazon MWAA VPC.

Integrate to multiple existing Amazon MWAA environments

To connect to multiple existing Amazon MWAA environments, specify only the Amazon MWAA environment name in the JSON file. The setup process will create a new VPC with subnets hosting the ALB and the listener. You must define the CIDR range for this ALB VPC such that it doesn’t overlap with the VPC CIDR range of your existing Amazon MWAA VPCs.

When the setup steps are complete, implement the post-deployment configuration steps. This includes adding the ALB CNAME record to the Amazon Route 53 DNS domain.

For integrating with Amazon MWAA environments configured using private access mode, there are additional steps that need to be configured. These include configuring VPC peering and subnet routes between the new ALB VPC and the existing Amazon MWAA VPC. Additionally, you need to configure network connectivity from your user-agent to the private ALB endpoint resolved by your DNS domain.

Create new Amazon MWAA environments

You can configure the new Amazon MWAA environments you want to provision through this solution. The cdk.context.json file defines a dictionary entry in the MwaaEnvironments array. Configure the details that you need for each of the Amazon MWAA environments. The setup process creates an ALB VPC, ALB with an HTTPS listener, Lambda authorizer function, DynamoDB table, and respective Amazon MWAA VPCs and Amazon MWAA environments in them. Furthermore, it creates the VPC peering connection between the ALB VPC and the Amazon MWAA VPC.

If you want to create Amazon MWAA environments with private access mode, the ALB VPC CIDR range specified must not overlap with the Amazon MWAA VPC CIDR range. This is required for the automatic peering connection to succeed. It can take between 20–30 minutes for each Amazon MWAA environment to finish creating.

When the environment creation processes are complete, run the post-deployment configuration steps. One of the steps here is to add authorization records to the created DynamoDB table for your users. You need to define the Apache Airflow rbac_role for each of your end-users, which the Lambda authorizer function matches to provide the requisite access.

Verify access

Once you’ve completed with the post-deployment steps, you can log in to the URL using your ALB FQDN. For example, If your ALB FQDN is alb-sso-mwaa.example.com, you can log in to your target Amazon MWAA environment, named Env1, assuming a specific Apache Airflow RBAC role (such as Admin), using the following URL: https://alb-sso-mwaa.example.com/aws_mwaa/aws-console-sso?mwaa_env=Env1&rbac_role=Admin. For the Amazon MWAA environments that this solution created, you need to have appropriate Apache Airflow rbac_role entries in your DynamoDB table.

The solution also provides a logout feature. To log out from an Apache Airflow console, use the normal Apache Airflow console logout. To log out from the ALB, you can, for example, use the URL https://alb-sso-mwaa.example.com/logout.

Clean up

Follow the readme documented steps in the section Destroy CDK stacks in the GitHub repo, which shows how to clean up the artifacts created via the AWS CDK deployments. Remember to revert any manual configurations, like VPC peering connections, that you might have made after the deployments.

Conclusion

This post provided a solution to integrate your organization’s OIDC-based IdPs with Amazon MWAA to grant secure access to multiple Amazon MWAA environments. We walked through the solution that solves this problem using infrastructure as code. This solution allows different end-user personas in your organization to access the Amazon MWAA Apache Airflow UI using OIDC SSO.

To use the solution for your own environments, refer to Application load balancer single-sign-on for Amazon MWAA. For additional code examples on Amazon MWAA, refer to Amazon MWAA code examples.

About the Authors

Ajay Vohra is a Principal Prototyping Architect specializing in perception machine learning for autonomous vehicle development. Prior to Amazon, Ajay worked in the area of massively parallel grid-computing for financial risk modeling.

Jaswanth Kumar is a customer-obsessed Cloud Application Architect at AWS in NY. Jaswanth excels in application refactoring and migration, with expertise in containers and serverless solutions, coupled with a Masters Degree in Applied Computer Science.

Aneel Murari is a Sr. Serverless Specialist Solution Architect at AWS based in the Washington, D.C. area. He has over 18 years of software development and architecture experience and holds a graduate degree in Computer Science. Aneel helps AWS customers orchestrate their workflows on Amazon Managed Apache Airflow (MWAA) in a secure, cost effective and performance optimized manner.

Parnab Basak is a Solutions Architect and a Serverless Specialist at AWS. He specializes in creating new solutions that are cloud native using modern software development practices like serverless, DevOps, and analytics. Parnab works closely in the analytics and integration services space helping customers adopt AWS services for their workflow orchestration needs.

How to scan EC2 AMIs using Amazon Inspector

2023-07-18 Luke Notley

Post Syndicated from Luke Notley original https://aws.amazon.com/blogs/security/how-to-scan-ec2-amis-using-amazon-inspector/

Amazon Inspector is an automated vulnerability management service that continually scans Amazon Web Services (AWS) workloads for software vulnerabilities and unintended network exposure. Amazon Inspector supports vulnerability reporting and deep inspection of Amazon Elastic Compute Cloud (Amazon EC2) instances, container images stored in Amazon Elastic Container Registry (Amazon ECR), and AWS Lambda functions. Operating system and programming language support is extensive, ranging from Bottlerocket to Windows Server.

Many customers use Amazon EC2 Auto Scaling groups as part of their resilience and scaling architecture for their workloads. With Auto Scaling groups, you can scale and deploy rapidly by using Amazon Machine Images (AMIs). However, AMIs within your environment can quickly become outdated as new vulnerabilities are discovered. A security best practice is to perform routine vulnerability assessments of your AMIs to identify whether newfound vulnerabilities apply to them. If you identify a vulnerability, you can update the AMI with the appropriate security patches, test the AMI in lower environments, and deploy the updated AMI in your environment. At this time, Amazon Inspector only supports scanning of running EC2 instances.

In this blog post, we’ll share a solution that you can use with Amazon EventBridge, AWS Lambda, AWS Step Functions, Amazon Simple Notification Service (Amazon SNS), and Amazon Simple Storage Service (Amazon S3) to scan AMIs and generate Amazon Inspector finding reports to help ensure that your AMIs are scanned for known vulnerabilities and updated prior to deployment. Then, we will show you how to periodically scan selected EC2 AMIs based on a tagging strategy, and take automated actions.

Prerequisites

The solution provided in this post has a number of items that you will need to review and address before you deploy the solution:

Make sure that the AMI to be scanned by Amazon Inspector is based from one of the operating systems that AWS supports for EC2 scanning.
To successfully complete a scan, Amazon Inspector requires the EC2 instance to be a managed instance in AWS Systems Manager that has the Systems Manager Agent installed and running, and has an attached AWS Identity and Access Management (IAM) instance profile that allows Systems Manager to manage the instance. For more information, see Scanning Amazon EC2 instances with Amazon Inspector.
If you use customer managed keys to encrypt Amazon Elastic Block Store (Amazon EBS) volumes and you have a default EC2 configuration set to encrypt EBS volumes, you will need to configure additional key policy permissions. For the customer managed key that encrypts EBS volumes, add the following example policy statement to the key policy. Make sure to replace <111122223333> with your own AWS account ID.
```
{
            "Sid": "Allow use of the key by AMI Scanner State Machine",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam:: <111122223333>:role/service-role/AMIScanner-Statemachine-role"
            },
            "Action": [
                "kms:Encrypt",
                "kms:Decrypt",
                "kms:ReEncrypt*",
                "kms:GenerateDataKey*",
                "kms:DescribeKey"
            ],
            "Resource": "*"
        },
```
If you don’t add this additional policy, the Step Functions state machine won’t allow the EC2 instances to launch. For more information, see Key policy sections that allow access to the customer managed key.
The solution in this blog post requires that you activate Amazon Inspector in your AWS account. If you haven’t activated Amazon Inspector yet, learn more about the free trial and pricing, and follow the steps in the Amazon Inspector documentation to set up the service and start monitoring your account. Alternatively, you can activate Amazon Inspector by using the AWS Command Line Interface (AWS CLI) and this GitHub example.

Solution overview and architecture

In this solution, you will use the follow AWS services and features:

Task orchestration
- AWS Step Functions state machine workflows are used in this solution to verify that conditions are successfully validated before moving to the next task. This helps ensure that the Amazon Inspector scanning of the temporary instance launched in the first state machine is completed before the second state machine starts. This can help reduce the overall cost of the solution and can help prevent the first state machine from reaching state transition limitations.
- Lambda functions handle the logic for retrieving AMIs to be scanned, launching temporary instances, creating Amazon EventBridge rules, tagging AMIs, and exporting Amazon Inspector reports to Amazon S3.
AMI tagging
- To use this solution, you need to tag the AMIs that Amazon Inspector will scan, because a Lambda function will use these tags to start the solution orchestration. For this post, we use the tag InspectorScan with a value of true. With AMI tagging, you can configure automated processes as part of your deployment pipelines to implement the tagging.
Storage of exported Amazon Inspector findings
- Amazon S3 helps you store the exported Amazon Inspector findings report and use them in a standardized format for multiple use cases across AWS services, or use Amazon Athena to query the reports, which we will cover later in the post. Each scanned report is stored in the S3 bucket and is named in the form AMI-NAME/guid.JSON or AMI-NAME/guid.CSV, depending on the export format that you specify.
- You can also use S3 event notifications to alert different operational teams that there are Amazon Inspector scan results that require review.
Encryption of Amazon Inspector findings reports
- AWS Key Management Service (AWS KMS) is used to encrypt the findings report. The AWS KMS key used must be a customer managed, symmetric KMS encryption key, and importantly, the key must be in the same AWS Region as the S3 bucket that you configured to store the report. The solution in this post creates a new KMS key, as well as a key policy that is configured to grant permissions for Amazon Inspector to use the key.
Event tracking and scheduling
- This solution uses an Amazon EventBridge rule to listen for completed Amazon Inspector scan events for each temporary EC2 instance launch. When the EventBridge rule finds a matched event, the rule passes the required parameters and invokes the second Step Functions state machine. The event pattern used in this solution uses the following format:
```
	{
			"source": ["aws.inspector2"],
			"detail-type": ["Inspector2 Scan"],
			"resources": ["i-abcdef01234567890"]
		}
```
- You can schedule the AMI scanning by using an EventBridge rule that invokes a Lambda function that runs on a schedule. The Lambda function uses a cron expression to occur weekly. You can configure this parameter according to your requirements. Initially, this rule will be disabled to allow you to configure and enable the rule at a later stage.
- Amazon SNS sends notifications during the AMI scanning solution process. From the SNS topic, you can configure different subscriptions, depending on your preferred use case and environment. An example of a subscription could be a shared mailbox email address for the security team or incident ticketing system.

Figure 1 shows the solution architecture.

Figure 1: Amazon Inspector scanning of an AMI

The high-level workflow of the solution is as follows:

You can use EventBridge to create a scheduled rule to invoke a Lambda function. You can set the rule for daily, weekly, or monthly, depending on your use case.
The Lambda function searches for AMIs with the appropriate tags and passes these as parameters to the Step Functions workflow.
The first Step Functions state machine is invoked for each AMI to be scanned.
The first Step Functions workflow deploys a temporary EC2 instance from the AMI that is defined.
A Lambda function is invoked to create an EventBridge rule.
An EventBridge rule is created to listen for the successful Amazon Inspector scanned event of the temporary EC2 instance.
A Lambda function is invoked to tag the EC2 instance.
The temporary EC2 instance is tagged, showing Amazon Inspector that scanning is in progress.
The first Step Functions workflow sends a notification to an SNS topic.
The EventBridge rule parses the required parameters and invokes the second Step Functions state machine.
A Lambda function is invoked to generate an Amazon Inspector report and export the findings to an S3 bucket.
The scanned Amazon Inspector AMI results are saved to an S3 bucket.
The Step Functions workflow terminates the temporary EC2 instance that can reduce cost and clean up the process.
A Lambda function is invoked to delete the temporary EventBridge rule.
The temporary EventBridge rule and targets are deleted.
A Lambda function is invoked to tag the AMI.
The scanned AMI is updated with tagging metadata.
The second Step Functions workflow sends a final notification to an SNS topic.

Deploy the solution

The solution will be deployed with the scheduled rule in Amazon EventBridge disabled to allow you to create your tagging strategy and to familiarize yourself with the solution. Later in this post, we’ll cover how to enable the Amazon EventBridge scheduled rule.

Step 1: Deploy the CloudFormation template

For this next step, make sure that you deploy the CloudFormation template provided for multi-AMI scanning in the AWS account and Region where you want to test this solution.

To deploy the CloudFormation template

Choose the following Launch Stack button to launch a CloudFormation stack in your account. Note that the stack will launch in the N. Virginia (us-east-1) Region. To deploy this solution into other AWS Regions, download the solution’s CloudFormation template, modify it, and deploy it to the selected Region.

Make sure that you configure the following parameters in the CloudFormation template so that it deploys successfully:
- AMITagName — The AMI tag name to check if the AMI should be scanned by Amazon Inspector.
- AMITagValue — The AMI tag value to check if the AMI should be scanned by Amazon Inspector.
- InspectorReportFormat — The report format, which can be either CSV or JSON.
- InstanceSubnetID — The subnet ID to launch the temporary EC2 instance into.
- InstanceType — The instance type to deploy the AMI to for temporary scanning purposes.
- KmsKeyAdministratorRole — The existing IAM role that needs to have administrator access to the KMS key created for the solution. This key provides access to encrypt and decrypt the Amazon Inspector report.
- S3ReportBucketName — The name of the S3 bucket to be created.
- SnsTopic — The name of the new SNS topic to be created. This name defines the SNS topic that notifications are published to.
Review the stack name and the parameters for the template.
On the Quick create stack screen, scroll to the bottom and select I acknowledge that AWS CloudFormation might create IAM resources.
Choose Create stack. The deployment of the CloudFormation stack will take 3–4 minutes.

After the CloudFormation stack has deployed successfully, you can use the deployed solution.

Step 2: Manually run the first Step Functions workflow

The first Step Functions state machine requires parameters to be passed in; the SingleAMI Lambda function accomplishes this. You can start the Lambda function by creating a test event and passing the correct JSON text and parameters. The following parameters are available in the output section of the CloudFormation stack that the solution deployed:

AmiId — The ID of the AMI to be used for deploying the EC2 instance. This is the EC2 AMI to be scanned.
EC2InstanceProfile — The Amazon Resource Name (ARN) of the EC2 instance profile that the CloudFormation stack created.
InstanceType — The type of EC2 instance to use for deployment.
KmsKeyName — The ARN of the KMS key to be used for encrypting and decrypting the Amazon Inspector report that the CloudFormation stack created.
S3Bucket — The name of the S3 bucket to which the Amazon Inspector reports will be exported. The S3 bucket was created previously by the CloudFormation stack.
S3ReportFormat — The report format that Amazon Inspector will use to export the findings report; either the JSON or the CSV format is valid.
SnsTopc — The ARN of the SNS topic to which notifications will be sent. This SNS topic was created previously by the CloudFormation stack.
StateMachineArn — The ARN of the first Step Functions state machine, which the Lambda function will run first.
SubnetId — The ID of the VPC subnet to which the EC2 instance will be attached and launched into. This is a required parameter and could be a subnet that is created specifically for this scanning purpose.

The following is an example parameter configuration and JSON that you can use to run the Lambda function. Make sure to replace each <user input placeholder> with your own information.

{
"AmiId" : "<AMI-ABCDEF01234567890>",
"Ec2InstanceProfile" : "arn:aws:iam::<111122223333>:instance-profile/Ec2InstanceLaunchRole",
"InstanceType" : "t3.medium",
"KMSKeyName" : "arn:aws:kms:region-name:<111122223333>:key/<a1b2c3d4-5678-90ab-cdef-EXAMPLE11111>",
"S3Bucket" : "<DOC-EXAMPLE-BUCKET-111122223333>",
"S3ReportFormat" : "CSV",
"SnsTopic" : "arn:aws:sns:region-name-2:<111122223333>:InspectorScanner",
"StateMachine": "arn:aws:states:region-name:<111122223333>:stateMachine:AMIScanner-Part1-LaunchEC2",
"SubnetId" : "<SUBNET-ABCDEF01234567890>"
}

After the first state machine is finished, the EventBridge rule listens for the successful Amazon Inspector scan event. An SNS notification is sent, similar to the following.


 {"AWS Inspector AMI Scan status":"EC2 instance","For AMI":"ami-abcdef01234567890","Temporarily launched AMI using instance":"i-abcdef01234567890"}

After Amazon Inspector has finished scanning the EC2 instance, and the second state machine completes successfully, the Amazon Inspector finding report appears in the S3 bucket and notifications appear on the SNS topic that was created. The following is an example of an SNS notification.


 {"AWS Inspector AMI Scan completed":"Successfully","For AMI":"ami-abcdef01234567890","AWS Inspector report located at S3 Bucket":"DOC-EXAMPLE-BUCKET-111122223333","Temporarily launched AMI using instance":"i-abcdef01234567890"}

Enable scheduled scanning

You can enable the EventBridge scheduled rule to handle multiple AMIs and automatic scheduling. The scheduled rule invokes a Lambda function on a scheduled basis that identifies AMIs with the appropriate tags and passes parameters to the Step Functions workflow.

To enable the rule

In the EventBridge rules console, navigate to AMIScanner-ScheduledSolutionTask, and choose Enable.

Figure 2: Enable Amazon EventBridge scheduled rule

Extend the solution

With Amazon Athena, you can run SQL queries on raw data that is stored in S3 buckets. The Amazon Inspector reports are exported to S3, and you can query the data and create tables by using AWS Glue crawlers. To make sure that AWS Glue can crawl the S3 data, you need to add the role that you create for AWS Glue to the AWS KMS key permissions, so that AWS Glue can decrypt the S3 data. The following is an example policy JSON that you can update. Make sure to replace the AWS account ID <111122223333> and S3 bucket name <DOC-EXAMPLE-BUCKET-111122223333> with your own information.

{
"Sid": "Allow the AWS Glue crawler usage of the KMS key",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::<111122223333>:role/service-role/AWSGlueServiceRole-S3InspectorReports"
},
"Action": [
"kms:Decrypt",
"kms:GenerateDataKey*"
],
"Resource": "arn:aws:s3:::<DOC-EXAMPLE-BUCKET-111122223333>"
},

After an AWS Glue Data Catalog has been built, you can run the crawler on a scheduled basis to help keep the catalog up to date with the latest Amazon Inspector findings as they are exported into the S3 bucket.

Using Amazon Athena, you can run queries against the Amazon Inspector reports to generate output data that is relevant to your environment. For example, to list the AMIs that are affected by high-severity findings, you can run the following SQL query. Make sure to replace <DOC-EXAMPLE-BUCKET-111122223333> with your own information.

SELECT DISTINCT partition_0 from "<DOC-EXAMPLE-BUCKET-111122223333>" where severity='HIGH'

With the results, you can use AWS Systems Manager to update the relevant AMIs to include the latest patches and update the launch template used in your Auto Scaling groups.

To further extend this solution, you can also use Amazon QuickSight to visualize the data by connecting to the AWS Glue table and producing dashboards for consumption.

Conclusion

By performing security assessments of your AMIs on a regular basis, you can gain greater visibility and control over the security of your EC2 instances that are created from those AMIs. In this blog post, you learned how to set up AMI vulnerability assessments, and how the results of these continuous vulnerability assessments can help you keep your environment up to date with security patches. For additional hands-on walkthroughs for Amazon Inspector, see Amazon Inspector workshops. You can find the code for this blog post in the inspector-ami-scanning-solution GitHub repository.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Optimize AWS Config for AWS Security Hub to effectively manage your cloud security posture

2023-07-17 Nicholas Jaeger

Post Syndicated from Nicholas Jaeger original https://aws.amazon.com/blogs/security/optimize-aws-config-for-aws-security-hub-to-effectively-manage-your-cloud-security-posture/

AWS Security Hub is a cloud security posture management service that performs security best practice checks, aggregates security findings from Amazon Web Services (AWS) and third-party security services, and enables automated remediation. Most of the checks Security Hub performs on AWS resources happen as soon as there is a configuration change, giving you nearly immediate visibility of non-compliant resources in your environment, compared to checks that run on a periodic basis. This near real-time finding and reporting of non-compliant resources helps you to quickly respond to infrastructure misconfigurations and reduce risk. Security Hub offers these continuous security checks through its integration with the AWS Config configuration recorder.

By default, AWS Config enables recording for more than 300 resource types in your account. Today, Security Hub has controls that cover approximately 60 of those resource types. If you’re using AWS Config only for Security Hub, you can optimize the configuration of the configuration recorder to track only the resources you need, helping to reduce the costs related to monitoring those resources in AWS Config and the amount of data produced, stored, and analyzed by AWS Config. This blog post walks you through how to set up and optimize the AWS Config recorder when it is used for controls in Security Hub.

Using AWS Config and Security Hub for continuous security checks

When you enable Security Hub, you’re alerted to first enable resource recording in AWS Config, as shown in Figure 1. AWS Config continually assesses, audits, and evaluates the configurations and relationships of your resources on AWS, on premises, and in other cloud environments. Security Hub uses this capability to perform change-initiated security checks. Security Hub checks that use periodic rules don’t depend on the AWS Config recorder. You must enable AWS Config resource recording for all the accounts and in all AWS Regions where you plan to enable Security Hub standards and controls. AWS Config charges for the configuration items that are recorded, separately from Security Hub.

Figure 1: Security Hub alerts you to first enable resource recording in AWS Config

When you get started with AWS Config, you’re prompted to set up the configuration recorder, as shown in Figure 2. AWS Config uses the configuration recorder to detect changes in your resource configurations and capture these changes as configuration items. Using the AWS Config configuration recorder not only allows for continuous security checks, it also minimizes the need to query for the configurations of the individual services, saving your service API quotas for other use cases. By default, the configuration recorder records the supported resources in the Region where the recorder is running.

Note: While AWS Config supports the configuration recording of more than 300 resource types, some Regions support only a subset of those resource types. To learn more, see Supported Resource Types and Resource Coverage by Region Availability.

Figure 2: Default AWS Config settings

Optimizing AWS Config for Security Hub

Recording global resources as well as current and future resources in AWS Config is more than what is necessary to enable Security Hub controls. If you’re using the configuration recorder only for Security Hub controls, and you want to cost optimize your use of AWS Config or reduce the amount of data produced, stored, and analyzed by AWS Config, you only need to record the configurations of approximately 60 resource types, as described in AWS Config resources required to generate control findings.

Set up AWS Config, optimized for Security Hub

We’ve created an AWS CloudFormation template that you can use to set up AWS Config to record only what’s needed for Security Hub. You can download the template from GitHub.

This template can be used in any Region that supports AWS Config (see AWS Services by Region). Although resource coverage varies by Region (Resource Coverage by Region Availability), you can still use this template in every Region. If a resource type is supported by AWS Config in at least one Region, you can enable the recording of that resource type in all Regions supported by AWS Config. For the Regions that don’t support the specified resource type, the recorder will be enabled but will not record any configuration items until AWS Config supports the resource type in the Region.

Security Hub regularly releases new controls that might rely on recording additional resource types in AWS Config. When you use this template, you can subscribe to Security Hub announcements with Amazon Simple Notification Service (SNS) to get information about newly released controls that might require you to update the resource types recorded by AWS Config (and listed in the CloudFormation template). The CloudFormation template receives periodic updates in GitHub, but you should validate that it’s up to date before using it. You can also use AWS CloudFormation StackSets to deploy, update, or delete the template across multiple accounts and Regions with a single operation. If you don’t enable the recording of all resources in AWS Config, the Security Hub control, Config.1 AWS Config should be enabled, will fail. If you take this approach, you have the option to disable the Config.1 Security Hub control or suppress its findings using the automation rules feature in Security Hub.

Customizing for your use cases

You can modify the CloudFormation template depending on your use cases for AWS Config and Security Hub. If your use case for AWS Config extends beyond your use of Security Hub controls, consider what additional resource types you will need to record the configurations of for your use case. For example, AWS Firewall Manager, AWS Backup, AWS Control Tower, AWS Marketplace, and AWS Trusted Advisor require AWS Config recording. Additionally, if you use other features of AWS Config, such as custom rules that depend on recording specific resource types, you can add these resource types in the CloudFormation script. You can see the results of AWS Config rule evaluations as findings in Security Hub.

Another customization example is related to the AWS Config configuration timeline. By default, resources evaluated by Security Hub controls include links to the associated AWS Config rule and configuration timeline in AWS Config for that resource, as shown in Figure 3.

Figure 3: Link from Security Hub control to the configuration timeline for the resource in AWS Config

The AWS Config configuration timeline, as illustrated in Figure 4, shows you the history of compliance changes for the resource, but it requires the AWS::Config::ResourceCompliance resource type to be recorded. If you need to track changes in compliance for resources and use the configuration timeline in AWS Config, you must add the AWS::Config::ResourceCompliance resource type to the CloudFormation template provided in the preceding section. In this case, Security Hub may change the compliance of the Security Hub managed AWS Config rules, which are recorded as configuration items for the AWS::Config::ResourceCompliance resource type, incurring additional AWS Config recorder charges.

Figure 4: Config resource timeline

Summary

You can use the CloudFormation template provided in this post to optimize the AWS Config configuration recorder for Security Hub to reduce your AWS Config costs and to reduce the amount of data produced, stored, and analyzed by AWS Config. Alternatively, you can run AWS Config with the default settings or use the AWS Config console or scripts to further customize your configuration to fit your use case. Visit Getting started with AWS Security Hub to learn more about managing your security alerts.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Build a serverless log analytics pipeline using Amazon OpenSearch Ingestion with managed Amazon OpenSearch Service

2023-07-17 Hajer Bouafif

Post Syndicated from Hajer Bouafif original https://aws.amazon.com/blogs/big-data/build-a-serverless-log-analytics-pipeline-using-amazon-opensearch-ingestion-with-managed-amazon-opensearch-service/

In this post, we show how to build a log ingestion pipeline using the new Amazon OpenSearch Ingestion, a fully managed data collector that delivers real-time log and trace data to Amazon OpenSearch Service domains. OpenSearch Ingestion is powered by the open-source data collector Data Prepper. Data Prepper is part of the open-source OpenSearch project. With OpenSearch Ingestion, you can filter, enrich, transform, and deliver your data for downstream analysis and visualization. OpenSearch Ingestion is serverless, so you don’t need to worry about scaling your infrastructure, operating your ingestion fleet, and patching or updating the software.

For a comprehensive overview of OpenSearch Ingestion, visit Amazon OpenSearch Ingestion, and for more information about the Data Prepper open-source project, visit Data Prepper.

In this post, we explore the logging infrastructure for a fictitious company, AnyCompany. We explore the components of the end-to-end solution and then show how to configure OpenSearch Ingestion’s main parameters and how the logs come in and out of OpenSearch Ingestion.

Solution overview

Consider a scenario in which AnyCompany collects Apache web logs. They use OpenSearch Service to monitor web access and identify possible root causes to error logs of type 4xx and 5xx. The following architecture diagram outlines the use of every component used in the log analytics pipeline: Fluent Bit collects and forwards logs; OpenSearch Ingestion processes, routes, and ingests logs; and OpenSearch Service analyzes the logs.

The workflow contains the following stages:

Generate and collect – Fluent Bit collects the generated logs and forwards them to OpenSearch Ingestion. In this post, you create fake logs that Fluent Bit forwards to OpenSearch Ingestion. Check the list of supported clients to review the required configuration for each client supported by OpenSearch Ingestion.
Process and ingest – OpenSearch Ingestion filters the logs based on response value, processes the logs using a grok processor, and applies conditional routing to ingest the error logs to an OpenSearch Service index.
Store and analyze – We can analyze the Apache httpd error logs using OpenSearch Dashboards.

Prerequisites

To implement this solution, make sure you have the following prerequisites:

An AWS account
An active AWS Cloud9 environment on an Amazon Elastic Compute Cloud (Amazon EC2) instance
Pre-installed Docker compose in the AWS Cloud9 environment
AWS managed temporary credentials disabled in AWS Cloud9
An active OpenSearch Service domain

Configure OpenSearch Ingestion

First, you define the appropriate AWS Identity and Access Management (IAM) permissions to write to and from OpenSearch Ingestion. Then you set up the pipeline configuration in the OpenSearch Ingestion. Let’s explore each step in more detail.

Configure IAM permissions

OpenSearch Ingestion works with IAM to secure communications into and out of OpenSearch Ingestion. You need two roles, authenticated using AWS Signature V4 (SigV4) signed requests. The originating entity requires permissions to write to OpenSearch Ingestion. OpenSearch Ingestion requires permissions to write to your OpenSearch Service domain. Finally, you must create an access policy using OpenSearch Service’s fine-grained access control, which allows OpenSearch Ingestion to create indexes and write to them in your domain.

The following diagram illustrates the IAM permissions to allow OpenSearch Ingestion to write to an OpenSearch Service domain. Refer to Setting up roles and users in Amazon OpenSearch Ingestion to get more details on roles and permissions required to use OpenSearch Ingestion.

In the demo, you use the AWS Cloud9 EC2 instance profile’s credentials to sign requests sent to OpenSearch Ingestion. You use Fluent Bit to fetch the credentials and assume the role you pass in the aws_role_arn you configure later.

Create an ingestion role (called IngestionRole) to allow Fluent Bit to ingest the logs into your pipeline.

Create a trust relationship to allow Fluent Bit to assume the ingestion role, as shown in the following code. Fluent Bit attempts to fetch the credentials in the following order. In configuring the access policy for this role, you grant permission for the osis:Ingest.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "{your-account-id}"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}

Create a pipeline role (called PipelineRole) with a trust relationship for OpenSearch Ingestion to assume that role. The domain-level access policy of the OpenSearch domain grants the pipeline role access to the domain.

Finally, configure your domain’s security plugin to enable OpenSearch Ingestion’s assumed role to create indexes and write data to the domain.

In this demo, the OpenSearch Service domain uses fine-grained access control for authentication, so you need to map the OpenSearch Ingestion pipeline role to the OpenSearch backend role all_access. For instructions, refer to Step 2: Include the pipeline role in the domain access policy page.

Create the pipeline in OpenSearch Ingestion

To create an OpenSearch Ingestion pipeline, complete the following steps:

On the OpenSearch Service console, choose Pipelines in the navigation pane.
Choose Create pipeline.
For Pipeline name, enter a name.

Input the minimum and maximum Ingestion OpenSearch Compute Units (Ingestion OCUs). In this example, we use the default pipeline capacity settings of minimum 1 Ingestion OCU and maximum 4 Ingestion OCUs.

Each OCU is a combination of approximately 8 GB of memory and 2 vCPUs that can handle an estimated 8 GiB per hour. OpenSearch Ingestion supports up to 96 OCUs, and it automatically scales up and down based on your ingest workload demand.

In the Pipeline configuration section, configure Data Prepper to process your data by choosing the appropriate blueprint configuration template on the Configuration blueprints menu. For this post, we choose AWS-LogAggregationWithConditionalRouting.

The OpenSearch Ingestion pipeline configuration consists of four sections:

Source – This is the input component of a pipeline. It defines the mechanism through which a pipeline consumes records. In this post, you use the http_source plugin and provide the Fluent Bit output URI value within the path attribute.
Processors – This represents an intermediate processing to filter, transform, and enrich your input data. Refer to Supported plugins for more details on the list of operations supported in OpenSearch Ingestion. In this post, we use the grok processor COMMONAPACHELOG, which matches input logs against the common Apache log pattern and makes it easy to query in OpenSearch Service.
Sink – This is the output component of a pipeline. It defines one or more destinations to which a pipeline publishes records. In this post, you define an OpenSearch Service domain and index as sink.
Route – This is the part of a processor that allows the pipeline to route the data into different sinks based on specific conditions. In this example, you create four routes based in the response field value of the log. If the response field value of the log line matches 2xx or 3xx, the log is sent to the OpenSearch Service index aggregated_2xx_3xx. If the response field value matches 4xx, the log is sent to the index aggregated_4xx. If the response field value matches 5xx, the log is sent to the index aggregated_5xx.

Update the blueprint based on your use case. The following code shows an example of the pipeline configuration YAML file:

version: "2"
log-aggregate-pipeline:
  source:
    http:
      # Provide the FluentBit output URI value.
      path: "/log/ingest"
  processor:
    - date:
        from_time_received: true
        destination: "@timestamp"
    - grok:
        match:
          log: [ "%{COMMONAPACHELOG_DATATYPED}" ]
  route:
    - 2xx_status: "/response >= 200 and /response < 300"
    - 3xx_status: "/response >= 300 and /response < 400"
    - 4xx_status: "/response >= 400 and /response < 500"
    - 5xx_status: "/response >= 500 and /response < 600"
  sink:
    - opensearch:
        # Provide an AWS OpenSearch Service domain endpoint
        hosts: [ "{your-domain-endpoint}" ]
        aws:
          # Provide a Role ARN with access to the domain. This role should have a trust relationship with osis-pipelines.amazonaws.com
          sts_role_arn: "arn:aws:iam::{your-account-id}:role/PipelineRole"
          # Provide the region of the domain.
          region: "{AWS_Region}"
        index: "aggregated_2xx_3xx"
        routes:
          - 2xx_status
          - 3xx_status
    - opensearch:
        # Provide an AWS OpenSearch Service domain endpoint
        hosts: [ "{your-domain-endpoint}"  ]
        aws:
          # Provide a Role ARN with access to the domain. This role should have a trust relationship with osis-pipelines.amazonaws.com
          sts_role_arn: "arn:aws:iam::{your-account-id}:role/PipelineRole"
          # Provide the region of the domain.
          region: "{AWS_Region}"
        index: "aggregated_4xx"
        routes:
          - 4xx_status
    - opensearch:
        # Provide an AWS OpenSearch Service domain endpoint
        hosts: [ "{your-domain-endpoint}"  ]
        aws:
          # Provide a Role ARN with access to the domain. This role should have a trust relationship with osis-pipelines.amazonaws.com
          sts_role_arn: "arn:aws:iam::{your-account-id}:role/PipelineRole"
          # Provide the region of the domain.
          region: "{AWS_Region}"
        index: "aggregated_5xx"
        routes:
          - 5xx_status

Provide the relevant values for your domain endpoint, account ID, and Region related to your configuration.

Check the health of your configuration setup by choosing Validate pipeline when you finish the update.

When designing a production workload, deploy your pipeline within a VPC. For instructions, refer to Securing Amazon OpenSearch Ingestion pipelines within a VPC.

For this post, select Public access under Network.

In the Log publishing options section, select Publish to CloudWatch logs and Create new group.

OpenSearch Ingestion uses the log levels of INFO, WARN, ERROR, and FATAL. Enabling log publishing helps you monitor your pipelines in production.

Choose Next and Create pipeline.
Select the pipeline and choose View details to see the progress of the pipeline creation.

Wait until the status changes to Active to start using the pipeline.

Send logs to the OpenSearch Ingestion pipeline

To start sending logs to the OpenSearch Ingestion pipeline, complete the following steps:

On the AWS Cloud9 console, create a Fluent Bit configuration file and update the following attributes:
- Host – Enter the ingestion URL of your OpenSearch Ingestion pipeline.
- aws_service – Enter osis.
- aws_role_arn – Enter the ARN of the IAM role IngestionRole.

The following code shows an example of the Fluent-bit.conf file:

[SERVICE]
    parsers_file          ./parsers.conf
    
[INPUT]
    name                  tail
    refresh_interval      5
    path                  /var/log/*.log
    read_from_head        true
[FILTER]
    Name parser
    Key_Name log
    Parser apache
[OUTPUT]
    Name http
    Match *
    Host {Ingestion URL}
    Port 443
    URI /log/ingest
    format json
    aws_auth true
    aws_region {AWS_region}
    aws_role_arn arn:aws:iam::{your-account-id}:role/IngestionRole
    aws_service osis
    Log_Level trace
    tls On

In the AWS Cloud9 environment, create a docker-compose YAML file to deploy Fluent Bit and Flog containers:

version: '3'
services:
  fluent-bit:
    container_name: fluent-bit
    image: docker.io/amazon/aws-for-fluent-bit
    volumes:
      - ./fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf
      - ./apache-logs:/var/log
  flog:
    container_name: flog
    image: mingrammer/flog
    command: flog -t log -f apache_common -o web/log/test.log -w -n 100000 -d 1ms -p 1000
    volumes:
      - ./apache-logs:/web/log

Before you start the Docker containers, you need to update the IAM EC2 instance role in AWS Cloud9 so it can sign the requests sent to OpenSearch Ingestion.

For demo purposes, create an IAM service-linked role and choose EC2 under Use case to allow the AWS Cloud9 EC2 instance to call OpenSearch Ingestion on your behalf.
Add the OpenSearch Ingestion policy, which is the same policy you used with IngestionRole.
Add the AdministratorAccess permission policy to the role as well.

Your role definition should look like the following screenshot.

After you create the role, go back to AWS Cloud9, select your demo environment, and choose View details.
On the EC2 instance tab, choose Manage EC2 instance to view the details of the EC2 instance attached to your AWS Cloud9 environment.

On the Amazon EC2 console, replace the IAM role of your AWS Cloud9 EC2 instance with the new role.
Open a terminal in AWS Cloud9 and run the command docker-compose up.

Check the output in the terminal—if everything is working correctly, you get status 200.

Fluent Bit collects logs from the /var/log repository in the container and pushes the data to the OpenSearch Ingestion pipeline.

Open OpenSearch Dashboards, navigate to Dev Tools, and run the command GET _cat/indices to validate that the data has been delivered by OpenSearch Ingestion to your OpenSearch Service domain.

You should see the three indexes created: aggregated_2xx_3xx, aggregated_4xx, and aggregated_5xx.

Now you can focus on analyzing your log data and reinvent your business without having to worry about any operational overhead regarding your ingestion pipeline.

Best practices for monitoring

You can monitor the Amazon CloudWatch metrics made available to you to maintain the right performance and availability of your pipeline. Check the list of available pipeline metrics related to the source, buffer, processor, and sink plugins.

Navigate to the Metrics tab for your specific OpenSearch Ingestion pipeline to explore the graphs available to each metric, as shown in the following screenshot.

In your production workloads, make sure to configure the following CloudWatch alarms to notify you when the pipeline metrics breach a specific threshold so you can promptly remediate each issue.

Managing cost

While OpenSearch Ingestion automatically provisions and scales the OCUs for your spiky workloads, you only pay for the compute resources actively used by your pipeline to ingest, process, and route data. Therefore, setting up a maximum capacity of Ingestion OCUs allows you to handle your workload peak demand while controlling cost.

For production workloads, make sure to configure a minimum of 2 Ingestion OCUs to ensure 99.9% availability for the ingestion pipeline. Check the sizing recommendations and learn how OpenSearch Ingestion responds to workload spikes.

Clean up

Make sure you clean up unwanted AWS resources created during this post in order to prevent additional billing for these resources. Follow these steps to clean up your AWS account:

On the AWS Cloud9 console, choose Environments in the navigation pane.
Select the environment you want to delete and choose Delete.
On the OpenSearch Service console, choose Domains under Managed clusters in the navigation pane.
Select the domain you want to delete and choose Delete.
Select Pipelines under Ingestion in the navigation pane.
Select the pipeline you want to delete and on the Actions menu, choose Delete.

Conclusion

In this post, you learned how to create a serverless ingestion pipeline to deliver Apache access logs to an OpenSearch Service domain using OpenSearch Ingestion. You learned the IAM permissions required to start using OpenSearch Ingestion and how to use a pipeline blueprint instead of creating a pipeline configuration from scratch.

You used Fluent Bit to collect and forward Apache logs, and used OpenSearch Ingestion to process and conditionally route the log data to different indexes in OpenSearch Service. For more examples about writing to OpenSearch Ingestion pipelines, refer to Sending data to Amazon OpenSearch Ingestion pipelines.

Finally, the post provided you with recommendations and best practices to deploy OpenSearch Ingestion pipelines in a production environment while controlling cost.

Follow this post to build your serverless log analytics pipeline, and refer to Top strategies for high volume tracing with Amazon OpenSearch Ingestion to learn more about high volume tracing with OpenSearch Ingestion.

About the authors

Hajer Bouafif is an Analytics Specialist Solutions Architect at Amazon Web Services. She focuses on OpenSearch Service and helps customers design and build well-architected analytics workloads in diverse industries. Hajer enjoys spending time outdoors and discovering new cultures.

Francisco Losada is an Analytics Specialist Solutions Architect based out of Madrid, Spain. He works with customers across EMEA to architect, implement, and evolve analytics solutions at AWS. He advocates for OpenSearch, the open-source search and analytics suite, and supports the community by sharing code samples, writing content, and speaking at conferences. In his spare time, Francisco enjoys playing tennis and running.

Muthu Pitchaimani is a Search Specialist with Amazon OpenSearch Service. He builds large-scale search applications and solutions. Muthu is interested in the topics of networking and security, and is based out of Austin, Texas.

Mixing AWS Graviton with x86 CPUs to optimize cost and resiliency using Amazon EKS

2023-07-13 Macey Neff

Post Syndicated from Macey Neff original https://aws.amazon.com/blogs/compute/mixing-aws-graviton-with-x86-cpus-to-optimize-cost-and-resilience-using-amazon-eks/

This post is written by Yahav Biran, Principal SA, and Yuval Dovrat, Israel Head Compute SA.

This post shows you how to integrate AWS Graviton-based Amazon EC2 instances into an existing Amazon Elastic Kubernetes Service (Amazon EKS) environment running on x86-based Amazon EC2 instances. Customers use mixed-CPU architectures to enable their application to utilize a wide selection of Amazon EC2 instance types and improve overall application resilience. In order to successfully run a mixed-CPU application, it is strongly recommended that you test application performance in a test environment before running production applications on Graviton-based instances. You can follow AWS’ transition guide to learn more about porting your application to AWS Graviton.

This example shows how you can use KEDA for controlling application capacity across CPU types in EKS. KEDA will trigger a deployment based on the application’s response latency as measured by the Application Load Balancer (ALB). To simplify resource provisioning, Karpenter, an open-source Kubernetes node provisioning software, and AWS Load Balancer Controller, are shown as well.

Solution Overview

There are two solutions that this post covers to test a mixed-CPU application. The first configuration (shown in Figure 1 below) is the “A/B Configuration”. It uses an Application Load Balancer (ALB)-based Ingress to control traffic flowing to x86-based and Graviton-based node pools. You use this configuration to gradually migrate a live application from x86-based instances to Graviton-based instances, while validating the response time with Amazon CloudWatch.

Figure 1, config 1: A/B Configuration

In the second configuration, the “Karpenter Controlled Configuration” (shown in Figure 2 below as Config 2), Karpenter automatically controls the instance blend. Karpenter is configured to use weighted provisioners with values that prioritize AWS Graviton-based Amazon EC2 instances over x86-based Amazon EC2 instances.

Figure 2, config II: Karpenter Controlled Configuration, with Weighting provisioners topology

It is recommended that you start with the “A/B” configuration to measure the response time of live requests. Once your workload is validated on Graviton-based instances, you can build the second configuration to simplify the deployment configuration and increase resiliency. This enables your application to automatically utilize x86-based instances if needed, for example, during an unplanned large-scale event.

You can find the step-by-step guide on GitHub to help you to examine and try the example app deployment described in this post. The following provides an overview of the step-by-step guide.

Code Migration to AWS Graviton

The first step is migrating your code from x86-based instances to Graviton-based instances. AWS has multiple resources to help you migrate your code. These include AWS Graviton Fast Start Program, AWS Graviton Technical Guide GitHub Repository, AWS Graviton Transition Guide, and Porting Advisor for Graviton.

After making any required changes, you might need to recompile your application for the Arm64 architecture. This is necessary if your application is written in a language that compiles to machine code, such as Golang and C/C++, or if you need to rebuild native-code libraries for interpreted/JIT compiled languages such as the Python/C API or Java Native Interface (JNI).

To allow your containerized application to run on both x86 and Graviton-based nodes, you must build OCI images for both the x86 and Arm64 architectures, push them to your image repository (such as Amazon ECR), and stitch them together by creating and pushing an OCI multi-architecture manifest list. You can find an overview of these steps in this AWS blog post. You can also find the AWS Cloud Development Kit (CDK) construct on GitHub to help get you started.

To simplify the process, you can use a Linux distribution package manager that supports cross-platform packages and avoid platform-specific software package names in the Linux distribution wherever possible. For example, use:

RUN pip install httpd

instead of:

ARG ARCH=aarch64 or amd64
RUN yum install httpd.${ARCH}

This blog post shows you how to automate multi-arch OCI image building in greater depth.

Application Deployment

Config 1 – A/B controlled topology

This topology allows you to migrate to Graviton while validating the application’s response time (approximately 300ms) on both x86 and Graviton-based instances. As shown in Figure 1, this design has a single Listener that forwards incoming requests to two Target Groups. One Target Group is associated with Graviton-based instances, while the other Target Group is associated with x86-based instances. The traffic ratio associated with each target group is defined in the Ingress configuration.

Here are the steps to create Config 1:

Create two KEDA ScaledObjects that scale the number of pods based on the latency metric (AWS/ApplicationELB-TargetResponseTime) that matches the target group (triggers.metadata.dimensionValue). Declare the maximum acceptable latency in targetMetricValue:0.3.
Below is the Graviton deployment scaledObject (spec.scaleTargetRef), note the comments that denote the value of the x86 deployment scaledObject

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
…
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: armsimplemultiarchapp #amdsimplemultiarchapp
…
  triggers:                 
    - type: aws-cloudwatch
      metadata:
        namespace: "AWS/ApplicationELB"
        dimensionName: "LoadBalancer"
        dimensionValue: "app/simplemultiarchapp/xxxxxx"
        metricName: "TargetResponseTime"
        targetMetricValue: "0.3"

Once the topology has been created, add Amazon CloudWatch Container Insights to measure CPU, network throughput, and instance performance.
To simplify testing and control for potential performance differences in instance generations, create two dedicated Karpenter provisioners and Kubernetes Deployments (replica sets) and specify the instance generation, CPU count, and CPU architecture for each one. This example uses c7g (Graviton3) and c6i (Intel) . You will remove these constraints in the next topology to allow more allocation flexibility.

The x86-based instances Karpenter provisioner:

apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: x86provisioner
spec:
  requirements:
  - key: karpenter.k8s.aws/instance-generation
    operator: In
    values:
    - "6"
  - key: karpenter.k8s.aws/instance-cpu
    operator: In 
    values:
    - "2"
  - key: kubernetes.io/arch
    operator: In
    values:
    - amd64

The Graviton-based instances Karpenter provisioner:

apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: arm64provisioner
spec:
  requirements:
  - key: karpenter.k8s.aws/instance-generation
    operator: In
    values:
    - "7"
  - key: karpenter.k8s.aws/instance-cpu
    operator: In
    values:
    - "2"
  - key: kubernetes.io/arch
    operator: In
    values:
    - arm64

Create two Kubernetes Deployment resources—one per CPU architecture—that use nodeSelector to schedule one Deployment on Graviton-based instances, and another Deployment on x86-based instances. Similarly, create two NodePort Service resources, where each Service points to its architecture-specific ReplicaSet.
Create an Application Load Balancer using the AWS Load Balancer Controller to distribute incoming requests among the different pods. Control the traffic routing in the ingress by adding an ingress.kubernetes.io/actions.weighted-routing annotation. You can adjust the weight in the example below to meet your needs. This migration example started with a 100%-to-0% x86-to-Graviton ratio, adjusting over time by 10% increments until it reached a 0%-to-100% x86-to-Graviton ratio.

…
alb.ingress.kubernetes.io/actions.weighted-routing: | 
{
…
  "targetGroups":[
    {
      "serviceName":"armsimplemultiarchapp-svc",
      "servicePort":"80","weight":50
    },
    {
      "serviceName":"amdsimplemultiarchapp-svc",
      "servicePort":"80","weight":50}]
    }
 }

spec:
  ingressClassName: alb
  rules:
    - http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: weighted-routing

You can simulate live user requests to an example application ALB endpoint. Amazon CloudWatch populates ALB Target Group request/second metrics, dimensioned by HTTP response code, to help assess the application throughput and CPU usage.

During the simulation, you will need to verify the following:

Both Graviton-based instances and x86-based instances pods process a variable amount of traffic.
The application response time (p99) meets the performance requirements (300ms).

The orange (Graviton) and blue (x86) curves of HTTP 2xx responses (figure 4) show the application throughput (HTTP requests/seconds) for each CPU architecture during the migration.

Figure 3 HTTP 2XX per CPU architecture

Figure 4 shows an example of application response time during the transition from x86-based instances to Graviton-based instances. The latency associated with each instance family grows and shrinks as the live request simulation changes the load on the application. In this example, the latency on x86 instances (until 07:00) grew up to 300ms because most of the request load was directed at to x86-based pods. It began to converge at around 08:00 when more pods were powered by Graviton-based instances. Finally, after 15:00, the request load was processed by Graviton-based instances entirely.

Figure 4: Target Response Time p99

Config 2 – Karpenter Controlled Configuration

After fully testing the application on Graviton-based EC2 instances, you are ready to simplify the deployment topology with weighted provisioners while preserving the ability to launch x86-based instances as needed.

Here are the steps to create Config 2:

Reuse the CPU-based provisioners from the previous topology, but assign a higher .spec.weight to Graviton-based instances provisioner. The x86 provisioner is still deployed in case x86-based instances are required. The karpenter.k8s.aws/instance-family can be expanded beyond those set in Config 1 or excluded by switching the operator to NotIn.

The x86-based Amazon EC2 instances Karpenter provisioner:

apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: x86provisioner
spec:
  requirements:
  - key: kubernetes.io/arch
    operator: In
    values: [amd64]

The Graviton-based Amazon EC2 instances Karpenter provisioner:

apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: priority-arm64provisioner
spec:
  weight: 10
  requirements:
  - key: kubernetes.io/arch
    operator: In
    values: [arm64]

Next, merge the two Kubernetes deployments into one deployment similar to the original before migration (i.e., no specific nodeSelector that points to a CPU-specific provisioner).

The two services are also combined into a single Kubernetes service and the actions.weighted-routing annotation is removed from the ingress resources:

spec:
  rules:
    - http:
        paths:
          - path: /app
            pathType: Prefix
            backend:
              service:
                name: simplemultiarchapp-svc

Unite the two KEDA ScaledObject resources from the first configuration and point them to a single deployment, e.g., simplemultiarchapp. The new KEDA ScaledObject will be:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: simplemultiarchapp-hpa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: simplemultiarchapp
…

Figure 5 Config 2 – Weighting provisioners results

A synthetic limit on Graviton CPU capacity is set to illustrate the scaling to x86_64 CPUs (Provisioner.limits.resources.cpu). The total application throughput (figure 6) is shown by aarch64_200 (blue) and x86_64_200 (orange). Mixing CPUs did not impact the target response time (Figure 6). Karpenter behaved as expected: prioritizing Graviton-based instances, and bursting to x86-based Amazon EC2 instances when CPU limits were crossed.

Figure 6 Config 2 -HTTP response time p99 with mixed-CPU provisioner

Conclusion

The use of a mixed-CPU architecture enables your application to utilize a wide selection of Amazon EC2 instance types and improves your applications’ resilience while meeting your service-level objectives. Application metrics can be used to control the migration with AWS ALB Ingress, Karpenter, and KEDA. Moreover, AWS Graviton-based Amazon EC2 instances can deliver up to 40% better price performance than x86-based Amazon EC2 instances. Learn more about this example on GitHub and more announcements about Gravtion.

Directing ML-powered Operational Insights from Amazon DevOps Guru to your Datadog event stream

2023-07-13 Bineesh Ravindran

Post Syndicated from Bineesh Ravindran original https://aws.amazon.com/blogs/devops/directing_ml-powered_operational_insights_from_amazon_devops_guru_to_your_datadog_event_stream/

Amazon DevOps Guru is a fully managed AIOps service that uses machine learning (ML) to quickly identify when applications are behaving outside of their normal operating patterns and generates insights from its findings. These insights generated by DevOps Guru can be used to alert on-call teams to react to anomalies for business mission critical workloads. If you are already utilizing Datadog to automate infrastructure monitoring, application performance monitoring, and log management for real-time observability of your entire technology stack, then this blog is for you.

You might already be using Datadog for a consolidated view of your Datadog Events interface to search, analyze and filter events from many different sources in one place. Datadog Events are records of notable changes relevant for managing and troubleshooting IT Operations, such as code, deployments, service health, configuration changes and monitoring alerts.

Wherever DevOps Guru detects operational events in your AWS environment that could lead to outages, it generates insights and recommendations. These insights/recommendations are then pushed to a user specific Datadog endpoint using Datadog events API. Customers can then create dashboards, incidents, alarms or take corrective automated actions based on these insights and recommendations in Datadog.

Datadog collects and unifies all of the data streaming from these complex environments, with a 1-click integration for pulling in metrics and tags from over 90 AWS services. Companies can deploy the Datadog Agent directly on their hosts and compute instances to collect metrics with greater granularity—down to one-second resolution. And with Datadog’s out-of-the-box integration dashboards, companies get not only a high-level view into the health of their infrastructure and applications but also deeper visibility into individual services such as AWS Lambda and Amazon EKS.

This blogpost will show you how to utilize Amazon DevOps guru with Datadog to get real time insights and recommendations on their AWS Infrastructure. We will demonstrate how an insight generated by Amazon DevOps Guru for an anomaly can automatically be pushed to Datadog’s event streams which can then be used to create dashboards, create alarms and alerts to take corrective actions.

Solution Overview

When an Amazon DevOps Guru insight is created, an Amazon EventBridge rule is used to capture the insight as an event and routed to an AWS Lambda Function target. The lambda function interacts with Datadog using a REST API to push corresponding DevOps Guru events captured by Amazon EventBridge

The EventBridge rule can be customized to capture all DevOps Guru insights or narrowed down to specific insights. In this blog, we will be capturing all DevOps Guru insights and will be performing actions on Datadog for the below DevOps Guru events:

DevOps Guru New Insight Open
DevOps Guru New Anomaly Association
DevOps Guru Insight Severity Upgraded
DevOps Guru New Recommendation Created
DevOps Guru Insight Closed

Figure 1: Amazon DevOps Guru Integration with Datadog with Amazon EventBridge and AWS.

Solution Implementation Steps

Pre-requisites

Before you deploy the solution, complete the following steps.

- Datadog Account Setup: We will be connecting your AWS Account with Datadog. If you do not have a Datadog account, you can request a free trial developer instance through Datadog.
- Datadog Credentials: Gather the credentials of Datadog keys that will be used to connect with AWS. Follow the steps below to create an API Key and Application Key
  Add an API key or client token
  1. 1. To add a Datadog API key or client token:
    2. Navigate to Organization settings, then click the API keys or Client Tokens
    3. Click the New Key or New Client Token button, depending on which you’re creating.
    4. Enter a name for your key or token.
    5. Click Create API key or Create Client Token.
    6. Note down the newly generated API Key value. We will need this in later steps
    7. Figure 2: Create new API Key.
  Add application keys
  - To add a Datadog application key, navigate to Organization Settings > Application Keys.If you have the permission to create application keys, click New Key.Note down the newly generated Application Key. We will need this in later steps

Add Application Key and API Key to AWS Secrets Manager : Secrets Manager enables you to replace hardcoded credentials in your code, including passwords, with an API call to Secrets Manager to retrieve the secret programmatically. This helps ensure the secret can’t be compromised by someone examining your code,because the secret no longer exists in the code.
Follow below steps to create a new secret in AWS Secrets Manager.

Open the Secrets Manager console at https://console.aws.amazon.com/secretsmanager/
Choose Store a new secret.
On the Choose secret type page, do the following:
1. For Secret type, choose other type of secret.
2. In Key/value pairs, either enter your secret in Key/value
  pairs

Figure 3: Create new secret in Secret Manager.

Click next and enter “DatadogSecretManager” as the secret name followed by Review and Finish

Figure 4: Configure secret in Secret Manager.

- - Enable DevOps Guru for your applications by following these steps or you can follow this blog to deploy a sample serverless application that can be used to generate DevOps Guru insights for anomalies detected in the application.
  - AWS Cloud9 is recommended to create an environment as AWS Serverless Application Model (SAM) CLI and AWS Command Line Interface (CLI) are pre-installed and can be accessed from a bash terminal.
  - Install and set up SAM CLI – Install the SAM CLI
  - Download and set up Java. The version should be matching to the runtime that you defined in the SAM template. yaml Serverless function configuration – Install the Java SE Development Kit 11
  - Maven – Install Maven

Option 1: Deploy Datadog Connector App from AWS Serverless Repository

The DevOps Guru Datadog Connector application is available on the AWS Serverless Application Repository which is a managed repository for serverless applications. The application is packaged with an AWS Serverless Application Model (SAM) template, definition of the AWS resources used and the link to the source code. Follow the steps below to quickly deploy this serverless application in your AWS account

- - Login to the AWS management console of the account to which you plan to deploy this solution.
  - Go to the DevOps Guru Datadog Connector application in the AWS Serverless Repository and click on “Deploy”.
  - The Lambda application deployment screen will be displayed where you can enter the Datadog Application name
    
    Figure 5: DevOps Guru Datadog connector.
    
    Figure 6: Serverless Application DevOps Guru Datadog connector.
  - After successful deployment the AWS Lambda Application page will display the “Create complete” status for the serverlessrepo-DevOps-Guru-Datadog-Connector application. The CloudFormation template creates four resources,
    1. Lambda function which has the logic to integrate to the Datadog
    2. Event Bridge rule for the DevOps Guru Insights
    3. Lambda permission
    4. IAM role
  - Now skip Option 2 and follow the steps in the “Test the Solution” section to trigger some DevOps Guru insights/recommendations and validate that the events are created and updated in Datadog.

Option 2: Build and Deploy sample Datadog Connector App using AWS SAM Command Line Interface

As you have seen above, you can directly deploy the sample serverless application form the Serverless Repository with one click deployment. Alternatively, you can choose to clone the GitHub source repository and deploy using the SAM CLI from your terminal.

The Serverless Application Model Command Line Interface (SAM CLI) is an extension of the AWS CLI that adds functionality for building and testing serverless applications. The CLI provides commands that enable you to verify that AWS SAM template files are written according to the specification, invoke Lambda functions locally, step-through debug Lambda functions, package and deploy serverless applications to the AWS Cloud, and so on. For details about how to use the AWS SAM CLI, including the full AWS SAM CLI Command Reference, see AWS SAM reference – AWS Serverless Application Model.

Before you proceed, make sure you have completed the pre-requisites section in the beginning which should set up the AWS SAM CLI, Maven and Java on your local terminal. You also need to install and set up Docker to run your functions in an Amazon Linux environment that matches Lambda.

Clone the source code from the github repo

git clone https://github.com/aws-samples/amazon-devops-guru-connector-datadog.git

Build the sample application using SAM CLI

$cd DatadogFunctions

$sam build
Building codeuri: $\amazon-devops-guru-connector-datadog\DatadogFunctions\Functions runtime: java11 metadata: {} architecture: x86_64 functions: Functions
Running JavaMavenWorkflow:CopySource
Running JavaMavenWorkflow:MavenBuild
Running JavaMavenWorkflow:MavenCopyDependency
Running JavaMavenWorkflow:MavenCopyArtifacts

Build Succeeded

Built Artifacts  : .aws-sam\build
Built Template   : .aws-sam\build\template.yaml

Commands you can use next
=========================
[*] Validate SAM template: sam validate
[*] Invoke Function: sam local invoke
[*] Test Function in the Cloud: sam sync --stack-name {{stack-name}} --watch
[*] Deploy: sam deploy --guided

This command will build the source of your application by installing dependencies defined in Functions/pom.xml, create a deployment package and saves it in the. aws-sam/build folder.

Deploy the sample application using SAM CLI

$sam deploy --guided

This command will package and deploy your application to AWS, with a series of prompts that you should respond to as shown below:

- - Stack Name: The name of the stack to deploy to CloudFormation. This should be unique to your account and region, and a good starting point would be something matching your project name.
  - AWS Region: The AWS region you want to deploy your application to.
  - Confirm changes before deploy: If set to yes, any change sets will be shown to you before execution for manual review. If set to no, the AWS SAM CLI will automatically deploy application changes.
  - Allow SAM CLI IAM role creation:Many AWS SAM templates, including this example, create AWS IAM roles required for the AWS Lambda function(s) included to access AWS services. By default, these are scoped down to minimum required permissions. To deploy an AWS CloudFormation stack which creates or modifies IAM roles, the CAPABILITY_IAM value for capabilities must be provided. If permission isn’t provided through this prompt, to deploy this example you must explicitly pass --capabilities CAPABILITY_IAM to the sam deploy command.
  - Disable rollback [y/N]: If set to Y, preserves the state of previously provisioned resources when an operation fails.
  - Save arguments to configuration file (samconfig.toml): If set to yes, your choices will be saved to a configuration file inside the project, so that in the future you can just re-run sam deploy without parameters to deploy changes to your application.

After you enter your parameters, you should see something like this if you have provided Y to view and confirm ChangeSets. Proceed here by providing ‘Y’ for deploying the resources.

Initiating deployment
=====================

        Uploading to sam-app-datadog/0c2b93e71210af97a8c57710d0463c8b.template  1797 / 1797  (100.00%)


Waiting for changeset to be created..

CloudFormation stack changeset
---------------------------------------------------------------------------------------------------------------------
Operation                     LogicalResourceId             ResourceType                  Replacement
---------------------------------------------------------------------------------------------------------------------
+ Add                         FunctionsDevOpsGuruPermissi   AWS::Lambda::Permission       N/A
                              on
+ Add                         FunctionsDevOpsGuru           AWS::Events::Rule             N/A
+ Add                         FunctionsRole                 AWS::IAM::Role                N/A
+ Add                         Functions                     AWS::Lambda::Function         N/A
---------------------------------------------------------------------------------------------------------------------


Changeset created successfully. arn:aws:cloudformation:us-east-1:867001007349:changeSet/samcli-deploy1680640852/bdc3039b-cdb7-4d7a-a3a0-ed9372f3cf9a


Previewing CloudFormation changeset before deployment
======================================================
Deploy this changeset? [y/N]: y

2023-04-04 15:41:06 - Waiting for stack create/update to complete

CloudFormation events from stack operations (refresh every 5.0 seconds)
---------------------------------------------------------------------------------------------------------------------
ResourceStatus                ResourceType                  LogicalResourceId             ResourceStatusReason
---------------------------------------------------------------------------------------------------------------------
CREATE_IN_PROGRESS            AWS::IAM::Role                FunctionsRole                 -
CREATE_IN_PROGRESS            AWS::IAM::Role                FunctionsRole                 Resource creation Initiated
CREATE_COMPLETE               AWS::IAM::Role                FunctionsRole                 -
CREATE_IN_PROGRESS            AWS::Lambda::Function         Functions                     -
CREATE_IN_PROGRESS            AWS::Lambda::Function         Functions                     Resource creation Initiated
CREATE_COMPLETE               AWS::Lambda::Function         Functions                     -
CREATE_IN_PROGRESS            AWS::Events::Rule             FunctionsDevOpsGuru           -
CREATE_IN_PROGRESS            AWS::Events::Rule             FunctionsDevOpsGuru           Resource creation Initiated
CREATE_COMPLETE               AWS::Events::Rule             FunctionsDevOpsGuru           -
CREATE_IN_PROGRESS            AWS::Lambda::Permission       FunctionsDevOpsGuruPermissi   -
                                                            on
CREATE_IN_PROGRESS            AWS::Lambda::Permission       FunctionsDevOpsGuruPermissi   Resource creation Initiated
                                                            on
CREATE_COMPLETE               AWS::Lambda::Permission       FunctionsDevOpsGuruPermissi   -
                                                            on
CREATE_COMPLETE               AWS::CloudFormation::Stack    sam-app-datadog               -
---------------------------------------------------------------------------------------------------------------------


Successfully created/updated stack - sam-app-datadog in us-east-1

Once the deployment succeeds, you should be able to see the successful creation of your resources. Also, you can find your Lambda, IAM Role and EventBridge Rule in the CloudFormation stack output values.

You can also choose to test and debug your function locally with sample events using the SAM CLI local functionality.Test a single function by invoking it directly with a test event. An event is a JSON document that represents the input that the function receives from the event source. Refer the Invoking Lambda functions locally – AWS Serverless Application Model link here for more details.

$ sam local invoke Functions -e ‘event/event.json’

Once you are done with the above steps, move on to “Test the Solution” section below to trigger some DevOps Guru insights and validate that the events are created and pushed to Datadog.

Test the Solution

To test the solution, we will simulate a DevOps Guru Insight. You can also simulate an insight by following the steps in this blog. After an anomaly is detected in the application, DevOps Guru creates an insight as shown below

Figure 7: DevOps Guru insight for DynamoDB

For the DevOps Guru insight shown above, a corresponding event is automatically created and pushed to Datadog as shown below. In addition to the events creation, any new anomalies and recommendations from DevOps Guru is also associated with the events

Figure 8 : DevOps Guru Insight pushed to Datadog event stream.

Cleaning Up

To delete the sample application that you created, In your Cloud 9 environment open a new terminal. Now type in the AWS CLI command below and pass the stack name you provided in the deploy step

aws cloudformation delete-stack --stack-name <Stack Name>

Alternatively ,you could also use the AWS CloudFormation Console to delete the stack

Conclusion

This article highlights how Amazon DevOps Guru monitors resources within a specific region of your AWS account, automatically detecting operational issues, predicting potential resource exhaustion, identifying probable causes, and recommending remediation actions. It describes a bespoke solution enabling integration of DevOps Guru insights with Datadog, enhancing management and oversight of AWS services. This solution aids customers using Datadog to bolster operational efficiencies, delivering customized insights, real-time alerts, and management capabilities directly from DevOps Guru, offering a unified interface to swiftly restore services and systems.

To start gaining operational insights on your AWS Infrastructure with Datadog head over to Amazon DevOps Guru documentation page.

About the authors:

How to enforce multi-party approval for creating Matter-compliant certificate authorities

2023-07-12 Ram Ramani

Post Syndicated from Ram Ramani original https://aws.amazon.com/blogs/security/how-to-enforce-multi-party-approval-for-creating-matter-compliant-certificate-authorities/

Customers who build smart home devices using the Matter protocol from the Connectivity Standards Alliance (CSA) need to create and maintain digital certificates, called device attestation certificates (DACs), to allow their devices to interoperate with devices from other vendors. DACs must be issued by a Matter device attestation certificate authority (CA). The CSA mandates multi-party approval for creating Matter compliant CAs. You can use AWS Private CA to create your Matter device attestation CA, which will store two important certificates: the product attestation authority (PAA) and product attestation intermediate (PAI) certificate. The PAA is the root CA that signs the PAI; the PAI is the intermediate CA that issues DACs. In this blog post, we will show how to implement multi-party approval for creation of these two certificates within AWS Private CA.

The CSA allows the use of delegated service providers (DSP) to provide you with public key infrastructure (PKI) services to create your Matter device attestation CA. You can use AWS Private CA as a DSP to create a Matter device attestation CA to issue DACs.

You should carefully plan to implement and demonstrate compliance with the Matter PKI Certificate Policy (CP) requirements when you issue Matter certificates by using the CA infrastructure services provided by AWS Private CA. Matter PKI CP is not just a technical standard; it also covers people, processes, and technology. For information about the requirements to comply with the Matter PKI CP and a reference list of acronyms, see the Matter PKI Compliance Guide. In this blog post, we address one of the Matter requirements for technical security controls for implementing multi-party approval for the creation of PAA and PAI certificates

Note: The solution presented in this post uses AWS Systems Manager Change Manager, a capability of AWS Systems Manager, for demonstrating multi-party approval as required by the Matter CP for the creation of the PAA and PAI. Additionally, the solution also uses AWS Systems Manager documents (SSM documents), which contain the code to automate the creation of PAA and PAI DAC certificates.

Implementing multi-party approval: Personas and IAM roles

For the process of achieving the multi-party approval required for Matter compliance, we will use the following human personas:

Jane Doe and Paulo Santos as the two approvers responsible for signing off on the creation of PAA and PAI.
Shirley Rodriguez as the persona responsible for setting up the prerequisite infrastructure and creating the change template that governs the multi-party approval process and specifying the human personas who are authorized to approve change requests.
Richard Roe as the persona responsible for reviewing and approving change template changes made by Shirley Rodriguez, to verify the separation of duties.

AWS offers support for identity federation to enable federated single sign-on (SSO). This allows users to sign into the AWS Management Console or call AWS API operations by using the credentials of an IAM role. To establish a secure authentication and authorization model, we highly recommend that you map the identities of the human personas to IAM roles.

As a prerequisite, Shirley Rodriguez will create the following AWS Identity and Access Management (IAM) roles that support the multi-party approval operations:

TmpltReview-Role — Richard Roe will assume this role to review and approve changes to the change template that is used to run the SSM document to create the matter CAs.
CreatePAA-Role and CreatePAI-Role — Clone the solution GitHub repository and create the roles by using the policies from the repository:
- CreatePAA-Role — This role is assumed by the AWS Systems Manager service to create the PAA.
- CreatePAI-Role — This role is assumed by the AWS Systems Manager service to create the PAI.
MatterCA-Admin-1 and MatterCA-Admin-2 — Jane Doe will use the MatterCA-Admin-1 role, while Paulo Santos will use the MatterCA-Admin-2 role. These individuals will serve as the two approvers for the multi-party approval process.

Note: It’s important that one person cannot approve an action by themselves. If a person is allowed to assume the MatterCA-Admin-1 role, they must not be allowed to assume the MatterCA-Admin-2 role also. If the same person can assume both roles, then that person can bypass the requirement for two different people to approve.

To create the IAM roles

Create IAM roles MatterCA-Admin-1 and MatterCA-Admin-2, and attach the following AWS-managed policies:
- AmazonSSMAutomationApproverAccess
- AmazonSSMReadOnlyAccess
You should configure the trust relationship to allow Jane Doe to use the Matter-CA-Admin-1-Role and Paulo Santos to use the Matter-CA-Admin-2-Role for the multi-party approval process. This is intended to restrict Jane Doe and Paulo Santos from assuming each other’s roles. Use the following policy as a guide, and make sure to replace <AccountNumber> and <Role_Name> with your own information, depending on the federated identity models that you have chosen.
```
{

    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "aws:PrincipalARN":"arn:aws:iam::<AccountNumber>:role/<Role-Name>"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}
```

Create the IAM role TmpltReview-Role, and attach the following policies.

AmazonSSMReadOnlyAccess
Attach the following custom inline policy to enable review and approval of the change template.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "TemplateReviewer",
            "Effect": "Allow",
            "Action": "ssm:UpdateDocumentMetadata",
            "Resource": "*"
        }
    ]
}

Modify the trust relationship to allow only Richard Roe to use the role, as shown in the following policy. Make sure to replace <AccountNumber> and <Role-Name> with your own information.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS":"aws:PrincipalARN":"arn:aws:iam::<AccountNumber>:role/<Role-Name>"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}

Create the IAM role CreatePAA-Role, which will be used by the AWS Systems Manager change template to run the SSM document to create PAA.

Attach the following inline policy to the CreatePAA-Role.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "acm-pca:ImportCertificateAuthorityCertificate",
                "acm-pca:IssueCertificate",
                "acm-pca:CreateCertificateAuthority",
                "acm-pca:GetCertificate",
                "acm-pca:GetCertificateAuthorityCsr",
                "acm-pca:DescribeCertificateAuthority"
            ],
            "Resource": "*"
        }
    ]
}

Modify the trust relationship for CreatePAA-Role to allow only the AWS Systems Manager service to assume this role, as shown following.

{


    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "ssm.amazonaws.com"
            },
            "Action": "sts:AssumeRole",
            "Condition": {}
        }
    ]

Create the IAM role CreatePAI-Role, which will be used by the change template to run the SSM document to create the PAI certificate.

Attach the following policy as an inline policy on the CreatePAI-Role.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "acm-pca:ImportCertificateAuthorityCertificate",
                "acm-pca:CreateCertificateAuthority",
                "acm-pca:GetCertificateAuthorityCsr",
                "acm-pca:DescribeCertificateAuthority"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "acm-pca:GetCertificateAuthorityCertificate",
                "acm-pca:GetCertificate",
                "acm-pca:IssueCertificate"
            ],
            "Resource": “*”
        }
    ]
}

Modify the trust relationship for CreatePAI-Role to allow only AWS Systems Manager to assume this role, as shown following.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "ssm.amazonaws.com"
            },
            "Action": "sts:AssumeRole",
            "Condition": {}
        }
    ]
}

Preventive security controls recommended for this solution

We recommend that you apply the following security controls for this solution:

Dedicate an AWS account to this solution – It is important that the only users who can perform actions on the PAA and PAI are the users in this account. By deploying these items in a dedicated AWS account, you limit the number of users who might have elevated privileges, but don’t have cause to use those privileges here.
SCPs (service control policies) – The IAM policies in this solution do not prevent someone with privileges, such as an administrator, from bypassing your expected controls and approving usage of the CA. SCPs, if they are applied by using AWS Organizations, can restrict the creation of CAs (certificate authorities) exclusively to CreatePAA-Role and CreatePAI-Role.
The following example SCP will enforce this type of restriction. Make sure to replace <AccountNumber> with your own information. With a strong SCP, even the root account will not be able to perform these operations or change these security controls.
```
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "RestrictCACreation",
            "Effect": "Deny",
            "Action": ["acm-pca:CreateCertificateAuthority"],
            "Resource": "*",
            "Condition": {
                "StringNotLike": {
                    "aws:PrincipalARN": [
                        "arn:aws:iam::<AccountNumber>:role/CreatePAA-Role",
                        "arn:aws:iam::<AccountNumber>:role/CreatePAI-Role"
                    ]
                }
            }
        }
    ]
 }
```

AWS Systems Manager configuration

Shirley Rodriguez will download the following sample Systems Manager (SSM) documents from our GitHub repository and perform the listed steps in this section.

The content in these yaml files will be used in the next steps to create SSM documents.

Create the SSM document

The first step is to create the SSM document that automates resource creation of the PAA and PAI in AWS Private CA.

To create the SSM document

Open the Systems Manager console.
In the left navigation pane, under Shared Resources, choose Documents.
Choose the Owned by me tab, choose Create document, and from the dropdown list, choose Automation.

Figure 1: Create the automation document
Under Create automation, choose the Editor tab, and then choose Edit.

Figure 2: Automation document editor
Copy the sample automation code from the file CreatePAA.yaml that you downloaded from the GitHub repository, and paste it into the editor.
For Name, enter CreatePAA, and then choose Create automation.
To check that the CreatePAA document was created successfully, choose the Owned by me tab. You should see the Systems Manager document, as shown in Figure 3.

Figure 3: Successful creation of the CreatePAA document
Repeat the preceding steps for creating the PAI. Make sure that you paste the code from the file CreatePAI.yaml into the editor and enter the name CreatePAI to create the PAI CA.
To check that the CreatePAI document was created successfully, choose the Owned by me tab. You should see the CreatePAI Systems Manager document, as shown in Figure 4.

Figure 4: Successful creation of the PAA and PAI documents

You’ve now completed the creation of an SSM document that contains the automation code to create certificate authorities PAA and PAI. The next step is to create Change Manager templates, which will use the SSM document and apply multi-party approval before the creation of the PAA and PAI.

Create the Change Manager templates

Shirley Rodriguez will next create two change templates that run the SSM documents: one for the PAA and one for the PAI.

To create the change templates

Open the Systems Manager console.
In the left navigation pane, under Change Management, choose Change Manager.
On the Change Manager page, in the top right, choose Create template.
For Name, enter CreatePAATemplate.
In the Builder section, add a description (optional), and for Runbook, search and select CreatePAA. Keep the defaults for the other selections.

Figure 5: Select the runbook CreatePAA in the change template
Scroll down to the Change request approvals section and choose Add approval level. This is where you configure multi-party approval for the change template.
Because there are two approvers, for Number of approvers required at this level, choose 1 from the dropdown.
Choose Add approver, choose Template specified approvers, and then select the MatterCA-Admin-1. Then choose Add another approval level for the second approver.

Figure 6: Add first level approver for the template
Choose Template specified approvers, and then select the MatterCA-Admin-2 role for multi-party approval. These roles can now approve the change request.

Figure 7: Add second level approver for the template.
Keep the defaults for the rest of the options, and at the bottom of the screen, choose Save and preview.
On the preview screen, review the configurations, and then on the top right, choose Submit for review. This pushes the template to be reviewed by template reviewer Richard Roe. In the Templates tab, the template status shows as Pending review.

Figure 8: Template with a status of pending review
Repeat the preceding steps to create the PAI change template. Make sure to name it CreatePAITemplate, and at step 5, for Runbook, select the CreatePAI document.

Figure 9: Both templates ready for review

You’ve successfully created two change templates, CreatePAATemplate and CreatePAITemplate, that generate a change request that contains an SSM document with automation code for building the PAA and PAI. These change requests are configured with multi-party approval before running the SSM document. However, before you can proceed with running the change template, it must undergo review and approval by the template reviewer Richard Roe.

Review and approve the Change Manager templates

First you need to make sure that TmpltReview-Role is added as a reviewer and approver of change templates. Shirley Rodriguez will follow the steps in this section to add TmpltReview-Role as change template reviewer.

To add the change template reviewer

Follow the instructions in the Systems Manager documentation to configure the IAM role TmpltReview-Role to review and approve the change template. Figure 10 shows how this setup looks in the Systems Manager console.

Figure 10: The template reviewer role in Settings

Now you have TmpltReview-Role added as a reviewer. Change templates that are created or modified will now need to be reviewed and approved by this role. Richard Roe will use the role TmpltReview-Role for governance of change templates, to make sure changes made by Shirley Rodriguez are in alignment with the organization’s compliance needs for Matter.
Richard Roe will follow the steps in the Systems Manager documentation for reviewing and approving change templates, to approve CreatePAATemplate and CreatePAITemplate. After the change template is approved, its status changes to Approved, and it’s ready to be run.

Figure 11: Change template approval details

You now have the change templates CreatePAATemplate and CreatePAITemplate in approved status.

Create the PAA and PAI with multi-party approval for Matter compliance

Up to this point, these instructions have described one-time configurations of AWS Systems Manager to set up the IAM roles, SSM documents, and change templates that are required to enforce multi-party approval. Now you are ready to use these change templates to create the PAA and PAI and perform multi-party approval.

Shirley Rodriguez will generate change requests that require approval from Jane Doe and Paulo Santos. This manual approval process will then run the SSM documents to create the PAA and PAI.

Create a change request for the PAA

Perform the following steps to create a change request for the PAA.

To create a change request for the PAA

Open the Systems Manager console.
In the left navigation pane, choose Change Manager, and then choose Create request.
Search for and select CreatePAATemplate, and then choose Next.
For Name, enter the name CreatePAA_ChangeRequest.
(Optional) For Change request information, provide additional information about the change request.
For Workflow start time, choose Run the operation as soon as possible after approval to run the change immediately after the request is approved.
For Change request approvals, validate that the list of First-level approvals includes the change request approvers MatterCA-Admin-1 and MatterCA-Admin-2, which you configured previously in the section Create Change Manager template. Then choose Next.

Figure 12: Change request approvers
For Automation assume role, select the IAM role CreatePAA_Role for creating the PAA.

Figure 13: Change request role
For Runbook parameters, enter the PAA certificate details for CommonName, Organization, VendorId, and ValidityInYears, and then choose Next.
Review the change request content and then, at the bottom of the screen, choose Submit for approval. Optionally, you can set up an SNS topic to notify the approvers.

You have successfully created a change request that is currently awaiting approval from Jane Doe and Paulo Santos. Let’s now move on to the approval steps.

Multi-party approval: Approve the change request for the PAA

Each of the approvers should now follow the steps in this section for approval. Jane Doe will use the IAM role MatterCA-Admin-1, while Paulo Santos will need to use the IAM role MatterCA-Admin-2.

To approve the change request for the PAA

Open the Systems Manager console and do the following.
1. In the navigation pane, choose Change Manager.
2. Choose the Approvals tab, select the CreatePAA change request, and then choose Approve.
Figure 14: Change request approval

After Jane Doe and Paulo Santos each follow these steps to approve the change request, the change request will run and will complete with status “Success,” and the PAA will be created in AWS Private CA.
Check that the status of the change request is Success, as shown in Figure 15.

Figure 15: The change request ran successfully

Validate that the PAA is created in AWS Private CA

Next, you need to validate that the PAA was created successfully and copy its Amazon Resource Name (ARN) to use for PAI creation.

To validate the creation of the PAA and retrieve its ARN

In the AWS Private CA console, choose the PAA CA that you created in the previous step.
Copy the ARN of the PAA root CA. You will use PAA CA ARN when you set up the PAI change request.

Figure 16: ARN of the PAA root CA PAA

After successfully completing these steps, you have created the certificate authority PAA by using AWS Private CA with multi-party approval. You can now proceed to the next section, where we will demonstrate how to use this PAA to issue a CA for PAI.

Create a change request for the PAI

Perform the following steps to create a change request for the PAI.

Note: Creation of a valid PAA is a prerequisite for creating the PAI in the following steps.

To create a change request for the PAI

After the PAA is created successfully, you can complete the creation of the PAI by repeating the same steps that you did in the Create a change request for the PAA section, but with the following changes:
1. At step 3, make sure that you search for and select CreatePAITemplate.
  
  Figure 17: CreatePAITemplate template
2. At step 4, for Name, enter CreatePAI_ChangeRequest.
3. At step 8, for Automation assume role, select the IAM role CreatePAI_Role.
  
  Figure 18: Change request IAM role selection
4. At step 9, for Runbook parameters, enter the PAA CA ARN that you made note of earlier, along with the CommonName, Organization, VendorId, ProductId, and ValidityInYears for the PAI, and then choose Next.
Multi-party approval: Approve the change request for the PAI

Each of the approvers should now follow the steps in this section for approval for the PAI. Jane Doe will need to use IAM role MatterCA-Admin-1, and Paulo Santos will need to use IAM role MatterCA-Admin-2.

To approve the change request for the PAI
1. Open the Systems Manager console and do the following:
  1. In the navigation pane, choose Change Manager.
  2. Choose the Approvals tab, select the CreatePAI change request, and choose Approve.
  After both approvers (Jane Doe and Paulo Santos) approve the change request, the change request will run, and the PAA will be created inside AWS Private CA.
2. Check that the status of the change request shows Success, as shown in Figure 19.
  
  Figure 19: The change requests for the PAA and PAI have run successfully
3. In the AWS Private CA console, verify that the PAA and PAI have been created successfully, as shown in Figure 20.
  
  Figure 20: PAA and PAI in AWS Private CA
After successfully completing these steps, you have created the certificate authority PAI by using AWS Private CA with multi-party approval. You can now issue DAC certificates by using this PAI. Next, we will show how to validate the logs to confirm multi-party approval.

Demonstrate compliance with multi-party approval requirements of the Matter CP by using the change manager timeline

To keep audit records that show that multi-party approval was used to create the PAA and PAI to issue DACs, you can use the Change Manager timeline.

To retrieve the change manager timeline report
1. Open the Systems Manager console.
2. In the left navigation pane, choose Change Manager.
3. Choose Requests, and then select either the CreatePAA_ChangeRequest or the CreatePAI_ChangeRequest change request.
4. Select the Timeline tab. Figure 21 shows an example of a timeline with complete runbook steps for PAA creation. It also shows the two approvers, Jane Doe and Paulo Santos, approving the change request. You can use this information to demonstrate multi-party approval.
  
  Figure 21: Audit trail for approval and creation of the PAA
  
  Likewise, Figure 22 shows an example of a timeline with complete runbook steps for creating the PAI by using multi-party approval.
  
  Figure 22: Audit trail for approval and creation of the PAI
Conclusion

In this post, you learned how to use AWS Private CA to facilitate the creation of Matter CAs in compliance with the Matter PKI CP. By using AWS Systems Manager, you can effectively fulfill the technical security control outlined in the Matter PKI CP for implementing multi-party approval for the creation of PAA and PAI certificates.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Private Certificate Authority re:Post or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Ram Ramani

Ram is a Principal Security Solutions Architect at AWS with deep expertise in data protection and privacy. Ram is currently helping customers accelerate their Matter compliance needs using AWS services.

Pravin Nair

Pravin is a seasoned Senior Security Solution Architect focused on data protection and privacy. Specializing in encryption, infrastructure security, and privacy, he assists customers in developing secure and scalable solutions that align with their business requirements. Pravin’s expertise helps to provide optimal data protection while addressing evolving security challenges.

Lukas Rash

Lukas is a Software Engineer in the AWS Cryptography organization. He is passionate about building robust cloud services to help customers improve the security of their systems. He specializes in building software to help customers implement their public key infrastructures.

Consolidating controls in Security Hub: The new controls view and consolidated findings

2023-07-12 Emmanuel Isimah

Post Syndicated from Emmanuel Isimah original https://aws.amazon.com/blogs/security/consolidating-controls-in-security-hub-the-new-controls-view-and-consolidated-findings/

In this blog post, we focus on two recently released features of AWS Security Hub: the consolidated controls view and consolidated control findings. You can use these features to manage controls across standards and to consolidate findings, which can help you significantly reduce finding noise and administrative overhead.

Security Hub is a cloud security posture management service that you can use to apply security best practice controls, such as “EC2 instances should not have a public IP address.” With Security Hub, you can check that your environment is properly configured and that your existing configurations don’t pose a security risk. Security Hub has more than 200 controls that cover more than 30 AWS services, such as Amazon Elastic Compute Cloud (Amazon EC2), Amazon Simple Storage Service (Amazon S3), and AWS Lambda. In addition, Security Hub has integrations with more than 85 partner products. Security Hub can centralize findings across your AWS accounts and AWS Regions into a single delegated administrator account in your aggregation Region of choice, creating a single pane of glass to view findings. This can help you to triage, investigate, and respond to findings in a simpler way and improve your security posture.

The Security Hub controls are grouped into the following security standards:

AWS Foundational Security Best Practices (FSBP) standard, which has been curated by AWS security specialists
Payment Card Industry Data Security Standard (PCI DSS) version 3.2.1
Center for Internet Security (CIS) AWS Foundations Benchmark versions 1.2.0 and 1.4.0
National Institute of Standards and Technology (NIST) Special Publication 800-53 Rev. 5
Service Managed Standard for AWS Control Tower

With the new features — consolidated controls view and consolidated control findings—you can now do the following:

Enable or disable controls across standards in a single action. Previously, if you wanted to maintain the same enablement status of controls between standards, you had to take the same action across multiple standards (up to six times!).
If you choose to turn on consolidated control findings, you will receive only a single finding for a security check, even if the security check is enabled across several standards. This reduces the number of findings and helps you focus on the most important misconfigured resources in your AWS environment. It allows you to apply actions and updates (such as suppressing the finding or changing its severity) one time rather than having to do so multiple times across non-consolidated findings.

Overview of new features

Now we’ll discuss some of the details of how you can use the two new features to streamline the management of controls.

The new consolidated controls view

On the new Controls page, now available in the Security Hub console as shown in Figure 1, you can view and configure security controls across standards from one central location.

Figure 1: Security Hub Controls page

Before this release, controls had to be managed within the context of individual security standards. Even if the same control was part of multiple standards, the control had different IDs in each of them. With this recent release, Security Hub now assigns controls a unique security control ID across standards, so that it’s simpler for you to reference the controls and view their findings. Following the current naming convention of the AWS FSBP standard, the consolidated control IDs start with the relevant service in scope for the control. In fact, whenever possible, the new consolidated control ID is the same as the previous FSBP control ID.

For example, before this release, control IAM.6 in FSBP was also referenced as 1.14 in CIS 1.2, and 1.6 in CIS 1.4, PCI.IAM.4, and CT.IAM.6. After the release, the control is now referenced as IAM.6 in the Security Hub standards. This change does not affect the pre-existing API calls for Security Hub, such as UpdateStandardsControl, where you can still provide the previous StandardControlARN in order to make the call.

By using the new Controls view, you can understand the status of controls across your system, view control findings, and prioritize next steps without context switching. The following information is available on the Controls page of the Security Hub console:

An overall security score, which is based on the proportion of passed controls to the total number of enabled controls.
A breakdown of security checks across controls, with the percentage of failed security checks highlighted. Because many controls can contain multiple security checks and multiple findings, this value might be different from the security score, which considers controls as a single object. You can use this metric, as well as your security score, to monitor your progress as you work to remediate findings.
A list of controls that are categorized into different tabs based on enablement and compliance status. If you are an administrator of an organization within Security Hub, the enablement and compliance status will reflect the aggregate status of the entire organization. In your finding aggregation Region, the status will also be aggregated across linked Regions.

From the controls page, you can select a control to view its details (including its title and the standards it belongs to), and view and act on the findings generated by the control.

Security Hub also offers new API operations that match the capabilities of the controls page. Unlike the pre-existing API operations, these new API operations use the consolidated control IDs (also known as security control IDs) to provide a way to know and manage the relationship between controls and standards. You can use these API operations to manage each Security Hub control across standards, to make sure that the status of controls in the standards is aligned. The new API operations include the following:

BatchGetSecurityControls — Given a list of security control IDs, returns the full definition of those controls.
BatchGetStandardsControlAssociations — Given a list of security control IDs and standards, returns whether each control is turned on in the relevant standard or not.
BatchUpdateStandardsControlAssociations — Allows you to update the enablement status of controls across standards. You can use this API operation to align the status of a security control across standards. For more information, see Enabling and disabling controls in all standards.
ListSecurityControlDefinitions — Returns the list of security controls that are a part of a specific standard.
ListStandardsControlAssociations –— Given a security control ID, returns the standards that the control is a part of.

We also provide an example script that makes use of these API calls and applies them across accounts and Regions so that your configuration is consistent. You can use our script to enable or disable Security Hub controls across your various accounts or Regions.

Consolidating control findings between standards

Before we released the consolidated control findings feature, Security Hub generated separate findings per standard for each related control. Now, you can turn on consolidated control findings, and after doing so, Security Hub will produce a single finding per security check, even when the underlying control is shared across multiple standards. Having a single finding per check across standards will help you investigate, update, and remediate failed findings more quickly, while also reducing finding noise.

As an example, we can look at control CloudTrail.2, which is shared between standards supported by Security Hub. Before you turn on this capability, you might potentially receive up to six findings for each security check generated by this control—with one finding for each security standard. After you turn on consolidated control findings, these older findings will be archived and Security Hub will generate one finding per security check in this control, regardless of how many security standards you have enabled. For an example of how the standard-specific findings compare to the new consolidated finding, see Sample control findings. The following is an example of a consolidated finding for the CloudTrial.2 control; we’ve highlighted the part that shows this finding is shared across standards.

{
  "SchemaVersion": "2018-10-08",
  "Id": "arn:aws:securityhub:us-east-2:123456789012:security-control/CloudTrail.2/finding/a1b2c3d4-5678-90ab-cdef-EXAMPLE11111",
  "ProductArn": "arn:aws:securityhub:us-east-2::product/aws/securityhub",
  "ProductName": "Security Hub",
  "CompanyName": "AWS",
  "Region": "us-east-2",
  "GeneratorId": "security-control/CloudTrail.2",
  "AwsAccountId": "123456789012",
  "Types": [
    "Software and Configuration Checks/Industry and Regulatory Standards"
  ],
  "FirstObservedAt": "2022-10-06T02:18:23.076Z",
  "LastObservedAt": "2022-10-28T16:10:06.956Z",
  "CreatedAt": "2022-10-06T02:18:23.076Z",
  "UpdatedAt": "2022-10-28T16:10:00.093Z",
  "Severity": {
    "Label": "MEDIUM",
    "Normalized": "40",
    "Original": "MEDIUM"
  },
  "Title": "CloudTrail should have encryption at-rest enabled",
  "Description": "This AWS control checks whether AWS CloudTrail is configured to use the server-side encryption (SSE) AWS Key Management Service (AWS KMS) customer master key (CMK) encryption. The check will pass if the KmsKeyId is defined.",
  "Remediation": {
    "Recommendation": {
      "Text": "For directions on how to correct this issue, consult the AWS Security Hub controls documentation.",
      "Url": "https://docs.aws.amazon.com/console/securityhub/CloudTrail.2/remediation"
    }
  },
  "ProductFields": {
    "RelatedAWSResources:0/name": "securityhub-cloud-trail-encryption-enabled-fe95bf3f",
    "RelatedAWSResources:0/type": "AWS::Config::ConfigRule",
    "aws/securityhub/ProductName": "Security Hub",
    "aws/securityhub/CompanyName": "AWS",
    "Resources:0/Id": "arn:aws:cloudtrail:us-east-2:123456789012:trail/AWSMacieTrail-DO-NOT-EDIT",
    "aws/securityhub/FindingId": "arn:aws:securityhub:us-east-2::product/aws/securityhub/arn:aws:securityhub:us-east-2:123456789012:security-control/CloudTrail.2/finding/a1b2c3d4-5678-90ab-cdef-EXAMPLE11111"
  }
  "Resources": [
    {
      "Type": "AwsCloudTrailTrail",
      "Id": "arn:aws:cloudtrail:us-east-2:123456789012:trail/AWSMacieTrail-DO-NOT-EDIT",
      "Partition": "aws",
      "Region": "us-east-2"
    }
  ],
  "Compliance": {
    "Status": "FAILED",
    "RelatedRequirements": [
        "PCI DSS v3.2.1/3.4",
        "CIS AWS Foundations Benchmark v1.2.0/2.7",
        "CIS AWS Foundations Benchmark v1.4.0/3.7"
    ],
    "SecurityControlId": "CloudTrail.2",
    "AssociatedStandards": [
  { "StandardsId": "standards/aws-foundational-security-best-practices/v/1.0.0"},
  { "StandardsId": "standards/pci-dss/v/3.2.1"},
  { "StandardsId": "ruleset/cis-aws-foundations-benchmark/v/1.2.0"},
  { "StandardsId": "standards/cis-aws-foundations-benchmark/v/1.4.0"},
  { "StandardsId": "standards/service-managed-aws-control-tower/v/1.0.0"},
  ]
  },
  "WorkflowState": "NEW",
  "Workflow": {
    "Status": "NEW"
  },
  "RecordState": "ACTIVE",
  "FindingProviderFields": {
    "Severity": {
      "Label": "MEDIUM",
      "Normalized": "40",
      "Original": "MEDIUM"
    },
    "Types": [
      "Software and Configuration Checks/Industry and Regulatory Standards"
    ]
  }
}

To turn on consolidated control findings

Open the Security Hub console.
In the left navigation pane, choose Settings, and then choose the General tab.
Under Controls, turn on Consolidated control findings, and then choose Save.

Figure 2: Turn on consolidated control findings

If you are using the Security Hub integration with AWS Organizations or have invited member accounts through a manual invitation process, consolidated control findings can only be turned on by the administrator account. When this action is taken in the administrator account, the action will also be reflected in each member account in the current Region. It can take up to 18 hours for Security Hub to archive existing standard-specific findings and generate the new, standard-agnostic, findings.

You can also enable consolidated control findings by using the API (calling the UpdateSecurityHubConfiguration API with the ControlFindingGenerator parameter equal to SECURITY_CONTROL), or by using the AWS CLI (running the update-security-hub-configuration command with control-finding-generator equal to SECURITY_CONTROL), as in the following example.

aws securityhub ‐‐region <Region of choice> update-security-hub-configuration ‐‐control-finding-generator SECURITY_CONTROL

Much like the console behavior, if you have an organizational setup in Security Hub, this API action can only be taken by the administrator, and it will be reflected in each member account in the same Region.

What to expect when you enable consolidated control findings

To allow for these new capabilities to be launched, changes to the AWS Security Finding Format (ASFF) are required. This format is used by Security Hub for findings it generates from its controls or ingests from external providers. When you turn on finding consolidation, Security Hub will archive old standard-specific findings and generate standard-agnostic findings instead. This action will only affect control findings that Security Hub generates, and it will not affect findings ingested from partner products. However, in Security Hub findings, turning on consolidated control findings might cause some updates that you previously made to findings to be archived. Despite this one-time change, after the migration is complete (it can take up to 18 hours), you will be able to update finding fields in a single action and the updates will apply across standards, without the need to make multiple updates.

One field affected by the new capabilities is the Workflow field, which provides information about the status of the investigation into a finding. Manipulating this field can also update the overall compliance status of the control that the finding is related to. For example, if you have a control with one failed finding (and the rest have passed), and the failed finding comes from a resource for which you’d like to make an exception, you can decide to suppress that failed finding by updating the Workflow field. If you suppress failed findings in a control, its compliance status can change to pass.

Before turning on consolidated control findings, if you want to maintain an aligned compliance status in controls that belong to multiple standards, you have to update the Workflow status of findings in each standard. After turning on finding consolidation, you will only have to update the Workflow status once, and the suppression will be applied across standards, helping you to reduce the number of steps needed to suppress the same findings across standards.

As mentioned earlier, when you turn on this new capability, some updates made to the previous, standard-specific findings will be archived and will not be included in the new consolidated control findings generated by Security Hub. In the case of the Workflow status, the new consolidated findings will be created with a value of NEW (for failed findings) or RESOLVED (for new findings) in the Workflow field. However, after you have onboarded to the new finding format, you can update the value of the Workflow field, as well as other fields, and this value will be maintained without requiring you to make continuous updates. For the full list of fields that can be affected by the migration to the consolidated finding format, see Consolidated control findings – ASFF changes. Before you turn on finding consolidation, we suggest that you check if your custom automations refer to those affected fields. If they do, you can update your automations and test them by using the Sample control findings in the documentation.

Conclusion

This blog post covers new Security Hub features that make it simpler for you to manage controls across standards. With the new consolidated control findings feature, you can focus on the most relevant findings and reduce noise, which is why we recommend that you review the new feature and its associated changes and turn it on at your earliest convenience.

If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about this blog post, start a new thread on the Security Hub forum or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Level up your React app with Amazon QuickSight: How to embed your dashboard for anonymous access

2023-07-11 Adrianna Kurzela

Post Syndicated from Adrianna Kurzela original https://aws.amazon.com/blogs/big-data/level-up-your-react-app-with-amazon-quicksight-how-to-embed-your-dashboard-for-anonymous-access/

Using embedded analytics from Amazon QuickSight can simplify the process of equipping your application with functional visualizations without any complex development. There are multiple ways to embed QuickSight dashboards into application. In this post, we look at how it can be done using React and the Amazon QuickSight Embedding SDK.

Dashboard consumers often don’t have a user assigned to their AWS account and therefore lack access to the dashboard. To enable them to consume data, the dashboard needs to be accessible for anonymous users. Let’s look at the steps required to enable an unauthenticated user to view your QuickSight dashboard in your React application.

Solution overview

Our solution uses the following key services:

Amazon API Gateway
AWS Identity and Access Management (IAM)
AWS Lambda
Amazon QuickSight (with session capacity pricing enabled)

After loading the web page on the browser, the browser makes a call to API Gateway, which invokes a Lambda function that calls the QuickSight API to generate a dashboard URL for an anonymous user. The Lambda function needs to assume an IAM role with the required permissions. The following diagram shows an overview of the architecture.

Prerequisites

You must have the following prerequisites:

An AWS account
A QuickSight account with session capacity pricing enabled
QuickSight dashboards (for setup instructions, refer to Tutorial: Create an Amazon QuickSight dashboard using sample data)
A sample React application (to get started, refer to Start a New React Project)

Set up permissions for unauthenticated viewers

In your account, create an IAM policy that your application will assume on behalf of the viewer:

On the IAM console, choose Policies in the navigation pane.
Choose Create policy.
On the JSON tab, enter the following policy code:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "quicksight:GenerateEmbedUrlForAnonymousUser"
            ],
            "Resource": [
                "arn:aws:quicksight:*:*:namespace/default",
				"arn:aws:quicksight:*:*:dashboard/<YOUR_DASHBOARD_ID1>",
				"arn:aws:quicksight:*:*:dashboard/<YOUR_DASHBOARD_ID2>"
            ],
            "Effect": "Allow"
        },
        {
            "Action": [
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": "*",
            "Effect": "Allow"
        }
    ]
}

Make sure to change the value of <YOUR_DASHBOARD_ID> to the value of the dashboard ID. Note this ID to use in a later step as well.

For the second statement object with logs, permissions are optional. It allows you to create a log group with the specified name, create a log stream for the specified log group, and upload a batch of log events to the specified log stream.

In this policy, we allow the user to perform the GenerateEmbedUrlForAnonymousUser action on the dashboard ID within the list of dashboard IDs inserted in the placeholder.

Enter a name for your policy (for example, AnonymousEmbedPolicy) and choose Create policy.

Next, we create a role and attach this policy to the role.

Choose Roles in the navigation pane, then choose Create role.

Identity and access console

Choose Lambda for the trusted entity.
Search for and select AnonymousEmbedPolicy, then choose Next.
Enter a name for your role, such as AnonymousEmbedRole.
Make sure the policy name is included in the Add permissions section.
Finish creating your role.

You have just created the AnonymousEmbedRole execution role. You can now move to the next step.

Generate an anonymous embed URL Lambda function

In this step, we create a Lambda function that interacts with QuickSight to generate an embed URL for an anonymous user. Our domain needs to be allowed. There are two ways to achieve the integration of Amazon QuickSight:

By adding the URL to the list of allowed domains in the Amazon QuickSight admin console (explained later in [Optional] Add your domain in QuickSight section).
[Recommended] By adding the embed URL request during runtime in the API call. Option 1 is recommended when you need to persist the allowed domains. Otherwise, the domains will be removed after 30 minutes, which is equivalent to the session duration. For other use cases, it is recommended to use the second option (described and implemented below).

On the Lambda console, create a new function.

Select Author from scratch.
For Function name, enter a name, such as AnonymousEmbedFunction.
For Runtime¸ choose Python 3.9.
For Execution role¸ choose Use an existing role.
Choose the role AnonymousEmbedRole.
Choose Create function.
On the function details page, navigate to the Code tab and enter the following code:


import json, boto3, os, re, base64

def lambda_handler(event, context):
print(event)
try:
def getQuickSightDashboardUrl(awsAccountId, dashboardIdList, dashboardRegion, event):
#Create QuickSight client
quickSight = boto3.client('quicksight', region_name=dashboardRegion);

#Construct dashboardArnList from dashboardIdList
dashboardArnList=[ 'arn:aws:quicksight:'+dashboardRegion+':'+awsAccountId+':dashboard/'+dashboardId for dashboardId in dashboardIdList]
#Generate Anonymous Embed url
response = quickSight.generate_embed_url_for_anonymous_user(
AwsAccountId = awsAccountId,
Namespace = 'default',
ExperienceConfiguration = {'Dashboard':{'InitialDashboardId': dashboardIdList[0]}},
AuthorizedResourceArns = dashboardArnList,
SessionLifetimeInMinutes = 60,
AllowedDomains = ['http://localhost:3000']
)
return response

#Get AWS Account Id
awsAccountId = context.invoked_function_arn.split(':')[4]

#Read in the environment variables
dashboardIdList = re.sub(' ','',os.environ['DashboardIdList']).split(',')
dashboardNameList = os.environ['DashboardNameList'].split(',')
dashboardRegion = os.environ['DashboardRegion']

response={}

response = getQuickSightDashboardUrl(awsAccountId, dashboardIdList, dashboardRegion, event)

return {'statusCode':200,
'headers': {"Access-Control-Allow-Origin": "http://localhost:3000",
"Content-Type":"text/plain"},
'body':json.dumps(response)
}

except Exception as e: #catch all
return {'statusCode':400,
'headers': {"Access-Control-Allow-Origin": "http://localhost:3000",
"Content-Type":"text/plain"},
'body':json.dumps('Error: ' + str(e))
}

If you don’t use localhost, replace http://localhost:3000 in the returns with the hostname of your application. To move to production, don’t forget to replace http://localhost:3000 with your domain.

On the Configuration tab, under General configuration, choose Edit.
Increase the timeout from 3 seconds to 30 seconds, then choose Save.
Under Environment variables, choose Edit.
Add the following variables:
1. Add DashboardIdList and list your dashboard IDs.
2. Add DashboardRegion and enter the Region of your dashboard.
Choose Save.

Your configuration should look similar to the following screenshot.

On the Code tab, choose Deploy to deploy the function.

Environment variables console

Set up API Gateway to invoke the Lambda function

To set up API Gateway to invoke the function you created, complete the following steps:

On the API Gateway console, navigate to the REST API section and choose Build.
Under Create new API, select New API.
For API name, enter a name (for example, QuicksightAnonymousEmbed).
Choose Create API.

API gateway console

On the Actions menu, choose Create resource.
For Resource name, enter a name (for example, anonymous-embed).

Now, let’s create a method.

Choose the anonymous-embed resource and on the Actions menu, choose Create method.
Choose GET under the resource name.
For Integration type, select Lambda.
Select Use Lambda Proxy Integration.
For Lambda function, enter the name of the function you created.
Choose Save, then choose OK.

API gateway console

Now we’re ready to deploy the API.

On the Actions menu, choose Deploy API.
For Deployment stage, select New stage.
Enter a name for your stage, such as embed.
Choose Deploy.

[Optional] Add your domain in QuickSight

If you added Allowed domains in Generate an anonymous embed URL Lambda function part, feel free to move to Turn on capacity pricing section.

To add your domain to the allowed domains in QuickSight, complete the following steps:

On the QuickSight console, choose the user menu, then choose Manage QuickSight.

Quicksight dropdown menu

Choose Domains and Embedding in the navigation pane.
For Domain, enter your domain (http://localhost:<PortNumber>).

Make sure to replace <PortNumber> to match your local setup.

Choose Add.

Quicksight admin console

Make sure to replace the localhost domain with the one you will use after testing.

Turn on capacity pricing

If you don’t have session capacity pricing enabled, follow the steps in this section. It’s mandatory to have this function enabled to proceed further.

Capacity pricing allows QuickSight customers to purchase reader sessions in bulk without having to provision individual readers in QuickSight. Capacity pricing is ideal for embedded applications or large-scale business intelligence (BI) deployments. For more information, visit Amazon QuickSight Pricing.

To turn on capacity pricing, complete the following steps:

On the Manage QuickSight page, choose Your Subscriptions in the navigation pane.
In the Capacity pricing section, select Get monthly subscription.
Choose Confirm subscription.

To learn more about capacity pricing, see New in Amazon QuickSight – session capacity pricing for large scale deployments, embedding in public websites, and developer portal for embedded analytics.

Set up your React application

To set up your React application, complete the following steps:

In your React project folder, go to your root directory and run npm i amazon-quicksight-embedding-sdk to install the amazon-quicksight-embedding-sdk package.
In your App.js file, replace the following:
1. Replace YOUR_API_GATEWAY_INVOKE_URL/RESOURCE_NAME with your API Gateway invoke URL and your resource name (for example, https://xxxxxxxx.execute-api.xx-xxx-x.amazonaws.com/embed/anonymous-embed).
2. Replace YOUR_DASHBOARD1_ID with the first dashboardId from your DashboardIdList. This is the dashboard that will be shown on the initial render.
3. Replace YOUR_DASHBOARD2_ID with the second dashboardId from your DashboardIdList.

The following code snippet shows an example of the App.js file in your React project. The code is a React component that embeds a QuickSight dashboard based on the selected dashboard ID. The code contains the following key components:

State hooks – Two state hooks are defined using the useState() hook from React:
- dashboard – Holds the currently selected dashboard ID.
- quickSightEmbedding – Holds the QuickSight embedding object returned by the embedDashboard() function.
Ref hook – A ref hook is defined using the useRef() hook from React. It’s used to hold a reference to the DOM element where the QuickSight dashboard will be embedded.
useEffect() hook – The useEffect() hook is used to trigger the embedding of the QuickSight dashboard whenever the selected dashboard ID changes. It first fetches the dashboard URL for the selected ID from the QuickSight API using the fetch() method. After it retrieves the URL, it calls the embed() function with the URL as the argument.
Change handler – The changeDashboard() function is a simple event handler that updates the dashboard state whenever the user selects a different dashboard from the drop-down menu. As soon as new dashboard ID is set, the useEffect hook is triggered.
10-millisecond timeout – The purpose of using the timeout is to introduce a small delay of 10 milliseconds before making the API call. This delay can be useful in scenarios where you want to avoid immediate API calls or prevent excessive requests when the component renders frequently. The timeout gives the component some time to settle before initiating the API request. Because we’re building the application in development mode, the timeout helps avoid errors caused by the double run of useEffect within StrictMode. For more information, refer to Updates to Strict Mode.

See the following code:

import './App.css';
import * as React from 'react';
import { useEffect, useRef, useState } from 'react';
import { createEmbeddingContext } from 'amazon-quicksight-embedding-sdk';

 function App() {
  const dashboardRef = useRef([]);
  const [dashboardId, setDashboardId] = useState('YOUR_DASHBOARD1_ID');
  const [embeddedDashboard, setEmbeddedDashboard] = useState(null);
  const [dashboardUrl, setDashboardUrl] = useState(null);
  const [embeddingContext, setEmbeddingContext] = useState(null);

  useEffect(() => {
    const timeout = setTimeout(() => {
      fetch("YOUR_API_GATEWAY_INVOKE_URL/RESOURCE_NAME"
      ).then((response) => response.json()
      ).then((response) => {
        setDashboardUrl(response.EmbedUrl)
      })
    }, 10);
    return () => clearTimeout(timeout);
  }, []);

  const createContext = async () => {
    const context = await createEmbeddingContext();
    setEmbeddingContext(context);
  }

  useEffect(() => {
    if (dashboardUrl) { createContext() }
  }, [dashboardUrl])

  useEffect(() => {
    if (embeddingContext) { embed(); }
  }, [embeddingContext])

  const embed = async () => {

    const options = {
      url: dashboardUrl,
      container: dashboardRef.current,
      height: "500px",
      width: "600px",
    };

    const newEmbeddedDashboard = await embeddingContext.embedDashboard(options);
    setEmbeddedDashboard(newEmbeddedDashboard);
  };

  useEffect(() => {
    if (embeddedDashboard) {
      embeddedDashboard.navigateToDashboard(dashboardId, {})
    }
  }, [dashboardId])

  const changeDashboard = async (e) => {
    const dashboardId = e.target.value
    setDashboardId(dashboardId)
  }

  return (
    <>
      <header>
        <h1>Embedded <i>QuickSight</i>: Build Powerful Dashboards in React</h1>
      </header>
      <main>
        <p>Welcome to the QuickSight dashboard embedding sample page</p>
        <p>Please pick a dashboard you want to render</p>
        <select id='dashboard' value={dashboardId} onChange={changeDashboard}>
          <option value="YOUR_DASHBOARD1_ID">YOUR_DASHBOARD1_NAME</option>
          <option value="YOUR_DASHBOARD2_ID">YOUR_DASHBOARD2_NAME</option>
        </select>
        <div ref={dashboardRef} />
      </main>
    </>
  );
};

export default App

Next, replace the contents of your App.css file, which is used to style and layout your web page, with the content from the following code snippet:

body {
  background-color: #ffffff;
  font-family: Arial, sans-serif;
  margin: 0;
  padding: 0;
}

header {
  background-color: #f1f1f1;
  padding: 20px;
  text-align: center;
}

h1 {
  margin: 0;
}

main {
  margin: 20px;
  text-align: center;
}

p {
  margin-bottom: 20px;
}

a {
  color: #000000;
  text-decoration: none;
}

a:hover {
  text-decoration: underline;
}

i {
  color: orange;
  font-style: normal;
}

Now it’s time to test your app. Start your application by running npm start in your terminal. The following screenshots show examples of your app as well as the dashboards it can display.

Example application page with the visualisation

Conclusion

In this post, we showed you how to embed a QuickSight dashboard into a React application using the AWS SDK. Sharing your dashboard with anonymous users allows them to access your dashboard without granting them access to your AWS account. There are also other ways to share your dashboard anonymously, such as using 1-click public embedding.

Join the Quicksight Community to ask, answer and learn with others and explore additional resources.

About the Author

author headshot

Adrianna is a Solutions Architect at AWS Global Financial Services. Having been a part of Amazon since August 2018, she has had the chance to be involved both in the operations as well as the cloud business of the company. Currently, she builds software assets which demonstrate innovative use of AWS services, tailored to a specific customer use cases. On a daily basis, she actively engages with various aspects of technology, but her true passion lies in combination of web development and analytics.

Access Amazon OpenSearch Serverless collections using a VPC endpoint

2023-07-11 Raj Ramasubbu

Post Syndicated from Raj Ramasubbu original https://aws.amazon.com/blogs/big-data/access-amazon-opensearch-serverless-collections-using-a-vpc-endpoint/

Amazon OpenSearch Serverless helps you index, analyze, and search your logs and data using OpenSearch APIs and dashboards. The OpenSearch Serverless collection is a group of indexes. API and dashboard clients can access the collections from public networks or one or more VPCs. For VPC access to collections and dashboards, you can create VPC endpoints. In this post, we demonstrate how you can create and use VPC endpoints and OpenSearch Serverless network policies to control access to your collections and OpenSearch dashboards from multiple network locations.

The demo in this post uses an AWS Lambda-based client in a VPC to ingest data into a collection via a VPC endpoint and a browser in a public network accessing the same collection.

Solution overview

To illustrate how you can ingest data into an OpenSearch Serverless collection from within a VPC, we use a Lambda function. We use a VPC-hosted Lambda function to create an index in an OpenSearch Serverless collection and add documents to the index using a VPC endpoint. We then use a publicly accessible OpenSearch Serverless dashboard to see the documents ingested from Lambda function.

The following sections detail the steps to ingest data into the collection using Lambda and access the OpenSearch Serverless dashboard.

Prerequisites

This setup assumes that you have already created a VPC with private subnets.

Ingest data using Lambda and access the OpenSearch Serverless dashboard

To set up your solution, complete the following steps:

On the OpenSearch Service console, create a private connection between your VPC and OpenSearch Serverless using a VPC endpoint. Use the private subnets and a security group from your VPC.
Create an OpenSearch collection using the VPC endpoint created in the previous step.
Create a network policy to enable VPC access to the OpenSearch endpoint so the Lambda function can ingest documents to the collection. You should also enable public access to the OpenSearch dashboard endpoint so we can see the documents ingested.
After you create the collection, create a data access policy to grant ingestion access to the Lambda function’s AWS Identity and Access Management (IAM) role.
Additionally, grant read access to the dashboard user’s IAM role.
Add IAM permissions to the Lambda function’s IAM role and the dashboard user’s IAM role for the OpenSearch Serverless collection.
Create a Lambda function in the same VPC and subnet that we used for the OpenSearch endpoint (see the following code). This function creates an index called sitcoms-eighties in the OpenSearch Serverless collection and adds a sample document to the index:

import datetime
import time
from opensearchpy import OpenSearch, RequestsHttpConnection
from requests_aws4auth import AWS4Auth
import boto3

host = '<Insert-OpenSearch-Serverless-Endpoint>'
region = 'us-east-1'
service = 'aoss'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service,session_token=credentials.token)

# Build the OpenSearch client
client = OpenSearch(
    hosts=[{'host': host, 'port': 443}],
    http_auth=awsauth,
    use_ssl=True,
    verify_certs=True,
    connection_class=RequestsHttpConnection,
    timeout=300
)

def lambda_handler (event, context):

    # Create index
    response = client.indices.create('sitcoms-eighties')
    print('\nCreating index:')
    print(response)
    time.sleep(5)
    
    dt = datetime.datetime.now()
    # Add a document to the index.
    response = client.index(
        index='sitcoms-eighties',
        body={
            'title': 'Seinfeld',
            'creator': 'Larry David',
            'year': 1989,
            'createtime': dt
        },
        id='1',
    )
    print('\nDocument added:')
    print(response)

Run the Lambda function, and you should see the output as shown in the following screenshot.
You can now see the documents from this index through your publicly accessible OpenSearch Dashboards URL.
Create the index pattern in OpenSearch Dashboards, and then you can see the documents as shown in the following screenshot.

Use a VPC DNS resolver from your network

A client in your VPN network can connect to the collection or dashboards over a VPC endpoint. The client needs to find the VPC endpoint’s IP address using an Amazon Route 53 inbound resolver endpoint. To learn more about Route 53 inbound resolver endpoints, refer to Resolving DNS queries between VPCs and your network. The following diagram shows a sample setup.

The flow for this architecture is as follows:

The DNS query for the OpenSearch Serverless client is routed to a locally configured on-premises DNS server.
The on-premises DNS as configured performs conditional forwarding for the zone us-east-1.aoss.amazonaws.com to a Route 53 inbound resolver endpoint IP address. You must replace your Region name in the preceding zone name.
The inbound resolver endpoint performs DNS resolution by forwarding the query to the private hosted zone that was created along with the OpenSearch Serverless VPC endpoint.
The IP addresses returned by the DNS query are the private IP addresses of the interface VPC endpoint, which allow your on-premises host to establish private connectivity over AWS Site-to-Site VPN.
The interface endpoint is a collection of one or more elastic network interfaces with a private IP address in your account that serves as an entry point for traffic going to an OpenSearch Serverless endpoint.

Summary

OpenSearch Serverless allows you to set up and control access to the service using VPC endpoints and network policies. In this post, we explored how to access an OpenSearch Serverless collection API and dashboard from within a VPC, on premises, and public networks. If you have any questions or suggestions, please write to us in the comments section.

About the Authors

Raj Ramasubbu is a Senior Analytics Specialist Solutions Architect focused on big data and analytics and AI/ML with Amazon Web Services. He helps customers architect and build highly scalable, performant, and secure cloud-based solutions on AWS. Raj provided technical expertise and leadership in building data engineering, big data analytics, business intelligence, and data science solutions for over 18 years prior to joining AWS. He helped customers in various industry verticals like healthcare, medical devices, life science, retail, asset management, car insurance, residential REIT, agriculture, title insurance, supply chain, document management, and real estate.

Vivek Kansal works with the Amazon OpenSearch team. In his role as Principal Software Engineer, he uses his experience in the areas of security, policy engines, cloud-native solutions, and networking to help secure customer data in OpenSearch Service and OpenSearch Serverless in an evolving threat landscape.

Automated Code Review on Pull Requests using AWS CodeCommit and AWS CodeBuild

2023-07-10 Verinder Singh

Post Syndicated from Verinder Singh original https://aws.amazon.com/blogs/devops/automated-code-review-on-pull-requests-using-aws-codecommit-and-aws-codebuild/

Pull Requests play a critical part in the software development process. They ensure that a developer’s proposed code changes are reviewed by relevant parties before code is merged into the main codebase. This is a standard procedure that is followed across the globe in different organisations today. However, pull requests often require code reviewers to read through a great deal of code and manually check it against quality and security standards. These manual reviews can lead to problematic code being merged into the main codebase if the reviewer overlooks any problems.

To help solve this problem, we recommend using Amazon CodeGuru Reviewer to assist in the review process. CodeGuru Reviewer identifies critical defects and deviation from best practices in your code. It provides recommendations to remediate its findings as comments in your pull requests, helping reviewers miss fewer problems that may have otherwise made into production. You can easily integrate your repositories in AWS CodeCommit with Amazon CodeGuru Reviewer following these steps.

The purpose of this post isn’t, however, to show you CodeGuru Reviewer. Instead, our aim is to help you achieve automated code reviews with your pull requests if you already have a code scanning tool and need to continue using it. In this post, we will show you step-by-step how to add automation to the pull request review process using your code scanning tool with AWS CodeCommit (as source code repository) and AWS CodeBuild (to automatically review code using your code reviewer). After following this guide, you should be able to give developers automatic feedback on their code changes and augment manual code reviews so fewer problems make it into your main codebase.

Solution Overview

The solution comprises of the following components:

AWS CodeCommit: AWS service to host private Git repositories.
Amazon EventBridge: AWS service to receive pullRequestCreated and pullRequestSourceBranchUpdated events and trigger Amazon EventBridge rule.
AWS CodeBuild: AWS service to perform code review and send the result to AWS CodeCommit repository as pull request comment.

The following diagram illustrates the architecture:

Figure 1: This architecture diagram illustrates the workflow where developer raises a Pull Request and receives automated feedback on the code changes using AWS CodeCommit, AWS CodeBuild and Amazon EventBridge rule

Figure 1. Architecture Diagram of the proposed solution in the blog

Developer raises a pull request against the main branch of the source code repository in AWS CodeCommit.
The pullRequestCreated event is received by the default event bus.
The default event bus triggers the Amazon EventBridge rule which is configured to be triggered on pullRequestCreated and pullRequestSourceBranchUpdated events.
The EventBridge rule triggers AWS CodeBuild project.
The AWS CodeBuild project runs the code quality check using customer’s choice of tool and sends the results back to the pull request as comments. Based on the result, the AWS CodeBuild project approves or rejects the pull request automatically.

Walkthrough

The following steps provide a high-level overview of the walkthrough:

Create a source code repository in AWS CodeCommit.
Create and associate an approval rule template.
Create AWS CodeBuild project to run the code quality check and post the result as pull request comment.
Create an Amazon EventBridge rule that reacts to AWS CodeCommit pullRequestCreated and pullRequestSourceBranchUpdated events for the repository created in step 1 and set its target to AWS CodeBuild project created in step 3.
Create a feature branch, add a new file and raise a pull request.
Verify the pull request with the code review feedback in comment section.

1. Create a source code repository in AWS CodeCommit

Create an empty test repository in AWS CodeCommit by following these steps. Once the repository is created you can add files to your repository following these steps. If you create or upload the first file for your repository in the console, a branch is created for you named main. This branch is the default branch for your repository. If you are using a Git client instead, consider configuring your Git client to use main as the name for the initial branch. This blog post assumes the default branch is named as main.

2. Create and associate an approval rule template

Create an AWS CodeCommit approval rule template and associate it with the code repository created in step 1 following these steps.

3. Create AWS CodeBuild project to run the code quality check and post the result as pull request comment

This blog post is based on the assumption that the source code repository has JavaScript code in it, so it uses jshint as a code analysis tool to review the code quality of those files. However, users can choose a different tool as per their use case and choice of programming language.

Create an AWS CodeBuild project from AWS Management Console following these steps and using below configuration:

Source: Choose the AWS CodeCommit repository created in step 1 as the source provider.
Environment: Select the latest version of AWS managed image with operating system of your choice. Choose New service role option to create the service IAM role with default permissions.
Buildspec: Use below build specification. Replace <NODEJS_VERSION> with the latest supported nodejs runtime version for the image selected in previous step. Replace <REPOSITORY_NAME> with the repository name created in step 1. The below spec installs the jshint package, creates a jshint config file with a few sample rules, runs it against the source code in the pull request commit, posts the result as comment to the pull request page and based on the results, approves or rejects the pull request automatically.

version: 0.2
phases:
  install:
    runtime-versions:
      nodejs: <NODEJS_VERSION>
    commands:
      - npm install jshint --global
  build:
    commands:
      - echo \{\"esversion\":6,\"eqeqeq\":true,\"quotmark\":\"single\"\} > .jshintrc
      - CODE_QUALITY_RESULT="$(echo \`\`\`) $(jshint .)"; EXITCODE=$?
      - aws codecommit post-comment-for-pull-request --pull-request-id $PULL_REQUEST_ID --repository-name <REPOSITORY_NAME> --content "$CODE_QUALITY_RESULT" --before-commit-id $DESTINATION_COMMIT_ID --after-commit-id $SOURCE_COMMIT_ID --region $AWS_REGION	
      - |
        if [ $EXITCODE -ne 0 ]
        then
          PR_STATUS='REVOKE'
        else
          PR_STATUS='APPROVE'
        fi
      - REVISION_ID=$(aws codecommit get-pull-request --pull-request-id $PULL_REQUEST_ID | jq -r '.pullRequest.revisionId')
      - aws codecommit update-pull-request-approval-state --pull-request-id $PULL_REQUEST_ID --revision-id $REVISION_ID --approval-state $PR_STATUS --region $AWS_REGION

Once the AWS CodeBuild project has been created successfully, modify its IAM service role by following the below steps:

Choose the CodeBuild project’s Build details tab.
Choose the Service role link under the Environment section which should navigate you to the CodeBuild’s IAM service role in IAM console.
Expand the default customer managed policy and choose Edit.
Add the following actions to the existing codecommit actions:

"codecommit:CreatePullRequestApprovalRule",
"codecommit:GetPullRequest",
"codecommit:PostCommentForPullRequest",
"codecommit:UpdatePullRequestApprovalState"

Choose Next.
On the Review screen, choose Save changes.

4. Create an Amazon EventBridge rule that reacts to AWS CodeCommit pullRequestCreated and pullRequestSourceBranchUpdated events for the repository created in step 1 and set its target to AWS CodeBuild project created in step 3

Follow these steps to create an Amazon EventBridge rule that gets triggered whenever a pull request is created or updated using the following event pattern. Replace the <REGION>, <ACCOUNT_ID> and <REPOSITORY_NAME> placeholders with the actual values. Select target of the event rule as AWS CodeBuild project created in step 3.

Event Pattern

{
    "detail-type": ["CodeCommit Pull Request State Change"],
    "resources": ["arn:aws:codecommit:<REGION>:<ACCOUNT_ID>:<REPOSITORY_NAME>"],
    "source": ["aws.codecommit"],
    "detail": {
      "isMerged": ["False"],
      "pullRequestStatus": ["Open"],
      "repositoryNames": ["<REPOSITORY_NAME>"],
      "destinationReference": ["refs/heads/main"],
      "event": ["pullRequestCreated", "pullRequestSourceBranchUpdated"]
    },
    "account": ["<ACCOUNT_ID>"]
  }

Follow these steps to configure the target input using the below input path and input template.

Input transformer – Input path

{
    "detail-destinationCommit": "$.detail.destinationCommit",
    "detail-pullRequestId": "$.detail.pullRequestId",
    "detail-sourceCommit": "$.detail.sourceCommit"
}

Input transformer – Input template

{
    "sourceVersion": <detail-sourceCommit>,
    "environmentVariablesOverride": [
        {
            "name": "DESTINATION_COMMIT_ID",
            "type": "PLAINTEXT",
            "value": <detail-destinationCommit>
        },
        {
            "name": "SOURCE_COMMIT_ID",
            "type": "PLAINTEXT",
            "value": <detail-sourceCommit>
        },
        {
            "name": "PULL_REQUEST_ID",
            "type": "PLAINTEXT",
            "value": <detail-pullRequestId>
        }
    ]
}

5. Create a feature branch, add a new file and raise a pull request

Create a feature branch following these steps. Push a new file called “index.js” to the root of the repository with the below content.

function greet(dayofweek) {
  if (dayofweek == "Saturday" || dayofweek == "Sunday") {
    console.log("Have a great weekend");
  } else {
    console.log("Have a great day at work");
  }
}

Now raise a pull request using the feature branch as source and main branch as destination following these steps.

6. Verify the pull request with the code review feedback in comment section

As soon as the pull request is created, the AWS CodeBuild project created in step 3 above will be triggered which will run the code quality check and post the results as a pull request comment. Navigate to the AWS CodeCommit repository pull request page in AWS Management Console and check under the Activity tab to confirm the automated code review result being displayed as the latest comment.

The pull request comment submitted by AWS CodeBuild highlights 6 errors in the JavaScript code. The errors on lines first and third are based on the jshint rule “eqeqeq”. It recommends to use strict equality operator (“===”) instead of the loose equality operator (“==”) to avoid type coercion. The errors on lines second, fourth and fifth are based on jshint rule “quotmark” which recommends to use single quotes with strings instead of double quotes for better readability. These jshint rules are defined in AWS CodeBuild project’s buildspec in step 3 above.

Figure 2: The image shows the AWS CodeCommit pull request's Activity tab with code review results automatically posted by the automated code reviewer

Figure 2. Pull Request comments updated with automated code review results.

Conclusion

In this blog post we’ve shown how using AWS CodeCommit and AWS CodeBuild services customers can automate their pull request review process by utilising Amazon EventBridge events and using their own choice of code quality tool. This simple solution also makes it easier for the human reviewers by providing them with automated code quality results as input and enabling them to focus their code review more on business logic code changes rather than static code quality issues.

About the authors

Backtesting index rebalancing arbitrage with Amazon EMR and Apache Iceberg

2023-07-03 Guy Bachar

Post Syndicated from Guy Bachar original https://aws.amazon.com/blogs/big-data/backtesting-index-rebalancing-arbitrage-with-amazon-emr-and-apache-iceberg/

Backtesting is a process used in quantitative finance to evaluate trading strategies using historical data. This helps traders determine the potential profitability of a strategy and identify any risks associated with it, enabling them to optimize it for better performance.

Index rebalancing arbitrage takes advantage of short-term price discrepancies resulting from ETF managers’ efforts to minimize index tracking error. Major market indexes, such as S&P 500, are subject to periodic inclusions and exclusions for reasons beyond the scope of this post (for an example, refer to CoStar Group, Invitation Homes Set to Join S&P 500; Others to Join S&P 100, S&P MidCap 400, and S&P SmallCap 600). The arbitrage trade looks to profit from going long on stocks added to an index and shorting the ones that are removed, with the aim of generating profit from these price differences.

In this post, we look into the process of using backtesting to evaluate the performance of an index arbitrage profitability strategy. We specifically explore how Amazon EMR and the newly developed Apache Iceberg branching and tagging feature can address the challenge of look-ahead bias in backtesting. This will enable a more accurate evaluation of the performance of the index arbitrage profitability strategy.

Terminology

Let’s first discuss some of the terminology used in this post:

Research data lake on Amazon S3 – A data lake is a large, centralized repository that allows you to manage all your structured and unstructured data at any scale. Amazon Simple Storage Service (Amazon S3) is a popular cloud-based object storage service that can be used as the foundation for building a data lake.
Apache Iceberg – Apache Iceberg is an open-source table format that is designed to provide efficient, scalable, and secure access to large datasets. It provides features such as ACID transactions on top of Amazon S3-based data lakes, schema evolution, partition evolution, and data versioning. With scalable metadata indexing, Apache Iceberg is able to deliver performant queries to a variety of engines such as Spark and Athena by reducing planning time.
Look–ahead bias – This is a common challenge in backtesting, which occurs when future information is inadvertently included in historical data used to test a trading strategy, leading to overly optimistic results.
Iceberg tags – The Iceberg branching and tagging feature allows users to tag specific snapshots of their data tables with meaningful labels using SQL syntax or the Iceberg library, which correspond to specific events notable to internal investment teams. This, combined with Iceberg’s time travel functionality, ensures that accurate data enters the research pipeline and guards it from hard-to-detect problems such as look-ahead bias.

Testing scope

For our testing purposes, consider the following example, in which a change to the S&P Dow Jones Indices is announced on September 2, 2022, becomes effective on September 19, 2022, and doesn’t become observable in the ETF holdings data that we will be using in the experiment until September 30, 2022. We use Iceberg tags to label market data snapshots to avoid look-ahead bias in the research data lake, which will enable us to test various trade entry and exit scenarios and assess the respective profitability of each.

Experiment

As part of our experiment, we utilize a paid, third-party data provider API to identify SPY ETF holdings changes and construct a portfolio. Our model portfolio will buy stocks that are added to the index, known as going long, and will sell an equivalent amount of stocks removed from the index, known as going short.

We will test short-term holding periods, such as 1 day and 1, 2, 3, or 4 weeks, because we assume that the rebalancing effect is very short-lived and new information, such as macroeconomics, will drive performance beyond the studied time horizons. Lastly, we simulate different entry points for this trade:

Market open the day after announcement day (AD+1)
Market close of effective date (ED0)
Market open the day after ETF holdings registered the change (MD+1)

Research data lake

To run our experiment, we have have used the following research data lake environment.

As shown in the architecture diagram, the research data lake is built on Amazon S3 and managed using Apache Iceberg, which is an open table format bringing the reliability and simplicity of relational database management service (RDBMS) tables to data lakes. To avoid look-ahead bias in backtesting, it’s essential to create snapshots of the data at different points in time. However, managing and organizing these snapshots can be challenging, especially when dealing with a large volume of data.

This is where the tagging feature in Apache Iceberg comes in handy. With tagging, researchers can create differently named snapshots of market data and track changes over time. For example, they can create a snapshot of the data at the end of each trading day and tag it with the date and any relevant market conditions.

By using tags to organize the snapshots, researchers can easily query and analyze the data based on specific market conditions or events, without having to worry about the specific dates of the data. This can be particularly helpful when conducting research that is not time-sensitive or when looking for trends over long periods of time.

Furthermore, the tagging feature can also help with other aspects of data management, such as data retention for GDPR compliance, and maintaining lineages of the table via different branches. Researchers can use Apache Iceberg tagging to ensure the integrity and accuracy of their data while also simplifying data management.

Prerequisites

To follow along with this walkthrough, you must have the following:

An AWS account with an IAM role that has sufficient access to provision the required resources.
To comply with licensing considerations, we cannot provide a sample of the ETF constituents data. Therefore, it must be purchased separately for the dataset onboarding purposes.

Solution overview

To set up and test this experiment, we complete the following high-level steps:

Create an S3 bucket.
Load the dataset into Amazon S3. For this post, the ETF data referred to was obtained via API call through a third-party provider, but you can also consider the following options:
1. You can use the following prescriptive guidance, which describes how to automate data ingestion from various data providers into a data lake in Amazon S3 using AWS Data Exchange.
2. You can also utilize AWS Data Exchange to select from a range of third-party dataset providers. It simplifies the usage of data files, tables, and APIs for your specific needs.
3. Lastly, you can also refer to the following post on how to use AWS Data Exchange for Amazon S3 to access data from a provider bucket: Analyzing impact of regulatory reform on the stock market using AWS and Refinitiv data.
Create an EMR cluster. You can use this Getting Started with EMR tutorial or we used CDK to deploy an EMR on EKS environment with a custom managed endpoint.
Create an EMR notebook using EMR Studio. For our testing environment, we used a custom build Docker image, which contains Iceberg v1.3. For instructions on attaching a cluster to a Workspace, refer to Attach a cluster to a Workspace.
Configure a Spark session. You can follow along via the following sample notebook.
Create an Iceberg table and load the test data from Amazon S3 into the table.
Tag this data to preserve a snapshot of it.
Perform updates to our test data and tag the updated dataset.
Run simulated backtesting on our test data to find the most profitable entry point for a trade.

Create the experiment environment

We can get up and running with Iceberg by creating a table via Spark SQL from an existing view, as shown in the following code:

spark.sql("""
CREATE TABLE glue_catalog.quant.etf_holdings 
USING iceberg OPTIONS ('format-version'='2') 
LOCATION 's3://substitute_your_bucket/etf_holdings/' 
AS SELECT * FROM 2022Q1
""")
spark.sql("""
SELECT symbol, date, acceptanceTime, status
FROM glue_catalog.quant.etf_holdings
""").show()

+------+----------+-------------------+------+
|symbol|      date|     acceptanceTime|status|
+------+----------+-------------------+------+
|   HON|2022-03-31|2022-05-27 13:54:03|   new|
|   DFS|2022-03-31|2022-05-27 13:54:03|   new|
|   FMC|2022-03-31|2022-05-27 13:54:03|   new|
|  NDSN|2022-03-31|2022-05-27 13:54:03|   new|
|   CRL|2022-03-31|2022-05-27 13:54:03|   new|
|  EPAM|2022-03-31|2022-05-27 13:54:03|   new|
|  CSCO|2022-03-31|2022-05-27 13:54:03|   new|
|   ALB|2022-03-31|2022-05-27 13:54:03|   new|
|   AIZ|2022-03-31|2022-05-27 13:54:03|   new|
|   CRM|2022-03-31|2022-05-27 13:54:03|   new|
|  PENN|2022-03-31|2022-05-27 13:54:03|   new|
|  INTU|2022-03-31|2022-05-27 13:54:03|   new|
|   DOW|2022-03-31|2022-05-27 13:54:03|   new|
|   LHX|2022-03-31|2022-05-27 13:54:03|   new|
|   BLK|2022-03-31|2022-05-27 13:54:03|   new|
|  ZBRA|2022-03-31|2022-05-27 13:54:03|   new|
|   UPS|2022-03-31|2022-05-27 13:54:03|   new|
|    DG|2022-03-31|2022-05-27 13:54:03|   new|
|  DISH|2022-03-31|2022-05-27 13:54:03|   new|
|      |2022-03-31|2022-05-27 13:54:03|   new|
+------+----------+-------------------+------+

Now that we’ve created an Iceberg table, we can use it for investment research. One of the key features of Iceberg is its support for scalable data versioning. This means that we can easily track changes to our data and roll back to previous versions without making additional copies. Because this data gets updated periodically, we want to be able to create named snapshots of the data so that quant traders have easy access to consistent snapshots of data that have their own retention policy. In this case, let’s tag the dataset to indicate that it represents the ETF holdings data as of Q1 2022:

spark.sql("""
ALTER TABLE glue_catalog.quant.etf_holdings CREATE TAG Q1_2022
""")

As we move forward in time and new data becomes available by Q3, we may need to update existing datasets to reflect these changes. In the following example, we first use an UPDATE statement to mark the stocks as expired in the existing ETF holdings dataset. Then we use the MERGE INTO statement based on matching conditions such as ISIN code. If a match is not found between the existing dataset and the new dataset, the new data will be inserted as new records in the table and status code will be set to ‘new’ for these records. Similarly, if the existing dataset has stocks that are not present in the new dataset, those records will remain expired with a status code of ‘expired’. Finally, for records where a match is found, the data in the existing dataset will be updated with the data from the new dataset, and record will have an unchanged status code. With Iceberg’s support for efficient data versioning and transactional consistency, we can be confident that our data updates will be applied correctly and without data corruption.

spark.sql("""
UPDATE glue_catalog.quant.etf_holdings
SET status = 'expired'
""")
spark.sql("""
MERGE INTO glue_catalog.quant.etf_holdings t
USING (SELECT * FROM 2022Q3) s
ON t.isin = s.isin
WHEN MATCHED THEN
    UPDATE SET t.acceptanceTime = s.acceptanceTime,
               t.date = s.date,
               t.balance = s.balance,
               t.valUsd = s.valUsd,
               t.pctVal = s.pctVal,
               t.status = "unchanged"
WHEN NOT MATCHED THEN INSERT *
""")

Because we now have a new version of the data, we use Iceberg tagging to provide isolation for each new version of data. In this case, we tag this as Q3_2022 and allow quant traders and other users to work on this snapshot of the data without being affected by ongoing updates to the pipeline:

spark.sql("""
ALTER TABLE glue_catalog.quant.etf_holdings CREATE TAG Q3_2022""")

This makes it very easy to see which stocks are being added and deleted. We can use Iceberg’s time travel feature to read the data at a given quarterly tag. First, let’s look at which stocks are added to the index; these are the rows that are in the Q3 snapshot but not in the Q1 snapshot. Then we will look at which stocks are removed; these are the rows that are in the Q1 snapshot but not in the Q3 snapshot:

spark.sql("""
SELECT symbol, isin, acceptanceTime, date 
FROM glue_catalog.quant.etf_holdings 
AS OF ‘Q3_2022’ EXCEPT 
SELECT symbol, isin, acceptanceTime, date 
FROM glue_catalog.quant.etf_holdings 
AS OF ‘Q1_2022’
""").show()

+------+------------+-------------------+----------+
|symbol|        isin|     acceptanceTime|      date|
+------+------------+-------------------+----------+
|   CPT|US1331311027|2022-11-28 15:50:55|2022-09-30|
|  CSGP|US22160N1090|2022-11-28 15:50:55|2022-09-30|
|  EMBC|US29082K1051|2022-11-28 15:50:55|2022-09-30|
|  INVH|US46187W1071|2022-11-28 15:50:55|2022-09-30|
|     J|US46982L1089|2022-11-28 15:50:55|2022-09-30|
|   KDP|US49271V1008|2022-11-28 15:50:55|2022-09-30|
|    ON|US6821891057|2022-11-28 15:50:55|2022-09-30|
|  VICI|US9256521090|2022-11-28 15:50:55|2022-09-30|
|   WBD|US9344231041|2022-11-28 15:50:55|2022-09-30|
+------+------------+-------------------+----------+

spark.sql("""
SELECT symbol, isin, acceptanceTime, date 
FROM glue_catalog.quant.etf_holdings 
AS OF ‘Q1_2022’ EXCEPT 
SELECT symbol, isin, acceptanceTime, date 
FROM glue_catalog.quant.etf_holdings 
AS OF ‘Q3_2022’
""").show()

+------+------------+-------------------+----------+
|symbol|        isin|     acceptanceTime|      date|
+------+------------+-------------------+----------+
|  PENN|US7075691094|2022-05-27 13:54:03|2022-03-31|
|    UA|US9043112062|2022-05-27 13:54:03|2022-03-31|
|   UAA|US9043111072|2022-05-27 13:54:03|2022-03-31|
|   LTP|US7127041058|2022-05-27 13:54:03|2022-03-31|
| DISCA|US25470F1049|2022-05-27 13:54:03|2022-03-31|
|  CERN|US1567821046|2022-05-27 13:54:03|2022-03-31|
|  IPGP|US44980X1090|2022-05-27 13:54:03|2022-03-31|
|      |US25470F3029|2022-05-27 13:54:03|2022-03-31|
|     J|US4698141078|2022-05-27 13:54:03|2022-03-31|
|   PVH|US6936561009|2022-05-27 13:54:03|2022-03-31|
+------+------------+-------------------+----------+

Now we use the delta obtained in the preceding code to backtest the following strategy. As part of the index rebalancing arbitrage process, we’re going to long stocks that are added to the index and short stocks that are removed from the index, and we’ll test this strategy for both the effective date and announcement date. As a proof of concept from the two different lists, we picked PVH and PENN as removed stocks, and CSGP and INVH as added stocks.

To follow along with the examples below, you will need to use the notebook provided in the Quant Research example GitHub repository.

Cumulative Returns comparison

import numpy as np
import vectorbt as vbt

def backtest(entry_point='2022-09-02', exit_point='2022-10-31'):
    open_position = (historical_prices_pd.index == entry_point)
    close_position = (historical_prices_pd.index == exit_point)

    CASH = 100000
    COMMPERC = 0.000

    symbol_cols = pd.Index(['PENN', 'PVH', 'INVH', 'CSGP'], name='symbol')
    order_size = pd.DataFrame(index=historical_prices_pd.index, columns=symbol_cols)
    order_size['PENN'] = np.nan
    order_size['PVH'] = np.nan
    order_size['INVH'] = np.nan
    order_size['CSGP'] = np.nan

    #short
    order_size.loc[open_position, 'PENN'] = -10
    order_size.loc[close_position, 'PENN'] = 0

    order_size.loc[open_position, 'PVH'] = -10
    order_size.loc[close_position, 'PVH'] = 0

    #long
    order_size.loc[open_position, 'INVH'] = 10
    order_size.loc[close_position, 'INVH'] = 0

    order_size.loc[open_position, 'CSGP'] = 10
    order_size.loc[close_position, 'CSGP'] = 0

    # Execute at the next bar
    order_size = order_size.vbt.fshift(1)

    portfolio = vbt.Portfolio.from_orders(
            historical_close_prices,  # current close as reference price
            size=order_size,  
            price=historical_open_prices,  # current open as execution price
            size_type='targetpercent', 
            val_price=historical_close_prices.vbt.fshift(1),  # previous close as group valuation price
            init_cash=CASH,
            allow_partial=False,
            fees=COMMPERC,
            direction='both',
            cash_sharing=True,  # share capital between assets in the same group
            group_by=True,  # all columns belong to the same group
            call_seq='auto',  # sell before buying
            freq='d'  # index frequency for annualization
    )
    return portfolio

portfolio = backtest('2022-09-02', '2022-10-31')

portfolio.orders.records_readable.head(20)

The following table represent the portfolio orders records:

Order Id	Column	Timestamp	Size	Price	Side
0	(PENN, PENN)	2022-09-06	31948.881789	31.66	Sell
1	(PVH, PVH)	2022-09-06	18321.729571	55.15	Sell
2	(INVH, INVH)	2022-09-06	27419.797094	38.20	Buy
3	(CSGP, CSGP)	2022-09-06	14106.361969	75.00	Buy
4	(CSGP, CSGP)	2022-11-01	14106.361969	83.70	Sell
5	(INVH, INVH)	2022-11-01	27419.797094	31.94	Sell
6	(PVH, PVH)	2022-11-01	18321.729571	52.95	Buy
7	(PENN, PENN)	2022-11-01	31948.881789	34.09	Buy

Experimentation findings

The following table shows Sharpe Ratios for various holding periods and two different trade entry points: announcement and effective dates.

Experimentation findings

The data suggests that the effective date is the most profitable entry point across most holding periods, whereas the announcement date is an effective entry point for short-term holding periods (5 calendar days, 2 business days). Because the results are obtained from testing a single event, this is not statistically significant to accept or reject a hypothesis that index rebalancing events can be used to generate consistent alpha. The infrastructure we used for our testing can be used to run the same experiment required to do hypothesis testing at scale, but index constituents data is not readily available.

Conclusion

In this post, we demonstrated how the use of backtesting and the Apache Iceberg tagging feature can provide valuable insights into the performance of index arbitrage profitability strategies. By using a scalable Amazon EMR on Amazon EKS stack, researchers can easily handle the entire investment research lifecycle, from data collection to backtesting. Additionally, the Iceberg tagging feature can help address the challenge of look-ahead bias, while also providing benefits such as data retention control for GDPR compliance and maintaining lineage of the table via different branches. The experiment findings demonstrate the effectiveness of this approach in evaluating the performance of index arbitrage strategies and can serve as a useful guide for researchers in the finance industry.

About the Authors

Boris Litvin is Principal Solution Architect, responsible for financial services industry innovation. He is a former Quant and FinTech founder, and is passionate about systematic investing.

Guy Bachar is a Solutions Architect at AWS, based in New York. He accompanies greenfield customers and helps them get started on their cloud journey with AWS. He is passionate about identity, security, and unified communications.

Noam Ouaknine is a Technical Account Manager at AWS, and is based in Florida. He helps enterprise customers develop and achieve their long-term strategy through technical guidance and proactive planning.

Sercan Karaoglu is Senior Solutions Architect, specialized in capital markets. He is a former data engineer and passionate about quantitative investment research.

Jack Ye is a software engineer in the Athena Data Lake and Storage team. He is an Apache Iceberg Committer and PMC member.

Amogh Jahagirdar is a Software Engineer in the Athena Data Lake team. He is an Apache Iceberg Committer.

How to Grant Another SES Account or User Permission To Send Emails

2023-06-30 bajavani

Post Syndicated from bajavani original https://aws.amazon.com/blogs/messaging-and-targeting/how-to-grant-another-ses-account-or-user-permission-to-send-emails/

Amazon Simple Email Service (Amazon SES) is a bulk and transactional email sending service for businesses and developers. To send emails from a particular email address through SES, users have to verify ownership of the email address, the domain used by the email address, or a parent domain of the domain used by the email address. This is referred to as an identity and is treated as a user-owned resource by SES.

For example, to send an email from [email protected], the user must verify ownership of the email address [email protected], the subdomain mail.example.com, or the domain example.com. Only identity owners are allowed to send emails from email addresses covered by their identities.

Why use the sending authorization feature in email?

This post will show you how you can grant another account or user to send emails from the identity that you own . By using sending authorization , you can authorize other users to send emails from the identities that you own using their Amazon SES accounts . In this blog post I’d like to walk you through how to setup sending authorization and addressing common concerns regarding the same.

With sending authorization, you can verify the identity under a single account and then grant the other accounts/users permission to send emails from that verified identity.

Let’s look at the below use case :

For example, if you’re a business owner who has collaborated with a email marketing company to send emails from your domain but you would like that only the domain you own should be verified in your account whereas , the email sending, and the monitoring of those emails ( bounce/complaint/delivery notifications for the emails) should be taken care by the email marketing company itself.

With sending authorization, the business owner can verify the identity in their SES account and provide the necessary permissions to the user of the email marketing company in order to send emails using their domain .

Before we proceed further , there are two important terms shared below which you should know that are used throughout the blog:

Delegate Sender : The user that will be using the verified identity from another account to send email.

Identity Owner : The account where the identity is verified . A policy is attached to an identity to specify who may send for that identity and under which conditions. You can refer the SES developer guide to know more

Overview of solution

If you want to enable a delegate sender to send on your behalf, you create a sending authorization policy and associate the policy to your identity by using the Amazon SES console or the Amazon SES API.
When the delegate sender attempts to send an email through Amazon SES on your behalf, the delegate sender passes the ARN of your identity in the request or in the header of the email as you can see from the Figure 1 shared below. Figure 1 shows the architecture of the sending authorization process.

Figure 1: High Level Overview of Sending Authorization Process

3. When Amazon SES receives the request to send the email, it checks your identity’s policy (if present) to determine if you have authorized the delegate sender to send on the identity’s behalf. If the delegate sender is authorized, Amazon SES accepts the email; otherwise, Amazon SES returns an error message. The error message is similar to error message :“ AccessDenied: User is not authorized to perform ses sendemail”

Walkthrough

In this section, you’ll learn the steps needed to setup email sending authorization:

Create a IAM user in Delegate Sender Account with the necessary email sending permissions.You can read more about the necessary email sending permission in our developer guide
Verify Identity in Identity Owner Account which will be used by the Delegate Sender account later to send email.
Set up Identity policy to authorize the Delegate Sender Account to send emails using an email address or domain (an identity) owned by Identity Owner Account. The below steps illustrates how you can setup the identity policy .
1. In order to add the identity policy , go to the Verified-identities screen of the SES console, select the verified identity you wish to authorize for the delegate sender to send on your behalf.
2. Choose the verified identity’s Authorization tab. Please refer the below screenshot for reference :

Choose the verified identity's Authorization tab

You can use both policy generator or create a custom policy .

In the Authorization policies pane, if you wish to use the policy generator to create the policy then you can select Use policy generator from the drop-down. You can create the sending authorization policy depending on your use case . The below screenshot demonstrates the policy generator view :

policy generator view

You can also create the policy using the option “create custom policy ” . Please see the below screenshot for reference for a sample policy :

Add the identity policy to the verified identity in Identity owner account . Check the sample policy below for reference :

{
“Version”: “2008-10-17”,
“Statement”: [
{
“Sid”: “stmt1532578375047”,
“Effect”: “Allow”,
“Principal”: {
“AWS”: “<write ARN of user belonging to Delegate sender account>”
},
“Action”: [
“ses:SendEmail”,
“ses:SendRawEmail”
],
“Resource”: “<write ARN of the identity verified in Identity owner Account >”
}
]
}

Note: Please make sure to write the ARN’s for the Principal and the Resource in the above given sample policy.

3.Click on Apply policy after you have reviewed the authorization policy.

You can use the policy generator to create a sending authorization policy or use Amazon SES API or console to create a custom policy . This policy can also restrict usage based on different conditions . A condition is any restriction about the permission in the statement. A key is the specific characteristic that’s the basis for access restriction .

For more information , you can refer Sending-authorization-policy-examples.

4. Send email from Account B using the source ARN of the identity of Account A .
Here we will be sending emails using the send-email api command using AWS CLI . When you send an email using the Amazon SES API, you specify the content of the message, and Amazon SES assembles a MIME email for you.

This blogpost assumes that you have installed and configured AWS CLI on your terminal. For more information on Installing or updating the latest version of the AWS CLI, refer this link.

aws ses send-email –source-arn “arn:aws:ses:us-east-1:XXXXXXXXX:identity/example.com” –from [email protected] –to [email protected] –text “This is for those who cannot read HTML.” –html “<h1>Hello World</h1><p>This is a pretty mail with HTML formatting</p>” –subject “Hello World”

Replace the From address , To address and source ARN (identity ARN from identity owner account) in the above command.

Once the email request is sent to SES , SES will acknowledge it with a Message ID. This Message ID is a string of characters that uniquely identifies the request and looks something like this: “000001271b15238a-fd3ae762-2563-11df-8cd4-6d4e828a9ae8-000000” .

If you are using SMTP interface for delegate sending, you have to add the authorisation policy in the SMTP user and include the X-SES-SOURCE-ARN, X-SES-FROM-ARN, and X-SES-RETURN-PATH-ARN headers in your message. Pass these headers after you issue the DATA command in the SMTP conversation.

Notifications in case of email sending authorization

If you authorize a delegate sender to send email on your behalf, Amazon SES counts all bounces or complaints that those emails generate toward the delegate sender’s bounce and complaint limits, rather than the identity owner. However, if your IP address appears on third-party anti-spam, DNS-based Blackhole Lists (DNSBLs) as a result of messages sent by a delegate sender, the reputation of your identities may be damaged. For this reason, if you’re an identity owner, you should set up email feedback forwarding for all your identities, including those that you’ve authorized for delegate sending.

For setting up notifications for Identity owner , refer the steps mentioned in the SES developer guide

Delegate senders can and should set up their own bounce and complaint notifications for the identities that you have authorized them to use. They can set up event publishing to to publish bounce and complaint events to an Amazon SNS topic or a Kinesis Data Firehose stream.

Note : If neither the identity owner nor the delegate sender sets up a method of sending notifications for bounce and complaint events, or if the sender doesn’t apply the configuration set that uses the event publishing rule, then Amazon SES automatically sends event notifications by email to the address in the Return-Path field of the email (or the address in the Source field, if you didn’t specify a Return-Path address), even if you disabled email feedback forwarding

Cleaning up resources:

To remove the resources created by this solution:

You can delete the verified identities from Idenitity owner account if you no longer wish to send emails from that verified identity. You can check the SES developer guide for steps for deleting the verified identity .

Frequently Asked Questions

Q.1 If my delegate sender account is in sandbox, can I send emails from the delegate sender account to non-verified addresses ?

Sanbox Restriction : If delegate sender account is in sandbox mode then you need to submit a limit increase case to move the Delegate sender account out of Sandbox mode to “get rid of the Sandbox limitations“. The AWS account of the delegate sender has to be removed from the sandbox before it can be used to send email to non-verified addresses.

If delegate sender account is in sandbox mode, you will face the following error while email sending to unverified identities :

An error occurred (MessageRejected) when calling the SendEmail operation: Email address is not verified. The following identities failed the check in region US-EAST-1 [email protected]

However , you can sent email to verified identities successfully from the delegate sender account in case of sandbox access .

Q2. Is it necessary to have production access in identity owner account ?
It is not necessary to have the Identity owner account to have production access for using Sending authorization.

Q.3 Will the delegate sender account or the identity owner get charged for the emails sent using sending authorization ?

Billing : Emails sent from the delegate sender account are billed to delegate sender account .

Reputation and sending quota : Cross-account emails count against the delegate’s sending limits, so the delegate is responsible for applying for any sending limit increases they might need. Similarly, delegated emails get charged to the delegate’s account, and any bounces and complaints count against the delegate’s reputation.

Region : The delegate sender must send the emails from the AWS Region in which the identity owner’s identity is verified.

Conclusion:

By using Sending Authorization, identity owners will be able to grant delegate senders the permission to send emails through their own verified identities in SES. With the sending authorization feature, you will have complete control over your identities so that you can change or revoke permissions at any time.

Migrate from Amazon Kinesis Data Analytics for SQL Applications to Amazon Kinesis Data Analytics Studio

2023-06-29 Nicholas Tunney

Post Syndicated from Nicholas Tunney original https://aws.amazon.com/blogs/big-data/migrate-from-amazon-kinesis-data-analytics-for-sql-applications-to-amazon-kinesis-data-analytics-studio/

Amazon Kinesis Data Analytics makes it easy to transform and analyze streaming data in real time.

In this post, we discuss why AWS recommends moving from Kinesis Data Analytics for SQL Applications to Amazon Kinesis Data Analytics for Apache Flink to take advantage of Apache Flink’s advanced streaming capabilities. We also show how to use Kinesis Data Analytics Studio to test and tune your analysis before deploying your migrated applications. If you don’t have any Kinesis Data Analytics for SQL applications, this post still provides a background on many of the use cases you’ll see in your data analytics career and how Amazon Data Analytics services can help you achieve your objectives.

Kinesis Data Analytics for Apache Flink is a fully managed Apache Flink service. You only need to upload your application JAR or executable, and AWS will manage the infrastructure and Flink job orchestration. To make things simpler, Kinesis Data Analytics Studio is a notebook environment that uses Apache Flink and allows you to query data streams and develop SQL queries or proof of concept workloads before scaling your application to production in minutes.

We recommend that you use Kinesis Data Analytics for Apache Flink or Kinesis Data Analytics Studio over Kinesis Data Analytics for SQL. This is because Kinesis Data Analytics for Apache Flink and Kinesis Data Analytics Studio offer advanced data stream processing features, including exactly-once processing semantics, event time windows, extensibility using user-defined functions (UDFs) and custom integrations, imperative language support, durable application state, horizontal scaling, support for multiple data sources, and more. These are critical for ensuring accuracy, completeness, consistency, and reliability of data stream processing and are not available with Kinesis Data Analytics for SQL.

Solution overview

For our use case, we use several AWS services to stream, ingest, transform, and analyze sample automotive sensor data in real time using Kinesis Data Analytics Studio. Kinesis Data Analytics Studio allows us to create a notebook, which is a web-based development environment. With notebooks, you get a simple interactive development experience combined with the advanced capabilities provided by Apache Flink. Kinesis Data Analytics Studio uses Apache Zeppelin as the notebook, and uses Apache Flink as the stream processing engine. Kinesis Data Analytics Studio notebooks seamlessly combine these technologies to make advanced analytics on data streams accessible to developers of all skill sets. Notebooks are provisioned quickly and provide a way for you to instantly view and analyze your streaming data. Apache Zeppelin provides your Studio notebooks with a complete suite of analytics tools, including the following:

Data visualization
Exporting data to files
Controlling the output format for easier analysis
Ability to turn the notebook into a scalable, production application

Unlike Kinesis Data Analytics for SQL Applications, Kinesis Data Analytics for Apache Flink adds the following SQL support:

Joining stream data between multiple Kinesis data streams, or between a Kinesis data stream and an Amazon Managed Streaming for Apache Kafka (Amazon MSK) topic
Real-time visualization of transformed data in a data stream
Using Python scripts or Scala programs within the same application
Changing offsets of the streaming layer

Another benefit of Kinesis Data Analytics for Apache Flink is the improved scalability of the solution once deployed, because you can scale the underlying resources to meet demand. In Kinesis Data Analytics for SQL Applications, scaling is performed by adding more pumps to persuade the application into adding more resources.

In our solution, we create a notebook to access automotive sensor data, enrich the data, and send the enriched output from the Kinesis Data Analytics Studio notebook to an Amazon Kinesis Data Firehose delivery stream for delivery to an Amazon Simple Storage Service (Amazon S3) data lake. This pipeline could further be used to send data to Amazon OpenSearch Service or other targets for additional processing and visualization.

Kinesis Data Analytics for SQL Applications vs. Kinesis Data Analytics for Apache Flink

In our example, we perform the following actions on the streaming data:

Connect to an Amazon Kinesis Data Streams data stream.
View the stream data.
Transform and enrich the data.
Manipulate the data with Python.
Restream the data to a Firehose delivery stream.

To compare Kinesis Data Analytics for SQL Applications with Kinesis Data Analytics for Apache Flink, let’s first discuss how Kinesis Data Analytics for SQL Applications works.

At the root of a Kinesis Data Analytics for SQL application is the concept of an in-application stream. You can think of the in-application stream as a table that holds the streaming data so you can perform actions on it. The in-application stream is mapped to a streaming source such as a Kinesis data stream. To get data into the in-application stream, first set up a source in the management console for your Kinesis Data Analytics for SQL application. Then, create a pump that reads data from the source stream and places it into the table. The pump query runs continuously and feeds the source data into the in-application stream. You can create multiple pumps from multiple sources to feed the in-application stream. Queries are then run on the in-application stream, and results can be interpreted or sent to other destinations for further processing or storage.

The following SQL demonstrates setting up an in-application stream and pump:

CREATE OR REPLACE STREAM "TEMPSTREAM" ( 
   "column1" BIGINT NOT NULL, 
   "column2" INTEGER, 
   "column3" VARCHAR(64));

CREATE OR REPLACE PUMP "SAMPLEPUMP" AS 
INSERT INTO "TEMPSTREAM" ("column1", 
                          "column2", 
                          "column3") 
SELECT STREAM inputcolumn1, 
      inputcolumn2, 
      inputcolumn3
FROM "INPUTSTREAM";

Data can be read from the in-application stream using a SQL SELECT query:

SELECT *
FROM "TEMPSTREAM"

When creating the same setup in Kinesis Data Analytics Studio, you use the underlying Apache Flink environment to connect to the streaming source, and create the data stream in one statement using a connector. The following example shows connecting to the same source we used before, but using Apache Flink:

CREATE TABLE `MY_TABLE` ( 
   "column1" BIGINT NOT NULL, 
   "column2" INTEGER, 
   "column3" VARCHAR(64)
) WITH (
   'connector' = 'kinesis',
   'stream' = sample-kinesis-stream',
   'aws.region' = 'aws-kinesis-region',
   'scan.stream.initpos' = 'LATEST',
   'format' = 'json'
 );

MY_TABLE is now a data stream that will continually receive the data from our sample Kinesis data stream. It can be queried using a SQL SELECT statement:

SELECT column1, 
       column2, 
       column3
FROM MY_TABLE;

Although Kinesis Data Analytics for SQL Applications use a subset of the SQL:2008 standard with extensions to enable operations on streaming data, Apache Flink’s SQL support is based on Apache Calcite, which implements the SQL standard.

It’s also important to mention that Kinesis Data Analytics Studio supports PyFlink and Scala alongside SQL within the same notebook. This allows you to perform complex, programmatic methods on your streaming data that aren’t possible with SQL.

Prerequisites

During this exercise, we set up various AWS resources and perform analytics queries. To follow along, you need an AWS account with administrator access. If you don’t already have an AWS account with administrator access, create one now. The services outlined in this post may incur charges to your AWS account. Make sure to follow the cleanup instructions at the end of this post.

Configure streaming data

In the streaming domain, we’re often tasked with exploring, transforming, and enriching data coming from Internet of Things (IoT) sensors. To generate the real-time sensor data, we employ the AWS IoT Device Simulator. This simulator runs within your AWS account and provides a web interface that lets users launch fleets of virtually connected devices from a user-defined template and then simulate them to publish data at regular intervals to AWS IoT Core. This means we can build a virtual fleet of devices to generate sample data for this exercise.

We deploy the IoT Device Simulator using the following Amazon CloudFront template. It handles creating all the necessary resources in your account.

On the Specify stack details page, assign a name to your solution stack.
Under Parameters, review the parameters for this solution template and modify them as necessary.
For User email, enter a valid email to receive a link and password to log in to the IoT Device Simulator UI.
Choose Next.
On the Configure stack options page, choose Next.
On the Review page, review and confirm the settings. Select the check boxes acknowledging that the template creates AWS Identity and Access Management (IAM) resources.
Choose Create stack.

The stack takes about 10 minutes to install.

When you receive your invitation email, choose the CloudFront link and log in to the IoT Device Simulator using the credentials provided in the email.

The solution contains a prebuilt automotive demo that we can use to begin delivering sensor data quickly to AWS.

On the Device Type page, choose Create Device Type.
Choose Automotive Demo.
The payload is auto populated. Enter a name for your device, and enter automotive-topic as the topic.
Choose Save.

Now we create a simulation.

On the Simulations page, choose Create Simulation.
For Simulation type, choose Automotive Demo.
For Select a device type, choose the demo device you created.
For Data transmission interval and Data transmission duration, enter your desired values.

You can enter any values you like, but use at least 10 devices transmitting every 10 seconds. You’ll want to set your data transmission duration to a few minutes, or you’ll need to restart your simulation several times during the lab.

Choose Save.

Now we can run the simulation.

On the Simulations page, select the desired simulation, and choose Start simulations.

Alternatively, choose View next to the simulation you want to run, then choose Start to run the simulation.

To view the simulation, choose View next to the simulation you want to view.

If the simulation is running, you can view a map with the locations of the devices, and up to 100 of the most recent messages sent to the IoT topic.

We can now check to ensure our simulator is sending the sensor data to AWS IoT Core.

Navigate to the AWS IoT Core console.

Make sure you’re in the same Region you deployed your IoT Device Simulator.

In the navigation pane, choose MQTT Test Client.
Enter the topic filter automotive-topic and choose Subscribe.

As long as you have your simulation running, the messages being sent to the IoT topic will be displayed.

Finally, we can set a rule to route the IoT messages to a Kinesis data stream. This stream will provide our source data for the Kinesis Data Analytics Studio notebook.

On the AWS IoT Core console, choose Message Routing and Rules.
Enter a name for the rule, such as automotive_route_kinesis, then choose Next.
Provide the following SQL statement. This SQL will select all message columns from the automotive-topic the IoT Device Simulator is publishing:

SELECT timestamp, trip_id, VIN, brake, steeringWheelAngle, torqueAtTransmission, engineSpeed, vehicleSpeed, acceleration, parkingBrakeStatus, brakePedalStatus, transmissionGearPosition, gearLeverPosition, odometer, ignitionStatus, fuelLevel, fuelConsumedSinceRestart, oilTemp, location 
FROM 'automotive-topic' WHERE 1=1

Choose Next.
Under Rule Actions, select Kinesis Stream as the source.
Choose Create New Kinesis Stream.

This opens a new window.

For Data stream name, enter automotive-data.

We use a provisioned stream for this exercise.

Choose Create Data Stream.

You may now close this window and return to the AWS IoT Core console.

Choose the refresh button next to Stream name, and choose the automotive-data stream.
Choose Create new role and name the role automotive-role.
Choose Next.
Review the rule properties, and choose Create.

The rule begins routing data immediately.

Set up Kinesis Data Analytics Studio

Now that we have our data streaming through AWS IoT Core and into a Kinesis data stream, we can create our Kinesis Data Analytics Studio notebook.

On the Amazon Kinesis console, choose Analytics applications in the navigation pane.
On the Studio tab, choose Create Studio notebook.
Leave Quick create with sample code selected.
Name the notebook automotive-data-notebook.
Choose Create to create a new AWS Glue database in a new window.
Choose Add database.
Name the database automotive-notebook-glue.
Choose Create.
Return to the Create Studio notebook section.
Choose refresh and choose your new AWS Glue database.
Choose Create Studio notebook.
To start the Studio notebook, choose Run and confirm.
Once the notebook is running, choose the notebook and choose Open in Apache Zeppelin.
Choose Import note.
Choose Add from URL.
Enter the following URL: https://aws-blogs-artifacts-public.s3.amazonaws.com/artifacts/BDB-2461/auto-notebook.ipynb.
Choose Import Note.
Open the new note.

Perform stream analysis

In a Kinesis Data Analytics for SQL application, we add our streaming course via the management console, and then define an in-application stream and pump to stream data from our Kinesis data stream. The in-application stream functions as a table to hold the data and make it available for us to query. The pump takes the data from our source and streams it to our in-application stream. Queries may then be run against the in-application stream using SQL, just as we’d query any SQL table. See the following code:

CREATE OR REPLACE STREAM "AUTOSTREAM" ( 
    `trip_id` CHAR(36),
    `VIN` CHAR(17),
    `brake` FLOAT,
    `steeringWheelAngle` FLOAT,
    `torqueAtTransmission` FLOAT,
    `engineSpeed` FLOAT,
    `vehicleSpeed` FLOAT,
    `acceleration` FLOAT,
    `parkingBrakeStatus` BOOLEAN,
    `brakePedalStatus` BOOLEAN,
    `transmissionGearPosition` VARCHAR(10),
    `gearLeverPosition` VARCHAR(10),
    `odometer` FLOAT,
    `ignitionStatus` VARCHAR(4),
    `fuelLevel` FLOAT,
    `fuelConsumedSinceRestart` FLOAT,
    `oilTemp` FLOAT,
    `location` VARCHAR(100),
    `timestamp` TIMESTAMP(3));

CREATE OR REPLACE PUMP "MYPUMP" AS 
INSERT INTO "AUTOSTREAM" ("trip_id",
    "VIN",
    "brake",
    "steeringWheelAngle",
    "torqueAtTransmission",
    "engineSpeed",
    "vehicleSpeed",
    "acceleration",
    "parkingBrakeStatus",
    "brakePedalStatus",
    "transmissionGearPosition",
    "gearLeverPosition",
    "odometer",
    "ignitionStatus",
    "fuelLevel",
    "fuelConsumedSinceRestart",
    "oilTemp",
    "location",
    "timestamp")
SELECT VIN,
    brake,
    steeringWheelAngle,
    torqueAtTransmission,
    engineSpeed,
    vehicleSpeed,
    acceleration,
    parkingBrakeStatus,
    brakePedalStatus,
    transmissionGearPosition,
    gearLeverPosition,
    odometer,
    ignitionStatus,
    fuelLevel,
    fuelConsumedSinceRestart,
    oilTemp,
    location,
    timestamp
FROM "INPUT_STREAM"

To migrate an in-application stream and pump from our Kinesis Data Analytics for SQL application to Kinesis Data Analytics Studio, we convert this into a single CREATE statement by removing the pump definition and defining a kinesis connector. The first paragraph in the Zeppelin notebook sets up a connector that is presented as a table. We can define columns for all items in the incoming message, or a subset.

Run the statement, and a success result is output in your notebook. We can now query this table using SQL, or we can perform programmatic operations with this data using PyFlink or Scala.

Before performing real-time analytics on the streaming data, let’s look at how the data is currently formatted. To do this, we run a simple Flink SQL query on the table we just created. The SQL used in our streaming application is identical to what is used in a SQL application.

Note that if you don’t see records after a few seconds, make sure that your IoT Device Simulator is still running.

If you’re also running the Kinesis Data Analytics for SQL code, you may see a slightly different result set. This is another key differentiator in Kinesis Data Analytics for Apache Flink, because the latter has the concept of exactly once delivery. If this application is deployed to production and is restarted or if scaling actions occur, Kinesis Data Analytics for Apache Flink ensures you only receive each message once, whereas in a Kinesis Data Analytics for SQL application, you need to further process the incoming stream to ensure you ignore repeat messages that could affect your results.

You can stop the current paragraph by choosing the pause icon. You may see an error displayed in your notebook when you stop the query, but it can be ignored. It’s just letting you know that the process was canceled.

Flink SQL implements the SQL standard, and provides an easy way to perform calculations on the stream data just like you would when querying a database table. A common task while enriching data is to create a new field to store a calculation or conversion (such as from Fahrenheit to Celsius), or create new data to provide simpler queries or improved visualizations downstream. Run the next paragraph to see how we can add a Boolean value named accelerating, which we can easily use in our sink to know if an automobile was currently accelerating at the time the sensor was read. The process here doesn’t differ between Kinesis Data Analytics for SQL and Kinesis Data Analytics for Apache Flink.

You can stop the paragraph from running when you have inspected the new column, comparing our new Boolean value to the FLOAT acceleration column.

Data being sent from a sensor is usually compact to improve latency and performance. Being able to enrich the data stream with external data and enrich the stream, such as additional vehicle information or current weather data, can be very useful. In this example, let’s assume we want to bring in data currently stored in a CSV in Amazon S3, and add a column named color that reflects the current engine speed band.

Apache Flink SQL provides several source connectors for AWS services and other sources. Creating a new table like we did in our first paragraph but instead using the filesystem connector permits Flink to directly connect to Amazon S3 and read our source data. Previously in Kinesis Data Analytics for SQL Applications, you couldn’t add new references inline. Instead, you defined S3 reference data and added it to your application configuration, which you could then use as a reference in a SQL JOIN.

NOTE: If you are not using the us-east-1 region, you can download the csv and place the object your own S3 bucket. Reference the csv file as s3a://<bucket-name>/<key-name>

Building on the last query, the next paragraph performs a SQL JOIN on our current data and the new lookup source table we created.

Now that we have an enriched data stream, we restream this data. In a real-world scenario, we have many choices on what to do with our data, such as sending the data to an S3 data lake, another Kinesis data stream for further analysis, or storing the data in OpenSearch Service for visualization. For simplicity, we send the data to Kinesis Data Firehose, which streams the data into an S3 bucket acting as our data lake.

Kinesis Data Firehose can stream data to Amazon S3, OpenSearch Service, Amazon Redshift data warehouses, and Splunk in just a few clicks.

Create the Kinesis Data Firehose delivery stream

To create our delivery stream, complete the following steps:

On the Kinesis Data Firehose console, choose Create delivery stream.
Choose Direct PUT for the stream source and Amazon S3 as the target.
Name your delivery stream automotive-firehose.
Under Destination settings, create a new bucket or use an existing bucket.
Take note of the S3 bucket URL.
Choose Create delivery stream.

The stream takes a few seconds to create.

Return to the Kinesis Data Analytics console and choose Streaming applications.
On the Studio tab, and choose your Studio notebook.
Choose the link under IAM role.
In the IAM window, choose Add permissions and Attach policies.
Search for and select AmazonKinesisFullAccess and CloudWatchFullAccess, then choose Attach policy.
You may return to your Zeppelin notebook.

Stream data into Kinesis Data Firehose

As of Apache Flink v1.15, creating the connector to the Firehose delivery stream works similar to creating a connector to any Kinesis data stream. Note that there are two differences: the connector is firehose, and the stream attribute becomes delivery-stream.

After the connector is created, we can write to the connector like any SQL table.

To validate that we’re getting data through the delivery stream, open the Amazon S3 console and confirm you see files being created. Open the file to inspect the new data.

In Kinesis Data Analytics for SQL Applications, we would have created a new destination in the SQL application dashboard. To migrate an existing destination, you add a SQL statement to your notebook that defines the new destination right in the code. You can continue to write to the new destination as you would have with an INSERT while referencing the new table name.

Time data

Another common operation you can perform in Kinesis Data Analytics Studio notebooks is aggregation over a window of time. This sort of data can be used to send to another Kinesis data stream to identify anomalies, send alerts, or be stored for further processing. The next paragraph contains a SQL query that uses a tumbling window and aggregates total fuel consumed for the automotive fleet for 30-second periods. Like our last example, we could connect to another data stream and insert this data for further analysis.

Scala and PyFlink

There are times when a function you’d perform on your data stream is better written in a programming language instead of SQL, for both simplicity and maintenance. Some examples include complex calculations that SQL functions don’t support natively, certain string manipulations, the splitting of data into multiple streams, and interacting with other AWS services (such as text translation or sentiment analysis). Kinesis Data Analytics for Apache Flink has the ability to use multiple Flink interpreters within the Zeppelin notebook, which is not available in Kinesis Data Analytics for SQL Applications.

If you have been paying close attention to our data, you’ll see that the location field is a JSON string. In Kinesis Data Analytics for SQL, we could use string functions and define a SQL function and break apart the JSON string. This is a fragile approach depending on the stability of the message data, but this could be improved with several SQL functions. The syntax for creating a function in Kinesis Data Analytics for SQL follows this pattern:

CREATE FUNCTION ''<function_name>'' ( ''<parameter_list>'' )
    RETURNS ''<data type>''
    LANGUAGE SQL
    [ SPECIFIC ''<specific_function_name>''  | [NOT] DETERMINISTIC ]
    CONTAINS SQL
    [ READS SQL DATA ]
    [ MODIFIES SQL DATA ]
    [ RETURNS NULL ON NULL INPUT | CALLED ON NULL INPUT ]  
  RETURN ''<SQL-defined function body>''

In Kinesis Data Analytics for Apache Flink, AWS recently upgraded the Apache Flink environment to v1.15, which extends Apache Flink SQL’s table SQL to add JSON functions that are similar to JSON Path syntax. This allows us to query the JSON string directly in our SQL. See the following code:

%flink.ssql(type=update)
SELECT JSON_STRING(location, ‘$.latitude) AS latitude,
JSON_STRING(location, ‘$.longitude) AS longitude
FROM my_table

Alternatively, and required prior to Apache Flink v1.15, we can use Scala or PyFlink in our notebook to convert the field and restream the data. Both languages provide robust JSON string handling.

The following PyFlink code defines two user-defined functions, which extract the latitude and longitude from the location field of our message. These UDFs can then be invoked from using Flink SQL. We reference the environment variable st_env. PyFlink creates six variables for you in your Zeppelin notebook. Zeppelin also exposes a context for you as the variable z.

Errors can also happen when messages contain unexpected data. Kinesis Data Analytics for SQL Applications provides an in-application error stream. These errors can then be processed separately and restreamed or dropped. With PyFlink in Kinesis Data Analytics Streaming applications, you can write complex error-handling strategies and immediately recover and continue processing the data. When the JSON string is passed into the UDF, it may be malformed, incomplete, or empty. By catching the error in the UDF, Python will always return a value even if an error would have occurred.

The following sample code shows another PyFlink snippet that performs a division calculation on two fields. If a division-by-zero error is encountered, it provides a default value so the stream can continue processing the message.

%flink.pyflink
@udf(input_types=[DataTypes.BIGINT()], result_type=DataTypes.BIGINT())
def DivideByZero(price):    
	try:        
		price / 0        
	except:        
		return -1
st_env.register_function("DivideByZero", DivideByZero)

Next steps

Building a pipeline as we’ve done in this post gives us the base for testing additional services in AWS. I encourage you to continue your streaming analytics learning before tearing down the streams you created. Consider the following:

Publish your Kinesis Data Analytics Studio notebook as an application with durable state.
Use your Kinesis Data Firehose delivery stream to write directly to OpenSearch Service.
Use OpenSearch Dashboards to visualize your streaming data.
Review the Migrating to Kinesis Data Analytics for Apache Flink: Examples docs for side-by-side translations of common SQL-based Kinesis Data Analytics application queries to Kinesis Data Analytics Studio.

Clean up

To clean up the services created in this exercise, complete the following steps:

Navigate to the CloudFormation Console and delete the IoT Device Simulator stack.
On the AWS IoT Core console, choose Message Routing and Rules, and delete the rule automotive_route_kinesis.
Delete the Kinesis data stream automotive-data in the Kinesis Data Stream console.
Remove the IAM role automotive-role in the IAM Console.
In the AWS Glue console, delete the automotive-notebook-glue database.
Delete the Kinesis Data Analytics Studio notebook automotive-data-notebook.
Delete the Firehose delivery stream automotive-firehose.

Conclusion

Thanks for following along with this tutorial on Kinesis Data Analytics Studio. If you’re currently using a legacy Kinesis Data Analytics Studio SQL application, I recommend you reach out to your AWS technical account manager or Solutions Architect and discuss migrating to Kinesis Data Analytics Studio. You can continue your learning path in our Amazon Kinesis Data Streams Developer Guide, and access our code samples on GitHub.

About the Author

Nicholas Tunney is a Partner Solutions Architect for Worldwide Public Sector at AWS. He works with global SI partners to develop architectures on AWS for clients in the government, nonprofit healthcare, utility, and education sectors.

How to list over 1000 email addresses from account-level suppression list

2023-06-29 vmgaddam

Post Syndicated from vmgaddam original https://aws.amazon.com/blogs/messaging-and-targeting/how-to-list-over-1000-email-addresses-from-account-level-suppression-list/

Overview of solution

Amazon Simple Email Service (SES) offers an account-level suppression list, which assists customers in avoiding sending emails to addresses that have previously resulted in bounce or complaint events. This feature is designed to protect the sender’s reputation and enhance message delivery rates. There are various types of suppression lists available, including the global suppression List, account-level suppression list, and configuration set-level suppression. The account-level suppression list is owned and managed by the customer, providing them with control over their list and account reputation. Additionally, customers can utilize the configuration set-level suppression feature for more precise control over suppression list management, which overrides the account-level suppression list.

Maintaining a healthy sender reputation with email providers (such as Gmail, Yahoo, or Hotmail) increases the probability of emails reaching recipients’ inboxes instead of being marked as spam. One effective approach to uphold sender reputation involves refraining from sending emails to invalid email addresses and disinterested recipients.

The account-level suppression list can be managed using Amazon SES console or AWS CLI which provides an easy way to manage addresses including bulk actions to add or remove addresses.

Currently, If the account-level suppression list contains more than 1000 records, we need to use NextToken to obtain a complete list of email addresses in a paginated manner. If the email address you are looking for is not within the first 1000 records of the response, you won’t be able to obtain the information from the account-level suppression list with one single command. To list all the email addresses within the account-level suppression, we use Amazon SES ListSuppressedDestinations API. This API allows you to fetch the NextToken and pass it to a follow-up request in order to retrieve another page of results.

The code below creates a loop that makes multiple requests, in each iteration, the next token is replaced, aiding in retrieving all email addresses that have been added to the account-level suppression list.

Prerequisite

The code below can be used to run in your local machine or using AWS CloudShell As part of this blog spot, we will be using AWS CloudShell to fetch the list.

Note: Python 3 and Python 2 are both ready to use in the shell environment. Python 3 is now considered the default version of the programming language (support for Python 2 ended in January 2020).

1) An active AWS account.
2) User logged in to AWS management console must have “ses:ListSuppressedDestinations” permissions.

Walkthrough

Sign in to AWS management console and select the region where you are using Amazon SES
Launch AWS CloudShell
Save the code specified below as a file in your local environment. Example: List_Account_Level.py
Click Actions and Upload File (List_Account_Level.py)

Upload File to AWS CloudShell

5. Run Python code.

Python3 List_Account_Level.py >> Email_Addresses_List.json

6. The file Email_Addresses_List.json will be saved in current directory
7. To download the file – Click Actions and Download File providing File name Email_Addresses_List.json

Download File from AWS CloudShell

List the Email addresses in your Amazon SES account suppression list added to recent bounce or complaint event using Python.

We used the ListSuppressedDestinations operation in the SES API v2 to create a list with all the email addresses that are on your account-level suppression list for your account including bounces and complaints.

Note: SES account-level suppression list applies to your AWS account in the current AWS Region.

import boto3
from datetime import datetime
import json

def showTimestamp(results):
    updated_results = []
    for eachAddress in results:
        updated_address = eachAddress.copy()
        updated_address['LastUpdateTime'] = eachAddress['LastUpdateTime'].strftime("%m/%d/%Y, %H:%M:%S")
        updated_results.append(updated_address)
    return updated_results

def get_resources_from(supression_details):
    results = supression_details['SuppressedDestinationSummaries']
    next_token = supression_details.get('NextToken', None)
    return results, next_token

def main():
    client = boto3.client('sesv2')
    next_token = ''  # Variable to hold the pagination token
    results = []   # List for the entire resource collection
    # Call the `list_suppressed_destinations` method in a loop

    while next_token is not None:
        if next_token:
            suppression_response = client.list_suppressed_destinations(
                PageSize=1000,
                NextToken=next_token
            )
        else:
            suppression_response = client.list_suppressed_destinations(
                PageSize=1000
            )
        current_batch, next_token = get_resources_from(suppression_response)
        results += current_batch

    results = showTimestamp(results)

    print(json.dumps(results, indent=2, sort_keys=False))

if __name__ == "__main__":
    main()

Sample Response

Returns all of the email addresses and the output resembles the following example:

[{
    "EmailAddress": "[email protected]",
    "Reason": "BOUNCE",
    "LastUpdateTime": "04/30/2021, 15:43:01"
}, {
    "EmailAddress": "[email protected]",
    "Reason": "BOUNCE",
    "LastUpdateTime": "04/30/2021, 15:43:01"
}, {
    "EmailAddress": "[email protected]",
    "Reason": "BOUNCE",
    "LastUpdateTime": "04/30/2021, 15:43:01"
}, {
    "EmailAddress": "[email protected]",
    "Reason": "BOUNCE",
    "LastUpdateTime": "04/30/2021, 15:43:00"
}, {
    "EmailAddress": "[email protected]",
    "Reason": "COMPLAINT",
    "LastUpdateTime": "06/22/2023, 12:59:31"
}]

Cleaning up

The response file Email_Addresses_List.json will contain the list of all the email addresses on your account-level suppression list even if there are more than 1000 records. Please free to delete files that were created as part of the process if you no longer need them.

Conclusion

In this blog post, we explained listing of all email addresses if the account-level suppression list contains more than 1000 records using AWS CouldShell. Having complete list of email addresses will help you identify email addresses you are looking for and that are not included in the first 1000 records of the response. You can validate email address and determine who can receive email that can be removed from the account-level suppression list. This protect the sender reputations and improving delivery rates.

Follow-up

https://docs.aws.amazon.com/ses/latest/dg/sending-email-suppression-list.html
https://repost.aws/knowledge-center/ses-remove-email-from-suppresion-list

About the Author

Venkata Manoj Gaddam is Cloud Support Engineer II at AWS and Service Matter Expert in Amazon Simple Email Service (SES) and Amazon Simple Storage Service (S3). Along with Amazon SES and S3, he is AWS Snow Family enthusiast. In his free time, he enjoys hanging out with friends and traveling.

Centralize near-real-time governance through alerts on Amazon Redshift data warehouses for sensitive queries

2023-06-29 Rajdip Chaudhuri

Post Syndicated from Rajdip Chaudhuri original https://aws.amazon.com/blogs/big-data/centralize-near-real-time-governance-through-alerts-on-amazon-redshift-data-warehouses-for-sensitive-queries/

Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud that delivers powerful and secure insights on all your data with the best price-performance. With Amazon Redshift, you can analyze your data to derive holistic insights about your business and your customers. In many organizations, one or multiple Amazon Redshift data warehouses run daily for data and analytics purposes. Therefore, over time, multiple Data Definition Language (DDL) or Data Control Language (DCL) queries, such as CREATE, ALTER, DROP, GRANT, or REVOKE SQL queries, are run on the Amazon Redshift data warehouse, which are sensitive in nature because they could lead to dropping tables or deleting data, causing disruptions or outages. Tracking such user queries as part of the centralized governance of the data warehouse helps stakeholders understand potential risks and take prompt action to mitigate them following the operational excellence pillar of the AWS Data Analytics Lens. Therefore, for a robust governance mechanism, it’s crucial to alert or notify the database and security administrators on the kind of sensitive queries that are run on the data warehouse, so that prompt remediation actions can be taken if needed.

To address this, in this post we show you how you can automate near-real-time notifications over a Slack channel when certain queries are run on the data warehouse. We also create a simple governance dashboard using a combination of Amazon DynamoDB, Amazon Athena, and Amazon QuickSight.

Solution overview

An Amazon Redshift data warehouse logs information about connections and user activities taking place in databases, which helps monitor the database for security and troubleshooting purposes. These logs can be stored in Amazon Simple Storage Service (Amazon S3) buckets or Amazon CloudWatch. Amazon Redshift logs information in the following log files, and this solution is based on using an Amazon Redshift audit log to CloudWatch as a destination:

Connection log – Logs authentication attempts, connections, and disconnections
User log – Logs information about changes to database user definitions
User activity log – Logs each query before it’s run on the database

The following diagram illustrates the solution architecture.

Solution Architecture

The solution workflow consists of the following steps:

Audit logging is enabled in each Amazon Redshift data warehouse to capture the user activity log in CloudWatch.
Subscription filters on CloudWatch capture the required DDL and DCL commands by providing filter criteria.
The subscription filter triggers an AWS Lambda function for pattern matching.
The Lambda function processes the event data and sends the notification over a Slack channel using a webhook.
The Lambda function stores the data in a DynamoDB table over which a simple dashboard is built using Athena and QuickSight.

Prerequisites

Before starting the implementation, make sure the following requirements are met:

You have an AWS account.
The AWS Region used for this post is us-east-1. However, this solution is relevant in any other Region where the necessary AWS services are available.
Permissions to create Slack a workspace.

Create and configure an Amazon Redshift cluster

To set up your cluster, complete the following steps:

Create a provisioned Amazon Redshift data warehouse.

For this post, we use three Amazon Redshift data warehouses: demo-cluster-ou1, demo-cluster-ou2, and demo-cluster-ou3. In this post, all the Amazon Redshift data warehouses are provisioned clusters. However, the same solution applies for Amazon Redshift Serverless.

To enable audit logging with CloudWatch as the log delivery destination, open an Amazon Redshift cluster and go to the Properties tab.
On the Edit menu, choose Edit audit logging.

Redshift edit audit logging

Select Turn on under Configure audit logging.
Select CloudWatch for Log export type.
Select all three options for User log, Connection log, and User activity log.
Choose Save changes.

Create a parameter group for the clusters with enable_user_activity_logging set as true for each of the clusters.
Modify the cluster to attach the new parameter group to the Amazon Redshift cluster.

For this post, we create three custom parameter groups: custom-param-grp-1, custom-param-grp-2, and custom-param-grp-3 for three clusters.

Note, if you enable only the audit logging feature, but not the associated parameter, the database audit logs log information for only the connection log and user log, but not for the user activity log.

On the CloudWatch console, choose Log groups under Logs in the navigation pane.
Search for /aws/redshift/cluster/demo.

This will show all the log groups created for the Amazon Redshift clusters.

Create a DynamoDB audit table

To create your audit table, complete the following steps:

On the DynamoDB console, choose Tables in the navigation pane.
Choose Create table.
For Table name, enter demo_redshift_audit_logs.
For Partition key, enter partKey with the data type as String.

Keep the table settings as default.
Choose Create table.

Create Slack resources

Slack Incoming Webhooks expect a JSON request with a message string corresponding to a "text" key. They also support message customization, such as adding a user name and icon, or overriding the webhook’s default channel. For more information, see Sending messages using Incoming Webhooks on the Slack website.

The following resources are created for this post:

A Slack workspace named demo_rc
A channel named #blog-demo in the newly created Slack workspace
A new Slack app in the Slack workspace named demo_redshift_ntfn (using the From Scratch option)
Note down the Incoming Webhook URL, which will be used in this post for sending the notifications

Create an IAM role and policy

In this section, we create an AWS Identity and Access Management (IAM) policy that will be attached to an IAM role. The role is then used to grant a Lambda function access to a DynamoDB table. The policy also includes permissions to allow the Lambda function to write log files to Amazon CloudWatch Logs.

On the IAM console, choose Policies in navigation pane.
Choose Create policy.
In the Create policy section, choose the JSON tab and enter the following IAM policy. Make sure you replace your AWS account ID in the policy (replace XXXXXXXX with your AWS account ID).

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "ReadWriteTable",
            "Effect": "Allow",
            "Action": [
                "dynamodb:BatchGetItem",
                "dynamodb:BatchWriteItem",
                "dynamodb:PutItem",
                "dynamodb:GetItem",
                "dynamodb:Scan",
                "dynamodb:Query",
                "dynamodb:UpdateItem",
                "logs:PutLogEvents"
            ],
            "Resource": [
                "arn:aws:dynamodb:us-east-1:XXXXXXXX:table/demo_redshift_audit_logs",
                "arn:aws:logs:*:XXXXXXXX:log-group:*:log-stream:*"
            ]
        },
        {
            "Sid": "WriteLogStreamsAndGroups",
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogStream",
                "logs:CreateLogGroup"
            ],
            "Resource": "arn:aws:logs:*:XXXXXXXX:log-group:*"
        }
    ]
}

Choose Next: Tags, then choose Next: Review.
Provide the policy name demo_post_policy and choose Create policy.

To apply demo_post_policy to a Lambda function, you first have to attach the policy to an IAM role.

On the IAM console, choose Roles in the navigation pane.
Choose Create role.
Select AWS service and then select Lambda.
Choose Next.

On the Add permissions page, search for demo_post_policy.
Select demo_post_policy from the list of returned search results, then choose Next.

On the Review page, enter demo_post_role for the role and an appropriate description, then choose Create role.

Create a Lambda function

We create a Lambda function with Python 3.9. In the following code, replace the slack_hook parameter with the Slack webhook you copied earlier:

import gzip
import base64
import json
import boto3
import uuid
import re
import urllib3

http = urllib3.PoolManager()
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table("demo_redshift_audit_logs")
slack_hook = "https://hooks.slack.com/services/xxxxxxx"

def exe_wrapper(data):
    cluster_name = (data['logStream'])
    for event in data['logEvents']:
        message = event['message']
        reg = re.match(r"'(?P<ts>\d{4}-\d\d-\d\dT\d\d:\d\d:\d\dZ).*?\bdb=(?P<db>\S*).*?\buser=(?P<user>\S*).*?LOG:\s+(?P<query>.*?);?$", message)
        if reg is not None:
            filter = reg.groupdict()
            ts = filter['ts']
            db = filter['db']
            user = filter['user']
            query = filter['query']
            query_type = ' '.join((query.split(" "))[0 : 2]).upper()
            object = query.split(" ")[2]
            put_dynamodb(ts,cluster_name,db,user,query,query_type,object,message)
            slack_api(cluster_name,db,user,query,query_type,object)
            
def put_dynamodb(timestamp,cluster,database,user,sql,query_type,object,event):
    table.put_item(Item = {
        'partKey': str(uuid.uuid4()),
        'redshiftCluster': cluster,
        'sqlTimestamp' : timestamp,
        'databaseName' : database,
        'userName': user,
        'sqlQuery': sql,
        'queryType' : query_type,
        'objectName': object,
        'rawData': event
        })
        
def slack_api(cluster,database,user,sql,query_type,object):
    payload = {
	'channel': '#blog-demo',
	'username': 'demo_redshift_ntfn',
	'blocks': [{
			'type': 'section',
			'text': {
				'type': 'mrkdwn',
				'text': 'Detected *{}* command\n *Affected Object*: `{}`'.format(query_type, object)
			}
		},
		{
			'type': 'divider'
		},
		{
			'type': 'section',
			'fields': [{
					'type': 'mrkdwn',
					'text': ':desktop_computer: *Cluster Name:*\n{}'.format(cluster)
				},
				{
					'type': 'mrkdwn',
					'text': ':label: *Query Type:*\n{}'.format(query_type)
				},
				{
					'type': 'mrkdwn',
					'text': ':card_index_dividers: *Database Name:*\n{}'.format(database)
				},
				{
					'type': 'mrkdwn',
					'text': ':technologist: *User Name:*\n{}'.format(user)
				}
			]
		},
		{
			'type': 'section',
			'text': {
				'type': 'mrkdwn',
				'text': ':page_facing_up: *SQL Query*\n ```{}```'.format(sql)
			}
		}
	]
	}
    encoded_msg = json.dumps(payload).encode('utf-8')
    resp = http.request('POST',slack_hook, body=encoded_msg)
    print(encoded_msg) 

def lambda_handler(event, context):
    print(f'Logging Event: {event}')
    print(f"Awslog: {event['awslogs']}")
    encoded_zipped_data = event['awslogs']['data']
    print(f'data: {encoded_zipped_data}')
    print(f'type: {type(encoded_zipped_data)}')
    zipped_data = base64.b64decode(encoded_zipped_data)
    data = json.loads(gzip.decompress(zipped_data))
    exe_wrapper(data)

Create your function with the following steps:

On the Lambda console, choose Create function.
Select Author from scratch and for Function name, enter demo_function.
For Runtime, choose Python 3.9.
For Execution role, select Use an existing role and choose demo_post_role as the IAM role.
Choose Create function.

On the Code tab, enter the preceding Lambda function and replace the Slack webhook URL.
Choose Deploy.

Create a CloudWatch subscription filter

We need to create the CloudWatch subscription filter on the useractivitylog log group created by the Amazon Redshift clusters.

On the CloudWatch console, navigate to the log group /aws/redshift/cluster/demo-cluster-ou1/useractivitylog.
On the Subscription filters tab, on the Create menu, choose Create Lambda subscription filter.

Choose demo_function as the Lambda function.
For Log format, choose Other.
Provide the subscription filter pattern as ?create ?alter ?drop ?grant ?revoke.
Provide the filter name as Sensitive Queries demo-cluster-ou1.
Test the filter by selecting the actual log stream. If it has any queries with a match pattern, then you can see some results. For testing, use the following pattern and choose Test pattern.

'2023-04-02T04:18:43Z UTC [ db=dev user=awsuser pid=100 userid=100 xid=100 ]' LOG: alter table my_table alter column string type varchar(16);
'2023-04-02T04:06:08Z UTC [ db=dev user=awsuser pid=100 userid=100 xid=200 ]' LOG: create user rs_user with password '***';

Choose Start streaming.

Repeat the same steps for /aws/redshift/cluster/demo-cluster-ou2/useractivitylog and /aws/redshift/cluster/demo-cluster-ou3/useractivitylog by giving unique subscription filter names.
Complete the preceding steps to create a second subscription filter for each of the Amazon Redshift data warehouses with the filter pattern ?CREATE ?ALTER ?DROP ?GRANT ?REVOKE, ensuring uppercase SQL commands are also captured through this solution.

Test the solution

In this section, we test the solution in the three Amazon Redshift clusters that we created in the previous steps and check for the notifications of the commands on the Slack channel as per the CloudWatch subscription filters as well as data getting ingested in the DynamoDB table. We use the following commands to test the solution; however, this is not restricted to these commands only. You can check with other DDL commands as per the filter criteria in your Amazon Redshift cluster.

create schema sales;
create schema marketing;
create table dev.public.demo_test_table_1  (id int, string varchar(10));
create table dev.public.demo_test_table_2  (empid int, empname varchar(100));
alter table dev.public.category alter column catdesc type varchar(65);
drop table dev.public.demo_test_table_1;
drop table dev.public.demo_test_table_2;

In the Slack channel, details of the notifications look like the following screenshot.

To get the results in DynamoDB, complete the following steps:

On the DynamoDB console, choose Explore items under Tables in the navigation pane.
In the Tables pane, select demo_redshift_audit_logs.
Select Scan and Run to get the results in the table.

Athena federation over the DynamoDB table

The Athena DynamoDB connector enables Athena to communicate with DynamoDB so that you can query your tables with SQL. As part of the prerequisites for this, deploy the connector to your AWS account using the Athena console or the AWS Serverless Application Repository. For more details, refer to Deploying a data source connector or Using the AWS Serverless Application Repository to deploy a data source connector. For this post, we use the Athena console.

On the Athena console, under Administration in the navigation pane, choose Data sources.
Choose Create data source.

Select the data source as Amazon DynamoDB, then choose Next.

For Data source name, enter dynamo_db.
For Lambda function, choose Create Lambda function to open a new window with the Lambda console.

Under Application settings, enter the following information:
- For Application name, enter AthenaDynamoDBConnector.
- For SpillBucket, enter the name of an S3 bucket.
- For AthenaCatalogName, enter dynamo.
- For DisableSpillEncryption, enter false.
- For LambdaMemory, enter 3008.
- For LambdaTimeout, enter 900.
- For SpillPrefix, enter athena-spill-dynamo.

Select I acknowledge that this app creates custom IAM roles and choose Deploy.
Wait for the function to deploy, then return to the Athena window and choose the refresh icon next to Lambda function.
Select the newly deployed Lambda function and choose Next.

Review the information and choose Create data source.
Navigate back to the query editor, then choose dynamo_db for Data source and default for Database.
Run the following query in the editor to check the sample data:

SELECT partkey,
       redshiftcluster,
       databasename,
       objectname,
       username,
       querytype,
       sqltimestamp,
       sqlquery,
       rawdata
FROM dynamo_db.default.demo_redshift_audit_logs limit 10;

Visualize the data in QuickSight

In this section, we create a simple governance dashboard in QuickSight using Athena in direct query mode to query the record set, which is persistently stored in a DynamoDB table.

Sign up for QuickSight on the QuickSight console.
Select Amazon Athena as a resource.
Choose Lambda and select the Lambda function created for DynamoDB federation.

Create a new dataset in QuickSight with Athena as the source.
Provide the name of the data source name as demo_blog.
Choose dynamo_db for Catalog, default for Database, and demo_redshift_audit_logs for Table.
Choose Edit/Preview data.

Choose String in the sqlTimestamp column and choose Date.

In the dialog box that appears, enter the data format yyyy-MM-dd'T'HH:mm:ssZZ.
Choose Validate and Update.

Choose PUBLISH & VISUALIZE.
Choose Interactive sheet and choose CREATE.

This will take you to the visualization page to create the analysis on QuickSight.

Create a governance dashboard with the appropriate visualization type.

Refer to the Amazon QuickSight learning videos in QuickSight community for basic to advanced level of authoring. The following screenshot is a sample visualization created on this data.

Clean up

Clean up your resources with the following steps:

Conclusion

Amazon Redshift is a powerful, fully managed data warehouse that can offer significantly increased performance and lower cost in the cloud. In this post, we discussed a pattern to implement a governance mechanism to identify and notify sensitive DDL/DCL queries on an Amazon Redshift data warehouse, and created a quick dashboard to enable the DBA and security team to take timely and prompt action as required. Additionally, you can extend this solution to include DDL commands used for Amazon Redshift data sharing across clusters.

Operational excellence is a critical part of the overall data governance on creating a modern data architecture, as it’s a great enabler to drive our customers’ business. Ideally, any data governance implementation is a combination of people, process, and technology that organizations use to ensure the quality and security of their data throughout its lifecycle. Use these instructions to set up your automated notification mechanism as sensitive queries are detected as well as create a quick dashboard on QuickSight to track the activities over time.

About the Authors

Rajdip Chaudhuri is a Senior Solutions Architect with Amazon Web Services specializing in data and analytics. He enjoys working with AWS customers and partners on data and analytics requirements. In his spare time, he enjoys soccer and movies.

Dhiraj Thakur is a Solutions Architect with Amazon Web Services. He works with AWS customers and partners to provide guidance on enterprise cloud adoption, migration, and strategy. He is passionate about technology and enjoys building and experimenting in the analytics and AI/ML space.

How to verify an email address in SES which does not have an inbox

2023-06-29 ajibho

Post Syndicated from ajibho original https://aws.amazon.com/blogs/messaging-and-targeting/how-to-verify-an-email-address-in-ses-which-does-not-have-an-inbox/

Overview of solution

Amazon Simple Email Service (Amazon SES) is an email platform that provides a straightforward and cost-effective solution for sending and receiving emails using your own email addresses and domains.

One of the most common use cases for using separate verified from email address is in online retails/e-commerce platforms. Online/e-commerce platform need to send emails to their customers where the from address should look like “[email protected]. In these cases, the From addresses like [email protected] does not have inbox setup for receiving emails. Using the following solution, you can avoid setting up an inbox for the email identity while still verifying the email address for sending and receiving.

In order to send emails from SES using email/domain identity, we need to have the From email identity or domain verified in Amazon SES in a supported region. When verifying a domain,you have the option to use Easy DKIM or Bring Your Own DKIM(BYOD). For verifying an email address, you need to create an identity in Amazon SES for the respective region. Once the required email address identity is created, you will receive a verification link in your inbox. To successfully verify the email address, simply open the link in your browser. In this case, you would need to have inbox setup for email address to receive the verification link from [email protected].

Verifying a domain in Amazon SES allows you to send emails from any identity associated with that domain. For example, if you create and verify a domain identity called example.com, you don’t need to create separate subdomain identities for a.example.com, a.b.example.com, nor separate email address identities for [email protected], [email protected], and so on. Therefore, the settings for the domain remain the same for all From addresses and you cannot separate you sending activity. You can use this solution to verify the From address without setting up an inbox and differentiate sending activity and tracking based on settings. The benefits of having different email settings from the domain are mentioned below.

Benefits of verifying the email separately for the same domain:

1) When you verify the email along with your domain, you can keep the settings different for the two Identities. You can setup different Configuration sets, notifications and dedicated IP pools for the verified email. This separation enables you to manage domain and email settings independently.
2) You can have two separate emails for sending transaction ([email protected]) and Marketing emails ([email protected]). After assigning different configuration sets, you can monitor the bounces and complaints separately for the sender. A best practice here would be separating the Transactional and Marketing in sub domains. Having both types in the same domain can adversely affect the reputation for your domain, and reduce deliverability of your transactional emails.
3) Using different dedicated IP pools, you can separate the sending IPs for Marketing and transaction or any other emails. Thus, your IP reputation for one use case is not affected by any other emails.

Prerequisite

1) An active AWS account.
2) Administrative Access to the Amazon SES Console and Amazon Simple Storage Service(S3) console.
3) A verified identity (Domain) with an MX record for the domain pointing to a Receiving Endpoint in one of the following region in Amazon SES.

Region Name	Region	Receiving Endpoint
US East (N. Virginia)	us-east-1	inbound-smtp.us-east-1.amazonaws.com
US West (Oregon)	us-west-2	inbound-smtp.us-west-2.amazonaws.com
Europe (Ireland)	eu-west-1	inbound-smtp.eu-west-1.amazonaws.com

Solution walkthrough

In order to verify the email in SES, we need to verify the link send from Amazon SES in the email inbox. We will setup receiving rule set and add S3 bucket with required permissions to store emails from Amazon SES in S3 bucket. After receiving the email in S3 bucket, download the email to get the verification link. Open the verification link in a browser to complete the process.

Step 1 : How to setup SES Email Receiving Ruleset for S3 bucket

1) Open the Amazon SES console.
2) In the navigation pane, under Configuration, choose Email Receiving.
Email Receiving Rule set

3) To create a new rule set, choose Create a Rule Set, enter a rule set name, and then choose Create a Rule Set.
Note: If you create a new rule set, select the rule set, and then choose Set as Active Rule Set. Only one of your receipt rule sets can be the active rule set at any given time.

4) Choose Active Rule Set and Choose Create Rule.

Active Ruleset

5) Enter a unique rule name. If your use case requires TLS or spam and virus scanning, then choose Require TLS or Enable spam and virus scanning. To make this an active rule, select the Enabled checkbox. Choose Next.
Receiving Rule Setting

6) To receive emails for specific verified domain, click Add new recipient condition and enter the domain/email address. You can leave it blank and it will store for all the verified domain addresses with receiving setup.
Add recipient condition

7) Choose Add new action, and then choose Deliver to S3 bucket
Action Deliver to S3 bucket

8) Click on Create S3 bucket

9) Enter a unique S3 bucket name and click on ‘Create Bucket’
Note: S3 Bucket policy will be added automatically.
Provide Unique S3 bucket name

(Optional) Choose Message encryption for Amazon SES to use an Amazon Key Management Server (Amazon KMS) key to encrypt your emails.
(Optional) For SNS topic, select an Amazon Simple Notification Service (Amazon SNS) topic to notify you when Amazon SES delivers an email to the S3 bucket.
Add Action in Receiving rule set

10) Click Next and Create Rule.
Review and Create Ruleset

Step 2: Verifying email address in Amazon SES using S3

The following procedure shows you how to verify Email address in Amazon SES.
1) Open the Amazon SES console.
2) In the navigation pane, under Configuration, choose Verified identities.
3) Choose Create identity.
Create Verified Identity

4) Under Identity details, choose Email address as the identity type you want to create.
5) For Email address, enter the email address that you want to use. The email address must be an address that’s able to receive mail and that you have access to.
(Optional) If you want to Assign a default configuration set, select the check box.
6) To create your email address identity, choose Create identity. After it’s created, you should receive a verification email within five minutes from [email protected].

Create Verified identity and Enter
7) Open the Amazon S3 console.
Go to S3 bucket

8) Open the S3 Bucket that you configured to store the Amazon SES emails. Verify that the bucket contains the test email that you sent. It can take a few minutes for the test email to appear.
Select the Received Email in S3 bucket

9) Select the email/object received in S3 bucket. Click Download.
Download the received email/object

10) Open the Downloaded file in Notepad and copy the verification link under the Subject. Paste the link in your Browser and confirm it.
Open the Downloaded email in Notepad

11) Once the link is confirmed, you can check in SES console and confirm under verified identities that your email address is in verified Status.
Browser link after pasting the verification link

Verified Identity confirmation in SES console

Cleaning up:

You should have successfully verified email address in Amazon SES using S3 bucket. To avoid incurring any extra charges, remember to delete any resources created manually if you no longer need them for monitoring.

Steps for removing the resources:

1) Delete all the created/verified Identities.
2) Delete data regarding Amazon SES receiving Rules.
3) Delete data regarding Amazon S3 bucket.

Conclusion:

In this blog post, we explained the benefits of verifying a separate email address for the verified domain without setting up an inbox. Having separate identities for different use cases helps in efficient management of bounces, complaints, and delivery. You can setup different IP pools using configuration set for different use cases.

Follow-up:

https://aws.amazon.com/blogs/messaging-and-targeting/manage-incoming-emails-with-ses/
https://docs.aws.amazon.com/ses/latest/dg/receiving-email.html
https://repost.aws/knowledge-center/ses-receive-inbound-emails

About the Author

Ajinkya Bhoite is Cloud Support Engineer II in AWS and Service Matter Expert in Amazon Simple Email Service(SES). Along with Amazon SES, he is an Amazon S3 enthusiast. He loves helping customers in solving issues related to SES and S3 in their environment. He loves reading, writing and running but not in the same order. He has a fictional novel published on Amazon Kindle by the name Shiva Stone: Hampi’s Hidden treasure.

How to investigate what happened to the email that was sent via SES but was never received in recipient inbox

2023-06-27 singhzwz

Post Syndicated from singhzwz original https://aws.amazon.com/blogs/messaging-and-targeting/how-to-investigate-what-happened-to-the-email-that-was-sent-via-ses-but-was-never-received-in-recipient-inbox/

How to investigate what happened to the email that was sent via SES but was never received in recipient inbox

Amazon Simple Email Service (SES) is an email platform that provides an easy, cost-effective way for you to send and receive email using your own email addresses and domains.

You can use SES for two major use cases. The first use case is marketing emails, which can include things such as special offers, newsletters, and product updates. The second use case is transactional emails, which can include things such as order confirmation, One Time Password (OTP), and registration confirmation.

One common issue we hear from our customers is that emails are not delivered to the recipient’s inbox. There are three areas in the flow where this could happen:

Emails were not sent successfully by your email application
Emails get dropped within SES after being received from your email application
Emails are dropped after leaving SES

In this blog post, we will explore each of these three scenarios.

Prerequisites:

To investigate each scenario, you need the following:

Understand How email sending works in Amazon SES
Enable verbose logging on your email application
Amazon SES – Set up notifications for bounces and complaints

Scenario 1: Emails were not sent successfully by your email application

The first step in the process is for your application to contact SES to pass it the message. This is one place where problems could occur.
Remember that for each message it successfully enqueues, SES returns a message ID that you can log on your side. An example message ID looks like 01000174f87e0fb4-81b97243-56b0-4cd1-90c7-d6987a00fdc1-000000.

Here is an example of Sendmail logs for successfully relaying the email to SES.

Jun 22 15:29:42 ip-172-12-12-12.ec2.internal sendmail[323257]: 35ABCDEF123123: to=<[email protected]>, delay=00:00:01, xdelay=00:00:01, mailer=relay, pri=30123, relay=email-smtp.us-east-1.amazonaws.com. [3.95.123.123], dsn=2.0.0, stat=Sent (Ok 01000174f87e0fb4-81b97243-56b0-4cd1-90c7-d6987a00fdc1-000000)

To verify that your requests are successful, you can look for the message IDs. If SES never accepted your message, you can be sure it won’t be delivered! If you are able to log a message ID, it means that your email safely arrived at SES.

Actions to troubleshoot client side issue:

1. In order to test that the problem lies between your application and SES, you can try to send a test email from the SES console to an email address and domain where you are experiencing the issue. If that test email arrives on time, it’s a pretty good indicator that the error is occurring before your email even reaches SES.
2. Examine the logs at application side to confirm if the application has successfully passed the email to SES. A success response looks like the following example:

250 Ok 0000012345678e09-123a4cdc-b56c-78dd-b90e-d123be456789-000000

Some common errors include the following:

* Email address is not verified. The following identities failed the check in region region: identity1, identity2, identity3
* Account is paused
* Daily message quota exceeded
* Maximum sending rate exceeded
* Maximum SigV2 SMTP sending rate exceeded

Another common type of error is a connection timeout error. Here is an example of a Java exception that shows a connection timeout error to an SES endpoint:

Caused: com.sun.mail.util.MailConnectException: Couldn't connect to host, port: email-smtp.ap-southeast-2.amazonaws.com , 465; timeout 60000; nested exception is: java.net .ConnectException: Connection timed out (Connection timed out) at com.sun.mail.smtp.SMTPTransport.openServer(SMTPTransport.java:2209 ) at com.sun.mail.smtp.SMTPTransport.protocolConnect(SMTPTransport.java:740 )

The way that you observe this error message depends on the way that you call SES.

If you call the SES API directly, the action to interact with SES will return an error like MessageRejected or one of the errors specified in the Common Errors topic of the Amazon Simple Email Service API Reference.

If you call SES through its SMTP interface, the way that you experience the error depends on the application. Some applications might display a specific error message, and others might not. For a list of SMTP response codes that SES returns, see SMTP response codes returned by Amazon SES.

Scenario 2: Emails get dropped within SES after being received from your email application

Once SES accepts the message and provides a message ID, the next step is to process it and hand it over to the recipient ISP or the recipient mailbox provider. However, in the following scenarios SES can drop the email after receiving it:

Rendering Failure: This can happen when you use an email template to send email. Upon accepting the email from your email application, SES validates that the template data you send includes the required variables in the template. If the template data contains non-compliant variables or is missing variables, SES is unable to construct the email and drops the email. This is called a Rendering Failure.
Suppression list: If a recipient is on an SES suppression list, SES will drop the email to protect the sender reputation. There are two types of suppression lists: Account-level suppression list and global suppression list. To learn more about the two types of suppression lists, please refer to Managing lists and subscriptions in Amazon Simple Email Service.
Reject Event: SES accepted the email, but determined that it contained a virus and didn’t attempt to deliver it to the recipient’s mail server. SES drops the email and generates a Reject event.

Actions to troubleshoot such issues:

To identify the root cause of the three preceding scenarios, you need to set up Amazon SES event publishing. Please refer to the prerequisite section to set this up.

Scenario 1: Rendering Failure

Here is an example event publishing record for Rendering Failure

{
"eventType":"Rendering Failure",
"mail":{
"timestamp":"2018-01-22T18:43:06.197Z",
"source":"[email protected]",
"sourceArn":"arn:aws:ses:us-east-1:123456789012:identity/[email protected]",
"sendingAccountId":"123456789012",
"messageId":"EXAMPLE7c191be45-e9aedb9a-02f9-4d12-a87d-dd0099a07f8a-000000",
"destination":[
"[email protected]"
],
"headersTruncated":false,
"tags":{
"ses:configuration-set":[
"ConfigSet"
]
}
},
"failure":{
"errorMessage":"Attribute 'attributeName' is not present in the rendering data.",
"templateName":"MyTemplate"
}
}

In this example, the error message tells you that there is a missing variable attributeName in the template data. To learn more about troubleshooting rendering failure, please refer to Why are emails failing to deliver when I send Amazon SES emails using the SendTemplatedEmail operation?

Scenario 2: Suppression list

SES drops the email if the recipient is on an account-level suppression list or a global suppression list. The following is an example error message for recipient being on a global suppression list:

"diagnosticCode": "Amazon SES has suppressed sending to this address because it has a recent history of bouncing as an invalid address. For more information about how to remove an address from the suppression list, see the Amazon SES Developer Guide: http://docs.aws.amazon.com/ses/latest/DeveloperGuide/remove-from-suppressionlist.html"

The example demonstrates that the email get bounced because the recipient is on the global suppression list. We recommend enabling SES account-level suppression list because the account-level suppression list supersedes the global suppression list.

If the recipient is on your account-level suppression list, you can remove individual email addresses from your Amazon SES account-level suppression list. The following is an example error message for recipient being on a account-level suppression list:

"diagnosticCode": "Amazon SES did not send the message to this address because it is on the suppression list for your account. For more information about removing addresses from the suppression list, see the Amazon SES Developer Guide at https://docs.aws.amazon.com/ses/latest/DeveloperGuide/sending-email-suppression-list.html"

Note: The suppression list feature is in place to avoid emails being sent to recipients who are known to produce bounce or complaint for your emails in past. Enabling this feature protects your reputation as a sender. Only remove the recipient when you are assured that recipients in concern will no longer produce bounce or complaint for your emails.

Scenario 3: Reject Event

SES rejects the email when it scans the email and determines that it contains a virus. The following is an example of a Reject event publishing record.

{
"eventType": "Reject",
"mail": {
"timestamp": "2016-10-14T17:38:15.211Z",
"source": "[email protected]",
"sourceArn": "arn:aws:ses:us-east-1:123456789012:identity/[email protected]",
"sendingAccountId": "123456789012",
"messageId": "EXAMPLE7c191be45-e9aedb9a-02f9-4d12-a87d-dd0099a07f8a-000000",
"destination": [
"[email protected]"
],
"headersTruncated": false,
"headers": [{
"name": "From",
"value": "[email protected]"
},
{
"name": "To",
"value": "[email protected]"
},
{
"name": "Subject",
"value": "Message sent from Amazon SES"
},
{
"name": "MIME-Version",
"value": "1.0"
},
{
"name": "Content-Type",
"value": "multipart/mixed; boundary=\"qMm9M+Fa2AknHoGS\""
},
{
"name": "X-SES-MESSAGE-TAGS",
"value": "myCustomTag1=myCustomTagValue1, myCustomTag2=myCustomTagValue2"
}
],
"commonHeaders": {
"from": [
"[email protected]"
],
"to": [
"[email protected]"
],
"messageId": "EXAMPLE7c191be45-e9aedb9a-02f9-4d12-a87d-dd0099a07f8a-000000",
"subject": "Message sent from Amazon SES"
},
"tags": {
" ses:configuration-set": [
"ConfigSet"
],
"ses:source-ip": [
"192.0.2.0"
],
"ses:from-domain": [
"example.com"
],
"ses:caller-identity": [
"ses_user"
],
"myCustomTag1": [
"myCustomTagValue1"
],
"myCustomTag2": [
"myCustomTagValue2"
]
}
},
"reject": {
"reason": "Bad content"
}
}

To fix this issue, please examine the email and make sure there is no virus in the email. You can test your email content by sending a test email to Amazon SES mailbox simulator.

Scenario 3: Emails are dropped after leaving SES
We sometimes encounter this when we successfully deliver the email to a recipient ISP, only to have the recipient ISP hold back the email and not deliver to the recipient’s inbox. This can happen for a variety of reasons, for example the recipient ISP’s anti-spam filters, or because the sender has not yet built sufficient reputation due to it being a new domain or IP and the recipient does not yet trust the sender.

Actions to troubleshoot such issues:

To identify the root cause of the issue, you need to set up Amazon SNS notification. Please refer to the prerequisite section to set this up.

When the recipient ISP accepts the email, it returns a 250 ok status code to SES. SES includes the detailed status code and response message from the recipient ISP in the SNS notification.The following is an example SNS notification for a success delivery event.

{
"notificationType":"Delivery",
"mail":{
"timestamp":"2016-01-27T14:59:38.237Z",
"messageId":"0000014644fe5ef6-9a483358-9170-4cb4-a269-f5dcdf415321-000000",
"source":"[email protected]",
"sourceArn": "arn:aws:ses:us-east-1:888888888888:identity/example.com",
"sourceIp": "127.0.3.0",
"sendingAccountId":"123456789012",
"callerIdentity": "IAM_user_or_role_name",
"destination":[
"[email protected]"
],
"headersTruncated":false,
"headers":[
{
"name":"From",
"value":"\"John Doe\" <[email protected]>"
},
{
"name":"To",
"value":"\"Jane Doe\" <[email protected]>"
},
{
"name":"Message-ID",
"value":"custom-message-ID"
},
{
"name":"Subject",
"value":"Hello"
},
{
"name":"Content-Type",
"value":"text/plain; charset=\"UTF-8\""
},
{
"name":"Content-Transfer-Encoding",
"value":"base64"
},
{
"name":"Date",
"value":"Wed, 27 Jan 2016 14:58:45 +0000"
}
],
"commonHeaders":{
"from":[
"John Doe <[email protected]>"
],
"date":"Wed, 27 Jan 2016 14:58:45 +0000",
"to":[
"Jane Doe <[email protected]>"
],
"messageId":"custom-message-ID",
"subject":"Hello"
}
},
"delivery":{
"timestamp":"2016-01-27T14:59:38.237Z",
"recipients":["[email protected]"],
"processingTimeMillis":546,
"reportingMTA":"a8-70.smtp-out.amazonses.com",
"smtpResponse":"250 ok: Message 64111812 accepted",
"remoteMtaIp":"127.0.2.0"
}
}

In this example, the SMTP response is "250 ok:Message 64111812 accepted", this confirms the recipient ISP accepted the email. At this point, the email is with the recipient ISP and SES no longer has control over the email.

ISPs use different mechanisms and algorithms to filter emails to place them in either the recipient’s inbox folder or spam folder. In such case, you will need to reach out to the recipient ISP to understand why they accepted the email but did not deliver it to the recipient inbox. Since their servers decide the placement of email, they can give a clear explanation of the cause of the issue and steps to remediate the issue.

The recipient ISP may reject the email. When this happens, the recipient ISP returns a 4xx or 5xx error code to SES. This is also called a bounce event.

There are two kinds of bounce events: Soft bounce and hard bounce. Soft bounce refers to a temporary email delivery failure, this kind of error usually can be retried. Hard bounce refers to a permanent delivery failure, and it cannot be retried. The Simple Mail Transfer Protocol (SMTP) Enhanced Status Codes Registry lists the possible error codes and reasons for email delivery failure. Generally speaking, 4xx error codes are soft bounces, 5xx error codes are hard bounces.

When the recipient ISP bounces the email, SES includes the detailed status code and response message from the recipient ISP in the SNS notification.

The following is an example SNS notification for a bounce event.

{
"notificationType":"Bounce",
"bounce":{
"bounceType":"Permanent",
"reportingMTA":"dns; email.example.com",
"bouncedRecipients":[
{
"emailAddress":"[email protected]",
"status":"5.1.1",
"action":"failed",
"diagnosticCode":"smtp; 550 5.1.1 <[email protected]> User not found"
}
],
"bounceSubType":"General",
"timestamp":"2016-01-27T14:59:38.237Z",
"feedbackId":"00000138111222aa-33322211-cccc-cccc-cccc-ddddaaaa068a-000000",
"remoteMtaIp":"127.0.2.0"
},
"mail":{
"timestamp":"2016-01-27T14:59:38.237Z",
"source":"[email protected]",
"sourceArn": "arn:aws:ses:us-east-1:888888888888:identity/example.com",
"sourceIp": "127.0.3.0",
"sendingAccountId":"123456789012",
"callerIdentity": "IAM_user_or_role_name",
"messageId":"00000138111222aa-33322211-cccc-cccc-cccc-ddddaaaa0680-000000",
"destination":[
"[email protected]",
"[email protected]",
"[email protected]"],
"headersTruncated":false,
"headers":[
{
"name":"From",
"value":"\"John Doe\" <[email protected]>"
},
{
"name":"To",
"value":"\"Jane Doe\" <[email protected]>, \"Mary Doe\" <[email protected]>, \"Richard Doe\" <[email protected]>"
},
{
"name":"Message-ID",
"value":"custom-message-ID"
},
{
"name":"Subject",
"value":"Hello"
},
{
"name":"Content-Type",
"value":"text/plain; charset=\"UTF-8\""
},
{
"name":"Content-Transfer-Encoding",
"value":"base64"
},
{
"name":"Date",
"value":"Wed, 27 Jan 2016 14:05:45 +0000"
}
],
"commonHeaders":{
"from":[
"John Doe <[email protected]>"
],
"date":"Wed, 27 Jan 2016 14:05:45 +0000",
"to":[
"Jane Doe <[email protected]>, Mary Doe <[email protected]>, Richard Doe <[email protected]>"
],
"messageId":"custom-message-ID",
"subject":"Hello"
}
}
}

In the preceding example, the recipient ISP permanently bounced the email. This is concluded from the "bounceType":"Permanent" field. Additionally, the recipient ISP provides the reason they bounced the email in the “diagnosticCode” field, and the reason is "smtp; 550 5.1.1 <[email protected]> User not found". In this case, [email protected] isn’t a valid inbox, the error can’t be resolved by retrying the request. The sender should check the spelling of the email address and confirm with the recipient that this address has a valid inbox before retry sending to this address again. You can also contact the recipient ISP to understand the reasons of the error and steps to remediate the issue when the information provided by the "diagnosticCode" isn’t sufficient or clear.

Another event that can happen is “delivery delay”. Such event happen when the email couldn’t be delivered to the recipient’s mail server because a temporary issue occurred. Delivery delays can occur, for example, when the recipient’s inbox is full, or when the receiving email server experiences a transient issue. These are usually soft bounces in which case SES is programmed to retry such deliveries several times over the next 10 hours. If you have enabled SES event publishing for this event you can review the “delivery delay” object section of SES event record to understand the cause of delay and take corrective actions accordingly. As an example, if you reach out to the recipient ISP and the recipient ISP informs you about maintenance on their side that is causing the issue then you can refrain sending to that ISP till the maintenance completes.

Conclusion:
Now that you know the 3 areas in the email sending flow where errors can occur you can begin to troubleshoot whether your problems are at your email application side, within SES, or at the recipient ISP. Make sure that you have proper logging and that you are receiving the events that provide you the detail you need to determine where your errors are occurring. Errors happening prior to the email being sent out from SES are usually solvable, while challenges in deliverability once an email has been sent from SES require more due diligence and experimentation in your content to determine why your emails are being delivered to spam, or not at all.

Feel free to post any comments or questions for us in the AWS Forum.

References:
[1]https://docs.aws.amazon.com/ses/latest/dg/send-email-concepts-process.html
[2]https://docs.aws.amazon.com/ses/latest/dg/monitor-sending-activity.html
[3]https://docs.aws.amazon.com/ses/latest/dg/monitor-using-event-publishing.html#event-publishing-how-works

Harmonize data using AWS Glue and AWS Lake Formation FindMatches ML to build a customer 360 view

2023-06-26 Nishchai JM

Post Syndicated from Nishchai JM original https://aws.amazon.com/blogs/big-data/harmonize-data-using-aws-glue-and-aws-lake-formation-findmatches-ml-to-build-a-customer-360-view/

In today’s digital world, data is generated by a large number of disparate sources and growing at an exponential rate. Companies are faced with the daunting task of ingesting all this data, cleansing it, and using it to provide outstanding customer experience.

Typically, companies ingest data from multiple sources into their data lake to derive valuable insights from the data. These sources are often related but use different naming conventions, which will prolong cleansing, slowing down the data processing and analytics cycle. This problem particularly impacts companies trying to build accurate, unified customer 360 profiles. There are customer records in this data that are semantic duplicates, that is, they represent the same user entity, but have different labels or values. It’s commonly referred to as a data harmonization or deduplication problem. The underlying schemas were implemented independently and don’t adhere to common keys that can be used for joins to deduplicate records using deterministic techniques. This has led to so-called fuzzy deduplication techniques to address the problem. These techniques utilize various machine learning (ML) based approaches.

In this post, we look at how we can use AWS Glue and the AWS Lake Formation ML transform FindMatches to harmonize (deduplicate) customer data coming from different sources to get a complete customer profile to be able to provide better customer experience. We use Amazon Neptune to visualize the customer data before and after the merge and harmonization.

Overview of solution

In this post, we go through the various steps to apply ML-based fuzzy matching to harmonize customer data across two different datasets for auto and property insurance. These datasets are synthetically generated and represent a common problem for entity records stored in multiple, disparate data sources with their own lineage that appear similar and semantically represent the same entity but don’t have matching keys (or keys that work consistently) for deterministic, rule-based matching. The following diagram shows our solution architecture.

We use an AWS Glue job to transform the auto insurance and property insurance customer source data to create a merged dataset containing fields that are common to both datasets (identifiers) that a human expert (data steward) would use to determine semantic matches. The merged dataset is then used to deduplicate customer records using an AWS Glue ML transform to create a harmonized dataset. We use Neptune to visualize the customer data before and after the merge and harmonization to see how the transform FindMacthes can bring all related customer data together to get a complete customer 360 view.

To demonstrate the solution, we use two separate data sources: one for property insurance customers and another for auto insurance customers, as illustrated in the following diagram.

The data is stored in an Amazon Simple Storage Service (Amazon S3) bucket, labeled as Raw Property and Auto Insurance data in the following architecture diagram. The diagram also describes detailed steps to process the raw insurance data into harmonized insurance data to avoid duplicates and build logical relations with related property and auto insurance data for the same customer.

The workflow includes the following steps:

Catalog the raw property and auto insurance data, using an AWS Glue crawler, as tables in the AWS Glue Data Catalog.
Transform raw insurance data into CSV format acceptable to Neptune Bulk Loader, using an AWS Glue extract, transform, and load (ETL) job.
When the data is in CSV format, use an Amazon SageMaker Jupyter notebook to run a PySpark script to load the raw data into Neptune and visualize it in a Jupyter notebook.
Run an AWS Glue ETL job to merge the raw property and auto insurance data into one dataset and catalog the merged dataset. This dataset will have duplicates and no relations are built between the auto and property insurance data.
Create and train an AWS Glue ML transform to harmonize the merged data to remove duplicates and build relations between the related data.
Run the AWS Glue ML transform job. The job also catalogs the harmonized data in the Data Catalog and transforms the harmonized insurance data into CSV format acceptable to Neptune Bulk Loader.
When the data is in CSV format, use a Jupyter notebook to run a PySpark script to load the harmonized data into Neptune and visualize it in a Jupyter notebook.

Prerequisites

To follow along with this walkthrough, you must have an AWS account. Your account should have permission to provision and run an AWS CloudFormation script to deploy the AWS services mentioned in the architecture diagram of the solution.

Provision required resources using AWS CloudFormation:

To launch the CloudFormation stack that configures the required resources for this solution in your AWS account, complete the following steps:

Follow the prompts on the AWS CloudFormation console to create the stack.
When the launch is complete, navigate to the Outputs tab of the launched stack and note all the key-value pairs of the resources provisioned by the stack.

Verify the raw data and script files S3 bucket

On the CloudFormation stack’s Outputs tab, choose the value for S3BucketName. The S3 bucket name should be cloud360-s3bucketstack-xxxxxxxxxxxxxxxxxxxxxxxx and should contain folders similar to the following screenshot.

The following are some important folders:

auto_property_inputs – Contains raw auto and property data
merged_auto_property – Contains the merged data for auto and property insurance
output – Contains the delimited files (separate subdirectories)

Catalog the raw data

To help walk through the solution, the CloudFormation stack created and ran an AWS Glue crawler to catalog the property and auto insurance data. To learn more about creating and running AWS Glue crawlers, refer to Working with crawlers on the AWS Glue console. You should see the following tables created by the crawler in the c360_workshop_db AWS Glue database:

source_auto_address – Contains address data of customers with auto insurance
source_auto_customer – Contains auto insurance details of customers
source_auto_vehicles – Contains vehicle details of customers
source_property_addresses – Contains address data of customers with property insurance
source_property_customers – Contains property insurance details of customers

You can review the data using Amazon Athena. For more information about using Athena to query an AWS Glue table, refer to Running SQL queries using Amazon Athena. For example, you can run the following SQL query:

SELECT * FROM "c360_workshop_db"."source_auto_address" limit 10;

Convert the raw data into CSV files for Neptune

The CloudFormation stack created and ran the AWS Glue ETL job prep_neptune_data to convert the raw data into CSV format acceptable to Neptune Bulk Loader. To learn more about building an AWS Glue ETL job using AWS Glue Studio and to review the job created for this solution, refer to Creating ETL jobs with AWS Glue Studio.

Verify the completion of job run by navigating to the Runs tab and checking the status of most recent run.

Verify the CSV files created by the AWS Glue job in the S3 bucket under the output folder.

Load and visualize the raw data in Neptune

This section uses SageMaker Jupyter notebooks to load, query, explore, and visualize the raw property and auto insurance data in Neptune. Jupyter notebooks are web-based interactive platforms. We use Python scripts to analyze the data in a Jupyter notebook. A Jupyter notebook with the required Python scripts has already been provisioned by the CloudFormation stack.

Start Jupyter Notebook.
Choose the Neptune folder on the Files tab.

Under the Customer360 folder, open the notebook explore_raw_insurance_data.ipynb.

Run Steps 1–5 in the notebook to analyze and visualize the raw insurance data.

The rest of the instructions are inside the notebook itself. The following is a summary of the tasks for each step in the notebook:

Step 1: Retrieve Config – Run this cell to run the commands to connect to Neptune for Bulk Loader.
Step 2: Load Source Auto Data – Load the auto insurance data into Neptune as vertices and edges.
Step 3: Load Source Property Data – Load the property insurance data into Neptune as vertices and edges.
Step 4: UI Configuration – This block sets up the UI config and provides UI hints.
Step 5: Explore entire graph – The first block builds and displays a graph for all customers with more than four coverages of auto or property insurance policies. The second block displays the graph for four different records for a customer with the name James.

These are all records for the same customer, but because they’re not linked in any way, they appear as different customer records. The AWS Glue FindMatches ML transform job will identify these records as customer James, and the records provide complete visibility on all policies owned by James. The Neptune graph looks like the following example. The vertex covers represents the coverage of auto or property insurance by the owner (James in this case) and the vertex locatedAt represents the address of the property or vehicle that is covered by the owner’s insurance.

Merge the raw data and crawl the merged dataset

The CloudFormation stack created and ran the AWS Glue ETL job merge_auto_property to merge the raw property and auto insurance data into one dataset and catalog the resultant dataset in the Data Catalog. The AWS Glue ETL job does the following transforms on the raw data and merges the transformed data into one dataset:

Changes the following fields on the source table source_auto_customer:

1. Changes policyid to id and data type to string
2. Changes fname to first_name
3. Changes lname to last_name
4. Changes work to company
5. Changes dob to date_of_birth
6. Changes phone to home_phone
7. Drops the fields birthdate, priority, policysince, and createddate

Changes the following fields on the source_property_customers:

1. Changes customer_id to id and data type to string
2. Changes social to ssn
3. Drops the fields job, email, industry, city, state, zipcode, netnew, sales_rounded, sales_decimal, priority, and industry2

After converting the unique ID field in each table to string type and renaming it to id, the AWS Glue job appends the suffix -auto to all id fields in the source_auto_customer table and the suffix -property to all id fields in the source_propery_customer table before copying all the data from both tables into the merged_auto_property table.

Verify the new table created by the job in the Data Catalog and review the merged dataset using Athena using below Athena SQL query:

SELECT * FROM "c360_workshop_db"."merged_auto_property" limit 10

For more information about how to review the data in the merged_auto_property table, refer to Running SQL queries using Amazon Athena.

Create, teach, and tune the Lake Formation ML transform

The merged AWS Glue job created a Data Catalog called merged_auto_property. Preview the table in Athena Query Editor and download the dataset as a CSV from the Athena console. You can open the CSV file for quick comparison of duplicates.

The rows with IDs 11376-property and 11377-property are mostly same except for the last two digits of their SSN, but these are mostly human errors. The fuzzy matches are easy to spot by a human expert or data steward with domain knowledge of how this data was generated, cleansed, and processed in the various source systems. Although a human expert can identify those duplicates on a small dataset, it becomes tedious when dealing with thousands of records. The AWS Glue ML transform builds on this intuition and provides an easy-to-use ML-based algorithm to automatically apply this approach to large datasets efficiently.

Create the FindMatches ML transform

On the AWS Glue console, expand Data Integration and ETL in the navigation pane.
Under Data classification tools, choose Record Matching.

This will open the ML transforms page.

Choose Create transform.
For Name, enter c360-ml-transform.
For Existing IAM role, choose GlueServiceRoleLab.
For Worker type, choose G.2X (Recommended).
For Number of workers, enter 10.
For Glue version, choose as Spark 2.4 (Glue Version 2.0).
Keep the other values as default and choose Next.

For Database, choose c360_workshop_db.
For Table, choose merged_auto_property.
For Primary key, select id.
Choose Next.

In the Choose tuning options section, you can tune performance and cost metrics available for the ML transform. We stay with the default trade-offs for a balanced approach.

We have specified these values to achieve balanced results. If needed, you can adjust these values later by selecting the transform and using the Tune menu.

Review the values and choose Create ML transform.

The ML transform is now created with the status Needs training.

Teach the transform to identify the duplicates

In this step, we teach the transform by providing labeled examples of matching and non-matching records. You can create your labeling set yourself or allow AWS Glue to generate the labeling set based on heuristics. AWS Glue extracts records from your source data and suggests potential matching records. The file will contain approximately 100 data samples for you to work with.

On the AWS Glue console, navigate to the ML transforms page.
Select the transform c360-ml-transform and choose Train model.

Select I have labels and choose Browse S3 to upload labels from Amazon S3.

Two labeled files have been created for this example. We upload these files to teach the ML transform.

Navigate to the folder label in your S3 bucket, select the labeled file (Label-1-iteration.csv), and choose Choose. And Click “Upload labeling file from S3”.
A green banner appears for successful uploads.
Upload another label file (Label-2-iteration.csv) and select Append to my existing labels.
Wait for the successful upload, then choose Next.

Review the details in the Estimate quality metrics section and choose Close.

Verify that the ML transform status is Ready for use. Note that the label count is 200 because we successfully uploaded two labeled files to teach the transform. Now we can use it in an AWS Glue ETL job for fuzzy matching of the full dataset.

Before proceeding to the next steps, note the transform ID (tfm-xxxxxxx) for the created ML transform.

Harmonize the data, catalog the harmonized data, and convert the data into CSV files for Neptune

In this step, we run an AWS Glue ML transform job to find matches in the merged data. The job also catalogs the harmonized dataset in the Data Catalog and converts the merged [A1] dataset into CSV files for Neptune to show the relations in the matched records.

On the AWS Glue console, choose Jobs in the navigation pane.
Choose the job perform_ml_dedup.

On the job details page, expand Additional properties.
Under Job parameters, enter the transform ID you saved earlier and save the settings.

1. Choose Run and monitor the job status for completion.

Run the following query in Athena to review the data in the new table ml_matched_auto_property, created and cataloged by the AWS Glue job, and observe the results:

SELECT * FROM c360_workshop_db.ml_matched_auto_property WHERE first_name like 'Jam%' and last_name like 'Sanchez%';

The job has added a new column called match_id. If multiple records follow the match criteria, then all matching records have the same match_id.

Match IDs play a crucial role in data harmonization using Lake Formation FindMatches. Each row is assigned a unique integer match ID based on matching criteria such as first_name, last_name, SSN, or date_of_birth, as defined in the uploaded label file. For instance, match ID 25769803941 is assigned to all records that meet the match criteria, such as row 1, 2, 4, and 5 which share the same last_name, SSN, and date_of_birth. Consequently, the properties with ID 19801-property, 29801-auto, 19800-property, and 29800-auto all share the same match ID. It’s important to take note of the match ID because it will be utilized for Neptune Gremlin queries.

The output of the AWS Glue job also has created two files, master_vertex.csv and master_edge.csv, in the S3 bucket output/master_data. We use these files to load data into the Neptune database to find the relationship among different entities.

Load and visualize the harmonized data in Neptune

This section uses Jupyter notebooks to load, query, explore, and visualize the ML matched auto and property insurance data in Neptune. Complete the following steps:

Start Jupyter Notebook.
Choose the Neptune folder on the Files tab.
Under the Customer360 folder, choose the notebook. explore_harmonized_insurance_data.ipynb.
Run Steps 1–5 in the notebook to analyze and visualize the raw insurance data.

The rest of the instructions are inside the notebook itself. The following is a summary of the tasks for each step in the notebook:

Step 1. Retrieve Config – Run this cell to run the commands to connect to Neptune for Bulk Loader.
Step 2. Load Harmonized Customer Data – Load the final vertex and edge files into Neptune.
Step 3. Initialize Neptune node traversals – This block sets up the UI config and provides UI hints.
Step 4. Exploring Customer 360 graph – Replace the Match_id 25769803941 copied from the previous step into g.V('REPLACE_ME')( If its not replaced already ) and run the cell.

This displays the graph for four different records for a customer with first_name, and James and JamE are is now connected with the SameAs vertex. The Neptune graph helps connect different entities with match criteria; the AWS Glue FindMatches ML transform job has identified these records as customer James and the records show the Match_id is the same for them. The following diagram shows an example of the Neptune graph. The vertex covers represents the coverage of auto or property insurance by the owner (James in this case) and the vertex locatedAt represents the address of the property or vehicle that is covered by the owner’s insurance.

Clean up

To avoid incurring additional charges to your account, on the AWS CloudFormation console, select the stack that you provisioned as part of this post and delete it.

Conclusion

In this post, we showed how to use the AWS Lake Formation FindMatch transform for fuzzy matching data on a data lake to link records if there are no join keys and group records with similar match IDs. You can use Amazon Neptune to establish the relationship between records and visualize the connect graph for deriving insights.

We encourage you to explore our range of services and see how they can help you achieve your goals. For more data and analytics blog posts, check out AWS Blogs.

About the Authors

Nishchai JM is an Analytics Specialist Solutions Architect at Amazon Web services. He specializes in building Big-data applications and help customer to modernize their applications on Cloud. He thinks Data is new oil and spends most of his time in deriving insights out of the Data.

Varad Ram is Senior Solutions Architect in Amazon Web Services. He likes to help customers adopt to cloud technologies and is particularly interested in artificial intelligence. He believes deep learning will power future technology growth. In his spare time, he like to be outdoor with his daughter and son.

Narendra Gupta is a Specialist Solutions Architect at AWS, helping customers on their cloud journey with a focus on AWS analytics services. Outside of work, Narendra enjoys learning new technologies, watching movies, and visiting new places

Arun A K is a Big Data Solutions Architect with AWS. He works with customers to provide architectural guidance for running analytics solutions on the cloud. In his free time, Arun loves to enjoy quality time with his family

Solution overview

Amazon CodeGuru detection engine

Bug Fix Tracking and code fixes

Actionable recommendations and concrete remediation

Conclusion

Overview of Amazon MWAA environments

Solution Overview

Amazon MWAA public network access mode

Amazon MWAA private network access mode

Automation through infrastructure as code

Prerequisites

Integrate to a single existing Amazon MWAA environment

Integrate to multiple existing Amazon MWAA environments

Create new Amazon MWAA environments

Verify access

Clean up

Conclusion

About the Authors

Prerequisites

Solution overview and architecture

Deploy the solution

Step 1: Deploy the CloudFormation template

To deploy the CloudFormation template

Step 2: Manually run the first Step Functions workflow

Enable scheduled scanning

To enable the rule

Extend the solution

Conclusion

Using AWS Config and Security Hub for continuous security checks

Optimizing AWS Config for Security Hub

Set up AWS Config, optimized for Security Hub

Customizing for your use cases

Summary

Solution overview

Prerequisites

Configure OpenSearch Ingestion

Configure IAM permissions

Create the pipeline in OpenSearch Ingestion

Send logs to the OpenSearch Ingestion pipeline

Best practices for monitoring

Managing cost

Clean up

Conclusion

About the authors

Solution Overview

Code Migration to AWS Graviton

Application Deployment

Config 1 – A/B controlled topology

Config 2 – Karpenter Controlled Configuration

Conclusion

Solution Implementation Steps

Option 1: Deploy Datadog Connector App from AWS Serverless Repository

Option 2: Build and Deploy sample Datadog Connector App using AWS SAM Command Line Interface

Implementing multi-party approval: Personas and IAM roles

To create the IAM roles

Preventive security controls recommended for this solution

AWS Systems Manager configuration

Create the SSM document

To create the SSM document

Create the Change Manager templates

To create the change templates

Review and approve the Change Manager templates

To add the change template reviewer

Create the PAA and PAI with multi-party approval for Matter compliance

Create a change request for the PAA

To create a change request for the PAA

Multi-party approval: Approve the change request for the PAA

To approve the change request for the PAA

Validate that the PAA is created in AWS Private CA

To validate the creation of the PAA and retrieve its ARN

Create a change request for the PAI

To create a change request for the PAI

Multi-party approval: Approve the change request for the PAI

To approve the change request for the PAI

Demonstrate compliance with multi-party approval requirements of the Matter CP by using the change manager timeline

To retrieve the change manager timeline report

Conclusion

Overview of new features

The new consolidated controls view

Consolidating control findings between standards