How to deploy the AWS Solution for Security Hub Automated Response and Remediation

Post Syndicated from Ramesh Venkataraman original https://aws.amazon.com/blogs/security/how-to-deploy-the-aws-solution-for-security-hub-automated-response-and-remediation/

In this blog post I show you how to deploy the Amazon Web Services (AWS) Solution for Security Hub Automated Response and Remediation. The first installment of this series was about how to create playbooks using Amazon CloudWatch Events, AWS Lambda functions, and AWS Security Hub custom actions that you can run manually based on triggers from Security Hub in a specific account. That solution requires an analyst to directly trigger an action using Security Hub custom actions and doesn’t work for customers who want to set up fully automated remediation based on findings across one or more accounts from their Security Hub master account.

The solution described in this post automates the cross-account response and remediation lifecycle from executing the remediation action to resolving the findings in Security Hub and notifying users of the remediation via Amazon Simple Notification Service (Amazon SNS). You can also deploy these automated playbooks as custom actions in Security Hub, which allows analysts to run them on-demand against specific findings. You can deploy these remediations as custom actions or as fully automated remediations.

Currently, the solution includes 10 playbooks aligned to the controls in the Center for Internet Security (CIS) AWS Foundations Benchmark standard in Security Hub, but playbooks for other standards such as AWS Foundational Security Best Practices (FSBP) will be added in the future.

Solution overview

Figure 1 shows the flow of events in the solution described in the following text.

Figure 1: Flow of events

Figure 1: Flow of events

Detect

Security Hub gives you a comprehensive view of your security alerts and security posture across your AWS accounts and automatically detects deviations from defined security standards and best practices.

Security Hub also collects findings from various AWS services and supported third-party partner products to consolidate security detection data across your accounts.

Ingest

All of the findings from Security Hub are automatically sent to CloudWatch Events and Amazon EventBridge and you can set up CloudWatch Events and EventBridge rules to be invoked on specific findings. You can also send findings to CloudWatch Events and EventBridge on demand via Security Hub custom actions.

Remediate

The CloudWatch Event and EventBridge rules can have AWS Lambda functions, AWS Systems Manager automation documents, or AWS Step Functions workflows as the targets of the rules. This solution uses automation documents and Lambda functions as response and remediation playbooks. Using cross-account AWS Identity and Access Management (IAM) roles, the playbook performs the tasks to remediate the findings using the AWS API when a rule is invoked.

Log

The playbook logs the results to the Amazon CloudWatch log group for the solution, sends a notification to an Amazon Simple Notification Service (Amazon SNS) topic, and updates the Security Hub finding. An audit trail of actions taken is maintained in the finding notes. The finding is updated as RESOLVED after the remediation is run. The security finding notes are updated to reflect the remediation performed.

Here are the steps to deploy the solution from this GitHub project.

  • In the Security Hub master account, you deploy the AWS CloudFormation template, which creates an AWS Service Catalog product along with some other resources. For a full set of what resources are deployed as part of an AWS CloudFormation stack deployment, you can find the full set of deployed resources in the Resources section of the deployed AWS CloudFormation stack. The solution uses the AWS Service Catalog to have the remediations available as a product that can be deployed after granting the users the required permissions to launch the product.
  • Add an IAM role that has administrator access to the AWS Service Catalog portfolio.
  • Deploy the CIS playbook from the AWS Service Catalog product list using the IAM role you added in the previous step.
  • Deploy the AWS Security Hub Automated Response and Remediation template in the master account in addition to the member accounts. This template establishes AssumeRole permissions to allow the playbook Lambda functions to perform remediations. Use AWS CloudFormation StackSets in the master account to have a centralized deployment approach across the master account and multiple member accounts.

Deployment steps for automated response and remediation

This section reviews the steps to implement the solution, including screenshots of the solution launched from an AWS account.

Launch AWS CloudFormation stack on the master account

As part of this AWS CloudFormation stack deployment, you create custom actions to configure Security Hub to send findings to CloudWatch Events. Lambda functions are used to provide remediation in response to actions sent to CloudWatch Events.

Note: In this solution, you create custom actions for the CIS standards. There will be more custom actions added for other security standards in the future.

To launch the AWS CloudFormation stack

  1. Deploy the AWS CloudFormation template in the Security Hub master account. In your AWS console, select CloudFormation and choose Create new stack and enter the S3 URL.
  2. Select Next to move to the Specify stack details tab, and then enter a Stack name as shown in Figure 2. In this example, I named the stack SO0111-SHARR, but you can use any name you want.
     
    Figure 2: Creating a CloudFormation stack

    Figure 2: Creating a CloudFormation stack

  3. Creating the stack automatically launches it, creating 21 new resources using AWS CloudFormation, as shown in Figure 3.
     
    Figure 3: Resources launched with AWS CloudFormation

    Figure 3: Resources launched with AWS CloudFormation

  4. An Amazon SNS topic is automatically created from the AWS CloudFormation stack.
  5. When you create a subscription, you’re prompted to enter an endpoint for receiving email notifications from Amazon SNS as shown in Figure 4. To subscribe to that topic that was created using CloudFormation, you must confirm the subscription from the email address you used to receive notifications.
     
    Figure 4: Subscribing to Amazon SNS topic

    Figure 4: Subscribing to Amazon SNS topic

Enable Security Hub

You should already have enabled Security Hub and AWS Config services on your master account and the associated member accounts. If you haven’t, you can refer to the documentation for setting up Security Hub on your master and member accounts. Figure 5 shows an AWS account that doesn’t have Security Hub enabled.
 

Figure 5: Enabling Security Hub for first time

Figure 5: Enabling Security Hub for first time

AWS Service Catalog product deployment

In this section, you use the AWS Service Catalog to deploy Service Catalog products.

To use the AWS Service Catalog for product deployment

  1. In the same master account, add roles that have administrator access and can deploy AWS Service Catalog products. To do this, from Services in the AWS Management Console, choose AWS Service Catalog. In AWS Service Catalog, select Administration, and then navigate to Portfolio details and select Groups, roles, and users as shown in Figure 6.
     
    Figure 6: AWS Service Catalog product

    Figure 6: AWS Service Catalog product

  2. After adding the role, you can see the products available for that role. You can switch roles on the console to assume the role that you granted access to for the product you added from the AWS Service Catalog. Select the three dots near the product name, and then select Launch product to launch the product, as shown in Figure 7.
     
    Figure 7: Launch the product

    Figure 7: Launch the product

  3. While launching the product, you can choose from the parameters to either enable or disable the automated remediation. Even if you do not enable fully automated remediation, you can still invoke a remediation action in the Security Hub console using a custom action. By default, it’s disabled, as highlighted in Figure 8.
     
    Figure 8: Enable or disable automated remediation

    Figure 8: Enable or disable automated remediation

  4. After launching the product, it can take from 3 to 5 minutes to deploy. When the product is deployed, it creates a new CloudFormation stack with a status of CREATE_COMPLETE as part of the provisioned product in the AWS CloudFormation console.

AssumeRole Lambda functions

Deploy the template that establishes AssumeRole permissions to allow the playbook Lambda functions to perform remediations. You must deploy this template in the master account in addition to any member accounts. Choose CloudFormation and create a new stack. In Specify stack details, go to Parameters and specify the Master account number as shown in Figure 9.
 

Figure 9: Deploy AssumeRole Lambda function

Figure 9: Deploy AssumeRole Lambda function

Test the automated remediation

Now that you’ve completed the steps to deploy the solution, you can test it to be sure that it works as expected.

To test the automated remediation

  1. To test the solution, verify that there are 10 actions listed in Custom actions tab in the Security Hub master account. From the Security Hub master account, open the Security Hub console and select Settings and then Custom actions. You should see 10 actions, as shown in Figure 10.
     
    Figure 10: Custom actions deployed

    Figure 10: Custom actions deployed

  2. Make sure you have member accounts available for testing the solution. If not, you can add member accounts to the master account as described in Adding and inviting member accounts.
  3. For testing purposes, you can use CIS 1.5 standard, which is to require that the IAM password policy requires at least one uppercase letter. Check the existing settings by navigating to IAM, and then to Account Settings. Under Password policy, you should see that there is no password policy set, as shown in Figure 11.
     
    Figure 11: Password policy not set

    Figure 11: Password policy not set

  4. To check the security settings, go to the Security Hub console and select Security standards. Choose CIS AWS Foundations Benchmark v1.2.0. Select CIS 1.5 from the list to see the Findings. You will see the Status as Failed. This means that the password policy to require at least one uppercase letter hasn’t been applied to either the master or the member account, as shown in Figure 12.
     
    Figure 12: CIS 1.5 finding

    Figure 12: CIS 1.5 finding

  5. Select CIS 1.5 – 1.11 from Actions on the top right dropdown of the Findings section from the previous step. You should see a notification with the heading Successfully sent findings to Amazon CloudWatch Events as shown in Figure 13.
     
    Figure 13: Sending findings to CloudWatch Events

    Figure 13: Sending findings to CloudWatch Events

  6. Return to Findings by selecting Security standards and then choosing CIS AWS Foundations Benchmark v1.2.0. Select CIS 1.5 to review Findings and verify that the Workflow status of CIS 1.5 is RESOLVED, as shown in Figure 14.
     
    Figure 14: Resolved findings

    Figure 14: Resolved findings

  7. After the remediation runs, you can verify that the Password policy is set on the master and the member accounts. To verify that the password policy is set, navigate to IAM, and then to Account Settings. Under Password policy, you should see that the account uses a password policy, as shown in Figure 15.
     
    Figure 15: Password policy set

    Figure 15: Password policy set

  8. To check the CloudWatch logs for the Lambda function, in the console, go to Services, and then select Lambda and choose the Lambda function and within the Lambda function, select View logs in CloudWatch. You can see the details of the function being run, including updating the password policy on both the master account and the member account, as shown in Figure 16.
     
    Figure 15: Lambda function log

    Figure 16: Lambda function log

Conclusion

In this post, you deployed the AWS Solution for Security Hub Automated Response and Remediation using Lambda and CloudWatch Events rules to remediate non-compliant CIS-related controls. With this solution, you can ensure that users in member accounts stay compliant with the CIS AWS Foundations Benchmark by automatically invoking guardrails whenever services move out of compliance. New or updated playbooks will be added to the existing AWS Service Catalog portfolio as they’re developed. You can choose when to take advantage of these new or updated playbooks.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Security Hub forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Ramesh Venkataraman

Ramesh is a Solutions Architect who enjoys working with customers to solve their technical challenges using AWS services. Outside of work, Ramesh enjoys following stack overflow questions and answers them in any way he can.

Multi-Region Replication Now Enabled for AWS Managed Microsoft Active Directory

Post Syndicated from Martin Beeby original https://aws.amazon.com/blogs/aws/multi-region-replication-now-enabled-for-aws-managed-microsoft-active-directory/

Our customers build applications that need to serve users that live in all corners of the world. When listening to our customers, they told us that whilst they were comfortable building Active Directory (AD) aware applications on AWS, making them work globally can be a real challenge.

Customers told us that AWS Directory Service for Microsoft Active Directory had saved them time and money and provided them with all the capabilities they need to run their AD-aware applications. However, if they wanted to go global, they needed to create independent AWS Managed Microsoft AD directories per Region. They would then need to create a solution to synchronize data across each Region. This level of management overhead is significant, complex, and costly. It also slowed customers as they sought to migrate their AD-aware workloads to the cloud.

Today, I want to tell you about a new feature that allows customers to deploy a single AWS Managed Microsoft AD across multiple AWS Regions. This new feature called multi-region replication automatically configures inter-region networking connectivity, deploys domain controllers, and replicates all the Active Directory data across multiple Regions, ensuring that Windows and Linux workloads residing in those Regions can connect to and use AWS Managed Microsoft AD with low latency and high performance. AWS Managed Microsoft AD makes it more cost-effective for customers to migrate AD-aware applications and workloads to AWS and easier to operate them globally. In addition, automated multi-region replication provides multi-region resiliency.

AWS can now synchronize all customer directory data, including users, groups, Group Policy Objects (GPOs), and schema across multiple Regions. AWS handles automated software updates, monitoring, recovery, and the security of the underlying AD infrastructure across all Regions, enabling customers to focus on building their applications. Integrating with Amazon CloudWatch Logs and Amazon Simple Notification Service (SNS), AWS Managed Microsoft AD makes it easy for customers to monitor the directory’s health, and security logs globally.

How It Works 
Let me show you how to create an Active Directory that spans multiple Regions using the AWS Managed Microsoft AD console. You do not have to create a new directory to use multi-region replication it will work on all your existing directories too.

First, I create a new Directory following the normal steps. I select Enterprise Edition since this is the only edition that supports multi-region replication.

I give my Directory a name and a description and then set an Admin password. I then click Next which takes me to the Networking setup.

I select a Amazon Virtual Private Cloud that I use for demos and then choose two subnets which are in separate Availability Zones. The AWS Managed Microsoft AD deploys two domain controllers per region and places them in separate subnets which are in different Availability Zones, this is done for resiliency reasons so that the directory can still operate even if one of the Availability Zones has issues.

Once I click next, I am presented with the review screen and I click Create Directory.

The directory takes between 20-45 minutes to be created. There is now a column on the Directories listing page that says Multi-Region, this directory has this value currently set to No indicating that it does not span multiple Regions

Once the directory has been created, I click on the Directory ID and drill into the details. I now have a new section called Multi-Region replication and there is a button called Add Region. If I click this button I can then configure an additional Region.

I select the Region that I want to add to my directory, in this example US West (Oregon) us-west-2, I then select a VPC in that Region and two subnets that must reside in separate Availability Zones. Finally, I click the Add button to add this new Region for my directory.

Now back on the directory details page I see there are two Regions listed one in US East (N. Virginia) and one in US West (Oregon), again the creation process can take upto 45 minutes, but once it has complete I will have my directory replicated across two Regions.

Costs
You pay by the hour for the domain controllers in each region, plus the cross-region data transfer. It’s important to understand that this feature will create two domain controllers in each Region that you Add, and so applications that reside in these Regions can now communicate with a local directory which lowers costs by minimizing the need for data transfer. To learn more, visit the pricing page.

Available Now
This new feature can be used today and is available for both new and existing directories that use the Enterprise Edition in any of the following Regions: US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), AWS GovCloud (US-East), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Paris), Europe (Stockholm), and South America (São Paulo).

Head over to the product page to learn more, view pricing, and get started creating directories that span multiple AWS Regions.

Happy Administering

— Martin

Simplifying cross-account access with Amazon EventBridge resource policies

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/simplifying-cross-account-access-with-amazon-eventbridge-resource-policies/

This post is courtesy of Stephen Liedig, Sr Serverless Specialist SA.

Amazon EventBridge is a serverless event bus used to decouple event producers and consumers. Event producers publish events onto an event bus, which then uses rules to determine where to send those events. The rules determine the targets and EventBridge routes the events accordingly.

A common architectural approach adopted by customers is to isolate these application components or services by using separate AWS accounts. This “account-per-service” strategy limits the blast radius by providing a logical and physical separation of resources. It provides additional security boundaries and allows customers to easily track service costs without having to adopt a complex tagging strategy.

To enable the flow of events from one account to another, you must create a rule on one event bus that routes events to an event bus in another account. To enable this routing, you need to configure the resource policy for your event buses.

This blog post shows you how to use EventBridge resource policies to publish events and create rules on event buses in another account.

Overview

Today, EventBridge launches improvements to resource policies that make it easier to build applications that work across accounts. The service expands the use of the policy associated with an event bus to the authorization of API calls.

This means you can manage permissions for API calls that interact with the event bus, such as PutEventsPutRule, and PutTargets, directly from that event bus’ resource policy. This replaces the need to create different IAM roles that are assumed by each account that interacts with the event bus. It also provides a central resource to manage your permissions.

There is support for organizations and tags via IAM conditions. Now when you call an API, it considers both the user’s IAM policy and the event bus resource policy in the authorization process.

EventBridge APIs that accept an event bus name parameter (including PutRule, PutTargets, DeleteRule, RemoveTargets, DisableRule, and EnableRule) now also support an event bus ARN. This allows you to target cross-account event buses through the APIs. For example, you can call PutRule to create a rule on an event bus in another account, without needing to assume a role.

EventBridge now supports using policy conditions for the following authorization context keys in the APIs, to help scope down permissions.

Context key

APIs

Customer usage

events:detail-type PutEvents Used to restrict PutEvents calls for events with a specific “detail-type” field.
events:source PutEvents Used to restrict PutEvents calls for events with a specific “source” field.
events:creatorAccount PutRule,
PutTargets,
DeleteRule,
RemoveTargets,
DisableRule,
EnableRule,
TagResource,
UntagResource,
DescribeRule,
ListTargetsByRule,
ListTagsForResource

Used to restrict control plane API calls on rules belonging to a certain account.

This can be used to allow a customer to edit/disable only rules created by their own account.

events:eventBusInvocation PutEvents

Used to differentiate a PutEvents API call from a cross-account event bus target invocation. This context key is set to true during a cross-account event bus target invocation authorization. For example, when a rule matches an event and sends that event to another event bus.

For an API call of PutEvents, this context key is set to false.

Ecommerce example walkthrough

In this ecommerce example, there are multiple services distributed across different accounts. A web store publishes an event when a new order is created. The event is sent via a central event bus, which is in another account. The bus has two rules with target services in different AWS accounts.

Walkthrough architecture

The goal is to create fine-grained permissions that only allow:

  • The web store to publish events for a specific detail-type and source.
  • The invoice processing service to create and manage its own rules on the central bus.

To complete this walk through, you set up three accounts. For account A (Web Store), you deploy an AWS Lambda function that sends the “newOrderCreated” event directly to the “central event bus” in account B. The invoice processing Lambda function in account C creates a rule on the central event bus to process the event published by account A.

Create the central event bus in account B

Account B event bus

Create the central event bus in account B, adding the following resource policy. Be sure to substitute your account numbers for accounts A, B, and C.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "WebStoreCrossAccountPublish",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::[ACCOUNT-A]:root"
      },
      "Action": "events:PutEvents",
      "Resource": "arn:aws:events:us-east-1:[ACCOUNT-B]:event-bus/central-event-bus",
      "Condition": {
        "StringEquals": {
          "events:detail-type": "newOrderCreated",
          "events:source": "com.exampleCorp.webStore"
        }
      }
    },
    {
      "Sid": "InvoiceProcessingRuleCreation",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::[ACCOUNT-C]:root"
      },
      "Action": [
        "events:PutRule",
        "events:DeleteRule",
        "events:DescribeRule",
        "events:DisableRule",
        "events:EnableRule",
        "events:PutTargets",
        "events:RemoveTargets"
      ],
      "Resource": "arn:aws:events:us-east-1:[ACCOUNT-B]:rule/central-event-bus/*",
      "Condition": {
        "StringEqualsIfExists": {
          "events:creatorAccount": "${aws:PrincipalAccount}",
          "events:source": "com.exampleCorp.webStore"
        }
      }
    }
  ]
}

Create event bus

There are two statements in the resource policy: WebStoreCrossAccountPublish and InvoiceProcessingRuleCreation.

The WebStoreCrossAccountPublish statement allows the Lambda function in account A to publish events directly to the central event bus. There are two conditions in the statement that further restrict the types of events that can be sent to the event bus. The first restricts the event detail-type to equal “newOrderCreated” and the second condition requires that the event source equals “com.exampleCorp.webStore”.

The InvoiceProcessingRuleCreation statement allows the invoice processing function in account C to describe, add, update, enable, disable, and delete any rules created by account C. You apply this restriction by using the events:creatorAccount context key in the statements condition.

Importantly you should set the StringEqualsIfExists type for the events:creatorAccount condition. If you use StringEquals, it results in an AccessDeniedException. AWS CloudFormation calls DescribeRule to check if the rule already exists. However, as this is a new rule, and because you set a condition for events:creatorAccount for DescribeRule, this key is not populated and CloudFormation receives an AccessDeniedException instead of ResourceNotFoundException.

Here is how you create the event bus using AWS CloudFormation:

  CentralEventBus: 
      Type: AWS::Events::EventBus
      Properties: 
          Name: !Ref EventBusName

  WebStoreCrossAccountPublishStatement: 
      Type: AWS::Events::EventBusPolicy
      Properties: 
          EventBusName: !Ref CentralEventBus
          StatementId: "WebStoreCrossAccountPublish"
          Statement: 
              Effect: "Allow"
              Principal: 
                  AWS: !Sub arn:aws:iam::${AccountA}:root
              Action: "events:PutEvents"
              Resource: !GetAtt CentralEventBus.Arn
              Condition:
                  StringEquals:
                      "events:detail-type": "newOrderCreated"
                      "events:source" : "com.exampleCorp.webStore"
                      
  InvoiceProcessingRuleCreationStatement: 
      Type: AWS::Events::EventBusPolicy
      Properties: 
          EventBusName: !Ref CentralEventBus
          StatementId: "InvoiceProcessingRuleCreation"
          Statement: 
              Effect: "Allow"
              Principal: 
                  AWS: !Sub arn:aws:iam::${AccountC}:root
              Action: 
                  - "events:PutRule"
                  - "events:DeleteRule"
                  - "events:DescribeRule"
                  - "events:DisableRule"
                  - "events:EnableRule"
                  - "events:PutTargets"
                  - "events:RemoveTargets"
              Resource: 
                  - !Sub arn:aws:events:${AWS::Region}:${AWS::AccountId}:rule/${CentralEventBus.Name}/*
              Condition:
                  StringEqualsIfExists:
                      "events:creatorAccount" : "${aws:PrincipalAccount}"
                  StringEquals:
                      "events:source": "com.exampleCorp.webStore"

Now that you have a policy set up on the central event bus, configure the client applications to send and process events. The client application must also have permissions configured.

Create the web store order function in account A

Web Store order function

In account A, create a Lambda function to send the event to the central bus in account B. Set the EventBusName parameter to the central event bus ARN on the PutEvents API call. This allows you to target cross-account event buses directly.

import json
import boto3

EVENT_BUS_ARN = 'arn:aws:events:us-east-1:[ACCOUNT-B]:event-bus/central-event-bus'

# Create EventBridge client
events = boto3.client('events')

def lambda_handler(event, context):

  # new order created event datail
  eventDetail  = {
    "orderNo": "123",
    "orderDate": "2020-09-09T22:01:02Z",
    "customerId": "789",
    "lineItems": {
      "productCode": "P1",
      "quantityOrdered": 3,
      "unitPrice": 23.5,
      "currency": "USD"
    }
  }
  
  try:
    # Put an event
    response = events.put_events(
        Entries=[
            {
                'EventBusName': EVENT_BUS_ARN,
                'Source': 'com.exampleCorp.webStore',
                'DetailType': 'newOrderCreated',
                'Detail': json.dumps(eventDetail)
            }
        ]
    )
    
    print(response['Entries'])
    print('Event sent to the event bus ' + EVENT_BUS_ARN )
    print('EventID is ' + response['Entries'][0]['EventId'])
    
  except Exception as e:
      print(e)

Create the Invoice Processing service in account C

Invoice processing service in account C

Next, create the invoice processing function that processes the newOrderCreated event. You use the AWS Serverless Application Model (AWS SAM) to create the invoice processing function and other application resources. Before you can process any events from the central event bus, you must create a new event bus in account C to receive incoming events.

Next, you define the function that processes the events. Here, you define a Lambda event source that is an EventBridge rule. You set the EventBusName to the receiving invoice processing event bus. When this Lambda function is deployed, AWS SAM creates the rule on the event bus with the specified pattern and target. It configures the event source that triggers the function when an event is received.

Parameters:
  EventBusName:
    Description: Name of the central event bus
    Type: String
    Default: invoice-processing-event-bus
  CentralEventBusArn:
    Description: The ARN of the central event bus # e.g. arn:aws:events:us-east-1:[ACCOUNT-B]:event-bus/central-event-bus
    Type: String
Resources:
  # This is the receiving invoice processing event bus in account C.
  InvoiceProcessingEventBus: 
    Type: AWS::Events::EventBus
    Properties: 
        Name: !Ref EventBusName
# AWS Lambda function processes the newOrderCreated event
  InvoiceProcessingFunction:
    Type: AWS::Serverless::Function 
    Properties:
      CodeUri: invoice_processing
      Handler: invoice_processing_function/app.lambda_handler
      Runtime: python3.8
      Events:
        NewOrderCreatedRule:
          Type: EventBridgeRule
          Properties:
            EventBusName: !Ref InvoiceProcessingEventBus
            Pattern:
              source:
                - com.exampleCorp.webStore
              detail-type:
                - newOrderCreated

The next resource in the AWS SAM template is the rule that creates the target on the central event bus. It sends events to the invoice processing event bus. Though the rule is added to the central event bus, its definition is managed by the invoice processing service template. The rule definition sets EventBusName parameter to the ARN of the central event bus.

  # This is the rule that the invoice processing service creates on the central event bus
  InvoiceProcessingRule:
    Type: AWS::Events::Rule
    Properties:
      Name: InvoiceProcessingNewOrderCreatedSubscription
      Description: Cross account rule created by Invoice Processing service
      EventBusName: !Ref CentralEventBusArn # ARN of the central event bus
      EventPattern:
        source:
          - com.exampleCorp.webStore
        detail-type:
          - newOrderCreated
      State: ENABLED
      Targets: 
        - Id: SendEventsToInvoiceProcessingEventBus
          Arn: !GetAtt InvoiceProcessingEventBus.Arn
          RoleArn: !GetAtt CentralEventBusToInvoiceProcessingEventBusRole.Arn
          DeadLetterConfig:
            Arn: !GetAtt InvoiceProcessingTargetDLQ.Arn

For the central event bus target to send the event to the invoice processing event bus in account C, EventBridge needs the necessary permissions to invoke the PutEvents API. The CentralEventBusToInvoiceProcessingEventBusRole IAM role provides that permission. It is assumed by the central event bus in account B when it needs to send events to the invoice processing event bus, without you having to create an additional resource policy on the invoice processing event bus.

  CentralEventBusToInvoiceProcessingEventBusRole:
    Type: 'AWS::IAM::Role'
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Principal:
              Service:
              - events.amazonaws.com
            Action:
              - 'sts:AssumeRole'
      Path: /
      Policies:
        - PolicyName: PutEventsOnInvoiceProcessingEventBus
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action: 'events:PutEvents'
                Resource: !GetAtt InvoiceProcessingEventBus.Arn

You can also set up a dead-letter queue (DLQ) configuration for the rule in account C. This allows the subscriber of the event to control where events that fail to get delivered to the invoice processing event bus get sent. All you need to do to make this happen is create an Amazon SQS queue in account C, and a queue policy that sets a resource policy to allow EventBridge to send failed events from account B to the queue in account C.

  # Invoice Processing Target Dead Letter Queue 
  InvoiceProcessingTargetDLQ:
    Type: AWS::SQS::Queue

  # SQS resource policy required to allow target on central bus to send failed messages to target DLQ
  InvoiceProcessingTargetDLQPolicy: 
    Type: AWS::SQS::QueuePolicy
    Properties: 
      Queues: 
        - !Ref InvoiceProcessingTargetDLQ
      PolicyDocument: 
        Statement: 
          - Action: 
              - "SQS:SendMessage" 
            Effect: "Allow"
            Resource: !GetAtt InvoiceProcessingTargetDLQ.Arn
            Principal:  
              Service: "events.amazonaws.com"
            Condition:
              ArnEquals:
                "aws:SourceArn": !GetAtt InvoiceProcessingRule.Arn 

Conclusion

This post shows you how to use the new features Amazon EventBridge resource policies that make it easier to build applications that work across accounts. Resource policies provide you with a powerful mechanism for modeling your event buses across multiple accounts, and give you fine-grained control over EventBridge API invocations.

Download the code in this blog from https://github.com/aws-samples/amazon-eventbridge-resource-policy-samples.

For more serverless learning resources, visit Serverless Land.

Improving Performance and Search Rankings with Cloudflare for Fun and Profit

Post Syndicated from Rustam Lalkaka original https://blog.cloudflare.com/improving-performance-and-search-rankings-with-cloudflare-for-fun-and-profit/

Improving Performance and Search Rankings with Cloudflare for Fun and Profit

Making things fast is one of the things we do at Cloudflare. More responsive websites, apps, APIs, and networks directly translate into improved conversion and user experience. Today, Google announced that Google Search will directly take web performance and page experience data into account when ranking results on their search engine results pages (SERPs), beginning in May 2021.

Specifically, Google Search will prioritize results based on how pages score on Core Web Vitals, a measurement methodology Cloudflare has worked closely with Google to establish, and we have implemented support for in our analytics tools.

Improving Performance and Search Rankings with Cloudflare for Fun and Profit
Source: “Search Page Experience Graphic” by Google is licensed under CC BY 4.0

The Core Web Vitals metrics are Largest Contentful Paint (LCP, a loading measurement), First Input Delay (FID, a measure of interactivity), and Cumulative Layout Shift (CLS, a measure of visual stability). Each one is directly associated with user perceptible page experience milestones. All three can be improved using our performance products, and all three can be measured with our Cloudflare Browser Insights product, and soon, with our free privacy-aware Cloudflare Web Analytics.

SEO experts have always suspected faster pages lead to better search ranking. With today’s announcement from Google, we can say with confidence that Cloudflare helps you achieve the web performance trifecta: our product suite makes your site faster, gives you direct visibility into how it is performing (and use that data to iteratively improve), and directly drives improved search ranking and business results.

“Google providing more transparency about how Search ranking works is great for the open Web. The fact they are ranking using real metrics that are easy to measure with tools like Cloudflare’s analytics suite makes today’s announcement all the more exciting. Cloudflare offers a full set of tools to make sites incredibly fast and measure ‘incredibly’ directly.”

Matt Weinberg, president of Happy Cog, a full-service digital agency.

Cloudflare helps make your site faster

Cloudflare offers a diverse, easy to deploy set of products to improve page experience for your visitors. We offer a rich, configurable set of tools to improve page speed, which this post is too small to contain. Unlike Fermat, who once famously described a math problem and then said “the margin is too small to contain the solution”, and then let folks spend three hundred plus years trying to figure out his enigma, I’m going to tell you how to solve web performance problems with Cloudflare. Here are the highlights:

Caching and Smart Routing

The typical website is composed of a mix of static assets, like images and product descriptions, and dynamic content, like the contents of a shopping cart or a user’s profile page. Cloudflare caches customers’ static content at our edge, avoiding the need for a full roundtrip to origin servers each time content is requested. Because our edge network places content very close (in physical terms) to users, there is less distance to travel and page loads are consequently faster. Thanks, Einstein.

And Argo Smart Routing helps speed page loads that require dynamic content. It analyzes and optimizes routing decisions across the global Internet in real-time. Think Waze, the automobile route optimization app, but for Internet traffic.

Just as Waze can tell you which route to take when driving by monitoring which roads are congested or blocked, Smart Routing can route connections across the Internet efficiently by avoiding packet loss, congestion, and outages.

Using caching and Smart Routing directly improves page speed and experience scores like Web Vitals. With today’s announcement from Google, this also means improved search ranking.

Content optimization

Caching and Smart Routing are designed to reduce and speed up round trips from your users to your origin servers, respectively. Cloudflare also offers features to optimize the content we do serve.

Cloudflare Image Resizing allows on-demand sizing, quality, and format adjustments to images, including the ability to convert images to modern file formats like WebP and AVIF.

Delivering images this way to your end-users helps you save bandwidth costs and improve performance, since Cloudflare allows you to optimize images already cached at the edge.

For WordPress operators, we recently launched Automatic Platform Optimization (APO). With APO, Cloudflare will serve your entire site from our edge network, ensuring that customers see improved performance when visiting your site. By default, Cloudflare only caches static content, but with APO we can also cache dynamic content (like HTML) so the entire site is served from cache. This removes round trips from the origin drastically improving TTFB and other site performance metrics. In addition to caching dynamic content, APO caches third party scripts to further reduce the need to make requests that leave Cloudflare’s edge network.

Workers and Workers Sites

Reducing load on customer origins and making sure we serve the right content to the right clients at the right time are great, but what if customers want to take things a step further and eliminate origin round trips entirely? What if there was no origin? Before we get into Schrödinger’s cat/server territory, we can make this concrete: Cloudflare offers tools to serve entire websites from our edge, without an origin server being involved at all. For more on Workers Sites, check out our introductory blog post and peruse our Built With Workers project gallery.

As big proponents of dogfooding, many of Cloudflare’s own web properties are deployed to Workers Sites, and we use Web Vitals to measure our customers’ experiences.

Using Workers Sites, our developers.cloudflare.com site, which gets hundreds of thousands of visits a day and is critical to developers building atop our platform, is able to attain incredible Web Vitals scores:

Improving Performance and Search Rankings with Cloudflare for Fun and Profit

These scores are superb, showing the performance and ease of use of our edge, our static website delivery system, and our analytics toolchain.

Cloudflare Web Analytics and Browser Insights directly measure the signals Google is prioritizing

As illustrated above, Cloudflare makes it easy to directly measure Web Vitals with Browser Insights. Enabling Browser Insights for websites proxied by Cloudflare takes one click in the Speed tab of the Cloudflare dashboard. And if you’re not proxying sites through Cloudflare, Web Vitals measurements will be supported in our upcoming, free, Cloudflare Web Analytics product that any site, using Cloudflare’s proxy or not, can use.

Web Vitals breaks down user experience into three components:

  • Loading: How long did it take for content to become available?
  • Interactivity: How responsive is the website when you interact with it?
  • Visual stability: How much does the page move around while loading?
Improving Performance and Search Rankings with Cloudflare for Fun and Profit
This image is reproduced from work created and shared by Google and used according to terms described in the Creative Commons 4.0 Attribution License.

It’s challenging to create a single metric that captures these high-level components. Thankfully, the folks at Google Chrome team have thought about this, and earlier this year introduced three “Core” Web Vitals metrics:  Largest Contentful Paint,  First Input Delay, and Cumulative Layout Shift.

Cloudflare Browser Insights measures all three metrics directly in your users’ browsers, all with one-click enablement from the Cloudflare dashboard.

Once enabled, Browser Insights works by inserting a JavaScript “beacon” into HTML pages. You can control where the beacon loads if you only want to measure specific pages or hostnames. If you’re using CSP version 3, we’ll even automatically detect the nonce (if present) and add it to the script.

To start using Browser Insights, just head over to the Speed tab in the dashboard.

Improving Performance and Search Rankings with Cloudflare for Fun and Profit
An example Browser Insights report, showing what pages on blog.cloudflare.com need improvement.

Making pages fast is better for everyone

Google’s announcement today, that Web Vitals measurements will be a key part of search ranking starting in May 2021, places even more emphasis on running fast, accessible websites.

Using Cloudflare’s performance tools, like our best-of-breed caching, Argo Smart Routing, content optimization, and Cloudflare Workers® products, directly improves page experience and Core Web Vitals measurements, and now, very directly, where your pages appear in Google Search results. And you don’t have to take our word for this — our analytics tools directly measure Web Vitals scores, instrumenting your real users’ experiences.

We’re excited to help our customers build fast websites, understand exactly how fast they are, and rank highly on Google search as a result. Render on!

Set up centralized monitoring for DDoS events and auto-remediate noncompliant resources

Post Syndicated from Fola Bolodeoku original https://aws.amazon.com/blogs/security/set-up-centralized-monitoring-for-ddos-events-and-auto-remediate-noncompliant-resources/

When you build applications on Amazon Web Services (AWS), it’s a common security practice to isolate production resources from non-production resources by logically grouping them into functional units or organizational units. There are many benefits to this approach, such as making it easier to implement the principal of least privilege, or reducing the scope of adversely impactful activities that may occur in non-production environments. After building these applications, setting up monitoring for resource compliance and security risks, such as distributed denial of service (DDoS) attacks across your AWS accounts, is just as important. The recommended best practice to perform this type of monitoring involves using AWS Shield Advanced with AWS Firewall Manager, and integrating these with AWS Security Hub.

In this blog post, I show you how to set up centralized monitoring for Shield Advanced–protected resources across multiple AWS accounts by using Firewall Manager and Security Hub. This enables you to easily manage resources that are out of compliance from your security policy and to view DDoS events that are detected across multiple accounts in a single view.

Shield Advanced is a managed application security service that provides DDoS protection for your workloads against infrastructure layer (Layer 3–4) attacks, as well as application layer (Layer 7) attacks, by using AWS WAF. Firewall Manager is a security management service that enables you to centrally configure and manage firewall rules across your accounts and applications in an organization in AWS. Security Hub consumes, analyzes, and aggregates security events produced by your application running on AWS by consuming security findings. Security Hub integrates with Firewall Manager without the need for any action to be taken by you.

I’m going to cover two different scenarios that show you how to use Firewall Manager for:

  1. Centralized visibility into Shield Advanced DDoS events
  2. Automatic remediation of noncompliant resources

Scenario 1: Centralized visibility of DDoS detected events

This scenario represents a fully native and automated integration, where Shield Advanced DDoSDetected events (indicates whether a DDoS event is underway for a particular Amazon Resource Name (ARN)) are made visible as a security finding in Security Hub, through Firewall Manager.

Solution overview

Figure 1 shows the solution architecture for scenario 1.
 

Figure 1: Scenario 1 – Shield Advanced DDoS detected events visible in Security Hub

Figure 1: Scenario 1 – Shield Advanced DDoS detected events visible in Security Hub

The diagram illustrates a customer using AWS Organizations to isolate their production resources into the Production Organizational Unit (OU), with further separation into multiple accounts for each of the mission-critical applications. The resources in Account 1 are protected by Shield Advanced. The Security OU was created to centralize security functions across all AWS accounts and OUs, obscuring the visibility of the production environment resources from the Security Operations Center (SOC) engineers and other security staff. The Security OU is home to the designated administrator account for Firewall Manager and the Security Hub dashboard.

Scenario 1 implementation

You will be setting up Security Hub in an account that has the prerequisite services configured in it as explained below. Before you proceed, see the architecture requirements in the next section. Once Security Hub is enabled for your organization, you can simulate a DDoS event in strict accordance with the AWS DDoS Simulation Testing Policy or use one of AWS DDoS Test Partners.

Architecture requirements

In order to implement these steps, you must have the following:

Once you have all these requirements completed, you can move on to enable Security Hub.

Enable Security Hub

Note: If you plan to protect resources with Shield Advanced across multiple accounts and in multiple Regions, we recommend that you use the AWS Security Hub Multiaccount Scripts from AWS Labs. Security Hub needs to be enabled in all the Regions and all the accounts where you have Shield protected resources. For global resources, like Amazon CloudFront, you should enable Security Hub in the us-east-1 Region.

To enable Security Hub

  1. In the AWS Security Hub console, switch to the account you want to use as the designated Security Hub administrator account.
  2. Select the security standard or standards that are applicable to your application’s use-case, and choose Enable Security Hub.
     
    Figure 2: Enabling Security Hub

    Figure 2: Enabling Security Hub

  3. From the designated Security Hub administrator account, go to the Settings – Account tab, and add accounts by sending invites to all the accounts you want added as member accounts. The invited accounts become associated as member accounts once the owner of the invited account has accepted the invite and Security Hub has been enabled. It’s possible to upload a comma-separated list of accounts you want to send to invites to.
     
    Figure 3: Designating a Security Hub administrator account by adding member accounts

    Figure 3: Designating a Security Hub administrator account by adding member accounts

View detected events in Shield and Security Hub

When Shield Advanced detects signs of DDoS traffic that is destined for a protected resource, the Events tab in the Shield console displays information about the event detected and provides a status on the mitigation that has been performed. Following is an example of how this looks in the Shield console.
 

Figure 4: Scenario 1 - The Events tab on the Shield console showing a Shield event in progress

Figure 4: Scenario 1 – The Events tab on the Shield console showing a Shield event in progress

If you’re managing multiple accounts, switching between these accounts to view the Shield console to keep track of DDoS incidents can be cumbersome. Using the Amazon CloudWatch metrics that Shield Advanced reports for Shield events, visibility across multiple accounts and Regions is easier through a custom CloudWatch dashboard or by consuming these metrics in a third-party tool. For example, the DDoSDetected CloudWatch metric has a binary value, where a value of 1 indicates that an event that might be a DDoS has been detected. This metric is automatically updated by Shield when the DDoS event starts and ends. You only need permissions to access the Security Hub dashboard in order to monitor all events on production resources. Following is an example of what you see in the Security Hub console.
 

Figure 5: Scenario 1 - Shield Advanced DDoS alarm showing in Security Hub

Figure 5: Scenario 1 – Shield Advanced DDoS alarm showing in Security Hub

Configure Shield event notification in Firewall Manager

In order to increase your visibility into possible Shield events across your accounts, you must configure Firewall Manager to monitor your protected resources by using Amazon Simple Notification Service (Amazon SNS). With this configuration, Firewall Manager sends you notifications of possible attacks by creating an Amazon SNS topic in Regions where you might have protected resources.

To configure SNS topics in Firewall Manager

  1. In the Firewall Manager console, go to the Settings page.
  2. Under Amazon SNS Topic Configuration, select a Region.
  3. Choose Configure SNS Topic.
     
    Figure 6: The Firewall Manager Settings page for configuring SNS topics

    Figure 6: The Firewall Manager Settings page for configuring SNS topics

  4. Select an existing topic or create a new topic, and then choose Configure SNS Topic.
     
    Figure 7: Configure an SNS topic in a Region

    Figure 7: Configure an SNS topic in a Region

Scenario 2: Automatic remediation of noncompliant resources

The second scenario is an example in which a new production resource is created, and Security Hub has full visibility of the compliance state of the resource.

Solution overview

Figure 8 shows the solution architecture for scenario 2.
 

Figure 8: Scenario 2 – Visibility of Shield Advanced noncompliant resources in Security Hub

Figure 8: Scenario 2 – Visibility of Shield Advanced noncompliant resources in Security Hub

Firewall Manager identifies that the resource is out of compliance with the defined policy for Shield Advanced and posts a finding to Security Hub, notifying your operations team that a manual action is required to bring the resource into compliance. If configured, Firewall Manager can automatically bring the resource into compliance by creating it as a Shield Advanced–protected resource, and then update Security Hub when the resource is in a compliant state.

Scenario 2 implementation

The following steps describe how to use Firewall Manager to enforce Shield Advanced protection compliance of an application that is deployed to a member account within AWS Organizations. This implementation assumes that you set up Security Hub as described for scenario 1.

Create a Firewall Manager security policy for Shield Advanced protected resources

In this step, you create a Shield Advanced security policy that will be enforced by Firewall Manager. For the purposes of this walkthrough, you’ll choose to automatically remediate noncompliant resources and apply the policy to Application Load Balancer (ALB) resources.

To create the Shield Advanced policy

  1. Open the Firewall Manager console in the designated Firewall Manager administrator account.
  2. In the left navigation pane, choose Security policies, and then choose Create a security policy.
  3. Select AWS Shield Advanced as the policy type, and select the Region where your protected resources are. Choose Next.

    Note: You will need to create a security policy for each Region where you have regional resources, such as Elastic Load Balancers and Elastic IP addresses, and a security policy for global resources such as CloudFront distributions.

    Figure 9: Select the policy type and Region

    Figure 9: Select the policy type and Region

  4. On the Describe policy page, for Policy name, enter a name for your policy.
  5. For Policy action, you have the option to configure automatic remediation of noncompliant resources or to only send alerts when resources are noncompliant. You can change this setting after the policy has been created. For the purposes of this blog post, I’m selecting Auto remediate any noncompliant resources. Select your option, and then choose Next.

    Important: It’s a best practice to first identify and review noncompliant resources before you enable automatic remediation.

  6. On the Define policy scope page, define the scope of the policy by choosing which AWS accounts, resource type, or resource tags the policy should be applied to. For the purposes of this blog post, I’m selecting to manage Application Load Balancer (ALB) resources across all accounts in my organization, with no preference for resource tags. When you’re finished defining the policy scope, choose Next.
     
    Figure 10: Define the policy scope

    Figure 10: Define the policy scope

  7. Review and create the policy. Once you’ve reviewed and created the policy in the Firewall Manager designated administrator account, the policy will be pushed to all the Firewall Manager member accounts for enforcement. The new policy could take up to 5 minutes to appear in the console. Figure 11 shows a successful security policy propagation across accounts.
     
    Figure 11: View security policies in an account

    Figure 11: View security policies in an account

Test the Firewall Manager and Security Hub integration

You’ve now defined a policy to cover only ALB resources, so the best way to test this configuration is to create an ALB in one of the Firewall Manager member accounts. This policy causes resources within the policy scope to be added as protected resources.

To test the policy

  1. Switch to the Security Hub administrator account and open the Security Hub console in the same Region where you created the ALB. On the Findings page, set the Title filter to Resource lacks Shield Advanced protection and set the Product name filter to Firewall Manager.
     
    Figure 12: Security Hub findings filter

    Figure 12: Security Hub findings filter

    You should see a new security finding flagging the ALB as a noncompliant resource, according to the Shield Advanced policy defined in Firewall Manager. This confirms that Security Hub and Firewall Manager have been enabled correctly.
     

    Figure 13: Security Hub with a noncompliant resource finding

    Figure 13: Security Hub with a noncompliant resource finding

  2. With the automatic remediation feature enabled, you should see the “Updated at” time reflect exactly when the automatic remediation actions were completed. The completion of the automatic remediation actions can take up to 5 minutes to be reflected in Security Hub.
     
    Figure 14: Security Hub with an auto-remediated compliance finding

    Figure 14: Security Hub with an auto-remediated compliance finding

  3. Go back to the account where you created the ALB, and in the Shield Protected Resources console, navigate to the Protected Resources page, where you should see the ALB listed as a protected resource.
     
    Figure 15: Shield console in the member account shows that the new ALB is a protected resource

    Figure 15: Shield console in the member account shows that the new ALB is a protected resource

    Confirming that the ALB has been added automatically as a Shield Advanced–protected resource means that you have successfully configured the Firewall Manager and Security Hub integration.

(Optional): Send a custom action to a third-party provider

You can send all regional Security Hub findings to a ticketing system, Slack, AWS Chatbot, a Security Information and Event Management (SIEM) tool, a Security Orchestration Automation and Response (SOAR), incident management tools, or to custom remediation playbooks by using Security Hub Custom Actions.

Conclusion

In this blog post I showed you how to set up a Firewall Manager security policy for Shield Advanced so that you can monitor your applications for DDoS events, and their compliance to DDoS protection policies in your multi-account environment from the Security Hub findings console. In line with best practices for account governance, organizations should have a centralized security account that performs monitoring for multiple accounts. Security Hub and Firewall Manager provide a centralized solution to help you achieve your compliance and monitoring goals for DDoS protection.

If you’re interested in exploring how Shield Advanced and AWS WAF help to improve the security posture of your application, have a look at the following resources:

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Security Hub forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Fola Bolodeoku

Fola is a Security Engineer on the AWS Threat Research Team, where he focuses on helping customers improve their application security posture against DDoS and other application threats. When he is not working, he enjoys spending time exploring the natural beauty of the Western Cape.

Monitoring dashboard for AWS ParallelCluster

Post Syndicated from Ben Peven original https://aws.amazon.com/blogs/compute/monitoring-dashboard-for-aws-parallelcluster/

This post is contributed by Nicola Venuti, Sr. HPC Solutions Architect.

AWS ParallelCluster is an AWS-supported, open source cluster management tool that makes it easy to deploy and manage High Performance Computing (HPC) clusters on AWS. While AWS ParallelCluster includes many benefits for its users, it has not provided straightforward support for monitoring your workloads. In this post, I walk you through an add-on extension that you can use to help monitor your cloud resources with AWS ParallelCluster.

Product overview

AWS ParallelCluster offers many benefits that hide the complexity of the underlying platform and processes. Some of these benefits include:

  • Automatic resource scaling
  • A batch scheduler
  • Easy cluster management that allows you to build and rebuild your infrastructure without the need for manual actions
  • Seamless migration to the cloud supporting a wide variety of operating systems.

While all of these benefits streamline processes for you, one crucial task for HPC stakeholders that customers have found challenging with AWS ParallelCluster is monitoring.

Resources in an HPC cluster like schedulers, instances, storage, and networking are important assets to monitor, especially when you pay for what you use. Organizing and displaying all these metrics becomes challenging as the scale of the infrastructure increases or changes over time which is the typical scenario in the Cloud.

Given these challenges, AWS wants to provide AWS ParallelCluster users with two additional benefits: 1/ facilitate the optimization of price performance and 2/ visualize and monitor the components of cost for their workloads.

The HPC Solution Architects team has created an AWS ParallelCluster add-on that is easy to use and customize. In this blog post, I demonstrate how Grafana – a platform for monitoring and observability – can run on AWS ParallelCluster to enable infrastructure monitoring.

AWS ParallelCluster integrated dashboard and Grafana add-on dashboards

Many customers want a tool that consolidates information coming from different AWS services and makes it easy to monitor the computing infrastructure created and managed by AWS ParallelCluster. The AWS ParallelCluster team – just like every other service team at AWS – is open and keen to listen from our customers.

This feedback helps us inform our product roadmap and build new, customer-focused features.

We recently released AWS ParallelCluster 2.10.0 based on customer feedback. Here are some key features have been released with it:

  • Support for the CentOS 8 Operating System: Customers can now choose CentOS 8 as their base operating system of choice to run their clustersfor both x86 and Arm architectures.
  • Support for P4d instances along with NVIDIA GPUDirect Remote Direct Memory Access (RDMA).
  • FSx for Lustre enhancements (support for  FSx AutoImport and HDD-based support filesystem options)
  • And finally, a new CloudWatch Dashboards designed to help aggregating cluster information, metrics, and logging already available on CloudWatch.

The last of these features above, integrated CloudWatch Dashboards, are designed to help customers face the challenge of cluster monitoring mentioned above. In addition, the Grafana dashboards demonstrated later in this blog are a complementary add-on to this new, CloudWatch-based, dashboard. There are some key differences between the two, summarized in the table below.

The latter does not require any additional component or agent running on either the head or the compute nodes. It aggregates the metrics already pushed by AWS ParallelCluster on CloudWatch into a single dashboard. This translates into zero overhead for the cluster, but at the expense of less flexibility and expandability.

Instead, the Grafana-based dashboard offers additional flexibility and customizability and requires a few lightweight components installed on either the head or the compute nodes. Another key difference between the two monitoring dashboards is that the CloudWatch based one requires AWS credentials, IAM User and Roles configured and access to the AWS web-console, while the Grafana-based one has its-own built-in authentication and authorization system unrelated from the AWS account, end-users or HPC admins might not have permissions (or simply are not willing) to access the AWS Management Console in order to monitor their clusters.

CloudWatch based dashboard Grafana-based monitoring add-on
No additional component Grafana + Prometheus
No overhead Minimal overhead
Little to no expandability Support full customizability
Requires AWS credentials and IAM configured Custom credentials, no AWS access required
Custom user-interface

Grafana add-on dashboards on GitHub

There are many components of an HPC cluster to monitor. Moreover, the cluster is built on a system that is continuously evolving: AWS ParallelCluster and AWS services in general are updated and new features are released often. Because of this, we wanted to have a monitoring solution that is developed on flexible components, that can evolve rapidly. So, we released a Grafana add-on as an open-source project onto this GitHub repository.

Releasing it as an open-source project allows us to more easily and frequently release new updates and enhancements. It also enables users to customize and extend its functionalities by adding new dashboards, or extending the dashboards functionalities (like GPU monitoring or track EC2 Spot Instance interruptions).

At the moment, this new solution is composed of the following open-source components:

  • Grafana is an open-source platform for monitoring and observing. Grafana allows you to query, visualize, alert, and understand your metrics. It also helps you create, explore, and share dashboards fostering a data-driven culture.
  • Prometheus is an open source project for systems and service monitoring from the Cloud Native Computing Foundation. It collects metrics from configured targets at given intervals, evaluates rule expressions, displays the results, and can trigger alerts if some condition is observed to be true.
  • The Prometheus Pushgateway is an open source tool that allows ephemeral and batch jobs to expose their metrics to Prometheus.
  • NGINX is an HTTP and reverse proxy server, a mail proxy server, and a generic TCP/UDP proxy server.
  • Prometheus-Slurm-Exporter is a Prometheus collector and exporter for metrics extracted from the Slurm resource scheduling system.
  • Node_exporter is a Prometheus exporter for hardware and OS metrics exposed by *NIX kernels, written in Go with pluggable metric collectors.

Note: while almost all components are under the Apache2 license, only Prometheus-Slurm-Exporter is licensed under GPLv3. You should be aware of this license and accept its terms before proceeding and installing this component.

The dashboards

I demonstrate a few different Grafana dashboards in this post. These dashboards are available for you in the AWS Samples GitHub repository. In addition, two dashboards – still under development – are proposed in beta. The first shows the cluster logs coming from Amazon CloudWatch Logs. The second one shows the costs associated to each AWS service utilized.

All these dashboards can be used as they are or customized as you need:

  • AWS ParallelCluster Summary – This is the main dashboard that shows general monitoring info and metrics for the whole cluster. It includes: Slurm metrics, compute related metrics, storage performance metrics, and network usage.

ParallelCluster dashboard

  • Master Node Details – This dashboard shows detailed metrics for the head node, including CPU, memory, network, and storage utilization.

Master node details

  • Compute Node List – This dashboard shows the list of the available compute nodes. Each entry is a link to a more detailed dashboard: the compute node details (see the following image).

Compute node list

  • Compute Node Details – Similar to the head node details, this dashboard shows detailed metrics for the compute nodes.
  • Cluster Logs – This dashboard (still under development) shows all the logs of your HPC cluster. The logs are pushed by AWS ParallelCluster to Amazon CloudWatch Logs and are reported here.

Cluster logs

  • Cluster Costs (also under development) – This dashboard shows the cost associated to AWS Service utilized by your cluster. It includes: EC2, Amazon EBS, FSx, Amazon S3, Amazon EFS. as well as an aggregation of all the costs of every single component.

Cluster costs

How to deploy it

You can simply use the post-install script that you can find in this GitHub repo as it is, or customize it as you need. For instance, you might want to change your Grafana password to something more secure and meaningful for you, or you might want to customize some dashboards by adding additional components to monitor.

#Load AWS Parallelcluster environment variables
. /etc/parallelcluster/cfnconfig
#get GitHub repo to clone and the installation script
monitoring_url=$(echo ${cfn_postinstall_args}| cut -d ',' -f 1 )
monitoring_dir_name=$(echo ${cfn_postinstall_args}| cut -d ',' -f 2 )
monitoring_tarball="${monitoring_dir_name}.tar.gz"
setup_command=$(echo ${cfn_postinstall_args}| cut -d ',' -f 3 )
monitoring_home="/home/${cfn_cluster_user}/${monitoring_dir_name}"
case ${cfn_node_type} in
    MasterServer)
        wget ${monitoring_url} -O ${monitoring_tarball}
        mkdir -p ${monitoring_home}
        tar xvf ${monitoring_tarball} -C ${monitoring_home} --strip-components 1
    ;;
    ComputeFleet)
    ;;
esac
#Execute the monitoring installation script
bash -x "${monitoring_home}/parallelcluster-setup/${setup_command}" >/tmp/monitoring-setup.log 2>&1
exit $?

The proposed post-install script takes care of installing and configuring everything for you. Although a few additional parameters are needed in the AWS ParallelCluster configuration file: the post-install argument, additional IAM policies, custom security group, and a tag. You can find an AWS ParallelCluster template here.

Please note that, at the moment, the post install script has only been tested using Amazon Linux 2.

Add the following parameters to your AWS ParallelCluster configuration file, and then build up your cluster:

base_os = alinux2
post_install = s3://<my-bucket-name>/post-install.sh
post_install_args = https://github.com/aws-samples/aws-parallelcluster-monitoring/tarball/main,aws-parallelcluster-monitoring,install-monitoring.sh
additional_iam_policies = arn:aws:iam::aws:policy/CloudWatchFullAccess,arn:aws:iam::aws:policy/AWSPriceListServiceFullAccess,arn:aws:iam::aws:policy/AmazonSSMFullAccess,arn:aws:iam::aws:policy/AWSCloudFormationReadOnlyAccess
tags = {"Grafana" : "true"}

Make sure that port 80 and port 443 of your head node are accessible from the internet or from your network. You can achieve this by creating the appropriate security group via AWS Management Console or via Command Line Interface (AWS CLI), see the following example:

aws ec2 create-security-group --group-name my-grafana-sg --description "Open Grafana dashboard ports" —vpc-id vpc-1a2b3c4d
aws ec2 authorize-security-group-ingress --group-id sg-12345 --protocol tcp --port 443 —cidr 0.0.0.0/0
aws ec2 authorize-security-group-ingress --group-id sg-12345 --protocol tcp --port 80 —cidr 0.0.0.0/0

There is more information on how to create your security groups here.

Finally, set the additional_sg parameter in the [VPC] section of your AWS ParallelCluster configuration file.

After your cluster is created, you can open a web-browser and connect to https://your_public_ip. You should see a landing page with links to the Prometheus database service and the Grafana dashboards.

Size your compute and head nodes

We looked into resource utilization of the components required for building this monitoring solution. In particular, the Prometheus node exporter installed in the compute nodes uses a small (almost negligible) number of CPU cycles, memory, and network.

Depending on the size of your cluster the components installed on the head node (see the list in the chapter “Solution components” of this blog) it might require additional CPU, memory and network capabilities. In particular, if you expect to run a large-scale cluster (hundreds of instances) because of the higher volume of network traffic due to the compute nodes continuously pushing metrics into the head node, we recommend you use an instance type bigger than what you planned.

We cannot advise you exactly how to size your head node because there are many factors that can influence resource utilization. The best recommendation we could give you is to use the Grafana dashboard itself to monitor the CPU, memory, disk, and most importantly, network utilization, and then resize your head node (or other components) accordingly.

Conclusions

This monitoring solution for AWS ParallelCluster is an add-on for your HPC Cluster on AWS and a complement to new features recently released in AWS ParallelCluster 2.10. This blog aimed to provide you with instructions and basic tooling that can be customized based on your needs and that can be adapted quickly as the underlying AWS services evolve.

We are open to hear from you, receive feedback, issues, and pull requests to extend its functionalities.

Using the Amazon Redshift Data API to interact from an Amazon SageMaker Jupyter notebook

Post Syndicated from Saunak Chandra original https://aws.amazon.com/blogs/big-data/using-the-amazon-redshift-data-api-to-interact-from-an-amazon-sagemaker-jupyter-notebook/

The Amazon Redshift Data API makes it easy for any application written in Python, Go, Java, Node.JS, PHP, Ruby, and C++ to interact with Amazon Redshift. Traditionally, these applications use JDBC connectors to connect, send a query to run, and retrieve results from the Amazon Redshift cluster. This requires extra steps like managing the cluster credentials and configuring the VPC subnet and security group.

In some use cases, you don’t want to manage connections or pass credentials on the wire. The Data API simplifies these steps so you can focus on data consumption instead of managing resources such as the cluster credentials, VPCs, and security groups.

This post demonstrates how you can connect an Amazon SageMaker Jupyter notebook to the Amazon Redshift cluster and run Data API commands in Python. The in-place analysis is an effective way to pull data directly into a Jupyter notebook object. We provide sample code to demonstrate in-place analysis by fetching Data API results into a Pandas DataFrame for quick analysis. For more information about the Data API, see Using the Amazon Redshift Data API to interact with Amazon Redshift clusters.

After exploring the mechanics of the Data API in a Jupyter notebook, we demonstrate how to implement a machine learning (ML) model in Amazon SageMaker, using data stored in the Amazon Redshift cluster. We use sample data to build, train, and test an ML algorithm in Amazon SageMaker. Finally, we deploy the model in an Amazon SageMaker instance and draw inference.

Using the Data API in a Jupyter notebook

Jupyter Notebook is a popular data exploration tool primarily used for ML. To work with ML-based analysis, data scientists pull data from sources like websites, Amazon Simple Storage Service (Amazon S3), and databases using Jupyter notebooks. Many Jupyter Notebook users prefer to use data from Amazon Redshift as their primary source of truth for their organization’s data warehouse and event data stored in Amazon S3 data lake.

When you use Amazon Redshift as a data source in Jupyter Notebook, the aggregated data is visualized first for preliminary analysis, followed by extensive ML model building, training, and deployment. Jupyter Notebook connects and runs SQL queries on Amazon Redshift using a Python-based JDBC driver. Data extraction via JDBC drivers poses the following challenges:

  • Dealing with driver installations, credentials and network security management, connection pooling, and caching the result set
  • Additional administrative overhead to bundle the drivers into the Jupyter notebook before sharing the notebook with others

The Data API simplifies these management steps. Jupyter Notebook is pre-loaded with libraries needed to access the Data API, which you import when you use data from Amazon Redshift.

Prerequisites

To provision the resources for this post, you launch the following AWS CloudFormation stack:

The CloudFormation template is tested in the us-east-2 Region. It launches a 2-node DC2.large Amazon Redshift cluster to work on for this post. It also launches an AWS Secrets Manager secret and an Amazon SageMaker Jupyter notebook instance.

The following screenshot shows the Outputs tab for the stack on the AWS CloudFormation console.

The Secrets Manager secret is updated with cluster details required to work with the Data API. An AWS Lambda function is spun up and run during the launch of the CloudFormation template to update the secret (it receives input from the launched Amazon Redshift cluster). The following code updates the secret:

SecretsUpdateFunction:
    Type: AWS::Lambda::Function
    Properties:
      Role: !GetAtt 'LambdaExecutionRole.Arn'
      FunctionName: !Join ['-', ['update_secret', !Ref 'AWS::StackName']]
      MemorySize: 2048
      Runtime: python3.7
      Timeout: 900
      Handler: index.handler
      Code:
        ZipFile:
          Fn::Sub:
          - |-
           import json
           import boto3
           import os
           import logging
           import cfnresponse

           LOGGER = logging.getLogger()
           LOGGER.setLevel(logging.INFO)

           ROLE_ARN = '${Role}'
           SECRET = '${Secret}'
           CLUSTER_ID = CLUSTER_ENDPOINT.split('.')[0]
           DBNAME = 'nyctaxi'

           def handler(event, context):

               # Get CloudFormation-specific parameters, if they exist
               cfn_stack_id = event.get('StackId')
               cfn_request_type = event.get('RequestType')

               #update Secrets Manager secret with host and port
               sm = boto3.client('secretsmanager')
               sec = json.loads(sm.get_secret_value(SecretId=SECRET)['SecretString'])
               sec['dbClusterIdentifier'] = CLUSTER_ID
               sec['db'] = DBNAME
               newsec = json.dumps(sec)
               response = sm.update_secret(SecretId=SECRET, SecretString=newsec)


               # Send a response to CloudFormation pre-signed URL
               cfnresponse.send(event, context, cfnresponse.SUCCESS, {
                   'Message': 'Secrets upated'
                   },
                   context.log_stream_name)

               return {
                   'statusCode': 200,
                   'body': json.dumps('Secrets updated')
               }

          - {
            Role : !GetAtt RedshiftSagemakerRole.Arn,
            Endpoint: !GetAtt RedshiftCluster.Endpoint.Address,
            Secret: !Ref RedshiftSecret
            }

Working with the Data API in Jupyter Notebook

In this section, we walk through the details of working with the Data API in a Jupyter notebook.

  1. On the Amazon SageMaker console, under Notebook, choose Notebook instances.
  2. Locate the notebook you created with the CloudFormation template.
  3. Choose Open Jupyter.

This opens up an empty Amazon SageMaker notebook page.

  1. Download the file RedshiftDeepAR-DataAPI.ipynb to your local storage.
  2. Choose Upload.
  3. Upload RedshiftDeepAR-DataAPI.ipynb.

Importing Python packages

We first import the necessary boto3 package. A few other packages are also relevant for the analysis, which we import in the first cell. See the following code:

import botocore.session as s
from botocore.exceptions import ClientError
import boto3.session
import json
import boto3
import sagemaker
import operator
from botocore.exceptions import WaiterError
from botocore.waiter import WaiterModel
from botocore.waiter import create_waiter_with_client

import s3fs
import time
import os
import random
import datetime

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from ipywidgets import interact, interactive, fixed, interact_manual
import ipywidgets as widgets
from ipywidgets import IntSlider, FloatSlider, Checkbox

Custom waiter

The Data API calls an HTTPS endpoint. Because ExecuteStatement Data API calls are asynchronous, we need a custom waiter. See the following code:

# Create custom waiter for the Redshift Data API to wait for finish execution of current SQL statement
waiter_name = 'DataAPIExecution'
JSON
delay=2
max_attempts=3

#Configure the waiter settings
waiter_config = {
  'version': 2,
  'waiters': {
    'DataAPIExecution': {
      'operation': 'DescribeStatement',
      'delay': delay,
      'maxAttempts': max_attempts,
      'acceptors': [
        {
          "matcher": "path",
          "expected": "FINISHED",
          "argument": "Status",
          "state": "success"
        },
        {
          "matcher": "pathAny",
          "expected": ["PICKED","STARTED","SUBMITTED"],
          "argument": "Status",
          "state": "retry"
        },
        {
          "matcher": "pathAny",
          "expected": ["FAILED","ABORTED"],
          "argument": "Status",
          "state": "failure"
        }
      ],
    },
  },
}

Retrieving information from Secrets Manager

We need to retrieve the following information from Secrets Manager for the Data API to use:

  • Cluster identifier
  • Secrets ARN
  • Database name

Retrieve the above information using the following code:

secret_name='redshift-dataapidemo' ## replace the secret name with yours
session = boto3.session.Session()
region = session.region_name

client = session.client(
        service_name='secretsmanager',
        region_name=region
    )

try:
    get_secret_value_response = client.get_secret_value(
            SecretId=secret_name
        )
    secret_arn=get_secret_value_response['ARN']

except ClientError as e:
    print("Error retrieving secret. Error: " + e.response['Error']['Message'])
    
else:
    # Depending on whether the secret is a string or binary, one of these fields will be populated.
    if 'SecretString' in get_secret_value_response:
        secret = get_secret_value_response['SecretString']
    else:
        secret = base64.b64decode(get_secret_value_response['SecretBinary'])
            
secret_json = json.loads(secret)

cluster_id=secret_json['dbClusterIdentifier']
db=secret_json['db']
print("Cluster_id: " + cluster_id + "\nDB: " + db + "\nSecret ARN: " + secret_arn)

We now create the Data API client. For the rest of the notebook, we use the Data API client client_redshift. See the following code:

bc_session = s.get_session()

session = boto3.Session(
        botocore_session=bc_session,
        region_name=region,
    )

# Setup the client
client_redshift = session.client("redshift-data")
print("Data API client successfully loaded")

Listing the schema and tables

To list the schema, enter the following code:

client_redshift.list_schemas(
    Database= db, 
    SecretArn= secret_arn, 
    ClusterIdentifier= cluster_id)["Schemas"]

The following screenshot shows the output.

To list the tables, enter the following code:

client_redshift.list_schemas(
    Database= db, 
    SecretArn= secret_arn, 
    ClusterIdentifier= cluster_id)["Schemas"]

The following screenshot shows the output.

Creating the schema and table

Before you issue any SQL statement to the Data API, we instantiate the custom waiter. See the following code:

waiter_model = WaiterModel(waiter_config)
custom_waiter = create_waiter_with_client(waiter_name, waiter_model, client_redshift)

query_str = "create schema taxischema;"

res = client_redshift.execute_statement(Database= db, SecretArn= secret_arn, Sql= query_str, ClusterIdentifier= cluster_id)
id=res["Id"]

# Waiter in try block and wait for DATA API to return
try:
    custom_waiter.wait(Id=id)    
except WaiterError as e:
    print (e)
    
desc=client_redshift.describe_statement(Id=id)
print("Status: " + desc["Status"] + ". Excution time: %d miliseconds" %float(desc["Duration"]/pow(10,6)))

query_str = 'create table taxischema.nyc_greentaxi(\
vendorid varchar(10),\
lpep_pickup_datetime timestamp,\
lpep_dropoff_datetime timestamp,\
store_and_fwd_flag char(1),\
ratecodeid int,\
pulocationid int,\
dolocationid int,\
passenger_count int,\
trip_distance decimal(8,2),\
fare_amount decimal(8,2),\
extra decimal(8,2),\
mta_tax decimal(8,2),\
tip_amount decimal(8,2),\
tolls_amount decimal(8,2),\
ehail_fee varchar(100),\
improvement_surcharge decimal(8,2),\
total_amount decimal(8,2),\
payment_type varchar(10),\
trip_type varchar(10),\
congestion_surcharge decimal(8,2)\
)\
sortkey (vendorid);'

res = client_redshift.execute_statement(Database= db, SecretArn= secret_arn, Sql= query_str, ClusterIdentifier= cluster_id)
id=res["Id"]

try:
    custom_waiter.wait(Id=id)
    print("Done waiting to finish Data API.")
except WaiterError as e:
    print (e)
    
desc=client_redshift.describe_statement(Id=id)
print("Status: " + desc["Status"] + ". Excution time: %d miliseconds" %float(desc["Duration"]/pow(10,6)))

Loading data into the cluster

After we create the table, we’re ready to load some data into it. The following code loads Green taxi cab data from two different Amazon S3 locations using individual COPY statements that run in parallel:

redshift_iam_role = sagemaker.get_execution_role() 
print("IAM Role: " + redshift_iam_role)
source_s3_region='us-east-1'

# Reset the 'delay' attribute of the waiter to 30 seconds for long running COPY statement.
waiter_config["waiters"]["DataAPIExecution"]["delay"] = 20
waiter_model = WaiterModel(waiter_long_config)
custom_waiter = create_waiter_with_client(waiter_name, waiter_model, client_redshift)

query_copystr1 = "COPY taxischema.nyc_greentaxi FROM 's3://nyc-tlc/trip data/green_tripdata_2020' IAM_ROLE '" + redshift_iam_role + "' csv ignoreheader 1 region '" + source_s3_region + "';"

query_copystr2 = "COPY taxischema.nyc_greentaxi FROM 's3://nyc-tlc/trip data/green_tripdata_2019' IAM_ROLE '" + redshift_iam_role + "' csv ignoreheader 1 region '" + source_s3_region + "';"

## Execute 2 COPY statements in paralell
res1 = client_redshift.execute_statement(Database= db, SecretArn= secret_arn, Sql= query_copystr1, ClusterIdentifier= cluster_id)
res2 = client_redshift.execute_statement(Database= db, SecretArn= secret_arn, Sql= query_copystr2, ClusterIdentifier= cluster_id)

print("Redshift COPY started ...")

id1 = res1["Id"]
id2 = res2["Id"]
print("\nID: " + id1)
print("\nID: " + id2)

# Waiter in try block and wait for DATA API to return
try:
    custom_waiter.wait(Id=id1)
    print("Done waiting to finish Data API for the 1st COPY statement.")
    custom_waiter.wait(Id=id2)
    print("Done waiting to finish Data API for the 2nd COPY statement.")
except WaiterError as e:
    print (e)

desc=client_redshift.describe_statement(Id=id1)
print("[1st COPY] Status: " + desc["Status"] + ". Excution time: %d miliseconds" %float(desc["Duration"]/pow(10,6)))
desc=client_redshift.describe_statement(Id=id2)
print("[2nd COPY] Status: " + desc["Status"] + ". Excution time: %d miliseconds" %float(desc["Duration"]/pow(10,6)))

Performing in-place analysis

We can run the Data API to fetch the query result into a Pandas DataFrame. This simplifies the in-place analysis of the Amazon Redshift cluster data because we bypass unloading the data first into Amazon S3 and then loading it into a Pandas DataFrame.

The following query lists records loaded in the table nyc_greentaxi by year and month:

query_str = "select to_char(lpep_pickup_datetime, 'YYYY-MM') as Pickup_YearMonth, count(*) as Ride_Count from taxischema.nyc_greentaxi group by 1 order by 1 desc;"

res = client_redshift.execute_statement(Database= db, SecretArn= secret_arn, Sql= query_str, ClusterIdentifier= cluster_id)

id = res["Id"]
# Waiter in try block and wait for DATA API to return
try:
    custom_waiter.wait(Id=id)
    print("Done waiting to finish Data API.")
except WaiterError as e:
    print (e)

output=client_redshift.get_statement_result(Id=id)
nrows=output["TotalNumRows"]
ncols=len(output["ColumnMetadata"])
#print("Number of columns: %d" %ncols)
resultrows=output["Records"]

col_labels=[]
for i in range(ncols): col_labels.append(output["ColumnMetadata"][i]['label'])
                                              
records=[]
for i in range(nrows): records.append(resultrows[i])

df = pd.DataFrame(np.array(resultrows), columns=col_labels)

df[col_labels[0]]=df[col_labels[0]].apply(operator.itemgetter('stringValue'))
df[col_labels[1]]=df[col_labels[1]].apply(operator.itemgetter('longValue'))

df

The following screenshot shows the output.

Now that you’re familiar with the Data API in Jupyter Notebook, let’s proceed with ML model building, training, and deployment in Amazon SageMaker.

ML models with Amazon Redshift

The following diagram shows the ML model building, training, and deployment process. The source of data for ML training and testing is Amazon Redshift.

The workflow includes the following steps:

  1. Launch a Jupyter notebook instance in Amazon SageMaker. You make the Data API call from the notebook instance that runs a query in Amazon Redshift.
  2. The query result is unloaded into an S3 bucket. The output data is formatted as CSV, GZIP, or Parquet.
  3. Read the query result from Amazon S3 into a Pandas DataFrame within the Jupyter notebook. This DataFrame is split between train and test data accordingly.
  4. Build the model using the DataFrame, then train and test the model.
  5. Deploy the model into a dedicated instance in Amazon SageMaker. End-users and other systems can call this instance to directly infer by providing the input data.

Building and training the ML model using data from Amazon Redshift

In this section, we review the steps to build and train an Amazon SageMaker model from data in Amazon Redshift. For this post, we use the Amazon SageMaker built-in forecasting algorithm DeepAR and the DeepAR example code on GitHub.

The source data is in an Amazon Redshift table. We build a forecasting ML model to predict the number of Green taxi rides in New York City.

Before building the model using Amazon SageMaker DeepAR, we need to format the raw table data into a format for the algorithm to use using SQL. The following screenshot shows the original format.

The following screenshot shows the converted format.

We convert the raw table data into the preceding format by running the following SQL query. We run the UNLOAD statement using this SQL to unload the transformed data into Amazon S3.

query_str = "UNLOAD('select
   coalesce(v1.pickup_timestamp_norm, v2.pickup_timestamp_norm) as pickup_timestamp_norm,
   coalesce(v1.vendor_1, 0) as vendor_1,
   coalesce(v2.vendor_2, 0) as vendor_2 
from
   (
      select
         case
            when
               extract(minute 
      from
         lpep_dropoff_datetime) between 0 and 14 
      then
         dateadd(minute, 0, date_trunc(''hour'', lpep_dropoff_datetime)) 
      when
         extract(minute 
      from
         lpep_dropoff_datetime) between 15 and 29 
      then
         dateadd(minute, 15, date_trunc(''hour'', lpep_dropoff_datetime)) 
      when
         extract(minute 
      from
         lpep_dropoff_datetime) between 30 and 44 
      then
         dateadd(minute, 30, date_trunc(''hour'', lpep_dropoff_datetime)) 
      when
         extract(minute 
      from
         lpep_dropoff_datetime) between 45 and 59 
      then
         dateadd(minute, 45, date_trunc(''hour'', lpep_dropoff_datetime)) 
         end
         as pickup_timestamp_norm , count(1) as vendor_1 
      from
         taxischema.nyc_greentaxi 
      where
         vendorid = 1 
      group by
         1
   )
   v1 
   full outer join
      (
         select
            case
               when
                  extract(minute 
         from
            lpep_dropoff_datetime) between 0 and 14 
         then
            dateadd(minute, 0, date_trunc(''hour'', lpep_dropoff_datetime)) 
         when
            extract(minute 
         from
            lpep_dropoff_datetime) between 15 and 29 
         then
            dateadd(minute, 15, date_trunc(''hour'', lpep_dropoff_datetime)) 
         when
            extract(minute 
         from
            lpep_dropoff_datetime) between 30 and 44 
         then
            dateadd(minute, 30, date_trunc(''hour'', lpep_dropoff_datetime)) 
         when
            extract(minute 
         from
            lpep_dropoff_datetime) between 45 and 59 
         then
            dateadd(minute, 45, date_trunc(''hour'', lpep_dropoff_datetime)) 
            end
            as pickup_timestamp_norm , count(1) as vendor_2 
         from
            taxischema.nyc_greentaxi 
         where
            vendorid = 2 
         group by
            1
      )
      v2 
      on v1.pickup_timestamp_norm = v2.pickup_timestamp_norm 
;') to '" + redshift_unload_path + "' iam_role '" + redshift_iam_role + "' format as CSV header ALLOWOVERWRITE GZIP"

After we unload the data into Amazon S3, we load the CSV data into a Pandas DataFrame and visualize the dataset. The following plots show the number of rides aggregated per 15 minutes for each of the vendors.

We now train our model using this time series data to forecast the number of rides.

The attached Jupyter notebook contains three steps:

  1. Split the train and test data. Unlike classification and regression ML tasks where the train and split are done by randomly dividing the entire dataset, in this forecasting algorithm, we split the data based on time:
    1. Start date of training data – 2019-01-01
    2. End date of training data – 2020-07-31
  2. Train the model by setting values to the mandatory hyperparameters.

The training job takes around 15 minutes, and the training progress is displayed on the screen. When the job is complete, you see code like the following:

#metrics {"Metrics": {"model.score.time": {"count": 1, "max": 3212.099075317383, "sum": 3212.099075317383, "min": 3212.099075317383}}, "EndTime": 1597702355.281733, "Dimensions": {"Host": "algo-1", "Operation": "training", "Algorithm": "AWS/DeepAR"}, "StartTime": 1597702352.069719}

[08/17/2020 22:12:35 INFO 140060425148224] #test_score (algo-1, RMSE): 24.8660570151
[08/17/2020 22:12:35 INFO 140060425148224] #test_score (algo-1, mean_absolute_QuantileLoss): 20713.306262554062
[08/17/2020 22:12:35 INFO 140060425148224] #test_score (algo-1, mean_wQuantileLoss): 0.18868379682658334
[08/17/2020 22:12:35 INFO 140060425148224] #test_score (algo-1, wQuantileLoss[0.1]): 0.13653619964790314
[08/17/2020 22:12:35 INFO 140060425148224] #test_score (algo-1, wQuantileLoss[0.2]): 0.18786255278771358
[08/17/2020 22:12:35 INFO 140060425148224] #test_score (algo-1, wQuantileLoss[0.3]): 0.21525202142165195
[08/17/2020 22:12:35 INFO 140060425148224] #test_score (algo-1, wQuantileLoss[0.4]): 0.2283095901515685
[08/17/2020 22:12:35 INFO 140060425148224] #test_score (algo-1, wQuantileLoss[0.5]): 0.2297682531655401
[08/17/2020 22:12:35 INFO 140060425148224] #test_score (algo-1, wQuantileLoss[0.6]): 0.22057919827603453
[08/17/2020 22:12:35 INFO 140060425148224] #test_score (algo-1, wQuantileLoss[0.7]): 0.20157691985194473
[08/17/2020 22:12:35 INFO 140060425148224] #test_score (algo-1, wQuantileLoss[0.8]): 0.16576246442811773
[08/17/2020 22:12:35 INFO 140060425148224] #test_score (algo-1, wQuantileLoss[0.9]): 0.11250697170877594
[08/17/2020 22:12:35 INFO 140060425148224] #quality_metric: host=algo-1, test mean_wQuantileLoss <loss>=0.188683796827
[08/17/2020 22:12:35 INFO 140060425148224] #quality_metric: host=algo-1, test RMSE <loss>=24.8660570151
#metrics {"Metrics": {"totaltime": {"count": 1, "max": 917344.633102417, "sum": 917344.633102417, "min": 917344.633102417}, "setuptime": {"count": 1, "max": 10.606050491333008, "sum": 10.606050491333008, "min": 10.606050491333008}}, "EndTime": 1597702355.338799, "Dimensions": {"Host": "algo-1", "Operation": "training", "Algorithm": "AWS/DeepAR"}, "StartTime": 1597702355.281794}


2020-08-17 22:12:56 Uploading - Uploading generated training model
2020-08-17 22:12:56 Completed - Training job completed
Training seconds: 991
Billable seconds: 991
CPU times: user 3.48 s, sys: 210 ms, total: 3.69 s
Wall time: 18min 56s

 

  1. Deploy the trained model in an Amazon SageMaker endpoint.

We use the endpoint to make predictions on the fly. In this post, we create an endpoint on an ml.m4.xlarge instance class. For displaying prediction results, we have provide an interactive time series graph. You can adjust four control values:

  • vendor_id – The vendor ID.
  • forecast_day – The offset from the training end date. This is the first date of the forecast prediction.
  • confidence – The confidence interval.
  • history_weeks_plot – The number of weeks in the plot prior to the forecast day.

The prediction plot looks like the following screenshot.

Conclusion

In this post, we walked through steps to interact with Amazon Redshift from an Amazon SageMaker Jupyter notebook using the Data API. We provided sample codes for the notebook to wait for the Data API to finish specific steps. The sample code showed how to configure the wait time for different SQL.

The length of wait time depends on the type of query you submit. A COPY command, which loads a large number of Amazon S3 objects, is usually longer than a SELECT query.

You can retrieve query results directly into a Pandas DataFrame by calling the GetStatementResult API. This approach simplifies the in-place analysis by delegating complex SQL queries at Amazon Redshift and visualizing the data by fetching the query result into the Jupyter notebook.

We further explored building and deploying an ML model on Amazon SageMaker using train and test data from Amazon Redshift.

For more information about the Data API, watch the video Introducing the Amazon Redshift Data API on YouTube and see Using the Amazon Redshift Data API.


About the Authors

Saunak Chandra is a senior partner solutions architect for Redshift at AWS. Saunak likes to experiment with new products in the technology space, alongside his day to day work. He loves exploring the nature in the Pacific Northwest. A short hiking or biking in the trails is his favorite weekend morning routine. He also likes to do yoga when he gets time from his kid.

 

 

Debu Panda, a senior product manager at AWS, is an industry leader in analytics, application platform, and database technologies. He has more than 20 years of experience in the IT industry and has published numerous articles on analytics, enterprise Java, and databases and has presented at multiple conferences. He is lead author of the EJB 3 in Action (Manning Publications 2007, 2014) and Middleware Management (Packt).

 

 

Chao Duan is a software development manager at Amazon Redshift, where he leads the development team focusing on enabling self-maintenance and self-tuning with comprehensive monitoring for Redshift. Chao is passionate about building high-availability, high-performance, and cost-effective database to empower customers with data-driven decision making.

[$] ID mapping for mounted filesystems

Post Syndicated from original https://lwn.net/Articles/837566/rss

Almost every filesystem (excepting relics like VFAT) implements the concept
of the owner and group of each file; the higher levels of the operating
system then use that information to control access to those files. For
decades, it has usually sufficed to track a single owner and group for each
file, but there is an increasing number of use cases wanting to make that
ownership relative to the environment any given process is running in.
Developers have been working for a few years to find solutions to this
problem; the latest attempt is the ID-mapped
mounts patch set
from Christian Brauner.

Development Simplified: CORS Support for Backblaze S3 Compatible APIs

Post Syndicated from Amrit Singh original https://www.backblaze.com/blog/development-simplified-cors-support-for-backblaze-s3-compatible-apis/

Since its inception in 2009, Cross-Origin Resource Sharing (CORS) has offered developers a convenient way of bypassing an inherently secure default setting—namely the same-origin policy (SOP). Allowing selective cross-origin requests via CORS has saved developers countless hours and money by reducing maintenance costs and code complexity. And now with CORS support for Backblaze’s recently launched S3 Compatible APIs, developers can continue to scale their experience without needing a complete code overhaul.

If you haven’t been able to adopt Backblaze B2 Cloud Storage in your development environment because of issues related to CORS, we hope this latest release gives you an excuse to try it out. Whether you are using our B2 Native APIs or S3 Compatible APIs, CORS support allows you to build rich client-side web applications with Backblaze B2. With the simplicity and affordability this service offers, you can put your time and money back to work on what’s really important: serving end users.

Top Three Reasons to Enable CORS

B2 Cloud Storage is popular among agile teams and developers who want to take advantage of easy to use and affordable cloud storage while continuing to seamlessly support their applications and workflows with minimal to no code changes. With Backblaze S3 Compatible APIs, pointing to Backblaze B2 for storage is dead simple. But if CORS is key to your workflow, there are three additional compelling reasons for you to test it out today:

  • Compatible storage with no re-coding. By enabling CORS rules for your custom web application or SaaS service that uses our S3 Compatible APIs, your development team can serve and upload data via B2 Cloud Storage without any additional coding or reconfiguring required. This will save you valuable development time as you continue to deliver a robust experience for your end users.
  • Seamless integration with your plugins. Even if you don’t choose B2 Cloud Storage as the primary backend for your business but you do use it for discreet plugins or content serving sites, enabling CORS rules for those applications will come in handy. Developers who configure PHP, NodeJS, and WordPress plugins via the S3 Compatible APIs to upload or download files from web applications can do so easily by enabling CORS rules in their Backblaze B2 Buckets. With CORS support enabled, these plugins work seamlessly.
  • Serving your web assets with ease. Consider an even simpler scenario in which you want to serve a custom web font from your B2 Cloud Storage Bucket. Most modern browsers will require a preflight check for loading the font. By configuring the CORS rules in that bucket to allow the font to be served in the origin(s) of your choice, you will be able to use your custom font seamlessly across your domains from a single source.

Whether you are relying on B2 Cloud Storage as your primary cloud infrastructure for your web application or simply using it to serve cross-origin assets such as fonts or images, enabling CORS rules in your buckets will allow for proper and secure resource sharing.

Enabling CORS Made Simple and Fast

If your web page or application is hosted in a different origin from images, fonts, videos, or stylesheets stored in B2 Cloud Storage, you need to add CORS rules to your bucket to achieve proper functionality. Thankfully, enabling CORS rules is easy and can be found in your B2 Cloud Storage settings:

You will have the option of sharing everything in your bucket with every origin, select origins, or defining custom rules with the Backblaze B2 CLI.

Learning More and Getting Started

If you’re dying to learn more about the fundamentals of CORS as well as additional specifics about how it works with B2 Cloud Storage, you can dig into this informative Knowledge Base article. If you’re just pumped that CORS is now easily available in our S3 Compatible APIs suite, well then, you’re probably already on your way to a smoother, more reasonably priced development experience. If you’ve got a question or a response, we always love to hear from you in the comments or you can contact us for assistance.

The post Development Simplified: CORS Support for Backblaze S3 Compatible APIs appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

The US Military Buys Commercial Location Data

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2020/11/the-us-military-buys-commercial-location-data.html

Vice has a long article about how the US military buys commercial location data worldwide.

The U.S. military is buying the granular movement data of people around the world, harvested from innocuous-seeming apps, Motherboard has learned. The most popular app among a group Motherboard analyzed connected to this sort of data sale is a Muslim prayer and Quran app that has more than 98 million downloads worldwide. Others include a Muslim dating app, a popular Craigslist app, an app for following storms, and a “level” app that can be used to help, for example, install shelves in a bedroom.

This isn’t new, this isn’t just data of non-US citizens, and this isn’t the US military. We have lots of instances where the government buys data that it cannot legally collect itself.

Some app developers Motherboard spoke to were not aware who their users’ location data ends up with, and even if a user examines an app’s privacy policy, they may not ultimately realize how many different industries, companies, or government agencies are buying some of their most sensitive data. U.S. law enforcement purchase of such information has raised questions about authorities buying their way to location data that may ordinarily require a warrant to access. But the USSOCOM contract and additional reporting is the first evidence that U.S. location data purchases have extended from law enforcement to military agencies.

Rust 1.48.0 released

Post Syndicated from original https://lwn.net/Articles/837771/rss

Version
1.48.0
of the Rust language has been released. The biggest change
appears to be improvements to the documentation system, but there’s more:
The most significant API change is kind of a mouthful: [T; N]:
TryFrom<Vec<T>> is now stable. What does this mean? Well, you
can use this to try and turn a vector into an array of a given
length
“.

Security updates for Thursday

Post Syndicated from original https://lwn.net/Articles/837767/rss

Security updates have been issued by Arch Linux (chromium and firefox), CentOS (bind, curl, fence-agents, kernel, librepo, libvirt, microcode_ctl, python, python3, qt and qt5-qtbase, resource-agents, and tomcat), Debian (drupal7, firefox-esr, jupyter-notebook, packer, python3.5, and rclone), Fedora (firefox), Mageia (firefox, nss), openSUSE (gdm, kernel-firmware, and moinmoin-wiki), Oracle (net-snmp), SUSE (libzypp, zypper), and Ubuntu (c-ares).

Managing COVID-19 exposure with crowd tracing

Post Syndicated from Aspire Ventures original https://aws.amazon.com/blogs/big-data/managing-covid-19-exposure-with-crowd-tracing/

This is a guest blog post by AWS partner Aspire Ventures

As we enter winter, with fewer options to be outdoors, our personal choices can impact our risk of contracting the COVID-19 virus even more. The New England Journal of Medicine publication showed real-world examples of the effectiveness of masks and social distancing in mitigating severity of COVID-19 infection and keeping people asymptomatic. CNN reported on a study that showed people who contracted COVID-19 were twice as likely to have visited a restaurant in the prior two weeks. What if we had actionable crowding, mask usage, and social distancing data that we could analyze to inform our daily decisions to keep us safe?

Aspire Ventures, an AWS partner, has developed the Clio GO pass system — a new venue-entry system that helps track COVID-19 exposure through kiosks and mobile phones in a completely privacy-preserving way. It uses a new technology called crowd tracing, which allows users to assess whether certain locations and venues meet their risk profile. Crowd tracing data is COVID-19 location-scouting data, which helps answer the question of how much risk may be associated with entering a particular crowd.

Today, Aspire Ventures is collaborating with AWS to open source anonymized crowd tracing data from the Clio GO pass system and make it available in the public AWS COVID-19 data lake. Aspire Ventures is a venture fund dedicated to fast-tracking precision medicine technologies and practices that leverage AI and IoT to deliver affordable, individualized solutions at a massive scale. The AWS COVID-19 data lake is a public repository of up-to-date and curated datasets on or related to COVID-19 to help experts track, contain, and neutralize the virus causing the illness. With the Clio GO pass system and the open-source crowd tracing dataset in the AWS COVID-19 data lake, we believe the global community can come together and develop techniques to better fight the COVID-19 pandemic.

The Clio GO app functions like an airline mobile boarding pass system. Prior to arrival, you check in via the app by answering a few questions and receive a mobile entry GO Pass in either QR-code or NFC ticket format. When you arrive at a venue, you validate your GO Pass by scanning it at a kiosk or smartphone. GO Pass is being used by thousands of venues who have tens of millions of annual visitors. These venues run the spectrum from schools and medical practices to office buildings and food manufacturing facilities. Further adoption of Aspire’s Clio GO app will generate more anonymized data that AWS will make available for advancing COVID-19 solutions.

Crowd tracing vs. contact tracing

As Dr. Fauci said, contact tracing is “not working.” As the primary technology used by public health authorities, it’s fraught with poor adoption, poor accuracy, high cost, and serious privacy concerns. To understand the issues with contact tracing, we analogize to a first-person video game to introduce the immunological concepts of viral dose, viral load, and undetectable asymptomatic carriers.

Imagine a video game scenario with attackers and shields to protect from attacks. Viral dose is analogous to incremental hits that weaken a player’s shield. Avoiding those hits prevents your shield from collapsing.

Viral load is analogous to how strongly any one attacker can hit a player’s shield. Certain infected individuals who are more progressed in their infection may hit you harder. Just as in the game, your shields may be destroyed by many weak hits from multiple attackers or one very strong hit from a single attacker.

Asymptomatic carriers are like players who, from a distance, look like they have no weapons. Undetectable asymptomatic carriers are like players whose weapons can’t even be detected when you search their belongings—the science indicates that infectious asymptomatic carriers may not be detectable with COVID-19 PCR swab tests. CDC blood test surveys show between 6 times to 25 times as many asymptomatic carriers are lurking out there for any single known case.

Clio GO uses crowd tracing and adaptive artificial intelligence (A2I) to progressively improve the estimates of each player’s shield strength, the intensity of hits you might encounter, and the likely hits from attackers whose weapons are completely undetectable.

In contrast, contact tracing requires that an attacker have a visible weapon, and if so, it assumes shields are obliterated immediately. However, if there is no visible weapon, the shields remain at 100%. In either case, contact tracing doesn’t decrease your shield level based on cumulative small hits (viral dose) or how intense the hits (viral load of others) are by taking into account the conditions at the time, such as mask usage, social distancing, and duration of contact.

The more people who have checked in to the same place, the lower each person’s shield is computed to be. Shield levels are further refined by reported mask usage and social distancing within the crowd. The venue never sees the person’s shield level. It sees a green or red check mark indicating if you’re entering with a valid pass and meets their entry requirements, but no symptom data is shared with the venue.

After your visit, you can rate the venue’s use of masks and social distancing, and this report helps compute your own shield level and that of others. When using the Clio GO app to scout venues prior to visiting, the venue’s listing shows the aggregate mask and social distancing ratings by visitors.

As part of our collaboration with AWS, and to broaden the adoption of crowd tracing, Clio GO app users can now pre-screen their visitors at no cost for personal, non-profit, faith- based, educational, and amateur athletic event use.

How you can contribute

We welcome everyone to participate in this collaborative effort. Using the app improves your and your visitor’s safety while contributing anonymized crowd tracing data to the open-source public AWS COVID-19 data lake.

In just a few minutes, you can get a free Clio GO account and use the Clio GO app to pre-screen people attending personal events and private clubs—whether small dinner parties, soccer matches, or religious gatherings. You can purchase additional hardware for unmanned door screening or mobile kiosk functionality, as well as solutions for commercial enterprises.

AWS COVID-19 data lake and crowd tracing data

To make the data from the AWS COVID-19 data lake available in the Data Catalog in your AWS account, create a CloudFormation stack using the following template. This template creates a covid-19 database in your Glue Data Catalog and tables that point to the public AWS COVID-19 data lake. This includes a aspirevc_crowd_tracing table which points to up-to-date crowd tracing data, and also a aspirevc_crowd_tracing_zipcode_3digits table which points to a lookup which translates 3 digits zip codes used in the aspirevc_crowd_tracing table to the respective states.

You can query these tables using Amazon Athena. Athena is a serverless interactive query service that makes it easy to analyze the data in the AWS COVID-19 data lake. Athena supports SQL, a common language that data analysts use for analyzing structured data. To query the data, complete the following steps:

  1. Sign in to the Athena console.
    1. If this is the first time you are using Athena, you must specify a query result location on Amazon S3.
  2. From the drop-down menu, choose the covid-19 database.
  3. Enter your query.

The following query returns statistics including the number of people marked as symptoms, diagnosed, contact, and near for the given scan date:

SELECT
  cast(from_iso8601_timestamp(scandate) as date) as date,
  COUNT_IF(symptoms) as symptoms_count,
  COUNT_IF(diagnosed) as diagnosed_count,
  COUNT_IF(contact) as contact_count,
  COUNT_IF(near) as near_count, COUNT(*) as total_count
FROM "covid-19"."aspirevc_crowd_tracing"
WHERE from_iso8601_timestamp(scandate)
BETWEEN parse_datetime('2020-10-01:00:00:00','yyyy-MM-dd:HH:mm:ss')
AND parse_datetime('2020-10-16:00:00:00','yyyy-MM-dd:HH:mm:ss')
GROUP BY 1
ORDER BY 1

symptoms: Past 2 weeks, have you had any of the following symptoms: shortness of breath, fever, loss of taste or smell, new cough?

diagnosed: Past 2 weeks, have you been diagnosed with COVID or are waiting for COVID test results?

contact: Past 2 weeks, have you been in contact with anyone who has been diagnosed with COVID or is waiting for COVID test results?

near: Past 2 weeks, have you been near anyone with the following symptoms: shortness of breath, fever, loss of taste or smell, new cough?

The following screenshot shows the results of this query:

To see more details, you can run the following query to retrieve the same statistics per state per risklevel for the given scan date:

SELECT
  cast(from_iso8601_timestamp(scandate) as date) as date,
  SUBSTR(scannerdevice_zipcode, 1, 3) as zip,
  state,
  risklevel,
  result,
  COUNT_IF(symptoms) as symptoms_count,
  COUNT_IF(diagnosed) as diagnosed_count,
  COUNT_IF(contact) as contact_count,
  COUNT_IF(near) as near_count, 
  COUNT(*) as total_count
FROM "covid-19"."aspirevc_crowd_tracing"
JOIN "covid-19"."aspirevc_crowd_tracing_zipcode_3digits" ON SUBSTR(scannerdevice_zipcode, 1, 3) = aspirevc_crowd_tracing_zipcode_3digits.zip
WHERE from_iso8601_timestamp(scandate)
BETWEEN parse_datetime('2020-10-15:00:00:00','yyyy-MM-dd:HH:mm:ss')
AND parse_datetime('2020-10-16:00:00:00','yyyy-MM-dd:HH:mm:ss')
AND scannerdevice_zipcode<>''
GROUP BY 1,2,3,4,5
ORDER BY 1,3

The following screenshot shows the results of this query:

You can see that there are a small number of people marked as symptoms, diagnosed, contact, and near per state per risklevel.

By open-sourcing the data, we see possibilities to combine it with other AWS COVID-19 data lake datasets, such as hospitalizations or COVIDcast data. This can enable a new game feature such as a radar that predicts regional hotspot emergence.

If you’re a data analyst, we encourage you to contribute to building better crowd tracing algorithms using any of the data provided in the public AWS Covid-19 data lake. Even if you’re building a different solution, you can use this dataset without license fees. The following section can help you quickly get started.

The data

Although our public AWS COVID-19 data lake has excellent data sources, such as hospitalization data down to regional levels, Clio GO provides data even at the zip-code level. Below is a description of the schema of the data made available in the data lake:

Strong data privacy and cryptographic pseudonymity via Powch

In contrast to the significant privacy challenges associated with contact tracing, crowd tracing and the Clio GO system don’t require disclosure of contact identities to reduce your risk for COVID-19 infection. Returning to the video game analogy, learning the identity after the fact of who hit your shields doesn’t matter. However, knowing that that you’re entering a crowded map of strongly armed assailants might cause you to choose a different crowd. Therefore, the aggregate risk of the crowd becomes the only relevant concern for your risk of contracting COVID-19.

To protect your identity, the Clio GO app uses Powch, a powerful cryptographic technology that protects identity and data. Similar to Bitcoin, Powch enables pseudonymity—a way to log in without any linkage to your true identity. You don’t need to use any personally identifiable information for Clio Go registration. Instead, an unguessable and random secret ID is stored in a personal QR code, which only you have access to, and GO Pass has no knowledge about the owner of the secret ID. The secret ID is used during registration as the only form of identity in the Go Pass app.

After exposure to a possibly infected individual, contact tracing requires you to exactly identify all the individuals that you interacted with during the same period of time, and compromise their privacy as well as your own.

Although you may choose to be identified and share your name with the venue you visit, the Clio GO app never shows personal data, like actual temperature readings, to the venue owner. The app only tells the venue whether the GO Pass was accepted or denied based on the entry requirements of the venue.

Crowd-sourced solutions, open data, restrictions, and uses

We hope that free access to the crowd tracing data via the public AWS COVID-19 data lake encourages the development of new creative, low-cost COVID-19 mitigation solutions. You can use the data within commercial products under a creative commons license with the explicit requirement that algorithms developed from the public dataset are open and published.

We encourage using the crowd tracing data in the public AWS COVID-19 data lake in conjunction with other free data sources also in the lake. A commercial data feed with fewer restrictions and data limitations is being made available via AWS Data Exchange to commercial organizations who pass verification requirements.


About the authors and Aspire Ventures

Aspire Ventures has developed a novel artificial intelligence engine called A2i and joint ventures with mission-driven health systems. Aspire’s first joint venture with Penn Medical Lancaster General Health and Capital BlueCross established the Smart Health Innovation Lab, an entity that accelerates healthcare technologies that impact the quadruple aim. Aspire is partnering with Clalit, the majority health system of Israel, in a similar joint venture focused on Israeli start-ups to encourage innovation in healthcare.

Aspire Ventures was founded by Essam Abadir, SB MIT Mathematics, SB Sloan School of Management, and JD with Distinction from the University of Iowa School of Law. Essam founded Aspire Ventures as an impact investment AI firm in 2014 after selling an apps platform to Intel in 2013.

A2i is overseen by Victor Owuor, SB & MS Aeronautical and Astronautical Engineering MIT, SB & MS Electrical Engineering MIT, and JD Harvard School of Law. Prior to Aspire, Victor headed a significant cloud P&L at Oracle.

The Aspire Ventures CIO and Smart Health Innovation Lab CEO is Kim Ireland, MSIS Penn State University. Kim formerly managed health system EHR rollouts at Cerner and was CEO of startup MedStatix.

Scott Schell, PhD Immunology University of Chicago, MD University of Chicago, and MBA University of Michigan, heads Clio Health Go and is Chief Medical Officer for the Aspire portfolio. Scott was Founding Chair of Cleveland Clinic’s Department of Population Health and led development of two of healthcare’s largest platforms at Alere and UPMC.

Clio Health GO’s mission is to reinvent the healthcare experience via advanced telehealth, starting with COVID-19. GO Pass and the crowd tracing data and algorithm is a product of Medstatix. Powch provides patented cryptographic privacy and security technologies. Connexion Health provides the GO Pass kiosks. Clio Health GO, Medstatix, Connexion Health, and Powch are portfolio companies of Aspire.

Many services, one cloudflared

Post Syndicated from Adam Chalmers original https://blog.cloudflare.com/many-services-one-cloudflared/

Many services, one cloudflared
Route many different local services through many different URLs, with only one cloudflared

Many services, one cloudflared

I work on the Argo Tunnel team, and we make a program called cloudflared, which lets you securely expose your web service to the Internet while ensuring that all its traffic goes through Cloudflare.

Say you have some local service (a website, an API, a TCP server, etc), and you want to securely expose it to the internet using Argo Tunnel. First, you run cloudflared, which establishes some long-lived TCP connections to the Cloudflare edge. Then, when Cloudflare receives a request for your chosen hostname, it proxies the request through those connections to cloudflared, which in turn proxies the request to your local service. This means anyone accessing your service has to go through Cloudflare, and Cloudflare can do caching, rewrite parts of the page, block attackers, or build Zero Trust rules to control who can reach your application (e.g. users with a @corp.com email). Previously, companies had to use VPNs or firewalls to achieve this, but Argo Tunnel aims to be more flexible, more secure, and more scalable than the alternatives.

Some of our larger customers have deployed hundreds of services with Argo Tunnel, but they’re consistently experiencing a pain point with these larger deployments. Each instance of cloudflared can only proxy a single service. This means if you want to put, say, 100 services on the internet, you’ll need 100 instances of cloudflared running on your server. This is inefficient (because you’re using 100x as many system resources) and, even worse, it’s a pain to manage 100 long-lived services!

Today, we’re thrilled to announce our most-requested feature: you can now expose unlimited services using one cloudflared. Any customer can start using this today, at no extra cost, using the Named Tunnels we released a few months ago.

Named Tunnels

Earlier this year, we announced Named Tunnels—tunnels with immutables ID that you can run and stop as you please. You can route traffic into the tunnel by adding a DNS or Cloudflare Load Balancer record, and you can route traffic from the tunnel into your local services by running cloudflared.

Many services, one cloudflared

You can create a tunnel by running $ cloudflared tunnel create my_tunnel_name. Once you’ve got a tunnel, you can use DNS records or Cloudflare Load Balancers to route traffic into the tunnel. Once traffic is routed into the tunnel, you can use our new ingress rules to map traffic to local services.

Map traffic with ingress rules

An ingress rule basically says “send traffic for this internet URL to this local service.” When you invoke cloudflared it’ll read these ingress rules from the configuration file. You write ingress rules under the ingress key of your config file, like this:

$ cat ~/cloudflared_config.yaml

tunnel: my_tunnel_name
credentials-file: .cloudflared/e0000000-e650-4190-0000-19c97abb503b.json
ingress:
 # Rules map traffic from a hostname to a local service:
 - hostname: example.com
   service: https://localhost:8000
 # Rules can match the request's path to a regular expression:
 - hostname: static.example.com
   path: /images/*\.(jpg|png|gif)
   service: https://machine1.local:3000
 # Rules can match the request's hostname to a wildcard character:
 - hostname: "*.ssh.foo.com"
   service: ssh://localhost:2222
 # You can map traffic to the built-in “Hello World” test server:
 - hostname: foo.com
   service: hello_world
 # This “catch-all” rule doesn’t have a hostname/path, so it matches everything
 - service: http_status:404

This example maps traffic to three different local services. But cloudflared can map traffic to more than just addresses: it can respond with a given HTTP status (as in the last rule) or with the built-in Hello World test server (as in the second-last rule). See the docs for a full list of supported services.

You can match traffic using the hostname, a path regex, or both. If you don’t use any filters, the ingress rule will match everything (so if you have DNS records from different zones routing into the tunnel, the rule will match all their URLs). Traffic is matched to rules from top to bottom, so in this example, the last rule will match anything that wasn’t matched by an earlier rule. We actually require the last rule to match everything; otherwise, cloudflared could receive a request and not know what to respond with.

Testing your rules

To make sure all your rules are valid, you can run

$ cat ~/cloudflared_config_invalid.yaml

ingress:
 - hostname: example.com
   service: https://localhost:8000

$ cloudflared tunnel ingress validate
Validating rules from /usr/local/etc/cloudflared/config.yml
Validation failed: The last ingress rule must match all URLs (i.e. it should not have a hostname or path filter)

This will check that all your ingress rules use valid regexes and map to valid services, and it’ll ensure that your last rule (and only your last rule) matches all traffic. To make sure your ingress rules do what you expect them to do, you can run

$ cloudflared tunnel ingress rule https://static.example.com/images/dog.gif
Using rules from ~/cloudflared_config.yaml
Matched rule #2
        Hostname: static.example.com
        path: /images/*\.(jpg|png|gif)

This will check which rule matches the given URL, almost like a dry run for the ingress rules (no tunnels are run and no requests are actually sent). It’s helpful for making sure you’re routing the right URLs to the right services!

Per-rule configuration

Whenever cloudflared gets a request from the internet, it proxies that request to the matching local service on your origin. Different services might need different configurations for this request; for example, you might want to tweak the timeout or HTTP headers for a certain origin. You can set a default configuration for all your local services, and then override it for specific ones, e.g.

ingress:
  # Set configuration for all services
  originRequest:
    connectTimeout: 30s
 # This service inherits all the default (root-level) configuration
 - hostname: example.com
   service: https://localhost:8000
 # This service overrides the default configuration
 - service: https://localhost:8001
   originRequest:
     connectTimeout: 10s
     disableChunkedEncoding: true

For a full list of configuration options, check out the docs.

What’s next?

We really hope this makes Argo Tunnel an even easier way to deploy services onto the Internet. If you have any questions, file an issue on our GitHub. Happy developing!

Read RFID and NFC tokens with Raspberry Pi | HackSpace 37

Post Syndicated from Andrew Gregory original https://www.raspberrypi.org/blog/read-rfid-and-nfc-tokens-with-raspberry-pi-hackspace-37/

Add a bit of security to your project or make things selectable
by using different cards. In the latest issue of HackSpace magazine, PJ Evans goes contactless.

The HAT is not hard on resources, so you can use many variants of Raspberry Pi

NFC (near-field communication) is based on the RFID (radio-frequency identification) standard. Both allow a device to receive data from a passive token or tag (meaning it doesn’t require external power to work). RFID supports a simple ID message that shouts ‘I exist’, whereas NFC allows for both reading and writing of data.

Most people come into contact with these systems every day, whether it’s using contactless payment, or a card to unlock a hotel or office door. In this tutorial we’ll look at the Waveshare NFC HAT, an add-on for Raspberry Pi computers that allows you to interact with NFC and RFID tokens.

Prepare your Raspberry Pi

We start with the usual step of preparing a Raspberry Pi model for the job. Reading RFID tags is not strenuous work for our diminutive friend, so you can use pretty much any variant of the Raspberry Pi range you like, so long as it has the 40-pin GPIO. We only need Raspberry Pi OS Lite (Buster) for this tutorial; however, you can install any version you wish. Make sure you’ve configured it how you want, have a network connection, and have updated everything by running sudo apt -y update && sudo apt -y upgrade on the command line.

Enable the serial interface

This NFC HAT is capable of communicating over three different interfaces: I2C, SPI, and UART. We’re going with UART as it’s the simplest to demonstrate, but you may wish to use the others. Start by running sudo raspi-config, going to ‘Interfacing options’, and selecting ‘Serial Interface’. When asked if you want to log into the console, say ‘No’. Then, when asked if you want to enable the serial interface, say ‘Yes’. You’ll need to reboot now. This will allow the HAT to talk to our Raspberry Pi over the serial interface.

Configure and install the HAT

As mentioned in the previous step, we have a choice of interfaces and swapping between them means changing some physical settings on the NFC HAT itself. Do not do this while the HAT is powered up in any way. Our HAT can be configured for UART/Serial by default but do check on the wiki at hsmag.cc/iHj1XA. The jumpers at I1 and I0 should both be shorting ‘L’, D16 and D20 should be shorted and on the DIP switch, everything should be off except RX and TX. Check, double-check, attach the HAT to the GPIO, and boot up.

The Waveshare HAT contains many settings. Make sure to read the instructions!

Download the examples

You can download some examples directly from Waveshare. First, we need to install some dependencies. Run the following at the command line:
sudo apt install rpi.gpio p7zip-full python3-pip
pip3 install spidev pyserial

Now, download the files and unpack them:
cd
wget https://www.waveshare.com/w/upload/6/67/Pn532-nfc-hat-code.7z
7z x Pn532-nfc-hat-code.7z

Before you try anything out, you need to edit the example file so that we use UART (see the accompanying code listing).
cd ~/raspberrypi/python
nano example_get_uid.py

Find the three lines that start pn532 = and add a # to the top one (to comment it out). Now remove the # from the line starting pn532 = PN532_UART. Save, and exit.

Try it out!

Finally, we get to the fun part. Start the example code as follows:
python3 example_get_uid.py
If all is well, the connection to the HAT will be announced. You can now place your RFID token over the area of the HAT marked ‘NFC’. Hexadecimal numbers will start scrolling up the screen; your token has been detected! Each RFID token has a unique number, so it can be used to uniquely identify someone. However, this HAT is capable of much more than that as it also supports NFC and can communicate with common standards like MIFARE Classic, which allows for 1kB of storage on the card. Check out example_dump_mifare.py in the same directory (but make sure you make the same edits as above to use the serial connection).

Going further

You can now read unique identifiers on RFID and NFC tokens. As we just mentioned, if you’re using the MIFARE or NTAG2 standards, you can also write data back to the card. The examples folder contains some C programs that let you do just that. The ability to read and write small amounts of data onto cards can lead to some fun projects. At the Electromagnetic Field festival in 2018, an entire game was based around finding physical locations and registering your presence with a MIFARE card. Even more is possible with smartphones, where NFC can be used to exchange data in any form.

Get HackSpace magazine 37 – Out Now!

Each month, HackSpace magazine brings you the best projects, tips, tricks and tutorials from the makersphere. You can get it from the Raspberry Pi Press online store, The Raspberry Pi store in Cambridge, or your local newsagents.

Each issue is free to download from the HackSpace magazine website.

The post Read RFID and NFC tokens with Raspberry Pi | HackSpace 37 appeared first on Raspberry Pi.

The collective thoughts of the interwebz

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close