Tag Archives: Amazon S3

Serverless Automated Cost Controls, Part1

Post Syndicated from Shankar Ramachandran original https://aws.amazon.com/blogs/compute/serverless-automated-cost-controls-part1/

This post courtesy of Shankar Ramachandran, Pubali Sen, and George Mao

In line with AWS’s continual efforts to reduce costs for customers, this series focuses on how customers can build serverless automated cost controls. This post provides an architecture blueprint and a sample implementation to prevent budget overruns.

This solution uses the following AWS products:

  • AWS Budgets – An AWS Cost Management tool that helps customers define and track budgets for AWS costs, and forecast for up to three months.
  • Amazon SNS – An AWS service that makes it easy to set up, operate, and send notifications from the cloud.
  • AWS Lambda – An AWS service that lets you run code without provisioning or managing servers.

You can fine-tune a budget for various parameters, for example filtering by service or tag. The Budgets tool lets you post notifications on an SNS topic. A Lambda function that subscribes to the SNS topic can act on the notification. Any programmatically implementable action can be taken.

The diagram below describes the architecture blueprint.

In this post, we describe how to use this blueprint with AWS Step Functions and IAM to effectively revoke the ability of a user to start new Amazon EC2 instances, after a budget amount is exceeded.

Freedom with guardrails

AWS lets you quickly spin up resources as you need them, deploying hundreds or even thousands of servers in minutes. This means you can quickly develop and roll out new applications. Teams can experiment and innovate more quickly and frequently. If an experiment fails, you can always de-provision those servers without risk.

This improved agility also brings in the need for effective cost controls. Your Finance and Accounting department must budget, monitor, and control the AWS spend. For example, this could be a budget per project. Further, Finance and Accounting must take appropriate actions if the budget for the project has been exceeded, for example. Call it “freedom with guardrails” – where Finance wants to give developers freedom, but with financial constraints.

Architecture

This section describes how to use the blueprint introduced earlier to implement a “freedom with guardrails” solution.

  1. The budget for “Project Beta” is set up in Budgets. In this example, we focus on EC2 usage and identify the instances that belong to this project by filtering on the tag Project with the value Beta. For more information, see Creating a Budget.
  2. The budget configuration also includes settings to send a notification on an SNS topic when the usage exceeds 100% of the budgeted amount. For more information, see Creating an Amazon SNS Topic for Budget Notifications.
  3. The master Lambda function receives the SNS notification.
  4. It triggers execution of a Step Functions state machine with the parameters for completing the configured action.
  5. The action Lambda function is triggered as a task in the state machine. The function interacts with IAM to effectively remove the user’s permissions to create an EC2 instance.

This decoupled modular design allows for extensibility.  New actions (serially or in parallel) can be added by simply adding new steps.

Implementing the solution

All the instructions and code needed to implement the architecture have been posted on the Serverless Automated Cost Controls GitHub repo. We recommend that you try this first in a Dev/Test environment.

This implementation description can be broken down into two parts:

  1. Create a solution stack for serverless automated cost controls.
  2. Verify the solution by testing the EC2 fleet.

To tie this back to the “freedom with guardrails” scenario, the Finance department performs a one-time implementation of the solution stack. To simulate resources for Project Beta, the developers spin up the test EC2 fleet.

Prerequisites

There are two prerequisites:

  • Make sure that you have the necessary IAM permissions. For more information, see the section titled “Required IAM permissions” in the README.
  • Define and activate a cost allocation tag with the key Project. For more information, see Using Cost Allocation Tags. It can take up to 12 hours for the tags to propagate to Budgets.

Create resources

The solution stack includes creating the following resources:

  • Three Lambda functions
  • One Step Functions state machine
  • One SNS topic
  • One IAM group
  • One IAM user
  • IAM policies as needed
  • One budget

Two of the Lambda functions were described in the previous section, to a) receive the SNS notification and b) trigger the Step Functions state machine. Another Lambda function is used to create the budget, as a custom AWS CloudFormation resource. The SNS topic connects Budgets with Lambda function A. Lambda function B is configured as a task in Step Functions. A budget for $2 is created which is filtered by Service: EC2 and Tag: Project, Beta. A test IAM group and user is created to enable you to validate this Cost Control Solution.

To create the serverless automated cost control solution stack, choose the button below. It takes few minutes to spin up the stack. You can monitor the progress in the CloudFormation console.

When you see the CREATE_COMPLETE status for the stack you had created, choose Outputs. Copy the following four values that you need later:

  • TemplateURL
  • UserName
  • SignInURL
  • Password

Verify the stack

The next step is to verify the serverless automated cost controls solution stack that you just created. To do this, spin up an EC2 fleet of t2.micro instances, representative of the resources needed for Project Beta, and tag them with Project, Beta.

  1. Browse to the SignInURL, and log in using the UserName and Password values copied on from the stack output.
  2. In the CloudFormation console, choose Create Stack.
  3. For Choose a template, select Choose an Amazon S3 template URL and paste the TemplateURL value from the preceding section. Choose Next.
  4. Give this stack a name, such as “testEc2FleetForProjectBeta”. Choose Next.
  5. On the Specify Details page, enter parameters such as the UserName and Password copied in the previous section. Choose Next.
  6. Ignore any errors related to listing IAM roles. The test user has a minimal set of permissions that is just sufficient to spin up this test stack (in line with security best practices).
  7. On the Options page, choose Next.
  8. On the Review page, choose Create. It takes a few minutes to spin up the stack, and you can monitor the progress in the CloudFormation console. 
  9. When you see the status “CREATE_COMPLETE”, open the EC2 console to verify that four t2.micro instances have been spun up, with the tag of Project, Beta.

The hourly cost for these instances depends on the region in which they are running. On the average (irrespective of the region), you can expect the aggregate cost for this EC2 fleet to exceed the set $2 budget in 48 hours.

Verify the solution

The first step is to identify the test IAM group that was created in the previous section. The group should have “projectBeta” in the name, prepended with the CloudFormation stack name and appended with an alphanumeric string. Verify that the managed policy associated is: “EC2FullAccess”, which indicates that the users in this group have unrestricted access to EC2.

There are two stages of verification for this serverless automated cost controls solution: simulating a notification and waiting for a breach.

Simulated notification

Because it takes at least a few hours for the aggregate cost of the EC2 fleet to breach the set budget, you can verify the solution by simulating the notification from Budgets.

  1. Log in to the SNS console (using your regular AWS credentials).
  2. Publish a message on the SNS topic that has “budgetNotificationTopic” in the name. The complete name is appended by the CloudFormation stack identifier.  
  3. Copy the following text as the body of the notification: “This is a mock notification”.
  4. Choose Publish.
  5. Open the IAM console to verify that the policy for the test group has been switched to “EC2ReadOnly”. This prevents users in this group from creating new instances.
  6. Verify that the test user created in the previous section cannot spin up new EC2 instances.  You can log in as the test user and try creating a new EC2 instance (via the same CloudFormation stack or the EC2 console). You should get an error message indicating that you do not have the necessary permissions.
  7. If you are proceeding to stage 2 of the verification, then you must switch the permissions back to “EC2FullAccess” for the test group, which can be done in the IAM console.

Automatic notification

Within 48 hours, the aggregate cost of the EC2 fleet spun up in the earlier section breaches the budget rule and triggers an automatic notification. This results in the permissions getting switched out, just as in the simulated notification.

Clean up

Use the following steps to delete your resources and stop incurring costs.

  1. Open the CloudFormation console.
  2. Delete the EC2 fleet by deleting the appropriate stack (for example, delete the stack named “testEc2FleetForProjectBeta”).                                               
  3. Next, delete the “costControlStack” stack.                                                                                                                                                    

Conclusion

Using Lambda in tandem with Budgets, you can build Serverless automated cost controls on AWS. Find all the resources (instructions, code) for implementing the solution discussed in this post on the Serverless Automated Cost Controls GitHub repo.

Stay tuned to this series for more tips about building serverless automated cost controls. In the next post, we discuss using smart lighting to influence developer behavior and describe a solution to encourage cost-aware development practices.

If you have questions or suggestions, please comment below.

 

How to Patch, Inspect, and Protect Microsoft Windows Workloads on AWS—Part 2

Post Syndicated from Koen van Blijderveen original https://aws.amazon.com/blogs/security/how-to-patch-inspect-and-protect-microsoft-windows-workloads-on-aws-part-2/

Yesterday in Part 1 of this blog post, I showed you how to:

  1. Launch an Amazon EC2 instance with an AWS Identity and Access Management (IAM) role, an Amazon Elastic Block Store (Amazon EBS) volume, and tags that Amazon EC2 Systems Manager (Systems Manager) and Amazon Inspector use.
  2. Configure Systems Manager to install the Amazon Inspector agent and patch your EC2 instances.

Today in Steps 3 and 4, I show you how to:

  1. Take Amazon EBS snapshots using Amazon EBS Snapshot Scheduler to automate snapshots based on instance tags.
  2. Use Amazon Inspector to check if your EC2 instances running Microsoft Windows contain any common vulnerabilities and exposures (CVEs).

To catch up on Steps 1 and 2, see yesterday’s blog post.

Step 3: Take EBS snapshots using EBS Snapshot Scheduler

In this section, I show you how to use EBS Snapshot Scheduler to take snapshots of your instances at specific intervals. To do this, I will show you how to:

  • Determine the schedule for EBS Snapshot Scheduler by providing you with best practices.
  • Deploy EBS Snapshot Scheduler by using AWS CloudFormation.
  • Tag your EC2 instances so that EBS Snapshot Scheduler backs up your instances when you want them backed up.

In addition to making sure your EC2 instances have all the available operating system patches applied on a regular schedule, you should take snapshots of the EBS storage volumes attached to your EC2 instances. Taking regular snapshots allows you to restore your data to a previous state quickly and cost effectively. With Amazon EBS snapshots, you pay only for the actual data you store, and snapshots save only the data that has changed since the previous snapshot, which minimizes your cost. You will use EBS Snapshot Scheduler to make regular snapshots of your EC2 instance. EBS Snapshot Scheduler takes advantage of other AWS services including CloudFormation, Amazon DynamoDB, and AWS Lambda to make backing up your EBS volumes simple.

Determine the schedule

As a best practice, you should back up your data frequently during the hours when your data changes the most. This reduces the amount of data you lose if you have to restore from a snapshot. For the purposes of this blog post, the data for my instances changes the most between the business hours of 9:00 A.M. to 5:00 P.M. Pacific Time. During these hours, I will make snapshots hourly to minimize data loss.

In addition to backing up frequently, another best practice is to establish a strategy for retention. This will vary based on how you need to use the snapshots. If you have compliance requirements to be able to restore for auditing, your needs may be different than if you are able to detect data corruption within three hours and simply need to restore to something that limits data loss to five hours. EBS Snapshot Scheduler enables you to specify the retention period for your snapshots. For this post, I only need to keep snapshots for recent business days. To account for weekends, I will set my retention period to three days, which is down from the default of 15 days when deploying EBS Snapshot Scheduler.

Deploy EBS Snapshot Scheduler

In Step 1 of Part 1 of this post, I showed how to configure an EC2 for Windows Server 2012 R2 instance with an EBS volume. You will use EBS Snapshot Scheduler to take eight snapshots each weekday of your EC2 instance’s EBS volumes:

  1. Navigate to the EBS Snapshot Scheduler deployment page and choose Launch Solution. This takes you to the CloudFormation console in your account. The Specify an Amazon S3 template URL option is already selected and prefilled. Choose Next on the Select Template page.
  2. On the Specify Details page, retain all default parameters except for AutoSnapshotDeletion. Set AutoSnapshotDeletion to Yes to ensure that old snapshots are periodically deleted. The default retention period is 15 days (you will specify a shorter value on your instance in the next subsection).
  3. Choose Next twice to move to the Review step, and start deployment by choosing the I acknowledge that AWS CloudFormation might create IAM resources check box and then choosing Create.

Tag your EC2 instances

EBS Snapshot Scheduler takes a few minutes to deploy. While waiting for its deployment, you can start to tag your instance to define its schedule. EBS Snapshot Scheduler reads tag values and looks for four possible custom parameters in the following order:

  • <snapshot time> – Time in 24-hour format with no colon.
  • <retention days> – The number of days (a positive integer) to retain the snapshot before deletion, if set to automatically delete snapshots.
  • <time zone> – The time zone of the times specified in <snapshot time>.
  • <active day(s)>all, weekdays, or mon, tue, wed, thu, fri, sat, and/or sun.

Because you want hourly backups on weekdays between 9:00 A.M. and 5:00 P.M. Pacific Time, you need to configure eight tags—one for each hour of the day. You will add the eight tags shown in the following table to your EC2 instance.

Tag Value
scheduler:ebs-snapshot:0900 0900;3;utc;weekdays
scheduler:ebs-snapshot:1000 1000;3;utc;weekdays
scheduler:ebs-snapshot:1100 1100;3;utc;weekdays
scheduler:ebs-snapshot:1200 1200;3;utc;weekdays
scheduler:ebs-snapshot:1300 1300;3;utc;weekdays
scheduler:ebs-snapshot:1400 1400;3;utc;weekdays
scheduler:ebs-snapshot:1500 1500;3;utc;weekdays
scheduler:ebs-snapshot:1600 1600;3;utc;weekdays

Next, you will add these tags to your instance. If you want to tag multiple instances at once, you can use Tag Editor instead. To add the tags in the preceding table to your EC2 instance:

  1. Navigate to your EC2 instance in the EC2 console and choose Tags in the navigation pane.
  2. Choose Add/Edit Tags and then choose Create Tag to add all the tags specified in the preceding table.
  3. Confirm you have added the tags by choosing Save. After adding these tags, navigate to your EC2 instance in the EC2 console. Your EC2 instance should look similar to the following screenshot.
    Screenshot of how your EC2 instance should look in the console
  4. After waiting a couple of hours, you can see snapshots beginning to populate on the Snapshots page of the EC2 console.Screenshot of snapshots beginning to populate on the Snapshots page of the EC2 console
  5. To check if EBS Snapshot Scheduler is active, you can check the CloudWatch rule that runs the Lambda function. If the clock icon shown in the following screenshot is green, the scheduler is active. If the clock icon is gray, the rule is disabled and does not run. You can enable or disable the rule by selecting it, choosing Actions, and choosing Enable or Disable. This also allows you to temporarily disable EBS Snapshot Scheduler.Screenshot of checking to see if EBS Snapshot Scheduler is active
  1. You can also monitor when EBS Snapshot Scheduler has run by choosing the name of the CloudWatch rule as shown in the previous screenshot and choosing Show metrics for the rule.Screenshot of monitoring when EBS Snapshot Scheduler has run by choosing the name of the CloudWatch rule

If you want to restore and attach an EBS volume, see Restoring an Amazon EBS Volume from a Snapshot and Attaching an Amazon EBS Volume to an Instance.

Step 4: Use Amazon Inspector

In this section, I show you how to you use Amazon Inspector to scan your EC2 instance for common vulnerabilities and exposures (CVEs) and set up Amazon SNS notifications. To do this I will show you how to:

  • Install the Amazon Inspector agent by using EC2 Run Command.
  • Set up notifications using Amazon SNS to notify you of any findings.
  • Define an Amazon Inspector target and template to define what assessment to perform on your EC2 instance.
  • Schedule Amazon Inspector assessment runs to assess your EC2 instance on a regular interval.

Amazon Inspector can help you scan your EC2 instance using prebuilt rules packages, which are built and maintained by AWS. These prebuilt rules packages tell Amazon Inspector what to scan for on the EC2 instances you select. Amazon Inspector provides the following prebuilt packages for Microsoft Windows Server 2012 R2:

  • Common Vulnerabilities and Exposures
  • Center for Internet Security Benchmarks
  • Runtime Behavior Analysis

In this post, I’m focused on how to make sure you keep your EC2 instances patched, backed up, and inspected for common vulnerabilities and exposures (CVEs). As a result, I will focus on how to use the CVE rules package and use your instance tags to identify the instances on which to run the CVE rules. If your EC2 instance is fully patched using Systems Manager, as described earlier, you should not have any findings with the CVE rules package. Regardless, as a best practice I recommend that you use Amazon Inspector as an additional layer for identifying any unexpected failures. This involves using Amazon CloudWatch to set up weekly Amazon Inspector scans, and configuring Amazon Inspector to notify you of any findings through SNS topics. By acting on the notifications you receive, you can respond quickly to any CVEs on any of your EC2 instances to help ensure that malware using known CVEs does not affect your EC2 instances. In a previous blog post, Eric Fitzgerald showed how to remediate Amazon Inspector security findings automatically.

Install the Amazon Inspector agent

To install the Amazon Inspector agent, you will use EC2 Run Command, which allows you to run any command on any of your EC2 instances that have the Systems Manager agent with an attached IAM role that allows access to Systems Manager.

  1. Choose Run Command under Systems Manager Services in the navigation pane of the EC2 console. Then choose Run a command.
    Screenshot of choosing "Run a command"
  2. To install the Amazon Inspector agent, you will use an AWS managed and provided command document that downloads and installs the agent for you on the selected EC2 instance. Choose AmazonInspector-ManageAWSAgent. To choose the target EC2 instance where this command will be run, use the tag you previously assigned to your EC2 instance, Patch Group, with a value of Windows Servers. For this example, set the concurrent installations to 1 and tell Systems Manager to stop after 5 errors.
    Screenshot of installing the Amazon Inspector agent
  3. Retain the default values for all other settings on the Run a command page and choose Run. Back on the Run Command page, you can see if the command that installed the Amazon Inspector agent executed successfully on all selected EC2 instances.
    Screenshot showing that the command that installed the Amazon Inspector agent executed successfully on all selected EC2 instances

Set up notifications using Amazon SNS

Now that you have installed the Amazon Inspector agent, you will set up an SNS topic that will notify you of any findings after an Amazon Inspector run.

To set up an SNS topic:

  1. In the AWS Management Console, choose Simple Notification Service under Messaging in the Services menu.
  2. Choose Create topic, name your topic (only alphanumeric characters, hyphens, and underscores are allowed) and give it a display name to ensure you know what this topic does (I’ve named mine Inspector). Choose Create topic.
    "Create new topic" page
  3. To allow Amazon Inspector to publish messages to your new topic, choose Other topic actions and choose Edit topic policy.
  4. For Allow these users to publish messages to this topic and Allow these users to subscribe to this topic, choose Only these AWS users. Type the following ARN for the US East (N. Virginia) Region in which you are deploying the solution in this post: arn:aws:iam::316112463485:root. This is the ARN of Amazon Inspector itself. For the ARNs of Amazon Inspector in other AWS Regions, see Setting Up an SNS Topic for Amazon Inspector Notifications (Console). Amazon Resource Names (ARNs) uniquely identify AWS resources across all of AWS.
    Screenshot of editing the topic policy
  5. To receive notifications from Amazon Inspector, subscribe to your new topic by choosing Create subscription and adding your email address. After confirming your subscription by clicking the link in the email, the topic should display your email address as a subscriber. Later, you will configure the Amazon Inspector template to publish to this topic.
    Screenshot of subscribing to the new topic

Define an Amazon Inspector target and template

Now that you have set up the notification topic by which Amazon Inspector can notify you of findings, you can create an Amazon Inspector target and template. A target defines which EC2 instances are in scope for Amazon Inspector. A template defines which packages to run, for how long, and on which target.

To create an Amazon Inspector target:

  1. Navigate to the Amazon Inspector console and choose Get started. At the time of writing this blog post, Amazon Inspector is available in the US East (N. Virginia), US West (N. California), US West (Oregon), EU (Ireland), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Sydney), and Asia Pacific (Tokyo) Regions.
  2. For Amazon Inspector to be able to collect the necessary data from your EC2 instance, you must create an IAM service role for Amazon Inspector. Amazon Inspector can create this role for you if you choose Choose or create role and confirm the role creation by choosing Allow.
    Screenshot of creating an IAM service role for Amazon Inspector
  3. Amazon Inspector also asks you to tag your EC2 instance and install the Amazon Inspector agent. You already performed these steps in Part 1 of this post, so you can proceed by choosing Next. To define the Amazon Inspector target, choose the previously used Patch Group tag with a Value of Windows Servers. This is the same tag that you used to define the targets for patching. Then choose Next.
    Screenshot of defining the Amazon Inspector target
  4. Now, define your Amazon Inspector template, and choose a name and the package you want to run. For this post, use the Common Vulnerabilities and Exposures package and choose the default duration of 1 hour. As you can see, the package has a version number, so always select the latest version of the rules package if multiple versions are available.
    Screenshot of defining an assessment template
  5. Configure Amazon Inspector to publish to your SNS topic when findings are reported. You can also choose to receive a notification of a started run, a finished run, or changes in the state of a run. For this blog post, you want to receive notifications if there are any findings. To start, choose Assessment Templates from the Amazon Inspector console and choose your newly created Amazon Inspector assessment template. Choose the icon below SNS topics (see the following screenshot).
    Screenshot of choosing an assessment template
  6. A pop-up appears in which you can choose the previously created topic and the events about which you want SNS to notify you (choose Finding reported).
    Screenshot of choosing the previously created topic and the events about which you want SNS to notify you

Schedule Amazon Inspector assessment runs

The last step in using Amazon Inspector to assess for CVEs is to schedule the Amazon Inspector template to run using Amazon CloudWatch Events. This will make sure that Amazon Inspector assesses your EC2 instance on a regular basis. To do this, you need the Amazon Inspector template ARN, which you can find under Assessment templates in the Amazon Inspector console. CloudWatch Events can run your Amazon Inspector assessment at an interval you define using a Cron-based schedule. Cron is a well-known scheduling agent that is widely used on UNIX-like operating systems and uses the following syntax for CloudWatch Events.

Image of Cron schedule

All scheduled events use a UTC time zone, and the minimum precision for schedules is one minute. For more information about scheduling CloudWatch Events, see Schedule Expressions for Rules.

To create the CloudWatch Events rule:

  1. Navigate to the CloudWatch console, choose Events, and choose Create rule.
    Screenshot of starting to create a rule in the CloudWatch Events console
  2. On the next page, specify if you want to invoke your rule based on an event pattern or a schedule. For this blog post, you will select a schedule based on a Cron expression.
  3. You can schedule the Amazon Inspector assessment any time you want using the Cron expression, or you can use the Cron expression I used in the following screenshot, which will run the Amazon Inspector assessment every Sunday at 10:00 P.M. GMT.
    Screenshot of scheduling an Amazon Inspector assessment with a Cron expression
  4. Choose Add target and choose Inspector assessment template from the drop-down menu. Paste the ARN of the Amazon Inspector template you previously created in the Amazon Inspector console in the Assessment template box and choose Create a new role for this specific resource. This new role is necessary so that CloudWatch Events has the necessary permissions to start the Amazon Inspector assessment. CloudWatch Events will automatically create the new role and grant the minimum set of permissions needed to run the Amazon Inspector assessment. To proceed, choose Configure details.
    Screenshot of adding a target
  5. Next, give your rule a name and a description. I suggest using a name that describes what the rule does, as shown in the following screenshot.
  6. Finish the wizard by choosing Create rule. The rule should appear in the Events – Rules section of the CloudWatch console.
    Screenshot of completing the creation of the rule
  7. To confirm your CloudWatch Events rule works, wait for the next time your CloudWatch Events rule is scheduled to run. For testing purposes, you can choose your CloudWatch Events rule and choose Edit to change the schedule to run it sooner than scheduled.
    Screenshot of confirming the CloudWatch Events rule works
  8. Now navigate to the Amazon Inspector console to confirm the launch of your first assessment run. The Start time column shows you the time each assessment started and the Status column the status of your assessment. In the following screenshot, you can see Amazon Inspector is busy Collecting data from the selected assessment targets.
    Screenshot of confirming the launch of the first assessment run

You have concluded the last step of this blog post by setting up a regular scan of your EC2 instance with Amazon Inspector and a notification that will let you know if your EC2 instance is vulnerable to any known CVEs. In a previous Security Blog post, Eric Fitzgerald explained How to Remediate Amazon Inspector Security Findings Automatically. Although that blog post is for Linux-based EC2 instances, the post shows that you can learn about Amazon Inspector findings in other ways than email alerts.

Conclusion

In this two-part blog post, I showed how to make sure you keep your EC2 instances up to date with patching, how to back up your instances with snapshots, and how to monitor your instances for CVEs. Collectively these measures help to protect your instances against common attack vectors that attempt to exploit known vulnerabilities. In Part 1, I showed how to configure your EC2 instances to make it easy to use Systems Manager, EBS Snapshot Scheduler, and Amazon Inspector. I also showed how to use Systems Manager to schedule automatic patches to keep your instances current in a timely fashion. In Part 2, I showed you how to take regular snapshots of your data by using EBS Snapshot Scheduler and how to use Amazon Inspector to check if your EC2 instances running Microsoft Windows contain any common vulnerabilities and exposures (CVEs).

If you have comments about today’s or yesterday’s post, submit them in the “Comments” section below. If you have questions about or issues implementing any part of this solution, start a new thread on the Amazon EC2 forum or the Amazon Inspector forum, or contact AWS Support.

– Koen

How to Enable Caching for AWS CodeBuild

Post Syndicated from Karthik Thirugnanasambandam original https://aws.amazon.com/blogs/devops/how-to-enable-caching-for-aws-codebuild/

AWS CodeBuild is a fully managed build service. There are no servers to provision and scale, or software to install, configure, and operate. You just specify the location of your source code, choose your build settings, and CodeBuild runs build scripts for compiling, testing, and packaging your code.

A typical application build process includes phases like preparing the environment, updating the configuration, downloading dependencies, running unit tests, and finally, packaging the built artifact.

Downloading dependencies is a critical phase in the build process. These dependent files can range in size from a few KBs to multiple MBs. Because most of the dependent files do not change frequently between builds, you can noticeably reduce your build time by caching dependencies.

In this post, I will show you how to enable caching for AWS CodeBuild.

Requirements

  • Create an Amazon S3 bucket for storing cache archives (You can use existing s3 bucket as well).
  • Create a GitHub account (if you don’t have one).

Create a sample build project:

1. Open the AWS CodeBuild console at https://console.aws.amazon.com/codebuild/.

2. If a welcome page is displayed, choose Get started.

If a welcome page is not displayed, on the navigation pane, choose Build projects, and then choose Create project.

3. On the Configure your project page, for Project name, type a name for this build project. Build project names must be unique across each AWS account.

4. In Source: What to build, for Source provider, choose GitHub.

5. In Environment: How to build, for Environment image, select Use an image managed by AWS CodeBuild.

  • For Operating system, choose Ubuntu.
  • For Runtime, choose Java.
  • For Version,  choose aws/codebuild/java:openjdk-8.
  • For Build specification, select Insert build commands.

Note: The build specification file (buildspec.yml) can be configured in two ways. You can package it along with your source root directory, or you can override it by using a project environment configuration. In this example, I will use the override option and will use the console editor to specify the build specification.

6. Under Build commands, click Switch to editor to enter the build specification.

Copy the following text.

version: 0.2

phases:
  build:
    commands:
      - mvn install
      
cache:
  paths:
    - '/root/.m2/**/*'

Note: The cache section in the build specification instructs AWS CodeBuild about the paths to be cached. Like the artifacts section, the cache paths are relative to $CODEBUILD_SRC_DIR and specify the directories to be cached. In this example, Maven stores the downloaded dependencies to the /root/.m2/ folder, but other tools use different folders. For example, pip uses the /root/.cache/pip folder, and Gradle uses the /root/.gradle/caches folder. You might need to configure the cache paths based on your language platform.

7. In Artifacts: Where to put the artifacts from this build project:

  • For Type, choose No artifacts.

8. In Cache:

  • For Type, choose Amazon S3.
  • For Bucket, choose your S3 bucket.
  • For Path prefix, type cache/archives/

9. In Service role, the Create a service role in your account option will display a default role name.  You can accept the default name or type your own.

If you already have an AWS CodeBuild service role, choose Choose an existing service role from your account.

10. Choose Continue.

11. On the Review page, to run a build, choose Save and build.

Review build and cache behavior:

Let us review our first build for the project.

In the first run, where no cache exists, overall build time would look something like below (notice the time for DOWNLOAD_SOURCE, BUILD and POST_BUILD):

If you check the build logs, you will see log entries for dependency downloads. The dependencies are downloaded directly from configured external repositories. At the end of the log, you will see an entry for the cache uploaded to your S3 bucket.

Let’s review the S3 bucket for the cached archive. You’ll see the cache from our first successful build is uploaded to the configured S3 path.

Let’s try another build with the same CodeBuild project. This time the build should pick up the dependencies from the cache.

In the second run, there was a cache hit (cache was generated from the first run):

You’ll notice a few things:

  1. DOWNLOAD_SOURCE took slightly longer. Because, in addition to the source code, this time the build also downloaded the cache from user’s s3 bucket.
  2. BUILD time was faster. As the dependencies didn’t need to get downloaded, but were reused from cache.
  3. POST_BUILD took slightly longer, but was relatively the same.

Overall, build duration was improved with cache.

Best practices for cache

  • By default, the cache archive is encrypted on the server side with the customer’s artifact KMS key.
  • You can expire the cache by manually removing the cache archive from S3. Alternatively, you can expire the cache by using an S3 lifecycle policy.
  • You can override cache behavior by updating the project. You can use the AWS CodeBuild the AWS CodeBuild console, AWS CLI, or AWS SDKs to update the project. You can also invalidate cache setting by using the new InvalidateProjectCache API. This API forces a new InvalidationKey to be generated, ensuring that future builds receive an empty cache. This API does not remove the existing cache, because this could cause inconsistencies with builds currently in flight.
  • The cache can be enabled for any folders in the build environment, but we recommend you only cache dependencies/files that will not change frequently between builds. Also, to avoid unexpected application behavior, don’t cache configuration and sensitive information.

Conclusion

In this blog post, I showed you how to enable and configure cache setting for AWS CodeBuild. As you see, this can save considerable build time. It also improves resiliency by avoiding external network connections to an artifact repository.

I hope you found this post useful. Feel free to leave your feedback or suggestions in the comments.

Access Resources in a VPC from AWS CodeBuild Builds

Post Syndicated from John Pignata original https://aws.amazon.com/blogs/devops/access-resources-in-a-vpc-from-aws-codebuild-builds/

John Pignata, Startup Solutions Architect, Amazon Web Services

In this blog post we’re going to discuss a new AWS CodeBuild feature that is available starting today. CodeBuild builds can now access resources in a VPC directly without these resources being exposed to the public internet. These resources include Amazon Relational Database Service (Amazon RDS) databases, Amazon ElastiCache clusters, internal services running on Amazon Elastic Compute Cloud (Amazon EC2), and Amazon EC2 Container Service (Amazon ECS), or any service endpoints that are only reachable from within a specific VPC.

CodeBuild is a fully managed build service that compiles source code, runs tests, and produces software packages that are ready to deploy. As part of the build process, developers often require access to resources that should be isolated from the public Internet. Now CodeBuild builds can be optionally configured to have VPC connectivity and access these resources directly.

Accessing Resources in a VPC

You can configure builds to have access to a VPC when you create a CodeBuild project or you can update an existing CodeBuild project with VPC configuration attributes. Here’s how it looks in the console:

 

To configure VPC connectivity: select a VPC, one or more subnets within that VPC, and one or more VPC security groups that CodeBuild should apply when attaching to your VPC. Once configured, commands running as part of your build will be able to access resources in your VPC without transiting across the public Internet.

Use Cases

The availability of VPC connectivity from CodeBuild builds unlocks many potential uses. For example, you can:

  • Run integration tests from your build against data in an Amazon RDS instance that’s isolated on a private subnet.
  • Query data in an ElastiCache cluster directly from tests.
  • Interact with internal web services hosted on Amazon EC2, Amazon ECS, or services that use internal Elastic Load Balancing.
  • Retrieve dependencies from self-hosted, internal artifact repositories such as PyPI for Python, Maven for Java, npm for Node.js, and so on.
  • Access objects in an Amazon S3 bucket configured to allow access only through a VPC endpoint.
  • Query external web services that require fixed IP addresses through the Elastic IP address of the NAT gateway associated with your subnet(s).

… and more! Your builds can now access any resource that’s hosted in your VPC without any compromise on network isolation.

Internet Connectivity

CodeBuild requires access to resources on the public Internet to successfully execute builds. At a minimum, it must be able to reach your source repository system (such as AWS CodeCommit, GitHub, Bitbucket), Amazon Simple Storage Service (Amazon S3) to deliver build artifacts, and Amazon CloudWatch Logs to stream logs from the build process. The interface attached to your VPC will not be assigned a public IP address so to enable Internet access from your builds, you will need to set up a managed NAT Gateway or NAT instance for the subnets you configure. You must also ensure your security groups allow outbound access to these services.

IP Address Space

Each running build will be assigned an IP address from one of the subnets in your VPC that you designate for CodeBuild to use. As CodeBuild scales to meet your build volume, ensure that you select subnets with enough address space to accommodate your expected number of concurrent builds.

Service Role Permissions

CodeBuild requires new permissions in order to manage network interfaces on your VPCs. If you create a service role for your new projects, these permissions will be included in that role’s policy automatically. For existing service roles, you can edit the policy document to include the additional actions. For the full policy document to apply to your service role, see Advanced Setup in the CodeBuild documentation.

For more information, see VPC Support in the CodeBuild documentation. We hope you find the ability to access internal resources on a VPC useful in your build processes! If you have any questions or feedback, feel free to reach out to us through the AWS CodeBuild forum or leave a comment!

Amazon QuickSight Update – Geospatial Visualization, Private VPC Access, and More

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-quicksight-update-geospatial-visualization-private-vpc-access-and-more/

We don’t often recognize or celebrate anniversaries at AWS. With nearly 100 services on our list, we’d be eating cake and drinking champagne several times a week. While that might sound like fun, we’d rather spend our working hours listening to customers and innovating. With that said, Amazon QuickSight has now been generally available for a little over a year and I would like to give you a quick update!

QuickSight in Action
Today, tens of thousands of customers (from startups to enterprises, in industries as varied as transportation, legal, mining, and healthcare) are using QuickSight to analyze and report on their business data.

Here are a couple of examples:

Gemini provides legal evidence procurement for California attorneys who represent injured workers. They have gone from creating custom reports and running one-off queries to creating and sharing dynamic QuickSight dashboards with drill-downs and filtering. QuickSight is used to track sales pipeline, measure order throughput, and to locate bottlenecks in the order processing pipeline.

Jivochat provides a real-time messaging platform to connect visitors to website owners. QuickSight lets them create and share interactive dashboards while also providing access to the underlying datasets. This has allowed them to move beyond the sharing of static spreadsheets, ensuring that everyone is looking at the same and is empowered to make timely decisions based on current data.

Transfix is a tech-powered freight marketplace that matches loads and increases visibility into logistics for Fortune 500 shippers in retail, food and beverage, manufacturing, and other industries. QuickSight has made analytics accessible to both BI engineers and non-technical business users. They scrutinize key business and operational metrics including shipping routes, carrier efficient, and process automation.

Looking Back / Looking Ahead
The feedback on QuickSight has been incredibly helpful. Customers tell us that their employees are using QuickSight to connect to their data, perform analytics, and make high-velocity, data-driven decisions, all without setting up or running their own BI infrastructure. We love all of the feedback that we get, and use it to drive our roadmap, leading to the introduction of over 40 new features in just a year. Here’s a summary:

Looking forward, we are watching an interesting trend develop within our customer base. As these customers take a close look at how they analyze and report on data, they are realizing that a serverless approach offers some tangible benefits. They use Amazon Simple Storage Service (S3) as a data lake and query it using a combination of QuickSight and Amazon Athena, giving them agility and flexibility without static infrastructure. They also make great use of QuickSight’s dashboards feature, monitoring business results and operational metrics, then sharing their insights with hundreds of users. You can read Building a Serverless Analytics Solution for Cleaner Cities and review Serverless Big Data Analytics using Amazon Athena and Amazon QuickSight if you are interested in this approach.

New Features and Enhancements
We’re still doing our best to listen and to learn, and to make sure that QuickSight continues to meet your needs. I’m happy to announce that we are making seven big additions today:

Geospatial Visualization – You can now create geospatial visuals on geographical data sets.

Private VPC Access – You can now sign up to access a preview of a new feature that allows you to securely connect to data within VPCs or on-premises, without the need for public endpoints.

Flat Table Support – In addition to pivot tables, you can now use flat tables for tabular reporting. To learn more, read about Using Tabular Reports.

Calculated SPICE Fields – You can now perform run-time calculations on SPICE data as part of your analysis. Read Adding a Calculated Field to an Analysis for more information.

Wide Table Support – You can now use tables with up to 1000 columns.

Other Buckets – You can summarize the long tail of high-cardinality data into buckets, as described in Working with Visual Types in Amazon QuickSight.

HIPAA Compliance – You can now run HIPAA-compliant workloads on QuickSight.

Geospatial Visualization
Everyone seems to want this feature! You can now take data that contains a geographic identifier (country, city, state, or zip code) and create beautiful visualizations with just a few clicks. QuickSight will geocode the identifier that you supply, and can also accept lat/long map coordinates. You can use this feature to visualize sales by state, map stores to shipping destinations, and so forth. Here’s a sample visualization:

To learn more about this feature, read Using Geospatial Charts (Maps), and Adding Geospatial Data.

Private VPC Access Preview
If you have data in AWS (perhaps in Amazon Redshift, Amazon Relational Database Service (RDS), or on EC2) or on-premises in Teradata or SQL Server on servers without public connectivity, this feature is for you. Private VPC Access for QuickSight uses an Elastic Network Interface (ENI) for secure, private communication with data sources in a VPC. It also allows you to use AWS Direct Connect to create a secure, private link with your on-premises resources. Here’s what it looks like:

If you are ready to join the preview, you can sign up today.

Jeff;

 

Terabytes Of US Military Social Media Spying S3 Data Exposed

Post Syndicated from Darknet original https://www.darknet.org.uk/2017/11/terabytes-us-military-social-media-spying-s3-data-exposed/?utm_source=rss&utm_medium=social&utm_campaign=darknetfeed

Terabytes Of US Military Social Media Spying S3 Data Exposed

Once again the old, default Amazon AWS S3 settings are catching people out, this time the US Military has left terabytes of social media spying S3 data exposed to everyone for years.

It’s not long ago since a Time Warner vendor and their sloppy AWS S3 config leaked over 4 million customer records and left S3 data exposed, and that’s not the only case – there’s plenty more.

Three misconfigured AWS S3 buckets have been discovered wide open on the public internet containing “dozens of terabytes” of social media posts and similar pages – all scraped from around the world by the US military to identify and profile persons of interest.

Read the rest of Terabytes Of US Military Social Media Spying S3 Data Exposed now! Only available at Darknet.

Use the New Visual Editor to Create and Modify Your AWS IAM Policies

Post Syndicated from Joy Chatterjee original https://aws.amazon.com/blogs/security/use-the-new-visual-editor-to-create-and-modify-your-aws-iam-policies/

Today, AWS Identity and Access Management (IAM) made it easier for you to create and modify your IAM policies by using a point-and-click visual editor in the IAM console. The new visual editor guides you through granting permissions for IAM policies without requiring you to write policies in JSON (although you can still author and edit policies in JSON, if you prefer). This update to the IAM console makes it easier to grant least privilege for the AWS service actions you select by listing all the supported resource types and request conditions you can specify. Policy summaries identify unrecognized services and actions and permissions errors when you import existing policies, and now you can use the visual editor to correct them. In this blog post, I give a brief overview of policy concepts and show you how to create a new policy by using the visual editor.

IAM policy concepts

You use IAM policies to define permissions for your IAM entities (groups, users, and roles). Policies are composed of one or more statements that include the following elements:

  • Effect: Determines if a policy statement allows or explicitly denies access.
  • Action: Defines AWS service actions in a policy (these typically map to individual AWS APIs.)
  • Resource: Defines the AWS resources to which actions can apply. The defined resources must be supported by the actions defined in the Action element for permissions to be granted.
  • Condition: Defines when a permission is allowed or denied. The conditions defined in a policy must be supported by the actions defined in the Action element for the permission to be granted.

To grant permissions, you attach policies to groups, users, or roles. Now that I have reviewed the elements of a policy, I will demonstrate how to create an IAM policy with the visual editor.

How to create an IAM policy with the visual editor

Let’s say my human resources (HR) recruiter, Casey, needs to review files located in an Amazon S3 bucket for all the product manager (PM) candidates our HR team has interviewed in 2017. To grant this access, I will create and attach a policy to Casey that grants list and limited read access to all folders that begin with PM_Candidate in the pmrecruiting2017 S3 bucket. To create this new policy, I navigate to the Policies page in the IAM console and choose Create policy. Note that I could also use the visual editor to modify existing policies by choosing Import existing policy; however, for Casey, I will create a new policy.

Image of the "Create policy" button

On the Visual editor tab, I see a section that includes Service, Actions, Resources, and Request Conditions.

Image of the "Visual editor" tab

Select a service

To grant S3 permissions, I choose Select a service, type S3 in the search box, and choose S3 from the list.

Image of choosing "S3"

Select actions

After selecting S3, I can define actions for Casey by using one of four options:

  1. Filter actions in the service by using the search box.
  2. Type actions by choosing Add action next to Manual actions. For example, I can type List* to grant all S3 actions that begin with List*.
  3. Choose access levels from List, Read, Write, Permissions management, and Tagging.
  4. Select individual actions by expanding each access level.

In the following screenshot, I choose options 3 and 4, and choose List and s3:GetObject from the Read access level.

Screenshot of options in the "Select actions" section

We introduced access levels when we launched policy summaries earlier in 2017. Access levels give you a way to categorize actions and help you understand the permissions in a policy. The following table gives you a quick overview of access levels.

Access level Description Example actions
List Actions that allow you to see a list of resources s3:ListBucket, s3:ListAllMyBuckets
Read Actions that allow you to read the content in resources s3:GetObject, s3:GetBucketTagging
Write Actions that allow you to create, delete, or modify resources s3:PutObject, s3:DeleteBucket
Permissions management Actions that allow you to grant or modify permissions to resources s3:PutBucketPolicy
Tagging Actions that allow you to create, delete, or modify tags
Note: Some services support authorization based on tags.
s3:PutBucketTagging, s3:DeleteObjectVersionTagging

Note: By default, all actions you choose will be allowed. To deny actions, choose Switch to deny permissions in the upper right corner of the Actions section.

As shown in the preceding screenshot, if I choose the question mark icon next to GetObject, I can see the description and supported resources and conditions for this action, which can help me scope permissions.

Screenshot of GetObject

The visual editor makes it easy to decide which actions I should select by providing in an integrated documentation panel the action description, supported resources or conditions, and any required actions for every AWS service action. Some AWS service actions have required actions, which are other AWS service actions that need to be granted in a policy for an action to run. For example, the AWS Directory Service action, ds:CreateDirectory, requires seven Amazon EC2 actions to be able to create a Directory Service directory.

Choose resources

In the Resources section, I can choose the resources on which actions can be taken. I choose Resources and see two ways that I can define or select resources:

  1. Define specific resources
  2. Select all resources

Specific is the default option, and only the applicable resources are presented based on the service and actions I chose previously. Because I want to grant Casey access to some objects in a specific bucket, I choose Specific and choose Add ARN under bucket.

Screenshot of Resources section

In the pop-up, I type the bucket name, pmrecruiting2017, and choose Add to specify the S3 bucket resource.

Screenshot of specifying the S3 bucket resource

To specify the objects, I choose Add ARN under object and grant Casey access to all objects starting with PM_Candidate in the pmrecruiting2017 bucket. The visual editor helps you build your Amazon Resource Name (ARN) and validates that it is structured correctly. For AWS services that are AWS Region specific, the visual editor prompts for AWS Region and account number.

The visual editor displays all applicable resources in the Resources section based on the actions I choose. For Casey, I defined an S3 bucket and object in the Resources section. In this example, when the visual editor creates the policy, it creates three statements. The first statement includes all actions that require a wildcard (*) for the Resource element because this action does not support resource-level permissions. The second statement includes all S3 actions that support an S3 bucket. The third statement includes all actions that support an S3 object resource. The visual editor generates policy syntax for you based on supported permissions in AWS services.

Specify request conditions

For additional security, I specify a condition to restrict access to the S3 bucket from inside our internal network. To do this, I choose Specify request conditions in the Request Conditions section, and choose the Source IP check box. A condition is composed of a condition key, an operator, and a value. I choose aws:SourceIp for my Key so that I can control from where the S3 files can be accessed. By default, IpAddress is the Operator, and I set the Value to my internal network.

Screenshot of "Request conditions" section

To add other conditions, choose Add condition and choose Save changes after choosing the key, operator, and value.

After specifying my request condition, I am now able to review all the elements of these S3 permissions.

Screenshot of S3 permissions

Next, I can choose to grant permissions for another service by choosing Add new permissions (bottom left of preceding screenshot), or I can review and create this new policy. Because I have granted all the permissions Casey needs, I choose Review policy. I type a name and a description, and I review the policy summary before choosing Create policy. 

Now that I have created the policy, I attach it to Casey by choosing the Attached entities tab of the policy I just created. I choose Attach and choose Casey. I then choose Attach policy. Casey should now be able to access the interview files she needs to review.

Summary

The visual editor makes it easier to create and modify your IAM policies by guiding you through each element of the policy. The visual editor helps you define resources and request conditions so that you can grant least privilege and generate policies. To start using the visual editor, sign in to the IAM console, navigate to the Policies page, and choose Create policy.

If you have comments about this post, submit them in the “Comments” section below. If you have questions about or suggestions for this solution, start a new thread on the IAM forum.

– Joy

Event-Driven Computing with Amazon SNS and AWS Compute, Storage, Database, and Networking Services

Post Syndicated from Christie Gifrin original https://aws.amazon.com/blogs/compute/event-driven-computing-with-amazon-sns-compute-storage-database-and-networking-services/

Contributed by Otavio Ferreira, Manager, Software Development, AWS Messaging

Like other developers around the world, you may be tackling increasingly complex business problems. A key success factor, in that case, is the ability to break down a large project scope into smaller, more manageable components. A service-oriented architecture guides you toward designing systems as a collection of loosely coupled, independently scaled, and highly reusable services. Microservices take this even further. To improve performance and scalability, they promote fine-grained interfaces and lightweight protocols.

However, the communication among isolated microservices can be challenging. Services are often deployed onto independent servers and don’t share any compute or storage resources. Also, you should avoid hard dependencies among microservices, to preserve maintainability and reusability.

If you apply the pub/sub design pattern, you can effortlessly decouple and independently scale out your microservices and serverless architectures. A pub/sub messaging service, such as Amazon SNS, promotes event-driven computing that statically decouples event publishers from subscribers, while dynamically allowing for the exchange of messages between them. An event-driven architecture also introduces the responsiveness needed to deal with complex problems, which are often unpredictable and asynchronous.

What is event-driven computing?

Given the context of microservices, event-driven computing is a model in which subscriber services automatically perform work in response to events triggered by publisher services. This paradigm can be applied to automate workflows while decoupling the services that collectively and independently work to fulfil these workflows. Amazon SNS is an event-driven computing hub, in the AWS Cloud, that has native integration with several AWS publisher and subscriber services.

Which AWS services publish events to SNS natively?

Several AWS services have been integrated as SNS publishers and, therefore, can natively trigger event-driven computing for a variety of use cases. In this post, I specifically cover AWS compute, storage, database, and networking services, as depicted below.

Compute services

  • Auto Scaling: Helps you ensure that you have the correct number of Amazon EC2 instances available to handle the load for your application. You can configure Auto Scaling lifecycle hooks to trigger events, as Auto Scaling resizes your EC2 cluster.As an example, you may want to warm up the local cache store on newly launched EC2 instances, and also download log files from other EC2 instances that are about to be terminated. To make this happen, set an SNS topic as your Auto Scaling group’s notification target, then subscribe two Lambda functions to this SNS topic. The first function is responsible for handling scale-out events (to warm up cache upon provisioning), whereas the second is in charge of handling scale-in events (to download logs upon termination).

  • AWS Elastic Beanstalk: An easy-to-use service for deploying and scaling web applications and web services developed in a number of programming languages. You can configure event notifications for your Elastic Beanstalk environment so that notable events can be automatically published to an SNS topic, then pushed to topic subscribers.As an example, you may use this event-driven architecture to coordinate your continuous integration pipeline (such as Jenkins CI). That way, whenever an environment is created, Elastic Beanstalk publishes this event to an SNS topic, which triggers a subscribing Lambda function, which then kicks off a CI job against your newly created Elastic Beanstalk environment.

  • Elastic Load Balancing: Automatically distributes incoming application traffic across Amazon EC2 instances, containers, or other resources identified by IP addresses.You can configure CloudWatch alarms on Elastic Load Balancing metrics, to automate the handling of events derived from Classic Load Balancers. As an example, you may leverage this event-driven design to automate latency profiling in an Amazon ECS cluster behind a Classic Load Balancer. In this example, whenever your ECS cluster breaches your load balancer latency threshold, an event is posted by CloudWatch to an SNS topic, which then triggers a subscribing Lambda function. This function runs a task on your ECS cluster to trigger a latency profiling tool, hosted on the cluster itself. This can enhance your latency troubleshooting exercise by making it timely.

Storage services

  • Amazon S3: Object storage built to store and retrieve any amount of data.You can enable S3 event notifications, and automatically get them posted to SNS topics, to automate a variety of workflows. For instance, imagine that you have an S3 bucket to store incoming resumes from candidates, and a fleet of EC2 instances to encode these resumes from their original format (such as Word or text) into a portable format (such as PDF).In this example, whenever new files are uploaded to your input bucket, S3 publishes these events to an SNS topic, which in turn pushes these messages into subscribing SQS queues. Then, encoding workers running on EC2 instances poll these messages from the SQS queues; retrieve the original files from the input S3 bucket; encode them into PDF; and finally store them in an output S3 bucket.

  • Amazon EFS: Provides simple and scalable file storage, for use with Amazon EC2 instances, in the AWS Cloud.You can configure CloudWatch alarms on EFS metrics, to automate the management of your EFS systems. For example, consider a highly parallelized genomics analysis application that runs against an EFS system. By default, this file system is instantiated on the “General Purpose” performance mode. Although this performance mode allows for lower latency, it might eventually impose a scaling bottleneck. Therefore, you may leverage an event-driven design to handle it automatically.Basically, as soon as the EFS metric “Percent I/O Limit” breaches 95%, CloudWatch could post this event to an SNS topic, which in turn would push this message into a subscribing Lambda function. This function automatically creates a new file system, this time on the “Max I/O” performance mode, then switches the genomics analysis application to this new file system. As a result, your application starts experiencing higher I/O throughput rates.

  • Amazon Glacier: A secure, durable, and low-cost cloud storage service for data archiving and long-term backup.You can set a notification configuration on an Amazon Glacier vault so that when a job completes, a message is published to an SNS topic. Retrieving an archive from Amazon Glacier is a two-step asynchronous operation, in which you first initiate a job, and then download the output after the job completes. Therefore, SNS helps you eliminate polling your Amazon Glacier vault to check whether your job has been completed, or not. As usual, you may subscribe SQS queues, Lambda functions, and HTTP endpoints to your SNS topic, to be notified when your Amazon Glacier job is done.

  • AWS Snowball: A petabyte-scale data transport solution that uses secure appliances to transfer large amounts of data.You can leverage Snowball notifications to automate workflows related to importing data into and exporting data from AWS. More specifically, whenever your Snowball job status changes, Snowball can publish this event to an SNS topic, which in turn can broadcast the event to all its subscribers.As an example, imagine a Geographic Information System (GIS) that distributes high-resolution satellite images to users via Web browser. In this example, the GIS vendor could capture up to 80 TB of satellite images; create a Snowball job to import these files from an on-premises system to an S3 bucket; and provide an SNS topic ARN to be notified upon job status changes in Snowball. After Snowball changes the job status from “Importing” to “Completed”, Snowball publishes this event to the specified SNS topic, which delivers this message to a subscribing Lambda function, which finally creates a CloudFront web distribution for the target S3 bucket, to serve the images to end users.

Database services

  • Amazon RDS: Makes it easy to set up, operate, and scale a relational database in the cloud.RDS leverages SNS to broadcast notifications when RDS events occur. As usual, these notifications can be delivered via any protocol supported by SNS, including SQS queues, Lambda functions, and HTTP endpoints.As an example, imagine that you own a social network website that has experienced organic growth, and needs to scale its compute and database resources on demand. In this case, you could provide an SNS topic to listen to RDS DB instance events. When the “Low Storage” event is published to the topic, SNS pushes this event to a subscribing Lambda function, which in turn leverages the RDS API to increase the storage capacity allocated to your DB instance. The provisioning itself takes place within the specified DB maintenance window.

  • Amazon ElastiCache: A web service that makes it easy to deploy, operate, and scale an in-memory data store or cache in the cloud.ElastiCache can publish messages using Amazon SNS when significant events happen on your cache cluster. This feature can be used to refresh the list of servers on client machines connected to individual cache node endpoints of a cache cluster. For instance, an ecommerce website fetches product details from a cache cluster, with the goal of offloading a relational database and speeding up page load times. Ideally, you want to make sure that each web server always has an updated list of cache servers to which to connect.To automate this node discovery process, you can get your ElastiCache cluster to publish events to an SNS topic. Thus, when ElastiCache event “AddCacheNodeComplete” is published, your topic then pushes this event to all subscribing HTTP endpoints that serve your ecommerce website, so that these HTTP servers can update their list of cache nodes.

  • Amazon Redshift: A fully managed data warehouse that makes it simple to analyze data using standard SQL and BI (Business Intelligence) tools.Amazon Redshift uses SNS to broadcast relevant events so that data warehouse workflows can be automated. As an example, imagine a news website that sends clickstream data to a Kinesis Firehose stream, which then loads the data into Amazon Redshift, so that popular news and reading preferences might be surfaced on a BI tool. At some point though, this Amazon Redshift cluster might need to be resized, and the cluster enters a ready-only mode. Hence, this Amazon Redshift event is published to an SNS topic, which delivers this event to a subscribing Lambda function, which finally deletes the corresponding Kinesis Firehose delivery stream, so that clickstream data uploads can be put on hold.At a later point, after Amazon Redshift publishes the event that the maintenance window has been closed, SNS notifies a subscribing Lambda function accordingly, so that this function can re-create the Kinesis Firehose delivery stream, and resume clickstream data uploads to Amazon Redshift.

  • AWS DMS: Helps you migrate databases to AWS quickly and securely. The source database remains fully operational during the migration, minimizing downtime to applications that rely on the database.DMS also uses SNS to provide notifications when DMS events occur, which can automate database migration workflows. As an example, you might create data replication tasks to migrate an on-premises MS SQL database, composed of multiple tables, to MySQL. Thus, if replication tasks fail due to incompatible data encoding in the source tables, these events can be published to an SNS topic, which can push these messages into a subscribing SQS queue. Then, encoders running on EC2 can poll these messages from the SQS queue, encode the source tables into a compatible character set, and restart the corresponding replication tasks in DMS. This is an event-driven approach to a self-healing database migration process.

Networking services

  • Amazon Route 53: A highly available and scalable cloud-based DNS (Domain Name System). Route 53 health checks monitor the health and performance of your web applications, web servers, and other resources.You can set CloudWatch alarms and get automated Amazon SNS notifications when the status of your Route 53 health check changes. As an example, imagine an online payment gateway that reports the health of its platform to merchants worldwide, via a status page. This page is hosted on EC2 and fetches platform health data from DynamoDB. In this case, you could configure a CloudWatch alarm for your Route 53 health check, so that when the alarm threshold is breached, and the payment gateway is no longer considered healthy, then CloudWatch publishes this event to an SNS topic, which pushes this message to a subscribing Lambda function, which finally updates the DynamoDB table that populates the status page. This event-driven approach avoids any kind of manual update to the status page visited by merchants.

  • AWS Direct Connect (AWS DX): Makes it easy to establish a dedicated network connection from your premises to AWS, which can reduce your network costs, increase bandwidth throughput, and provide a more consistent network experience than Internet-based connections.You can monitor physical DX connections using CloudWatch alarms, and send SNS messages when alarms change their status. As an example, when a DX connection state shifts to 0 (zero), indicating that the connection is down, this event can be published to an SNS topic, which can fan out this message to impacted servers through HTTP endpoints, so that they might reroute their traffic through a different connection instead. This is an event-driven approach to connectivity resilience.

More event-driven computing on AWS

In addition to SNS, event-driven computing is also addressed by Amazon CloudWatch Events, which delivers a near real-time stream of system events that describe changes in AWS resources. With CloudWatch Events, you can route each event type to one or more targets, including:

Many AWS services publish events to CloudWatch. As an example, you can get CloudWatch Events to capture events on your ETL (Extract, Transform, Load) jobs running on AWS Glue and push failed ones to an SQS queue, so that you can retry them later.

Conclusion

Amazon SNS is a pub/sub messaging service that can be used as an event-driven computing hub to AWS customers worldwide. By capturing events natively triggered by AWS services, such as EC2, S3 and RDS, you can automate and optimize all kinds of workflows, namely scaling, testing, encoding, profiling, broadcasting, discovery, failover, and much more. Business use cases presented in this post ranged from recruiting websites, to scientific research, geographic systems, social networks, retail websites, and news portals.

Start now by visiting Amazon SNS in the AWS Management Console, or by trying the AWS 10-Minute Tutorial, Send Fan-out Event Notifications with Amazon SNS and Amazon SQS.

 

Updated AWS SOC Reports Are Now Available with 19 Additional Services in Scope

Post Syndicated from Chad Woolf original https://aws.amazon.com/blogs/security/updated-aws-soc-reports-are-now-available-with-19-additional-services-in-scope/

AICPA SOC logo

Newly updated reports are available for AWS System and Organization Control Report 1 (SOC 1), formerly called AWS Service Organization Control Report 1, and AWS SOC 2: Security, Availability, & Confidentiality Report. You can download both reports for free and on demand in the AWS Management Console through AWS Artifact. The updated AWS SOC 3: Security, Availability, & Confidentiality Report also was just released. All three reports cover April 1, 2017, through September 30, 2017.

With the addition of the following 19 services, AWS now supports 51 SOC-compliant AWS services and is committed to increasing the number:

  • Amazon API Gateway
  • Amazon Cloud Directory
  • Amazon CloudFront
  • Amazon Cognito
  • Amazon Connect
  • AWS Directory Service for Microsoft Active Directory
  • Amazon EC2 Container Registry
  • Amazon EC2 Container Service
  • Amazon EC2 Systems Manager
  • Amazon Inspector
  • AWS IoT Platform
  • Amazon Kinesis Streams
  • AWS Lambda
  • AWS [email protected]
  • AWS Managed Services
  • Amazon S3 Transfer Acceleration
  • AWS Shield
  • AWS Step Functions
  • AWS WAF

With this release, we also are introducing a separate spreadsheet, eliminating the need to extract the information from multiple PDFs.

If you are not yet an AWS customer, contact AWS Compliance to access the SOC Reports.

– Chad

Visualize AWS Cloudtrail Logs using AWS Glue and Amazon Quicksight

Post Syndicated from Luis Caro Perez original https://aws.amazon.com/blogs/big-data/streamline-aws-cloudtrail-log-visualization-using-aws-glue-and-amazon-quicksight/

Being able to easily visualize AWS CloudTrail logs gives you a better understanding of how your AWS infrastructure is being used. It can also help you audit and review AWS API calls and detect security anomalies inside your AWS account. To do this, you must be able to perform analytics based on your CloudTrail logs.

In this post, I walk through using AWS Glue and AWS Lambda to convert AWS CloudTrail logs from JSON to a query-optimized format dataset in Amazon S3. I then use Amazon Athena and Amazon QuickSight to query and visualize the data.

Solution overview

To process CloudTrail logs, you must implement the following architecture:

CloudTrail delivers log files in an Amazon S3 bucket folder. To correctly crawl these logs, you modify the file contents and folder structure using an Amazon S3-triggered Lambda function that stores the transformed files in an S3 bucket single folder. When the files are in a single folder, AWS Glue scans the data, converts it into Apache Parquet format, and catalogs it to allow for querying and visualization using Amazon Athena and Amazon QuickSight.

Walkthrough

Let’s look at the steps that are required to build the solution.

Set up CloudTrail logs

First, you need to set up a trail that delivers log files to an S3 bucket. To create a trail in CloudTrail, follow the instructions in Creating a Trail.

When you finish, the trail settings page should look like the following screenshot:

In this example, I set up log files to be delivered to the cloudtraillfcaro bucket.

Consolidate CloudTrail reports into a single folder using Lambda

AWS CloudTrail delivers log files using the following folder structure inside the configured Amazon S3 bucket:

AWSLogs/ACCOUNTID/CloudTrail/REGION/YEAR/MONTH/HOUR/filename.json.gz

Additionally, log files have the following structure:

{
    "Records": [{
        "eventVersion": "1.01",
        "userIdentity": {
            "type": "IAMUser",
            "principalId": "AIDAJDPLRKLG7UEXAMPLE",
            "arn": "arn:aws:iam::123456789012:user/Alice",
            "accountId": "123456789012",
            "accessKeyId": "AKIAIOSFODNN7EXAMPLE",
            "userName": "Alice",
            "sessionContext": {
                "attributes": {
                    "mfaAuthenticated": "false",
                    "creationDate": "2014-03-18T14:29:23Z"
                }
            }
        },
        "eventTime": "2014-03-18T14:30:07Z",
        "eventSource": "cloudtrail.amazonaws.com",
        "eventName": "StartLogging",
        "awsRegion": "us-west-2",
        "sourceIPAddress": "72.21.198.64",
        "userAgent": "signin.amazonaws.com",
        "requestParameters": {
            "name": "Default"
        },
        "responseElements": null,
        "requestID": "cdc73f9d-aea9-11e3-9d5a-835b769c0d9c",
        "eventID": "3074414d-c626-42aa-984b-68ff152d6ab7"
    },
    ... additional entries ...
    ]

If AWS Glue crawlers are used to catalog these files as they are written, the following obstacles arise:

  1. AWS Glue identifies different tables per different folders because they don’t follow a traditional partition format.
  2. Based on the structure of the file content, AWS Glue identifies the tables as having a single column of type array.
  3. CloudTrail logs have JSON attributes that use uppercase letters. According to the Best Practices When Using Athena with AWS Glue, it is recommended that you convert these to lowercase.

To have AWS Glue catalog all log files in a single table with all the columns describing each event, implement the following Lambda function:

from __future__ import print_function
import json
import urllib
import boto3
import gzip

s3 = boto3.resource('s3')
client = boto3.client('s3')

def convertColumntoLowwerCaps(obj):
    for key in obj.keys():
        new_key = key.lower()
        if new_key != key:
            obj[new_key] = obj[key]
            del obj[key]
    return obj


def lambda_handler(event, context):

    bucket = event['Records'][0]['s3']['bucket']['name']
    key = urllib.unquote_plus(event['Records'][0]['s3']['object']['key'].encode('utf8'))
    print(bucket)
    print(key)
    try:
        newKey = 'flatfiles/' + key.replace("/", "")
        client.download_file(bucket, key, '/tmp/file.json.gz')
        with gzip.open('/tmp/out.json.gz', 'w') as output, gzip.open('/tmp/file.json.gz', 'rb') as file:
            i = 0
            for line in file: 
                for record in json.loads(line,object_hook=convertColumntoLowwerCaps)['records']:
            		if i != 0:
            		    output.write("\n")
            		output.write(json.dumps(record))
            		i += 1
        client.upload_file('/tmp/out.json.gz', bucket,newKey)
        return "success"
    except Exception as e:
        print(e)
        print('Error processing object {} from bucket {}. Make sure they exist and your bucket is in the same region as this function.'.format(key, bucket))
        raise e

The function goes over each element of the records array, changes uppercase letters to lowercase in column names, and inserts each element of the array as a single line of a new file. The new file is saved inside a flatfiles folder created by the function without any subfolders in the S3 bucket.

The function should have a role containing a policy with at least the following permissions:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "s3:*"
            ],
            "Resource": [
                "arn:aws:s3:::cloudtraillfcaro/*",
                "arn:aws:s3:::cloudtraillfcaro"
            ],
            "Effect": "Allow"
        }
    ]
}

In this example, CloudTrail delivers logs to the cloudtraillfcaro bucket. Make sure that you replace this name with your bucket name in the policy. For more information about how to work with inline policies, see Working with Inline Policies.

After the Lambda function is created, you can set up the following trigger using the Triggers tab on the AWS Lambda console.

Choose Add trigger, and choose S3 as a source of the trigger.

After choosing the source, configure the following settings:

In the trigger, any file that is written to the path for the log files—which in this case is AWSLogs/119582755581/CloudTrail/—is processed. Make sure that the Enable trigger check box is selected and that the bucket and prefix parameters match your use case.

After you set up the function and receive log files, the bucket (in this case cloudtraillfcaro) should contain the processed files inside the flatfiles folder.

Catalog source data

Once the files are processed by the Lambda function, set up a crawler named cloudtrail to catalog them.

The crawler must point to the flatfiles folder.

All the crawlers and AWS Glue jobs created for this solution must have a role with the AWSGlueServiceRole managed policy and an inline policy with permissions to modify the S3 buckets used on the Lambda function. For more information, see Working with Managed Policies.

The role should look like the following:

In this example, the inline policy named s3perms contains the permissions to modify the S3 buckets.

After you choose the role, you can schedule the crawler to run on demand.

A new database is created, and the crawler is set to use it. In this case, the cloudtrail database is used for all the tables.

After the crawler runs, a single table should be created in the catalog with the following structure:

The table should contain the following columns:

Create and run the AWS Glue job

To convert all the CloudTrail logs to a columnar store in Parquet, set up an AWS Glue job by following these steps.

Upload the following script into a bucket in Amazon S3:

import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
import boto3
import time

## @params: [JOB_NAME]
args = getResolvedOptions(sys.argv, ['JOB_NAME'])

sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(args['JOB_NAME'], args)

datasource0 = glueContext.create_dynamic_frame.from_catalog(database = "cloudtrail", table_name = "flatfiles", transformation_ctx = "datasource0")
resolvechoice1 = ResolveChoice.apply(frame = datasource0, choice = "make_struct", transformation_ctx = "resolvechoice1")
relationalized1 = resolvechoice1.relationalize("trail", args["TempDir"]).select("trail")
datasink = glueContext.write_dynamic_frame.from_options(frame = relationalized1, connection_type = "s3", connection_options = {"path": "s3://cloudtraillfcaro/parquettrails"}, format = "parquet", transformation_ctx = "datasink4")
job.commit()

In the example, you load the script as a file named cloudtrailtoparquet.py. Make sure that you modify the script and update the “{"path": "s3://cloudtraillfcaro/parquettrails"}” with the destination in which you want to store your results.

After uploading the script, add a new AWS Glue job. Choose a name and role for the job, and choose the option of running the job from An existing script that you provide.

To avoid processing the same data twice, enable the Job bookmark setting in the Advanced properties section of the job properties.

Choose Next twice, and then choose Finish.

If logs are already in the flatfiles folder, you can run the job on demand to generate the first set of results.

Once the job starts running, wait for it to complete.

When the job is finished, its Run status should be Succeeded. After that, you can verify that the Parquet files are written to the Amazon S3 location.

Catalog results

To be able to process results from Athena, you can use an AWS Glue crawler to catalog the results of the AWS Glue job.

In this example, the crawler is set to use the same database as the source named cloudtrail.

You can run the crawler using the console. When the crawler finishes running and has processed the Parquet results, a new table should be created in the AWS Glue Data Catalog. In this example, it’s named parquettrails.

The table should have the classification set to parquet.

It should have the same columns as the flatfiles table, with the exception of the struct type columns, which should be relationalized into several columns:

In this example, notice how the requestparameters column, which was a struct in the original table (flatfiles), was transformed to several columns—one for each key value inside it. This is done using a transformation native to AWS Glue called relationalize.

Query results with Athena

After crawling the results, you can query them using Athena. For example, to query what events took place in the time frame between 2017-10-23t12:00:00 and 2017-10-23t13:00, use the following select statement:

select *
from cloudtrail.parquettrails
where eventtime > '2017-10-23T12:00:00Z' AND eventtime < '2017-10-23T13:00:00Z'
order by eventtime asc;

Be sure to replace cloudtrail.parquettrails with the names of your database and table that references the Parquet results. Replace the datetimes with an hour when your account had activity and was processed by the AWS Glue job.

Visualize results using Amazon QuickSight

Once you can query the data using Athena, you can visualize it using Amazon QuickSight. Before connecting Amazon QuickSight to Athena, be sure to grant QuickSight access to Athena and the associated S3 buckets in your account. For more information, see Managing Amazon QuickSight Permissions to AWS Resources. You can then create a new data set in Amazon QuickSight based on the Athena table that you created.

After setting up permissions, you can create a new analysis in Amazon QuickSight by choosing New analysis.

Then add a new data set.

Choose Athena as the source.

Give the data source a name (in this case, I named it cloudtrail).

Choose the name of the database and the table referencing the Parquet results.

Then choose Visualize.

After that, you should see the following screen:

Now you can create some visualizations. First, search for the sourceipaddress column, and drag it to the AutoGraph section.

You can see a list of the IP addresses that you have used to interact with AWS. To review whether these IP addresses have been used from IAM users, internal AWS services, or roles, use the type value that is inside the useridentity field of the original log files. Thanks to the relationalize transformation, this value is available as the useridentity.type column. After the column is added into the Group/Color box, the visualization should look like the following:

You can now see and distinguish the most used IPs and whether they are used from roles, AWS services, or IAM users.

After following all these steps, you can use Amazon QuickSight to add different columns from CloudTrail and perform different types of visualizations. You can build operational dashboards that continuously monitor AWS infrastructure usage and access. You can share those dashboards with others in your organization who might need to see this data.

Summary

In this post, you saw how you can use a simple Lambda function and an AWS Glue script to convert text files into Parquet to improve Athena query performance and data compression. The post also demonstrated how to use AWS Lambda to preprocess files in Amazon S3 and transform them into a format that is recognizable by AWS Glue crawlers.

This example, used AWS CloudTrail logs, but you can apply the proposed solution to any set of files that after preprocessing, can be cataloged by AWS Glue.


Additional Reading

Learn how to Harmonize, Query, and Visualize Data from Various Providers using AWS Glue, Amazon Athena, and Amazon QuickSight.


About the Authors

Luis Caro is a Big Data Consultant for AWS Professional Services. He works with our customers to provide guidance and technical assistance on big data projects, helping them improving the value of their solutions when using AWS.

 

 

 

B2 Cloud Storage Roundup

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/b2-cloud-storage-roundup/

B2 Integrations
Over the past several months, B2 Cloud Storage has continued to grow like we planted magic beans. During that time we have added a B2 Java SDK, and certified integrations with GoodSync, Arq, Panic, UpdraftPlus, Morro Data, QNAP, Archiware, Restic, and more. In addition, B2 customers like Panna Cooking, Sermon Audio, and Fellowship Church are happy they chose B2 as their cloud storage provider. If any of that sounds interesting, read on.

The B2 Java SDK

While the Backblaze B2 API is well documented and straight-forward to implement, we were asked by a few of our Integration Partners if we had an SDK they could use. So we developed one as an open-course project on GitHub, where we hope interested parties will not only use our Java SDK, but make it better for everyone else.

There are different reasons one might use the Java SDK, but a couple of areas where the SDK can simplify the coding process are:

Expiring Authorization — B2 requires an application key for a given account be reissued once a day when using the API. If the application key expires while you are in the middle of transferring files or some other B2 activity (bucket list, etc.), the SDK can be used to detect and then update the application key on the fly. Your B2 related activities will continue without incident and without having to capture and code your own exception case.

Error Handling — There are different types of error codes B2 will return, from expired application keys to detecting malformed requests to command time-outs. The SDK can dramatically simplify the coding needed to capture and account for the various things that can happen.

While Backblaze has created the Java SDK, developers in the GitHub community have also created other SDKs for B2, for example, for PHP (https://github.com/cwhite92/b2-sdk-php,) and Go (https://github.com/kurin/blazer.) Let us know in the comments about other SDKs you’d like to see or perhaps start your own GitHub project. We will publish any updates in our next B2 roundup.

What You Can Do with Affordable and Available Cloud Storage

You’re probably aware that B2 is up to 75% less expensive than other similar cloud storage services like Amazon S3 and Microsoft Azure. Businesses and organizations are finding that projects that previously weren’t economically feasible with other Cloud Storage services are now not only possible, but a reality with B2. Here are a few recent examples:

SermonAudio logo SermonAudio wanted their media files to be readily available, but didn’t want to build and manage their own internal storage farm. Until B2, cloud storage was just too expensive to use. Now they use B2 to store their audio and video files, and also as the primary source of downloads and streaming requests from their subscribers.
Fellowship Church logo Fellowship Church wanted to escape from the ever increasing amount of time they were spending saving their data to their LTO-based system. Using B2 saved countless hours of personnel time versus LTO, fit easily into their video processing workflow, and provided instant access at any time to their media library.
Panna logo Panna Cooking replaced their closet full of archive hard drives with a cost-efficient hybrid-storage solution combining 45Drives and Backblaze B2 Cloud Storage. Archived media files that used to take hours to locate are now readily available regardless of whether they reside in local storage or in the B2 Cloud.

B2 Integrations

Leading companies in backup, archive, and sync continue to add B2 Cloud Storage as a storage destination for their customers. These companies realize that by offering B2 as an option, they can dramatically lower the total cost of ownership for their customers — and that’s always a good thing.

If your favorite application is not integrated to B2, you can do something about it. One integration partner told us they received over 200 customer requests for a B2 integration. The partner got the message and the integration is currently in beta test.

Below are some of the partner integrations completed in the past few months. You can check the B2 Partner Integrations page for a complete list.

Archiware — Both P5 Archive and P5 Backup can now store data in the B2 Cloud making your offsite media files readily available while keeping your off-site storage costs predictable and affordable.

Arq — Combine Arq and B2 for amazingly affordable backup of external drives, network drives, NAS devices, Windows PCs, Windows Servers, and Macs to the cloud.

GoodSync — Automatically synchronize and back up all your photos, music, email, and other important files between all your desktops, laptops, servers, external drives, and sync, or back up to B2 Cloud Storage for off-site storage.

QNAP — QNAP Hybrid Backup Sync consolidates backup, restoration, and synchronization functions into a single QTS application to easily transfer your data to local, remote, and cloud storage.

Morro Data — Their CloudNAS solution stores files in the cloud, caches them locally as needed, and syncs files globally among other CloudNAS systems in an organization.

Restic – Restic is a fast, secure, multi-platform command line backup program. Files are uploaded to a B2 bucket as de-duplicated, encrypted chunks. Each backup is a snapshot of only the data that has changed, making restores of a specific date or time easy.

Transmit 5 by Panic — Transmit 5, the gold standard for macOS file transfer apps, now supports B2. Upload, download, and manage files on tons of servers with an easy, familiar, and powerful UI.

UpdraftPlus — WordPress developers and admins can now use the UpdraftPlus Premium WordPress plugin to affordably back up their data to the B2 Cloud.

Getting Started with B2 Cloud Storage

If you’re using B2 today, thank you. If you’d like to try B2, but don’t know where to start, here’s a guide to getting started with the B2 Web Interface — no programming or scripting is required. You get 10 gigabytes of free storage and 1 gigabyte a day in free downloads. Give it a try.

The post B2 Cloud Storage Roundup appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Backing Up the Modern Enterprise with Backblaze for Business

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/endpoint-backup-solutions/

Endpoint backup diagram

Organizations of all types and sizes need reliable and secure backup. Whether they have as few as 3 or as many as 300,000 computer users, an organization’s computer data is a valuable business asset that needs to be protected.

Modern organizations are changing how they work and where they work, which brings new challenges to making sure that company’s data assets are not only available, but secure. Larger organizations have IT departments that are prepared to address these needs, but often times in smaller and newer organizations the challenge falls upon office management who might not be as prepared or knowledgeable to face a work environment undergoing dramatic changes.

Whether small or large, local or world-wide, for-profit or non-profit, organizations need a backup strategy and solution that matches the new ways of working in the enterprise.

The Enterprise Has Changed, and So Has Data Use

More and more, organizations are working in the cloud. These days organizations can operate just fine without their own file servers, database servers, mail servers, or other IT infrastructure that used to be standard for all but the smallest organization.

The reality is that for most organizations, though, it’s a hybrid work environment, with a combination of cloud-based and PC and Macintosh-based applications. Legacy apps aren’t going away any time soon. They will be with us for a while, with their accompanying data scattered amongst all the desktops, laptops and other endpoints in corporate headquarters, home offices, hotel rooms, and airport waiting areas.

In addition, the modern workforce likely combines regular full-time employees, remote workers, contractors, and sometimes interns, volunteers, and other temporary workers who also use company IT assets.

The Modern Enterprise Brings New Challenges for IT

These changes in how enterprises work present a problem for anyone tasked with making sure that data — no matter who uses it or where it lives — is adequately backed-up. Cloud-based applications, when properly used and managed, can be adequately backed up, provided that users are connected to the internet and data transfers occur regularly — which is not always the case. But what about the data on the laptops, desktops, and devices used by remote employees, contractors, or just employees whose work keeps them on the road?

The organization’s backup solution must address all the needs of the modern organization or enterprise using both cloud and PC and Mac-based applications, and not be constrained by employee or computer location.

A Ten-Point Checklist for the Modern Enterprise for Backing Up

What should the modern enterprise look for when evaluating a backup solution?

1) Easy to deploy to workers’ computers

Whether installed by the computer user or an IT person locally or remotely, the backup solution must be easy to implement quickly with minimal demands on the user or administrator.

2) Fast and unobtrusive client software

Backups should happen in the background by efficient (native) PC and Macintosh software clients that don’t consume valuable processing power or take memory away from applications the user needs.

3) Easy to configure

The backup solutions must be easy to configure for both the user and the IT professional. Ease-of-use means less time to deploy, configure, and manage.

4) Defaults to backing up all valuable data

By default, the solution backs up commonly used files and folders or directories, including desktops. Some backup solutions are difficult and intimidating because they require that the user chose what needs to be backed up, often missing files and folders/directories that contain valuable data.

5) Works automatically in the background

Backups should happen automatically, no matter where the computer is located. The computer user, especially the remote or mobile one, shouldn’t be required to attach cables or drives, or remember to initiate backups. A working solution backs up automatically without requiring action by the user or IT administrator.

6) Data restores are fast and easy

Whether it’s a single file, directory, or an entire system that must be restored, a user or IT sysadmin needs to be able to restore backed up data as quickly as possible. In cases of large restores to remote locations, the ability to send a restore via physical media is a must.

7) No limitations on data

Throttling, caps, and data limits complicate backups and require guesses about how much storage space will be needed.

8) Safe & Secure

Organizations require that their data is secure during all phases of initial upload, storage, and restore.

9) Easy-to-manage

The backup solution needs to provide a clear and simple web management interface for all functions. Designing for ease-of-use leads to efficiency in management and operation.

10) Affordable and transparent pricing

Backup costs should be predictable, understandable, and without surprises.

Two Scenarios for the Modern Enterprise

Enterprises exist in many forms and types, but wanting to meet the above requirements is common across all of them. Below, we take a look at two common scenarios showing how enterprises face these challenges. Three case studies are available that provide more information about how Backblaze customers have succeeded in these environments.

Enterprise Profile 1

The needs of a smaller enterprise differ from those of larger, established organizations. This organization likely doesn’t have anyone who is devoted full-time to IT. The job of on-boarding new employees and getting them set up with a computer likely falls upon an executive assistant or office manager. This person might give new employees a checklist with the software and account information and lets users handle setting up the computer themselves.

Organizations in this profile need solutions that are easy to install and require little to no configuration. Backblaze, by default, backs up all user data, which lets the organization be secure in knowing all the data will be backed up to the cloud — including files left on the desktop. Combined with Backblaze’s unlimited data policy, organizations have a truly “set it and forget it” platform.

Customizing Groups To Meet Teams’ Needs

The Groups feature of Backblaze for Business allows an organization to decide whether an individual client’s computer will be Unmanaged (backups and restores under the control of the worker), or Managed, in which an administrator can monitor the status and frequency of backups and handle restores should they become necessary. One group for the entire organization might be adequate at this stage, but the organization has the option to add additional groups as it grows and needs more flexibility and control.

The organization, of course, has the choice of managing and monitoring users using Groups. With Backblaze’s Groups, organizations can set user-based access rules, which allows the administrator to create restores for lost files or entire computers on an employee’s behalf, to centralize billing for all client computers in the organization, and to redeploy a recovered computer or new computer with the backed up data.

Restores

In this scenario, the decision has been made to let each user manage her own backups, including restores, if necessary, of individual files or entire systems. If a restore of a file or system is needed, the restore process is easy enough for the user to handle it by herself.

Case Study 1

Read about how PagerDuty uses Backblaze for Business in a mixed enterprise of cloud and desktop/laptop applications.

PagerDuty Case Study

In a common approach, the employee can retrieve an accidentally deleted file or an earlier version of a document on her own. The Backblaze for Business interface is easy to navigate and was designed with feedback from thousands of customers over the course of a decade.

In the event of a lost, damaged, or stolen laptop,  administrators of Managed Groups can  initiate the restore, which could be in the form of a download of a restore ZIP file from the web management console, or the overnight shipment of a USB drive directly to the organization or user.

Enterprise Profile 2

This profile is for an organization with a full-time IT staff. When a new worker joins the team, the IT staff is tasked with configuring the computer and delivering it to the new employee.

Backblaze for Business Groups

Case Study 2

Global charitable organization charity: water uses Backblaze for Business to back up workers’ and volunteers’ laptops as they travel to developing countries in their efforts to provide clean and safe drinking water.

charity: water Case Study

This organization can take advantage of additional capabilities in Groups. A Managed Group makes sense in an organization with a geographically dispersed work force as it lets IT ensure that workers’ data is being regularly backed up no matter where they are. Billing can be company-wide or assigned to individual departments or geographical locations. The organization has the choice of how to divide the organization into Groups (location, function, subsidiary, etc.) and whether the Group should be Managed or Unmanaged. Using Managed Groups might be suitable for most of the organization, but there are exceptions in which sensitive data might dictate using an Unmanaged Group, such as could be the case with HR, the executive team, or finance.

Deployment

By Invitation Email, Link, or Domain

Backblaze for Business allows a number of options for deploying the client software to workers’ computers. Client installation is fast and easy on both Windows and Macintosh, so sending email invitations to users or automatically enrolling users by domain or invitation link, is a common approach.

By Remote Deployment

IT might choose to remotely and silently deploy Backblaze for Business across specific Groups or the entire organization. An administrator can silently deploy the Backblaze backup client via the command-line, or use common RMM (Remote Monitoring and Management) tools such as Jamf and Munki.

Restores

Case Study 3

Read about how Bright Bear Technology Solutions, an IT Managed Service Provider (MSP), uses the Groups feature of Backblaze for Business to manage customer backups and restores, deploy Backblaze licenses to their customers, and centralize billing for all their client-based backup services.

Bright Bear Case Study

Some organizations are better equipped to manage or assist workers when restores become necessary. Individual users will be pleased to discover they can roll-back files to an earlier version if they wish, but IT will likely manage any complete system restore that involves reconfiguring a computer after a repair or requisitioning an entirely new system when needed.

This organization might chose to retain a client’s entire computer backup for archival purposes, using Backblaze B2 as the cloud storage solution. This is another advantage of having a cloud storage provider that combines both endpoint backup and cloud object storage among its services.

The Next Step: Server Backup & Data Archiving with B2 Cloud Storage

As organizations grow, they have increased needs for cloud storage beyond Macintosh and PC data backup. Backblaze’s object cloud storage, Backblaze B2, provides low-cost storage and archiving of records, media, and server data that can grow with the organization’s size and needs.

B2 Cloud Storage is available through the same Backblaze management console as Backblaze Computer Backup. This means that Admins have one console for billing, monitoring, deployment, and role provisioning. B2 is priced at 1/4 the cost of Amazon S3, or $0.005 per month per gigabyte (which equals $5/month per terabyte).

Why Modern Enterprises Chose Backblaze

Backblaze for Business

Businesses and organizations select Backblaze for Business for backup because Backblaze is designed to meet the needs of the modern enterprise. Backblaze customers are part of a a platform that has a 10+ year track record of innovation and over 400 petabytes of customer data already under management.

Backblaze’s backup model is proven through head-to-head comparisons to back up data that other backup solutions overlook in their default configurations — including valuable files that are needed after an accidental deletion, theft, or computer failure.

Backblaze is the only enterprise-level backup company that provides TOTP (Time-based One-time Password) via both SMS and Authentication app to all accounts at no incremental charge. At just $50/year/computer, Backblaze is affordable for any size of enterprise.

Modern Enterprises can Meet The Challenge of The Changing Data Environment

With the right backup solution and strategy, the modern enterprise will be prepared to ensure that its data is protected from accident, disaster, or theft, whether its data is in one office or dispersed among many locations, and remote and mobile employees.

Backblaze for Business is an affordable solution that enables organizations to meet the evolving data demands facing the modern enterprise.

The post Backing Up the Modern Enterprise with Backblaze for Business appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Just in Case You Missed It: Catching Up on Some Recent AWS Launches

Post Syndicated from Tara Walker original https://aws.amazon.com/blogs/aws/just-in-case-you-missed-it-catching-up-on-some-recent-aws-launches/

So many launches and cloud innovations, that you simply may not believe.  In order to catch up on some service launches and features, this post will be a round-up of some cool releases that happened this summer and through the end of September.

The launches and features I want to share with you today are:

  • AWS IAM for Authenticating Database Users for RDS MySQL and Amazon Aurora
  • Amazon SES Reputation Dashboard
  • Amazon SES Open and Click Tracking Metrics
  • Serverless Image Handler by the Solutions Builder Team
  • AWS Ops Automator by the Solutions Builder Team

Let’s dive in, shall we!

AWS IAM for Authenticating Database Users for RDS MySQL and Amazon Aurora

Wished you could manage access to your Amazon RDS database instances and clusters using AWS IAM? Well, wish no longer. Amazon RDS has launched the ability for you to use IAM to manage database access for Amazon RDS for MySQL and Amazon Aurora DB.

What I like most about this new service feature is, it’s very easy to get started.  To enable database user authentication using IAM, you would select a checkbox Enable IAM DB Authentication when creating, modifying, or restoring your DB instance or cluster. You can enable IAM access using the RDS console, the AWS CLI, and/or the Amazon RDS API.

After configuring the database for IAM authentication, client applications authenticate to the database engine by providing temporary security credentials generated by the IAM Security Token Service. These credentials can be used instead of providing a password to the database engine.

You can learn more about using IAM to provide targeted permissions and authentication to MySQL and Aurora by reviewing the Amazon RDS user guide.

Amazon SES Reputation Dashboard

In order to aid Amazon Simple Email Service customers’ in utilizing best practice guidelines for sending email, I am thrilled to announce we launched the Reputation Dashboard to provide comprehensive reporting on email sending health. To aid in proactively managing emails being sent, customers now have visibility into overall account health, sending metrics, and compliance or enforcement status.

The Reputation Dashboard will provide the following information:

  • Account status: A description of your account health status.
    • Healthy – No issues currently impacting your account.
    • Probation – Account is on probation; Issues causing probation must be resolved to prevent suspension
    • Pending end of probation decision – Your account is on probation. Amazon SES team member must review your account prior to action.
    • Shutdown – Your account has been shut down. No email will be able to be sent using Amazon SES.
    • Pending shutdown – Your account is on probation and issues causing probation are unresolved.
  • Bounce Rate: Percentage of emails sent that have bounced and bounce rate status messages.
  • Complaint Rate: Percentage of emails sent that recipients have reported as spam and complaint rate status messages.
  • Notifications: Messages about other account reputation issues.

Amazon SES Open and Click Tracking Metrics

Another exciting feature recently added to Amazon SES is support for Email Open and Click Tracking Metrics. With Email Open and Click Tracking Metrics feature, SES customers can now track when email they’ve sent has been opened and track when links within the email have been clicked.  Using this SES feature will allow you to better track email campaign engagement and effectiveness.

How does this work?

When using the email open tracking feature, SES will add a transparent, miniature image into the emails that you choose to track. When the email is opened, the mail application client will load the aforementioned tracking which triggers an open track event with Amazon SES. For the email click (link) tracking, links in email and/or email templates are replaced with a custom link.  When the custom link is clicked, a click event is recorded in SES and the custom link will redirect the email user to the link destination of the original email.

You can take advantage of the new open tracking and click tracking features by creating a new configuration set or altering an existing configuration set within SES. After choosing either; Amazon SNS, Amazon CloudWatch, or Amazon Kinesis Firehose as the AWS service to receive the open and click metrics, you would only need to select a new configuration set to successfully enable these new features for any emails you want to send.

AWS Solutions: Serverless Image Handler & AWS Ops Automator

The AWS Solution Builder team has been hard at work helping to make it easier for you all to find answers to common architectural questions to aid in building and running applications on AWS. You can find these solutions on the AWS Answers page. Two new solutions released earlier this fall on AWS Answers are  Serverless Image Handler and the AWS Ops Automator.
Serverless Image Handler was developed to provide a solution to help customers dynamically process, manipulate, and optimize the handling of images on the AWS Cloud. The solution combines Amazon CloudFront for caching, AWS Lambda to dynamically retrieve images and make image modifications, and Amazon S3 bucket to store images. Additionally, the Serverless Image Handler leverages the open source image-processing suite, Thumbor, for additional image manipulation, processing, and optimization.

AWS Ops Automator solution helps you to automate manual tasks using time-based or event-based triggers to automatically such as snapshot scheduling by providing a framework for automated tasks and includes task audit trails, logging, resource selection, scaling, concurrency handling, task completion handing, and API request retries. The solution includes the following AWS services:

  • AWS CloudFormation: a templates to launches the core framework of microservices and solution generated task configurations
  • Amazon DynamoDB: a table which stores task configuration data to defines the event triggers, resources, and saves the results of the action and the errors.
  • Amazon CloudWatch Logs: provides logging to track warning and error messages
  • Amazon SNS: topic to send messages to a subscribed email address to which to send the logging information from the solution

Have fun exploring and coding.

Tara

Amazon ElastiCache for Redis Is Now a HIPAA Eligible Service and You Can Use It to Power Real-Time Healthcare Applications

Post Syndicated from Manan Goel original https://aws.amazon.com/blogs/security/now-you-can-use-amazon-elasticache-for-redis-a-hipaa-eligible-service-to-power-real-time-healthcare-applications/

HIPAA image

Amazon ElastiCache for Redis is now a HIPAA Eligible Service and has been added to the AWS Business Associate Addendum (BAA). This means you can use ElastiCache for Redis to help you power healthcare applications as well as process, maintain, and store protected health information (PHI). ElastiCache for Redis is a Redis-compatible, fully-managed, in-memory data store and cache in the cloud that provides sub-millisecond latency to power applications. Now you can use the speed, simplicity, and flexibility of ElastiCache for Redis to build secure, fast, and internet-scale healthcare applications.

ElastiCache for Redis with HIPAA eligibility is available for all current-generation instance node types and requires Redis engine version 3.2.6. You must ensure that nodes are configured to encrypt the data in transit and at rest, and to authenticate Redis commands before the engine executes them. See Architecting for HIPAA Security and Compliance on Amazon Web Services for information about how to configure Amazon HIPAA Eligible Services to store, process, and transmit PHI.

ElastiCache for Redis uses Advanced Encryption Standard (AES)-512 symmetric keys to encrypt data on disk. The Redis backups stored in Amazon S3 are encrypted with server-side encryption (SSE) using AES-256 symmetric keys. ElastiCache for Redis uses Transport Layer Security (TLS) to encrypt data in transit. It uses the Redis AUTH token that you provide at the time of Redis cluster creation to authenticate the Redis commands coming from clients. The AUTH token is encrypted using AWS Key Management Service.

There is no additional charge for using ElastiCache for Redis clusters with HIPAA eligibility. To get started, see HIPAA Compliance for Amazon ElastiCache for Redis.

– Manan

Tableau 10.4 Supports Amazon Redshift Spectrum with External Amazon S3 Tables

Post Syndicated from Robin Cottiss original https://aws.amazon.com/blogs/big-data/tableau-10-4-supports-amazon-redshift-spectrum-with-external-amazon-s3-tables/

This is a guest post by Robin Cottiss, strategic customer consultant, Russell Christopher, staff product manager, and Vaidy Krishnan, senior manager of product marketing, at Tableau. Tableau, in their own words, “helps anyone quickly analyze, visualize, and share information. More than 61,000 customer accounts get rapid results with Tableau in the office and on the go. Over 300,000 people use Tableau Public to share public data in their blogs and websites.”

We’re excited to announce today an update to our Amazon Redshift connector with support for Amazon Redshift Spectrum to analyze data in external Amazon S3 tables. This feature, the direct result of joint engineering and testing work performed by the teams at Tableau and AWS, was released as part of Tableau 10.3.3 and will be available broadly in Tableau 10.4.1. With this update, you can quickly and directly connect Tableau to data in Amazon Redshift and analyze it in conjunction with data in Amazon S3—all with drag-and-drop ease.

This connector is yet another in a series of market-leading integrations of Tableau with AWS’s analytics platform, with services such as Amazon Redshift, Amazon EMR, and Amazon Athena. These integrations have allowed Tableau to become the natural choice of tool for analyzing data stored on AWS. Beyond this, Tableau Server runs seamlessly in the AWS Cloud infrastructure. If you prefer to deploy all your applications inside AWS, you have a complete solution offering from Tableau.

How does support for Amazon Redshift Spectrum help you?

If you’re like many Tableau customers, you have large buckets of data stored in Amazon S3. You might need to access this data frequently and store it in a consistent, highly structured format. If so, you can provision it to a data warehouse like Amazon Redshift. You might also want to explore this S3 data on an ad hoc basis. For example, you might want to determine whether or not to provision the data, and where—options might be Hadoop, Impala, Amazon EMR, or Amazon Redshift. To do so, you can use Amazon Athena, a serverless interactive query service from AWS that requires no infrastructure setup and management.

But what if you want to analyze both the frequently accessed data stored locally in Amazon Redshift AND your full datasets stored cost-effectively in Amazon S3? What if you want the throughput of disk and sophisticated query optimization of Amazon Redshift AND a service that combines a serverless scale-out processing capability with the massively reliable and scalable S3 infrastructure? What if you want the super-fast performance of Amazon Redshift AND support for open storage formats (for example, Parquet or ORC) in S3?

To enable these AND and resolve the tyranny of ORs, AWS launched Amazon Redshift Spectrum earlier this year.

Amazon Redshift Spectrum gives you the freedom to store your data where you want, in the format you want, and have it available for processing when you need it. Since the Amazon Redshift Spectrum launch, Tableau has worked tirelessly to provide best-in-class support for this new service. With Tableau and Redshift Spectrum, you can extend your Amazon Redshift analyses out to the entire universe of data in your S3 data lakes.

This latest update has been tested by many customers with very positive feedback. One such customer is the world’s largest food product distributor, Sysco—you can watch their session referencing the Amazon Spectrum integration at Tableau Conference 2017. Sysco also plans to reprise its “Tableau on AWS” story again in a month’s time at AWS re:Invent.

Now, I’d like to use a concrete example to demonstrate how Tableau works with Amazon Redshift Spectrum. In this example, I also show you how and why you might want to connect to your AWS data in different ways.

The setup

I use the pipeline described following to ingest, process, and analyze data with Tableau on an AWS stack. The source data is the New York City Taxi dataset, which has 9 years’ worth of taxi rides activity (including pick-up and drop-off location, amount paid, payment type, and so on) captured in 1.2 billion records.

In this pipeline, this data lands in S3, is cleansed and partitioned by using Amazon EMR, and is then converted to a columnar Parquet format that is analytically optimized. You can point Tableau to the raw data in S3 by using Amazon Athena. You can also access the cleansed data with Tableau using Presto through your Amazon EMR cluster.

Why use Tableau this early in the pipeline? Because sometimes you want to understand what’s there and what questions are worth asking before you even start the analysis.

After you find out what those questions are and determine if this sort of analysis has long-term usefulness, you can automate and optimize that pipeline. You do this to add new data as soon as possible as it arrives, to get it to the processes and people that need it. You might also want to provision this data to a highly performant “hotter” layer (Amazon Redshift or Tableau Extract) for repeated access.

In the illustration preceding, S3 contains the raw denormalized ride data at the timestamp level of granularity. This S3 data is the fact table. Amazon Redshift has the time dimensions broken out by date, month, and year, and also has the taxi zone information.

Now imagine I want to know where and when taxi pickups happen on a certain date in a certain borough. With support for Amazon Redshift Spectrum, I can now join the S3 tables with the Amazon Redshift dimensions, as shown following.

I can next analyze the data in Tableau to produce a borough-by-borough view of New York City ride density on Christmas Day 2015.

Or I can hone in on just Manhattan and identify pickup hotspots, with ride charges way above the average!

With Amazon Redshift Spectrum, you now have a fast, cost-effective engine that minimizes data processed with dynamic partition pruning. You can further improve query performance by reducing the data scanned. You do this by partitioning and compressing data and by using a columnar format for storage.

At the end of the day, which engine you use behind Tableau is a function of what you want to optimize for. Some possible engines are Amazon Athena, Amazon Redshift, and Redshift Spectrum, or you can bring a subset of data into Tableau Extract. Factors in planning optimization include these:

  • Are you comfortable with the serverless cost model of Amazon Athena and potential full scans? Or do you prefer the advantages of no setup?
  • Do you want the throughput of local disk?
  • Effort and time of setup. Are you okay with the lead-time of an Amazon Redshift cluster setup, as opposed to just bringing everything into Tableau Extract?

To meet the many needs of our customers, Tableau’s approach is simple: It’s all about choice. The choice of how you want to connect to and analyze your data. Throughout the history of our product and into the future, we have and will continue to empower choice for customers.

For more on how to deal with choice, as you go about making architecture decisions for your enterprise, watch this big data strategy session my friend Robin Cottiss and I delivered at Tableau Conference 2017. This session includes several customer examples leveraging the Tableau on AWS platform, and also a run-through of the aforementioned demonstration.

If you’re curious to learn more about analyzing data with Tableau on Amazon Redshift we encourage you to check out the following resources:

Hot Startups on AWS – October 2017

Post Syndicated from Tina Barr original https://aws.amazon.com/blogs/aws/hot-startups-on-aws-october-2017/

In 2015, the Centers for Medicare and Medicaid Services (CMS) reported that healthcare spending made up 17.8% of the U.S. GDP – that’s almost $3.2 trillion or $9,990 per person. By 2025, the CMS estimates this number will increase to nearly 20%. As cloud technology evolves in the healthcare and life science industries, we are seeing how companies of all sizes are using AWS to provide powerful and innovative solutions to customers across the globe. This month we are excited to feature the following startups:

  • ClearCare – helping home care agencies operate efficiently and grow their business.
  • DNAnexus – providing a cloud-based global network for sharing and managing genomic data.

ClearCare (San Francisco, CA)

ClearCare envisions a future where home care is the only choice for aging in place. Home care agencies play a critical role in the economy and their communities by significantly lowering the overall cost of care, reducing the number of hospital admissions, and bending the cost curve of aging. Patients receiving home care typically have multiple chronic conditions and functional limitations, driving over $190 billion in healthcare spending in the U.S. each year. To offset these costs, health insurance payers are developing in-home care management programs for patients. ClearCare’s goal is to help home care agencies leverage technology to improve costs, outcomes, and quality of life for the aging population. The company’s powerful software platform is specifically designed for use by non-medical, in-home care agencies to manage their businesses.

Founder and CEO Geoff Nudd created ClearCare because of his own grandmother’s need for care. Keeping family members and caregivers up to date on a loved one’s well being can be difficult, so Geoff created what is now ClearCare’s Family Room, which enables caregivers and agency staff to check schedules and receive real-time updates about what’s happening in the home. Since then, agencies have provided feedback on others areas of their businesses that could be streamlined. ClearCare has now built over 20 modules to help home care agencies optimize operations with services including a telephony service, billing and payroll, and more. ClearCare now serves over 4,000 home care agencies, representing 500,000 caregivers and 400,000 seniors.

Using AWS, ClearCare is able to spin up reliable infrastructure for proofs of concept and iterate on those systems to quickly get value to market. The company runs many AWS services including Amazon Elasticsearch Service, Amazon RDS, and Amazon CloudFront. Amazon EMR and Amazon Athena have enabled ClearCare to build a Hadoop-based ETL and data warehousing system that processes terabytes of data each day. By utilizing these managed services, ClearCare has been able to go from concept to customer delivery in less than three months.

To learn more about ClearCare, check out their website.

DNAnexus (Mountain View, CA)

DNAnexus is accelerating the application of genomic data in precision medicine by providing a cloud-based platform for sharing and managing genomic and biomedical data and analysis tools. The company was founded in 2009 by Stanford graduate student Andreas Sundquist and two Stanford professors Arend Sidow and Serafim Batzoglou, to address the need for scaling secondary analysis of next-generation sequencing (NGS) data in the cloud. The founders quickly learned that users needed a flexible solution to build complex analysis workflows and tools that enable them to share and manage large volumes of data. DNAnexus is optimized to address the challenges of security, scalability, and collaboration for organizations that are pursuing genomic-based approaches to health, both in clinics and research labs. DNAnexus has a global customer base – spanning North America, Europe, Asia-Pacific, South America, and Africa – that runs a million jobs each month and is doubling their storage year-over-year. The company currently stores more than 10 petabytes of biomedical and genomic data. That is equivalent to approximately 100,000 genomes, or in simpler terms, over 50 billion Facebook photos!

DNAnexus is working with its customers to help expand their translational informatics research, which includes expanding into clinical trial genomic services. This will help companies developing different medicines to better stratify clinical trial populations and develop companion tests that enable the right patient to get the right medicine. In collaboration with Janssen Human Microbiome Institute, DNAnexus is also launching Mosaic – a community platform for microbiome research.

AWS provides DNAnexus and its customers the flexibility to grow and scale research programs. Building the technology infrastructure required to manage these projects in-house is expensive and time-consuming. DNAnexus removes that barrier for labs of any size by using AWS scalable cloud resources. The company deploys its customers’ genomic pipelines on Amazon EC2, using Amazon S3 for high-performance, high-durability storage, and Amazon Glacier for low-cost data archiving. DNAnexus is also an AWS Life Sciences Competency Partner.

Learn more about DNAnexus here.

-Tina