Tag Archives: seo

T2 Unlimited – Going Beyond the Burst with High Performance

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-t2-unlimited-going-beyond-the-burst-with-high-performance/

I first wrote about the T2 instances in the summer of 2014, and talked about how many workloads have a modest demand for continuous compute power and an occasional need for a lot more. This model resonated with our customers; the T2 instances are very popular and are now used to host microservices, low-latency interactive applications, virtual desktops, build & staging environments, prototypes, and the like.

New T2 Unlimited
Today we are extending the burst model that we pioneered with the T2, giving you the ability to sustain high CPU performance over any desired time frame while still keeping your costs as low as possible. You simply enable this feature when you launch your instance; you can also enable it for an instance that is already running. The hourly T2 instance price covers all interim spikes in usage if the average CPU utilization is lower than the baseline over a 24-hour window. There’s a small hourly charge if the instance runs at higher CPU utilization for a prolonged period of time. For example, if you run a t2.micro instance at an average of 15% utilization (5% above the baseline) for 24 hours you will be charged an additional 6 cents (5 cents per vCPU-hour * 1 vCPU * 5% * 24 hours).

To launch a T2 Unlimited instance from the EC2 Console, select any T2 instance and then click on Enable next to T2 Unlimited:

And here’s how to switch a running instance from T2 Standard to T2 Unlimited:

Behind the Scenes
As I described in my original post, each T2 instance accumulates CPU Credits as it runs and consumes them while it is running at full-core speed, decelerating to a baseline level when the supply of Credits is exhausted. T2 Unlimited instances have the ability to borrow an entire day’s worth of future credits, allowing them to perform additional bursting. This borrowing is tracked by the new CPUSurplusCreditBalance CloudWatch metric. When this balance rises to the level where it represents an entire day’s worth of future credits, the instance continues to deliver full-core performance, charged at the rate of $0.05 per vCPU per hour for Linux and $0.096 for Windows. These charged surplus credits are tracked by the new CPUSurplusCreditsCharged metric. You will be charged on a per-millisecond basis for partial hours of bursting (further reducing your costs) if you exhaust your surplus late in a given hour.

The charge for any remaining CPUSurplusCreditBalance is processed when the instance is terminated or configured as a T2 Standard. Any accumulated CPUCreditBalance carries over during the transition to T2 Standard.

The T2 Unlimited model is designed to spare you the trouble of watching the CloudWatch metrics, but (if you are like me) you will do it anyway. Let’s take a quick look at a t2.nano and watch the credits over time. First, CPU utilization grows to 100% and the instance begins to consume 5 credits every 5 minutes (one credit is equivalent to a VCPU-minute):

The CPU credit balance remains at 0 because the credits are being produced and consumed at the same rate. The surplus credit balance (tracked by the CPUSurplusCreditBalance metric) ramps up to 72, representing the credits that are being borrowed from the future:

Once the surplus credit balance hits 72, there’s nothing more to borrow from the future, and any further CPU usage is charged at the end of the hour, tracked with the CPUSurplusCreditsCharged metric. The instance consumes 5 credits every 5 minutes and earns 0.25, resulting in a net charge of 4.75 VCPU-minutes for each 5 minutes of bursting:

You can switch each of your instances back and forth between T2 Standard and T2 Unlimited at any time; all credit balances except CPUSurplusCreditsCharged remain and are carried over. Because T2 Unlimited instances have the ability to burst at any time, they do not receive the 30 minutes of credits given to newly launched T2 Standard instances. Also, since each AWS account can launch a limited number of T2 Standard instances with initial CPU credits each day, T2 Unlimited instances can be a better fit for use in Auto Scaling Groups and other scenarios where large numbers of instances come and go each day.

Available Now
You can launch T2 Unlimited instances today in the US East (Northern Virginia), US East (Ohio), US West (Northern California), US West (Oregon), Canada (Central), South America (São Paulo), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Asia Pacific (Mumbai), Asia Pacific (Seoul), EU (Frankfurt), EU (Ireland), and EU (London) Regions today.

Jeff;

 

Amazon GuardDuty – Continuous Security Monitoring & Threat Detection

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-guardduty-continuous-security-monitoring-threat-detection/

Threats to your IT infrastructure (AWS accounts & credentials, AWS resources, guest operating systems, and applications) come in all shapes and sizes! The online world can be a treacherous place and we want to make sure that you have the tools, knowledge, and perspective to keep your IT infrastructure safe & sound.

Amazon GuardDuty is designed to give you just that. Informed by a multitude of public and AWS-generated data feeds and powered by machine learning, GuardDuty analyzes billions of events in pursuit of trends, patterns, and anomalies that are recognizable signs that something is amiss. You can enable it with a click and see the first findings within minutes.

How it Works
GuardDuty voraciously consumes multiple data streams, including several threat intelligence feeds, staying aware of malicious IP addresses, devious domains, and more importantly, learning to accurately identify malicious or unauthorized behavior in your AWS accounts. In combination with information gleaned from your VPC Flow Logs, AWS CloudTrail Event Logs, and DNS logs, this allows GuardDuty to detect many different types of dangerous and mischievous behavior including probes for known vulnerabilities, port scans and probes, and access from unusual locations. On the AWS side, it looks for suspicious AWS account activity such as unauthorized deployments, unusual CloudTrail activity, patterns of access to AWS API functions, and attempts to exceed multiple service limits. GuardDuty will also look for compromised EC2 instances talking to malicious entities or services, data exfiltration attempts, and instances that are mining cryptocurrency.

GuardDuty operates completely on AWS infrastructure and does not affect the performance or reliability of your workloads. You do not need to install or manage any agents, sensors, or network appliances. This clean, zero-footprint model should appeal to your security team and allow them to green-light the use of GuardDuty across all of your AWS accounts.

Findings are presented to you at one of three levels (low, medium, or high), accompanied by detailed evidence and recommendations for remediation. The findings are also available as Amazon CloudWatch Events; this allows you to use your own AWS Lambda functions to automatically remediate specific types of issues. This mechanism also allows you to easily push GuardDuty findings into event management systems such as Splunk, Sumo Logic, and PagerDuty and to workflow systems like JIRA, ServiceNow, and Slack.

A Quick Tour
Let’s take a quick tour. I open up the GuardDuty Console and click on Get started:

Then I confirm that I want to enable GuardDuty. This gives it permission to set up the appropriate service-linked roles and to analyze my logs by clicking on Enable GuardDuty:

My own AWS environment isn’t all that exciting, so I visit the General Settings and click on Generate sample findings to move ahead. Now I’ve got some intriguing findings:

I can click on a finding to learn more:

The magnifying glass icons allow me to create inclusion or exclusion filters for the associated resource, action, or other value. I can filter for all of the findings related to this instance:

I can customize GuardDuty by adding lists of trusted IP addresses and lists of malicious IP addresses that are peculiar to my environment:

After I enable GuardDuty in my administrator account, I can invite my other accounts to participate:

Once the accounts decide to participate, GuardDuty will arrange for their findings to be shared with the administrator account.

I’ve barely scratched the surface of GuardDuty in the limited space and time that I have. You can try it out at no charge for 30 days; after that you pay based on the number of entries it processes from your VPC Flow, CloudTrail, and DNS logs.

Available Now
Amazon GuardDuty is available in production form in the US East (Northern Virginia), US East (Ohio), US West (Oregon), US West (Northern California), EU (Ireland), EU (Frankfurt), EU (London), South America (São Paulo), Canada (Central), Asia Pacific (Tokyo), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), and Asia Pacific (Mumbai) Regions and you can start using it today!

Jeff;

How to Patch, Inspect, and Protect Microsoft Windows Workloads on AWS—Part 2

Post Syndicated from Koen van Blijderveen original https://aws.amazon.com/blogs/security/how-to-patch-inspect-and-protect-microsoft-windows-workloads-on-aws-part-2/

Yesterday in Part 1 of this blog post, I showed you how to:

  1. Launch an Amazon EC2 instance with an AWS Identity and Access Management (IAM) role, an Amazon Elastic Block Store (Amazon EBS) volume, and tags that Amazon EC2 Systems Manager (Systems Manager) and Amazon Inspector use.
  2. Configure Systems Manager to install the Amazon Inspector agent and patch your EC2 instances.

Today in Steps 3 and 4, I show you how to:

  1. Take Amazon EBS snapshots using Amazon EBS Snapshot Scheduler to automate snapshots based on instance tags.
  2. Use Amazon Inspector to check if your EC2 instances running Microsoft Windows contain any common vulnerabilities and exposures (CVEs).

To catch up on Steps 1 and 2, see yesterday’s blog post.

Step 3: Take EBS snapshots using EBS Snapshot Scheduler

In this section, I show you how to use EBS Snapshot Scheduler to take snapshots of your instances at specific intervals. To do this, I will show you how to:

  • Determine the schedule for EBS Snapshot Scheduler by providing you with best practices.
  • Deploy EBS Snapshot Scheduler by using AWS CloudFormation.
  • Tag your EC2 instances so that EBS Snapshot Scheduler backs up your instances when you want them backed up.

In addition to making sure your EC2 instances have all the available operating system patches applied on a regular schedule, you should take snapshots of the EBS storage volumes attached to your EC2 instances. Taking regular snapshots allows you to restore your data to a previous state quickly and cost effectively. With Amazon EBS snapshots, you pay only for the actual data you store, and snapshots save only the data that has changed since the previous snapshot, which minimizes your cost. You will use EBS Snapshot Scheduler to make regular snapshots of your EC2 instance. EBS Snapshot Scheduler takes advantage of other AWS services including CloudFormation, Amazon DynamoDB, and AWS Lambda to make backing up your EBS volumes simple.

Determine the schedule

As a best practice, you should back up your data frequently during the hours when your data changes the most. This reduces the amount of data you lose if you have to restore from a snapshot. For the purposes of this blog post, the data for my instances changes the most between the business hours of 9:00 A.M. to 5:00 P.M. Pacific Time. During these hours, I will make snapshots hourly to minimize data loss.

In addition to backing up frequently, another best practice is to establish a strategy for retention. This will vary based on how you need to use the snapshots. If you have compliance requirements to be able to restore for auditing, your needs may be different than if you are able to detect data corruption within three hours and simply need to restore to something that limits data loss to five hours. EBS Snapshot Scheduler enables you to specify the retention period for your snapshots. For this post, I only need to keep snapshots for recent business days. To account for weekends, I will set my retention period to three days, which is down from the default of 15 days when deploying EBS Snapshot Scheduler.

Deploy EBS Snapshot Scheduler

In Step 1 of Part 1 of this post, I showed how to configure an EC2 for Windows Server 2012 R2 instance with an EBS volume. You will use EBS Snapshot Scheduler to take eight snapshots each weekday of your EC2 instance’s EBS volumes:

  1. Navigate to the EBS Snapshot Scheduler deployment page and choose Launch Solution. This takes you to the CloudFormation console in your account. The Specify an Amazon S3 template URL option is already selected and prefilled. Choose Next on the Select Template page.
  2. On the Specify Details page, retain all default parameters except for AutoSnapshotDeletion. Set AutoSnapshotDeletion to Yes to ensure that old snapshots are periodically deleted. The default retention period is 15 days (you will specify a shorter value on your instance in the next subsection).
  3. Choose Next twice to move to the Review step, and start deployment by choosing the I acknowledge that AWS CloudFormation might create IAM resources check box and then choosing Create.

Tag your EC2 instances

EBS Snapshot Scheduler takes a few minutes to deploy. While waiting for its deployment, you can start to tag your instance to define its schedule. EBS Snapshot Scheduler reads tag values and looks for four possible custom parameters in the following order:

  • <snapshot time> – Time in 24-hour format with no colon.
  • <retention days> – The number of days (a positive integer) to retain the snapshot before deletion, if set to automatically delete snapshots.
  • <time zone> – The time zone of the times specified in <snapshot time>.
  • <active day(s)>all, weekdays, or mon, tue, wed, thu, fri, sat, and/or sun.

Because you want hourly backups on weekdays between 9:00 A.M. and 5:00 P.M. Pacific Time, you need to configure eight tags—one for each hour of the day. You will add the eight tags shown in the following table to your EC2 instance.

Tag Value
scheduler:ebs-snapshot:0900 0900;3;utc;weekdays
scheduler:ebs-snapshot:1000 1000;3;utc;weekdays
scheduler:ebs-snapshot:1100 1100;3;utc;weekdays
scheduler:ebs-snapshot:1200 1200;3;utc;weekdays
scheduler:ebs-snapshot:1300 1300;3;utc;weekdays
scheduler:ebs-snapshot:1400 1400;3;utc;weekdays
scheduler:ebs-snapshot:1500 1500;3;utc;weekdays
scheduler:ebs-snapshot:1600 1600;3;utc;weekdays

Next, you will add these tags to your instance. If you want to tag multiple instances at once, you can use Tag Editor instead. To add the tags in the preceding table to your EC2 instance:

  1. Navigate to your EC2 instance in the EC2 console and choose Tags in the navigation pane.
  2. Choose Add/Edit Tags and then choose Create Tag to add all the tags specified in the preceding table.
  3. Confirm you have added the tags by choosing Save. After adding these tags, navigate to your EC2 instance in the EC2 console. Your EC2 instance should look similar to the following screenshot.
    Screenshot of how your EC2 instance should look in the console
  4. After waiting a couple of hours, you can see snapshots beginning to populate on the Snapshots page of the EC2 console.Screenshot of snapshots beginning to populate on the Snapshots page of the EC2 console
  5. To check if EBS Snapshot Scheduler is active, you can check the CloudWatch rule that runs the Lambda function. If the clock icon shown in the following screenshot is green, the scheduler is active. If the clock icon is gray, the rule is disabled and does not run. You can enable or disable the rule by selecting it, choosing Actions, and choosing Enable or Disable. This also allows you to temporarily disable EBS Snapshot Scheduler.Screenshot of checking to see if EBS Snapshot Scheduler is active
  1. You can also monitor when EBS Snapshot Scheduler has run by choosing the name of the CloudWatch rule as shown in the previous screenshot and choosing Show metrics for the rule.Screenshot of monitoring when EBS Snapshot Scheduler has run by choosing the name of the CloudWatch rule

If you want to restore and attach an EBS volume, see Restoring an Amazon EBS Volume from a Snapshot and Attaching an Amazon EBS Volume to an Instance.

Step 4: Use Amazon Inspector

In this section, I show you how to you use Amazon Inspector to scan your EC2 instance for common vulnerabilities and exposures (CVEs) and set up Amazon SNS notifications. To do this I will show you how to:

  • Install the Amazon Inspector agent by using EC2 Run Command.
  • Set up notifications using Amazon SNS to notify you of any findings.
  • Define an Amazon Inspector target and template to define what assessment to perform on your EC2 instance.
  • Schedule Amazon Inspector assessment runs to assess your EC2 instance on a regular interval.

Amazon Inspector can help you scan your EC2 instance using prebuilt rules packages, which are built and maintained by AWS. These prebuilt rules packages tell Amazon Inspector what to scan for on the EC2 instances you select. Amazon Inspector provides the following prebuilt packages for Microsoft Windows Server 2012 R2:

  • Common Vulnerabilities and Exposures
  • Center for Internet Security Benchmarks
  • Runtime Behavior Analysis

In this post, I’m focused on how to make sure you keep your EC2 instances patched, backed up, and inspected for common vulnerabilities and exposures (CVEs). As a result, I will focus on how to use the CVE rules package and use your instance tags to identify the instances on which to run the CVE rules. If your EC2 instance is fully patched using Systems Manager, as described earlier, you should not have any findings with the CVE rules package. Regardless, as a best practice I recommend that you use Amazon Inspector as an additional layer for identifying any unexpected failures. This involves using Amazon CloudWatch to set up weekly Amazon Inspector scans, and configuring Amazon Inspector to notify you of any findings through SNS topics. By acting on the notifications you receive, you can respond quickly to any CVEs on any of your EC2 instances to help ensure that malware using known CVEs does not affect your EC2 instances. In a previous blog post, Eric Fitzgerald showed how to remediate Amazon Inspector security findings automatically.

Install the Amazon Inspector agent

To install the Amazon Inspector agent, you will use EC2 Run Command, which allows you to run any command on any of your EC2 instances that have the Systems Manager agent with an attached IAM role that allows access to Systems Manager.

  1. Choose Run Command under Systems Manager Services in the navigation pane of the EC2 console. Then choose Run a command.
    Screenshot of choosing "Run a command"
  2. To install the Amazon Inspector agent, you will use an AWS managed and provided command document that downloads and installs the agent for you on the selected EC2 instance. Choose AmazonInspector-ManageAWSAgent. To choose the target EC2 instance where this command will be run, use the tag you previously assigned to your EC2 instance, Patch Group, with a value of Windows Servers. For this example, set the concurrent installations to 1 and tell Systems Manager to stop after 5 errors.
    Screenshot of installing the Amazon Inspector agent
  3. Retain the default values for all other settings on the Run a command page and choose Run. Back on the Run Command page, you can see if the command that installed the Amazon Inspector agent executed successfully on all selected EC2 instances.
    Screenshot showing that the command that installed the Amazon Inspector agent executed successfully on all selected EC2 instances

Set up notifications using Amazon SNS

Now that you have installed the Amazon Inspector agent, you will set up an SNS topic that will notify you of any findings after an Amazon Inspector run.

To set up an SNS topic:

  1. In the AWS Management Console, choose Simple Notification Service under Messaging in the Services menu.
  2. Choose Create topic, name your topic (only alphanumeric characters, hyphens, and underscores are allowed) and give it a display name to ensure you know what this topic does (I’ve named mine Inspector). Choose Create topic.
    "Create new topic" page
  3. To allow Amazon Inspector to publish messages to your new topic, choose Other topic actions and choose Edit topic policy.
  4. For Allow these users to publish messages to this topic and Allow these users to subscribe to this topic, choose Only these AWS users. Type the following ARN for the US East (N. Virginia) Region in which you are deploying the solution in this post: arn:aws:iam::316112463485:root. This is the ARN of Amazon Inspector itself. For the ARNs of Amazon Inspector in other AWS Regions, see Setting Up an SNS Topic for Amazon Inspector Notifications (Console). Amazon Resource Names (ARNs) uniquely identify AWS resources across all of AWS.
    Screenshot of editing the topic policy
  5. To receive notifications from Amazon Inspector, subscribe to your new topic by choosing Create subscription and adding your email address. After confirming your subscription by clicking the link in the email, the topic should display your email address as a subscriber. Later, you will configure the Amazon Inspector template to publish to this topic.
    Screenshot of subscribing to the new topic

Define an Amazon Inspector target and template

Now that you have set up the notification topic by which Amazon Inspector can notify you of findings, you can create an Amazon Inspector target and template. A target defines which EC2 instances are in scope for Amazon Inspector. A template defines which packages to run, for how long, and on which target.

To create an Amazon Inspector target:

  1. Navigate to the Amazon Inspector console and choose Get started. At the time of writing this blog post, Amazon Inspector is available in the US East (N. Virginia), US West (N. California), US West (Oregon), EU (Ireland), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Sydney), and Asia Pacific (Tokyo) Regions.
  2. For Amazon Inspector to be able to collect the necessary data from your EC2 instance, you must create an IAM service role for Amazon Inspector. Amazon Inspector can create this role for you if you choose Choose or create role and confirm the role creation by choosing Allow.
    Screenshot of creating an IAM service role for Amazon Inspector
  3. Amazon Inspector also asks you to tag your EC2 instance and install the Amazon Inspector agent. You already performed these steps in Part 1 of this post, so you can proceed by choosing Next. To define the Amazon Inspector target, choose the previously used Patch Group tag with a Value of Windows Servers. This is the same tag that you used to define the targets for patching. Then choose Next.
    Screenshot of defining the Amazon Inspector target
  4. Now, define your Amazon Inspector template, and choose a name and the package you want to run. For this post, use the Common Vulnerabilities and Exposures package and choose the default duration of 1 hour. As you can see, the package has a version number, so always select the latest version of the rules package if multiple versions are available.
    Screenshot of defining an assessment template
  5. Configure Amazon Inspector to publish to your SNS topic when findings are reported. You can also choose to receive a notification of a started run, a finished run, or changes in the state of a run. For this blog post, you want to receive notifications if there are any findings. To start, choose Assessment Templates from the Amazon Inspector console and choose your newly created Amazon Inspector assessment template. Choose the icon below SNS topics (see the following screenshot).
    Screenshot of choosing an assessment template
  6. A pop-up appears in which you can choose the previously created topic and the events about which you want SNS to notify you (choose Finding reported).
    Screenshot of choosing the previously created topic and the events about which you want SNS to notify you

Schedule Amazon Inspector assessment runs

The last step in using Amazon Inspector to assess for CVEs is to schedule the Amazon Inspector template to run using Amazon CloudWatch Events. This will make sure that Amazon Inspector assesses your EC2 instance on a regular basis. To do this, you need the Amazon Inspector template ARN, which you can find under Assessment templates in the Amazon Inspector console. CloudWatch Events can run your Amazon Inspector assessment at an interval you define using a Cron-based schedule. Cron is a well-known scheduling agent that is widely used on UNIX-like operating systems and uses the following syntax for CloudWatch Events.

Image of Cron schedule

All scheduled events use a UTC time zone, and the minimum precision for schedules is one minute. For more information about scheduling CloudWatch Events, see Schedule Expressions for Rules.

To create the CloudWatch Events rule:

  1. Navigate to the CloudWatch console, choose Events, and choose Create rule.
    Screenshot of starting to create a rule in the CloudWatch Events console
  2. On the next page, specify if you want to invoke your rule based on an event pattern or a schedule. For this blog post, you will select a schedule based on a Cron expression.
  3. You can schedule the Amazon Inspector assessment any time you want using the Cron expression, or you can use the Cron expression I used in the following screenshot, which will run the Amazon Inspector assessment every Sunday at 10:00 P.M. GMT.
    Screenshot of scheduling an Amazon Inspector assessment with a Cron expression
  4. Choose Add target and choose Inspector assessment template from the drop-down menu. Paste the ARN of the Amazon Inspector template you previously created in the Amazon Inspector console in the Assessment template box and choose Create a new role for this specific resource. This new role is necessary so that CloudWatch Events has the necessary permissions to start the Amazon Inspector assessment. CloudWatch Events will automatically create the new role and grant the minimum set of permissions needed to run the Amazon Inspector assessment. To proceed, choose Configure details.
    Screenshot of adding a target
  5. Next, give your rule a name and a description. I suggest using a name that describes what the rule does, as shown in the following screenshot.
  6. Finish the wizard by choosing Create rule. The rule should appear in the Events – Rules section of the CloudWatch console.
    Screenshot of completing the creation of the rule
  7. To confirm your CloudWatch Events rule works, wait for the next time your CloudWatch Events rule is scheduled to run. For testing purposes, you can choose your CloudWatch Events rule and choose Edit to change the schedule to run it sooner than scheduled.
    Screenshot of confirming the CloudWatch Events rule works
  8. Now navigate to the Amazon Inspector console to confirm the launch of your first assessment run. The Start time column shows you the time each assessment started and the Status column the status of your assessment. In the following screenshot, you can see Amazon Inspector is busy Collecting data from the selected assessment targets.
    Screenshot of confirming the launch of the first assessment run

You have concluded the last step of this blog post by setting up a regular scan of your EC2 instance with Amazon Inspector and a notification that will let you know if your EC2 instance is vulnerable to any known CVEs. In a previous Security Blog post, Eric Fitzgerald explained How to Remediate Amazon Inspector Security Findings Automatically. Although that blog post is for Linux-based EC2 instances, the post shows that you can learn about Amazon Inspector findings in other ways than email alerts.

Conclusion

In this two-part blog post, I showed how to make sure you keep your EC2 instances up to date with patching, how to back up your instances with snapshots, and how to monitor your instances for CVEs. Collectively these measures help to protect your instances against common attack vectors that attempt to exploit known vulnerabilities. In Part 1, I showed how to configure your EC2 instances to make it easy to use Systems Manager, EBS Snapshot Scheduler, and Amazon Inspector. I also showed how to use Systems Manager to schedule automatic patches to keep your instances current in a timely fashion. In Part 2, I showed you how to take regular snapshots of your data by using EBS Snapshot Scheduler and how to use Amazon Inspector to check if your EC2 instances running Microsoft Windows contain any common vulnerabilities and exposures (CVEs).

If you have comments about today’s or yesterday’s post, submit them in the “Comments” section below. If you have questions about or issues implementing any part of this solution, start a new thread on the Amazon EC2 forum or the Amazon Inspector forum, or contact AWS Support.

– Koen

Resume AWS Step Functions from Any State

Post Syndicated from Andy Katz original https://aws.amazon.com/blogs/compute/resume-aws-step-functions-from-any-state/


Yash Pant, Solutions Architect, AWS


Aaron Friedman, Partner Solutions Architect, AWS

When we discuss how to build applications with customers, we often align to the Well Architected Framework pillars of security, reliability, performance efficiency, cost optimization, and operational excellence. Designing for failure is an essential component to developing well architected applications that are resilient to spurious errors that may occur.

There are many ways you can use AWS services to achieve high availability and resiliency of your applications. For example, you can couple Elastic Load Balancing with Auto Scaling and Amazon EC2 instances to build highly available applications. Or use Amazon API Gateway and AWS Lambda to rapidly scale out a microservices-based architecture. Many AWS services have built in solutions to help with the appropriate error handling, such as Dead Letter Queues (DLQ) for Amazon SQS or retries in AWS Batch.

AWS Step Functions is an AWS service that makes it easy for you to coordinate the components of distributed applications and microservices. Step Functions allows you to easily design for failure, by incorporating features such as error retries and custom error handling from AWS Lambda exceptions. These features allow you to programmatically handle many common error modes and build robust, reliable applications.

In some rare cases, however, your application may fail in an unexpected manner. In these situations, you might not want to duplicate in a repeat execution those portions of your state machine that have already run. This is especially true when orchestrating long-running jobs or executing a complex state machine as part of a microservice. Here, you need to know the last successful state in your state machine from which to resume, so that you don’t duplicate previous work. In this post, we present a solution to enable you to resume from any given state in your state machine in the case of an unexpected failure.

Resuming from a given state

To resume a failed state machine execution from the state at which it failed, you first run a script that dynamically creates a new state machine. When the new state machine is executed, it resumes the failed execution from the point of failure. The script contains the following two primary steps:

  1. Parse the execution history of the failed execution to find the name of the state at which it failed, as well as the JSON input to that state.
  2. Create a new state machine, which adds an additional state to failed state machine, called "GoToState". "GoToState" is a choice state at the beginning of the state machine that branches execution directly to the failed state, allowing you to skip states that had succeeded in the previous execution.

The full script along with a CloudFormation template that creates a demo of this is available in the aws-sfn-resume-from-any-state GitHub repo.

Diving into the script

In this section, we walk you through the script and highlight the core components of its functionality. The script contains a main function, which adds a command line parameter for the failedExecutionArn so that you can easily call the script from the command line:

python gotostate.py --failedExecutionArn '<Failed_Execution_Arn>'

Identifying the failed state in your execution

First, the script extracts the name of the failed state along with the input to that state. It does so by using the failed state machine execution history, which is identified by the Amazon Resource Name (ARN) of the execution. The failed state is marked in the execution history, along with the input to that state (which is also the output of the preceding successful state). The script is able to parse these values from the log.

The script loops through the execution history of the failed state machine, and traces it backwards until it finds the failed state. If the state machine failed in a parallel state, then it must restart from the beginning of the parallel state. The script is able to capture the name of the parallel state that failed, rather than any substate within the parallel state that may have caused the failure. The following code is the Python function that does this.


def parseFailureHistory(failedExecutionArn):

    '''
    Parses the execution history of a failed state machine to get the name of failed state and the input to the failed state:
    Input failedExecutionArn = A string containing the execution ARN of a failed state machine y
    Output = A list with two elements: [name of failed state, input to failed state]
    '''
    failedAtParallelState = False
    try:
        #Get the execution history
        response = client.get\_execution\_history(
            executionArn=failedExecutionArn,
            reverseOrder=True
        )
        failedEvents = response['events']
    except Exception as ex:
        raise ex
    #Confirm that the execution actually failed, raise exception if it didn't fail.
    try:
        failedEvents[0]['executionFailedEventDetails']
    except:
        raise('Execution did not fail')
        
    '''
    If you have a 'States.Runtime' error (for example, if a task state in your state machine attempts to execute a Lambda function in a different region than the state machine), get the ID of the failed state, and use it to determine the failed state name and input.
    '''
    
    if failedEvents[0]['executionFailedEventDetails']['error'] == 'States.Runtime':
        failedId = int(filter(str.isdigit, str(failedEvents[0]['executionFailedEventDetails']['cause'].split()[13])))
        failedState = failedEvents[-1 \* failedId]['stateEnteredEventDetails']['name']
        failedInput = failedEvents[-1 \* failedId]['stateEnteredEventDetails']['input']
        return (failedState, failedInput)
        
    '''
    You need to loop through the execution history, tracing back the executed steps.
    The first state you encounter is the failed state. If you failed on a parallel state, you need the name of the parallel state rather than the name of a state within a parallel state that it failed on. This is because you can only attach goToState to the parallel state, but not a substate within the parallel state.
    This loop starts with the ID of the latest event and uses the previous event IDs to trace back the execution to the beginning (id 0). However, it returns as soon it finds the name of the failed state.
    '''

    currentEventId = failedEvents[0]['id']
    while currentEventId != 0:
        #multiply event ID by -1 for indexing because you're looking at the reversed history
        currentEvent = failedEvents[-1 \* currentEventId]
        
        '''
        You can determine if the failed state was a parallel state because it and an event with 'type'='ParallelStateFailed' appears in the execution history before the name of the failed state
        '''

        if currentEvent['type'] == 'ParallelStateFailed':
            failedAtParallelState = True

        '''
        If the failed state is not a parallel state, then the name of failed state to return is the name of the state in the first 'TaskStateEntered' event type you run into when tracing back the execution history
        '''

        if currentEvent['type'] == 'TaskStateEntered' and failedAtParallelState == False:
            failedState = currentEvent['stateEnteredEventDetails']['name']
            failedInput = currentEvent['stateEnteredEventDetails']['input']
            return (failedState, failedInput)

        '''
        If the failed state was a parallel state, then you need to trace execution back to the first event with 'type'='ParallelStateEntered', and return the name of the state
        '''

        if currentEvent['type'] == 'ParallelStateEntered' and failedAtParallelState:
            failedState = failedState = currentEvent['stateEnteredEventDetails']['name']
            failedInput = currentEvent['stateEnteredEventDetails']['input']
            return (failedState, failedInput)
        #Update the ID for the next execution of the loop
        currentEventId = currentEvent['previousEventId']
        

Create the new state machine

The script uses the name of the failed state to create the new state machine, with "GoToState" branching execution directly to the failed state.

To do this, the script requires the Amazon States Language (ASL) definition of the failed state machine. It modifies the definition to append "GoToState", and create a new state machine from it.

The script gets the ARN of the failed state machine from the execution ARN of the failed state machine. This ARN allows it to get the ASL definition of the failed state machine by calling the DesribeStateMachine API action. It creates a new state machine with "GoToState".

When the script creates the new state machine, it also adds an additional input variable called "resuming". When you execute this new state machine, you specify this resuming variable as true in the input JSON. This tells "GoToState" to branch execution to the state that had previously failed. Here’s the function that does this:

def attachGoToState(failedStateName, stateMachineArn):

    '''
    Given a state machine ARN and the name of a state in that state machine, create a new state machine that starts at a new choice state called 'GoToState'. "GoToState" branches to the named state, and sends the input of the state machine to that state, when a variable called "resuming" is set to True.
    Input failedStateName = A string with the name of the failed state
          stateMachineArn = A string with the ARN of the state machine
    Output response from the create_state_machine call, which is the API call that creates a new state machine
    '''

    try:
        response = client.describe\_state\_machine(
            stateMachineArn=stateMachineArn
        )
    except:
        raise('Could not get ASL definition of state machine')
    roleArn = response['roleArn']
    stateMachine = json.loads(response['definition'])
    #Create a name for the new state machine
    newName = response['name'] + '-with-GoToState'
    #Get the StartAt state for the original state machine, because you point the 'GoToState' to this state
    originalStartAt = stateMachine['StartAt']

    '''
    Create the GoToState with the variable $.resuming.
    If new state machine is executed with $.resuming = True, then the state machine skips to the failed state.
    Otherwise, it executes the state machine from the original start state.
    '''

    goToState = {'Type':'Choice', 'Choices':[{'Variable':'$.resuming', 'BooleanEquals':False, 'Next':originalStartAt}], 'Default':failedStateName}
    #Add GoToState to the set of states in the new state machine
    stateMachine['States']['GoToState'] = goToState
    #Add StartAt
    stateMachine['StartAt'] = 'GoToState'
    #Create new state machine
    try:
        response = client.create_state_machine(
            name=newName,
            definition=json.dumps(stateMachine),
            roleArn=roleArn
        )
    except:
        raise('Failed to create new state machine with GoToState')
    return response

Testing the script

Now that you understand how the script works, you can test it out.

The following screenshot shows an example state machine that has failed, called "TestMachine". This state machine successfully completed "FirstState" and "ChoiceState", but when it branched to "FirstMatchState", it failed.

Use the script to create a new state machine that allows you to rerun this state machine, but skip the "FirstState" and the "ChoiceState" steps that already succeeded. You can do this by calling the script as follows:

python gotostate.py --failedExecutionArn 'arn:aws:states:us-west-2:<AWS_ACCOUNT_ID>:execution:TestMachine-with-GoToState:b2578403-f41d-a2c7-e70c-7500045288595

This creates a new state machine called "TestMachine-with-GoToState", and returns its ARN, along with the input that had been sent to "FirstMatchState". You can then inspect the input to determine what caused the error. In this case, you notice that the input to "FirstMachState" was the following:

{
"foo": 1,
"Message": true
}

However, this state machine expects the "Message" field of the JSON to be a string rather than a Boolean. Execute the new "TestMachine-with-GoToState" state machine, change the input to be a string, and add the "resuming" variable that "GoToState" requires:

{
"foo": 1,
"Message": "Hello!",
"resuming":true
}

When you execute the new state machine, it skips "FirstState" and "ChoiceState", and goes directly to "FirstMatchState", which was the state that failed:

Look at what happens when you have a state machine with multiple parallel steps. This example is included in the GitHub repository associated with this post. The repo contains a CloudFormation template that sets up this state machine and provides instructions to replicate this solution.

The following state machine, "ParallelStateMachine", takes an input through two subsequent parallel states before doing some final processing and exiting, along with the JSON with the ASL definition of the state machine.

{
  "Comment": "An example of the Amazon States Language using a parallel state to execute two branches at the same time.",
  "StartAt": "Parallel",
  "States": {
    "Parallel": {
      "Type": "Parallel",
      "ResultPath":"$.output",
      "Next": "Parallel 2",
      "Branches": [
        {
          "StartAt": "Parallel Step 1, Process 1",
          "States": {
            "Parallel Step 1, Process 1": {
              "Type": "Task",
              "Resource": "arn:aws:lambda:us-west-2:XXXXXXXXXXXX:function:LambdaA",
              "End": true
            }
          }
        },
        {
          "StartAt": "Parallel Step 1, Process 2",
          "States": {
            "Parallel Step 1, Process 2": {
              "Type": "Task",
              "Resource": "arn:aws:lambda:us-west-2:XXXXXXXXXXXX:function:LambdaA",
              "End": true
            }
          }
        }
      ]
    },
    "Parallel 2": {
      "Type": "Parallel",
      "Next": "Final Processing",
      "Branches": [
        {
          "StartAt": "Parallel Step 2, Process 1",
          "States": {
            "Parallel Step 2, Process 1": {
              "Type": "Task",
              "Resource": "arn:aws:lambda:us-west-2:XXXXXXXXXXXXX:function:LambdaB",
              "End": true
            }
          }
        },
        {
          "StartAt": "Parallel Step 2, Process 2",
          "States": {
            "Parallel Step 2, Process 2": {
              "Type": "Task",
              "Resource": "arn:aws:lambda:us-west-2:XXXXXXXXXXXX:function:LambdaB",
              "End": true
            }
          }
        }
      ]
    },
    "Final Processing": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-west-2:XXXXXXXXXXXX:function:LambdaC",
      "End": true
    }
  }
}

First, use an input that initially fails:

{
  "Message": "Hello!"
}

This fails because the state machine expects you to have a variable in the input JSON called "foo" in the second parallel state to run "Parallel Step 2, Process 1" and "Parallel Step 2, Process 2". Instead, the original input gets processed by the first parallel state and produces the following output to pass to the second parallel state:

{
"output": [
    {
      "Message": "Hello!"
    },
    {
      "Message": "Hello!"
    }
  ],
}

Run the script on the failed state machine to create a new state machine that allows it to resume directly at the second parallel state instead of having to redo the first parallel state. This creates a new state machine called "ParallelStateMachine-with-GoToState". The following JSON was created by the script to define the new state machine in ASL. It contains the "GoToState" value that was attached by the script.

{
   "Comment":"An example of the Amazon States Language using a parallel state to execute two branches at the same time.",
   "States":{
      "Final Processing":{
         "Resource":"arn:aws:lambda:us-west-2:XXXXXXXXXXXX:function:LambdaC",
         "End":true,
         "Type":"Task"
      },
      "GoToState":{
         "Default":"Parallel 2",
         "Type":"Choice",
         "Choices":[
            {
               "Variable":"$.resuming",
               "BooleanEquals":false,
               "Next":"Parallel"
            }
         ]
      },
      "Parallel":{
         "Branches":[
            {
               "States":{
                  "Parallel Step 1, Process 1":{
                     "Resource":"arn:aws:lambda:us-west-2:XXXXXXXXXXXX:function:LambdaA",
                     "End":true,
                     "Type":"Task"
                  }
               },
               "StartAt":"Parallel Step 1, Process 1"
            },
            {
               "States":{
                  "Parallel Step 1, Process 2":{
                     "Resource":"arn:aws:lambda:us-west-2:XXXXXXXXXXXX:LambdaA",
                     "End":true,
                     "Type":"Task"
                  }
               },
               "StartAt":"Parallel Step 1, Process 2"
            }
         ],
         "ResultPath":"$.output",
         "Type":"Parallel",
         "Next":"Parallel 2"
      },
      "Parallel 2":{
         "Branches":[
            {
               "States":{
                  "Parallel Step 2, Process 1":{
                     "Resource":"arn:aws:lambda:us-west-2:XXXXXXXXXXXX:function:LambdaB",
                     "End":true,
                     "Type":"Task"
                  }
               },
               "StartAt":"Parallel Step 2, Process 1"
            },
            {
               "States":{
                  "Parallel Step 2, Process 2":{
                     "Resource":"arn:aws:lambda:us-west-2:XXXXXXXXXXXX:function:LambdaB",
                     "End":true,
                     "Type":"Task"
                  }
               },
               "StartAt":"Parallel Step 2, Process 2"
            }
         ],
         "Type":"Parallel",
         "Next":"Final Processing"
      }
   },
   "StartAt":"GoToState"
}

You can then execute this state machine with the correct input by adding the "foo" and "resuming" variables:

{
  "foo": 1,
  "output": [
    {
      "Message": "Hello!"
    },
    {
      "Message": "Hello!"
    }
  ],
  "resuming": true
}

This yields the following result. Notice that this time, the state machine executed successfully to completion, and skipped the steps that had previously failed.


Conclusion

When you’re building out complex workflows, it’s important to be prepared for failure. You can do this by taking advantage of features such as automatic error retries in Step Functions and custom error handling of Lambda exceptions.

Nevertheless, state machines still have the possibility of failing. With the methodology and script presented in this post, you can resume a failed state machine from its point of failure. This allows you to skip the execution of steps in the workflow that had already succeeded, and recover the process from the point of failure.

To see more examples, please visit the Step Functions Getting Started page.

If you have questions or suggestions, please comment below.

Reports from Netconf and Netdev

Post Syndicated from jake original https://lwn.net/Articles/738912/rss

The Netconf 2017,
Part 2
and Netdev 2.2 conferences were
recently held in Seoul, South Korea. Netconf is an invitation-only
gathering of kernel
networking developers, while Netdev is an open conference for the Linux
networking community. Attendees have put together reports
from all five days (two for Netconf and three for Netdev) that LWN is
happy to publish for them. So far, we have coverage from the first day of
each—with more coming soon.

Say Hello To Our Newest AWS Community Heroes (Fall 2017 Edition)

Post Syndicated from Sara Rodas original https://aws.amazon.com/blogs/aws/say-hello-to-our-newest-aws-community-heroes-fall-2017-edition/

The AWS Community Heroes program helps shine a spotlight on some of the innovative work being done by rockstar AWS developers around the globe. Marrying cloud expertise with a passion for community building and education, these heroes share their time and knowledge across social media and through in-person events. Heroes also actively help drive community-led tracks at conferences. At this year’s re:Invent, many Heroes will be speaking during the Monday Community Day track.

This November, we are thrilled to have four Heroes joining our network of cloud innovators. Without further ado, meet to our newest AWS Community Heroes!

 

Anh Ho Viet

Anh Ho Viet is the founder of AWS Vietnam User Group, Co-founder & CEO of OSAM, an AWS Consulting Partner in Vietnam, an AWS Certified Solutions Architect, and a cloud lover.

At OSAM, Anh and his enthusiastic team have helped many companies, from SMBs to Enterprises, move to the cloud with AWS. They offer a wide range of services, including migration, consultation, architecture, and solution design on AWS. Anh’s vision for OSAM is beyond a cloud service provider; the company will take part in building a complete AWS ecosystem in Vietnam, where other companies are encouraged to become AWS partners through training and collaboration activities.

In 2016, Anh founded the AWS Vietnam User Group as a channel to share knowledge and hands-on experience among cloud practitioners. Since then, the community has reached more than 4,800 members and is still expanding. The group holds monthly meetups, connects many SMEs to AWS experts, and provides real-time, free-of-charge consultancy to startups. In August 2017, Anh joined as lead content creator of a program called “Cloud Computing Lectures for Universities” which includes translating AWS documentation & news into Vietnamese, providing students with fundamental, up-to-date knowledge of AWS cloud computing, and supporting students’ career paths.

 

Thorsten Höger

Thorsten Höger is CEO and Cloud consultant at Taimos, where he is advising customers on how to use AWS. Being a developer, he focuses on improving development processes and automating everything to build efficient deployment pipelines for customers of all sizes.

Before being self-employed, Thorsten worked as a developer and CTO of Germany’s first private bank running on AWS. With his colleagues, he migrated the core banking system to the AWS platform in 2013. Since then he organizes the AWS user group in Stuttgart and is a frequent speaker at Meetups, BarCamps, and other community events.

As a supporter of open source software, Thorsten is maintaining or contributing to several projects on Github, like test frameworks for AWS Lambda, Amazon Alexa, or developer tools for CloudFormation. He is also the maintainer of the Jenkins AWS Pipeline plugin.

In his spare time, he enjoys indoor climbing and cooking.

 

Becky Zhang

Yu Zhang (Becky Zhang) is COO of BootDev, which focuses on Big Data solutions on AWS and high concurrency web architecture. Before she helped run BootDev, she was working at Yubis IT Solutions as an operations manager.

Becky plays a key role in the AWS User Group Shanghai (AWSUGSH), regularly organizing AWS UG events including AWS Tech Meetups and happy hours, gathering AWS talent together to communicate the latest technology and AWS services. As a female in technology industry, Becky is keen on promoting Women in Tech and encourages more woman to get involved in the community.

Becky also connects the China AWS User Group with user groups in other regions, including Korea, Japan, and Thailand. She was invited as a panelist at AWS re:Invent 2016 and spoke at the Seoul AWS Summit this April to introduce AWS User Group Shanghai and communicate with other AWS User Groups around the world.

Besides events, Becky also promotes the Shanghai AWS User Group by posting AWS-related tech articles, event forecasts, and event reports to Weibo, Twitter, Meetup.com, and WeChat (which now has over 2000 official account followers).

 

Nilesh Vaghela

Nilesh Vaghela is the founder of ElectroMech Corporation, an AWS Cloud and open source focused company (the company started as an open source motto). Nilesh has been very active in the Linux community since 1998. He started working with AWS Cloud technologies in 2013 and in 2014 he trained a dedicated cloud team and started full support of AWS cloud services as an AWS Standard Consulting Partner. He always works to establish and encourage cloud and open source communities.

He started the AWS Meetup community in Ahmedabad in 2014 and as of now 12 Meetups have been conducted, focusing on various AWS technologies. The Meetup has quickly grown to include over 2000 members. Nilesh also created a Facebook group for AWS enthusiasts in Ahmedabad, with over 1500 members.

Apart from the AWS Meetup, Nilesh has delivered a number of seminars, workshops, and talks around AWS introduction and awareness, at various organizations, as well as at colleges and universities. He has also been active in working with startups, presenting AWS services overviews and discussing how startups can benefit the most from using AWS services.

Nilesh is Red Hat Linux Technologies and AWS Cloud Technologies trainer as well.

 

To learn more about the AWS Community Heroes Program and how to get involved with your local AWS community, click here.

Moonhack 2017: a new world record!

Post Syndicated from Katherine Leadbetter original https://www.raspberrypi.org/blog/moonhack-2017-world-record/

With the incredible success of this year’s Moonhack under their belt, here’s Code Club Australia‘s Kelly Tagalan with a lowdown on the event, and why challenges such as these are so important.

On 15 August 2017, Code Clubs around the globe set a world record for the most kids coding in a day! From Madrid to Manila and from Sydney to Seoul, kids in Code Clubs, homes, and community centres around the world used code in order to ‘hack the moon’.

Moonhack 2017 Recap: WORLDWIDE CODING

We set a world record of the most kids coding at the same time not only across Australia….but across the WORLD! Watch our recap of our day hackathon of kids coding across the globe.

The Moonhack movement

The first Moonhack took place in Sydney in 2016, where we set a record of 10207 kids coding in a day.

Images of children taking part in Code Club Australia's Moonhack 2017

The response to Moonhack, not just in Australia but around the world, blew us away, and this year we decided to make the challenge as global as possible.

“I want to create anything that can benefit the life of one person, hundreds of people, or maybe even thousands.” – Moonhack Code Club kid, Australia.

The Code Club New Zealand team helped to create and execute projects with help from Code Club in the UK, and Code Club Canada, France, South Korea, Bangladesh, and Croatia created translated materials to allow even more kids to take part.

Moonhack 2017

The children had 24 hours to try coding a specially made Moonhack project using Python, Scratch or Scratch Jr. Creative Moonhackers even made their own custom projects, and we saw amazing submissions on a range of themes, from moon football to heroic dogs saving our natural satellite from alien invaders!

Images of children taking part in Code Club Australia's Moonhack 2017

In the end, 28575 kids from 56 countries and from 600 Code Clubs took part in Moonhack to set a new record. Record Setter founder and Senior Adjudicator, Corey Henderson, travelled to Sydney to Moonhack Mission Control to verify the record, and we were thrilled to hear that we came close to tripling the number of kids who took part last year!

The top five Moonhack contributing countries were Australia, New Zealand, the USA, the UK, and Croatia, but we saw contributions from so many more amazing places, including Syria and Guatemala. The event was a truly international Code Club collaboration!

Images of children taking part in Code Club Australia's Moonhack 2017

The founder of Code Club Bangladesh, Shajan Miah, summed up the spirit of Moonhack well: “Moonhack was a great opportunity for children in Bangladesh to take part in a global event. It connected the children with like-minded people across the world, and this motivated them to want to continue learning coding and programming. They really enjoyed the challenge!”

Images of children taking part in Code Club Australia's Moonhack 2017

Of course, the most important thing about Moonhack was that the kids had fun taking part and experienced what it feels like to create with code. One astute nine-year-old told us, “What I love about coding is that you can create your own games. Coding is becoming more important in the work environment and I want to understand it and write it.”

This is why we Moonhack: to get kids excited about coding, and to bring them into the global Code Club community. We hope that every Moonhacker who isn’t yet part of a Code Club will decide to join one soon, and that their experience will help guide them towards a future involving digital making. Here’s to Moonhack 2018!

Join Code Club

With new school terms starting and new clubs forming, there’s never been a better time to volunteer for a Code Club! With the official extension of the Code Club age range from 9-11 to 9-13, there are even more opportunities to get involved.

The Code Club logo with added robots - Moonhack 2017

If you’re ready to volunteer and are looking for a club to join, head to the Code Club International website to find your local network. There you’ll also find information on starting a new club from scratch, anywhere in the world, and you can read all about making your venue, such as a library, youth club, or office, available as a space for a Code Club.

The post Moonhack 2017: a new world record! appeared first on Raspberry Pi.

How Aussie ecommerce stores can compete with the retail giant Amazon

Post Syndicated from chris desantis original https://www.anchor.com.au/blog/2017/08/aussie-ecommerce-stores-vs-amazon/

The powerhouse Amazon retail store is set to launch in Australia toward the end of 2018 and Aussie ecommerce retailers need to ready themselves for the competition storm ahead.

2018 may seem a while away but getting your ecommerce site in tip top shape and ready to compete can take time. Check out these helpful hints from the Anchor crew.

Speed kills

If you’ve ever heard of the tale of the tortoise and the hare, the moral is that “slow and steady wins the race”. This is definitely not the place for that phrase, because if your site loads as slowly as a 1995 dial up connection, your ecommerce store will not, I repeat, will not win the race.

Site speed can be impacted by a number of factors and getting the balance right between a site that loads at lightning speed and delivering engaging content to your audience. There are many ways to check the performance of your site including Anchor’s free hosting check up or pingdom.

Taking action can boost the performance of your site:

Here’s an interesting blog from the WebCEO team about site speed’s impact on conversion rates on-page, or check out our previous blog on maximising site performance.

Show me the money

As an ecommerce store, getting credit card details as fast as possible is probably at the top of your list, but it’s important to remember that it’s an actual person that needs to hand over the details.

Consider the customer’s experience whilst checking out. Making people log in to their account before checkout, can lead to abandoned carts as customers try to remember the vital details. Similarly, making a customer enter all their details before displaying shipping costs is more of an annoyance than a benefit.

Built for growth

Before you blast out a promo email to your entire database or spend up big on PPC, consider what happens when this 5 fold increase in traffic, all jumps onto your site at around the same time.

Will your site come screeching to a sudden halt with a 504 or 408 error message, or ride high on the wave of increased traffic? If you have fixed infrastructure such as a dedicated server, or are utilising a VPS, then consider the maximum concurrent users that your site can handle.

Consider this. Amazon.com.au will be built on the scalable cloud infrastructure of Amazon Web Services and will utilise all the microservices and data mining technology to offer customers a seamless, personalised shopping experience. How will your business compete?

Search ready

Being found online is important for any business, but for ecommerce sites, it’s essential. Gaining results from SEO practices can take time so beware of ‘quick fix guarantees’ from outsourced agencies.

Search Engine Optimisation (SEO) practices can have lasting effects. Good practices can ensure your site is found via organic search without huge advertising budgets, on the other hand ‘black hat’ practices can push your ecommerce store into search oblivion.

SEO takes discipline and focus to get right. Here are some of our favourite hints for SEO greatness from those who live and breathe SEO:

  • Optimise your site for mobile
  • Use Meta Tags wisely
  • Leverage Descriptive alt tags and image file names
  • Create content for people, not bots (keyword stuffing is a no no!)

SEO best practices are continually evolving, but creating a site that is designed to give users a great experience and give them the content they expect to find.

Google My Business is a free service that EVERY business should take advantage of. It is a listing service where your business can provide details such as address, phone number, website, and trading hours. It’s easy to update and manage, you can add photos, a physical address (if applicable), and display shopper reviews.

Get your site ship shape

Overwhelmed by these starter tips? If you are ready to get your site into tip top shape–get in touch. We work with awesome partners like eWave who can help create a seamless online shopping experience.

 

The post How Aussie ecommerce stores can compete with the retail giant Amazon appeared first on AWS Managed Services by Anchor.

NetDev 2.2 registration is now open

Post Syndicated from jake original https://lwn.net/Articles/731573/rss

The registration for the NetDev 2.2 networking conference is now open. It will be held in Seoul, Korea November 8-10. As usual, it will be preceded by the invitation-only Netconf for core kernel networking hackers. “Netdev 2.2 is a community-driven conference geared towards Linux netheads. Linux kernel networking and user space utilization of the interfaces to the Linux kernel networking subsystem are the focus. If you are using Linux as a boot system for proprietary networking, then this conference _may not be for you_.” LWN covered these conferences in 2016 and earlier this year; with luck, we will cover these upcoming conferences as well.

Analyzing AWS Cost and Usage Reports with Looker and Amazon Athena

Post Syndicated from Dillon Morrison original https://aws.amazon.com/blogs/big-data/analyzing-aws-cost-and-usage-reports-with-looker-and-amazon-athena/

This is a guest post by Dillon Morrison at Looker. Looker is, in their own words, “a new kind of analytics platform–letting everyone in your business make better decisions by getting reliable answers from a tool they can use.” 

As the breadth of AWS products and services continues to grow, customers are able to more easily move their technology stack and core infrastructure to AWS. One of the attractive benefits of AWS is the cost savings. Rather than paying upfront capital expenses for large on-premises systems, customers can instead pay variables expenses for on-demand services. To further reduce expenses AWS users can reserve resources for specific periods of time, and automatically scale resources as needed.

The AWS Cost Explorer is great for aggregated reporting. However, conducting analysis on the raw data using the flexibility and power of SQL allows for much richer detail and insight, and can be the better choice for the long term. Thankfully, with the introduction of Amazon Athena, monitoring and managing these costs is now easier than ever.

In the post, I walk through setting up the data pipeline for cost and usage reports, Amazon S3, and Athena, and discuss some of the most common levers for cost savings. I surface tables through Looker, which comes with a host of pre-built data models and dashboards to make analysis of your cost and usage data simple and intuitive.

Analysis with Athena

With Athena, there’s no need to create hundreds of Excel reports, move data around, or deploy clusters to house and process data. Athena uses Apache Hive’s DDL to create tables, and the Presto querying engine to process queries. Analysis can be performed directly on raw data in S3. Conveniently, AWS exports raw cost and usage data directly into a user-specified S3 bucket, making it simple to start querying with Athena quickly. This makes continuous monitoring of costs virtually seamless, since there is no infrastructure to manage. Instead, users can leverage the power of the Athena SQL engine to easily perform ad-hoc analysis and data discovery without needing to set up a data warehouse.

After the data pipeline is established, cost and usage data (the recommended billing data, per AWS documentation) provides a plethora of comprehensive information around usage of AWS services and the associated costs. Whether you need the report segmented by product type, user identity, or region, this report can be cut-and-sliced any number of ways to properly allocate costs for any of your business needs. You can then drill into any specific line item to see even further detail, such as the selected operating system, tenancy, purchase option (on-demand, spot, or reserved), and so on.

Walkthrough

By default, the Cost and Usage report exports CSV files, which you can compress using gzip (recommended for performance). There are some additional configuration options for tuning performance further, which are discussed below.

Prerequisites

If you want to follow along, you need the following resources:

Enable the cost and usage reports

First, enable the Cost and Usage report. For Time unit, select Hourly. For Include, select Resource IDs. All options are prompted in the report-creation window.

The Cost and Usage report dumps CSV files into the specified S3 bucket. Please note that it can take up to 24 hours for the first file to be delivered after enabling the report.

Configure the S3 bucket and files for Athena querying

In addition to the CSV file, AWS also creates a JSON manifest file for each cost and usage report. Athena requires that all of the files in the S3 bucket are in the same format, so we need to get rid of all these manifest files. If you’re looking to get started with Athena quickly, you can simply go into your S3 bucket and delete the manifest file manually, skip the automation described below, and move on to the next section.

To automate the process of removing the manifest file each time a new report is dumped into S3, which I recommend as you scale, there are a few additional steps. The folks at Concurrency labs wrote a great overview and set of scripts for this, which you can find in their GitHub repo.

These scripts take the data from an input bucket, remove anything unnecessary, and dump it into a new output bucket. We can utilize AWS Lambda to trigger this process whenever new data is dropped into S3, or on a nightly basis, or whatever makes most sense for your use-case, depending on how often you’re querying the data. Please note that enabling the “hourly” report means that data is reported at the hour-level of granularity, not that a new file is generated every hour.

Following these scripts, you’ll notice that we’re adding a date partition field, which isn’t necessary but improves query performance. In addition, converting data from CSV to a columnar format like ORC or Parquet also improves performance. We can automate this process using Lambda whenever new data is dropped in our S3 bucket. Amazon Web Services discusses columnar conversion at length, and provides walkthrough examples, in their documentation.

As a long-term solution, best practice is to use compression, partitioning, and conversion. However, for purposes of this walkthrough, we’re not going to worry about them so we can get up-and-running quicker.

Set up the Athena query engine

In your AWS console, navigate to the Athena service, and click “Get Started”. Follow the tutorial and set up a new database (we’ve called ours “AWS Optimizer” in this example). Don’t worry about configuring your initial table, per the tutorial instructions. We’ll be creating a new table for cost and usage analysis. Once you walked through the tutorial steps, you’ll be able to access the Athena interface, and can begin running Hive DDL statements to create new tables.

One thing that’s important to note, is that the Cost and Usage CSVs also contain the column headers in their first row, meaning that the column headers would be included in the dataset and any queries. For testing and quick set-up, you can remove this line manually from your first few CSV files. Long-term, you’ll want to use a script to programmatically remove this row each time a new file is dropped in S3 (every few hours typically). We’ve drafted up a sample script for ease of reference, which we run on Lambda. We utilize Lambda’s native ability to invoke the script whenever a new object is dropped in S3.

For cost and usage, we recommend using the DDL statement below. Since our data is in CSV format, we don’t need to use a SerDe, we can simply specify the “separatorChar, quoteChar, and escapeChar”, and the structure of the files (“TEXTFILE”). Note that AWS does have an OpenCSV SerDe as well, if you prefer to use that.

 

CREATE EXTERNAL TABLE IF NOT EXISTS cost_and_usage	 (
identity_LineItemId String,
identity_TimeInterval String,
bill_InvoiceId String,
bill_BillingEntity String,
bill_BillType String,
bill_PayerAccountId String,
bill_BillingPeriodStartDate String,
bill_BillingPeriodEndDate String,
lineItem_UsageAccountId String,
lineItem_LineItemType String,
lineItem_UsageStartDate String,
lineItem_UsageEndDate String,
lineItem_ProductCode String,
lineItem_UsageType String,
lineItem_Operation String,
lineItem_AvailabilityZone String,
lineItem_ResourceId String,
lineItem_UsageAmount String,
lineItem_NormalizationFactor String,
lineItem_NormalizedUsageAmount String,
lineItem_CurrencyCode String,
lineItem_UnblendedRate String,
lineItem_UnblendedCost String,
lineItem_BlendedRate String,
lineItem_BlendedCost String,
lineItem_LineItemDescription String,
lineItem_TaxType String,
product_ProductName String,
product_accountAssistance String,
product_architecturalReview String,
product_architectureSupport String,
product_availability String,
product_bestPractices String,
product_cacheEngine String,
product_caseSeverityresponseTimes String,
product_clockSpeed String,
product_currentGeneration String,
product_customerServiceAndCommunities String,
product_databaseEdition String,
product_databaseEngine String,
product_dedicatedEbsThroughput String,
product_deploymentOption String,
product_description String,
product_durability String,
product_ebsOptimized String,
product_ecu String,
product_endpointType String,
product_engineCode String,
product_enhancedNetworkingSupported String,
product_executionFrequency String,
product_executionLocation String,
product_feeCode String,
product_feeDescription String,
product_freeQueryTypes String,
product_freeTrial String,
product_frequencyMode String,
product_fromLocation String,
product_fromLocationType String,
product_group String,
product_groupDescription String,
product_includedServices String,
product_instanceFamily String,
product_instanceType String,
product_io String,
product_launchSupport String,
product_licenseModel String,
product_location String,
product_locationType String,
product_maxIopsBurstPerformance String,
product_maxIopsvolume String,
product_maxThroughputvolume String,
product_maxVolumeSize String,
product_maximumStorageVolume String,
product_memory String,
product_messageDeliveryFrequency String,
product_messageDeliveryOrder String,
product_minVolumeSize String,
product_minimumStorageVolume String,
product_networkPerformance String,
product_operatingSystem String,
product_operation String,
product_operationsSupport String,
product_physicalProcessor String,
product_preInstalledSw String,
product_proactiveGuidance String,
product_processorArchitecture String,
product_processorFeatures String,
product_productFamily String,
product_programmaticCaseManagement String,
product_provisioned String,
product_queueType String,
product_requestDescription String,
product_requestType String,
product_routingTarget String,
product_routingType String,
product_servicecode String,
product_sku String,
product_softwareType String,
product_storage String,
product_storageClass String,
product_storageMedia String,
product_technicalSupport String,
product_tenancy String,
product_thirdpartySoftwareSupport String,
product_toLocation String,
product_toLocationType String,
product_training String,
product_transferType String,
product_usageFamily String,
product_usagetype String,
product_vcpu String,
product_version String,
product_volumeType String,
product_whoCanOpenCases String,
pricing_LeaseContractLength String,
pricing_OfferingClass String,
pricing_PurchaseOption String,
pricing_publicOnDemandCost String,
pricing_publicOnDemandRate String,
pricing_term String,
pricing_unit String,
reservation_AvailabilityZone String,
reservation_NormalizedUnitsPerReservation String,
reservation_NumberOfReservations String,
reservation_ReservationARN String,
reservation_TotalReservedNormalizedUnits String,
reservation_TotalReservedUnits String,
reservation_UnitsPerReservation String,
resourceTags_userName String,
resourceTags_usercostcategory String  


)
    ROW FORMAT DELIMITED
      FIELDS TERMINATED BY ','
      ESCAPED BY '\\'
      LINES TERMINATED BY '\n'

STORED AS TEXTFILE
    LOCATION 's3://<<your bucket name>>';

Once you’ve successfully executed the command, you should see a new table named “cost_and_usage” with the below properties. Now we’re ready to start executing queries and running analysis!

Start with Looker and connect to Athena

Setting up Looker is a quick process, and you can try it out for free here (or download from Amazon Marketplace). It takes just a few seconds to connect Looker to your Athena database, and Looker comes with a host of pre-built data models and dashboards to make analysis of your cost and usage data simple and intuitive. After you’re connected, you can use the Looker UI to run whatever analysis you’d like. Looker translates this UI to optimized SQL, so any user can execute and visualize queries for true self-service analytics.

Major cost saving levers

Now that the data pipeline is configured, you can dive into the most popular use cases for cost savings. In this post, I focus on:

  • Purchasing Reserved Instances vs. On-Demand Instances
  • Data transfer costs
  • Allocating costs over users or other Attributes (denoted with resource tags)

On-Demand, Spot, and Reserved Instances

Purchasing Reserved Instances vs On-Demand Instances is arguably going to be the biggest cost lever for heavy AWS users (Reserved Instances run up to 75% cheaper!). AWS offers three options for purchasing instances:

  • On-Demand—Pay as you use.
  • Spot (variable cost)—Bid on spare Amazon EC2 computing capacity.
  • Reserved Instances—Pay for an instance for a specific, allotted period of time.

When purchasing a Reserved Instance, you can also choose to pay all-upfront, partial-upfront, or monthly. The more you pay upfront, the greater the discount.

If your company has been using AWS for some time now, you should have a good sense of your overall instance usage on a per-month or per-day basis. Rather than paying for these instances On-Demand, you should try to forecast the number of instances you’ll need, and reserve them with upfront payments.

The total amount of usage with Reserved Instances versus overall usage with all instances is called your coverage ratio. It’s important not to confuse your coverage ratio with your Reserved Instance utilization. Utilization represents the amount of reserved hours that were actually used. Don’t worry about exceeding capacity, you can still set up Auto Scaling preferences so that more instances get added whenever your coverage or utilization crosses a certain threshold (we often see a target of 80% for both coverage and utilization among savvy customers).

Calculating the reserved costs and coverage can be a bit tricky with the level of granularity provided by the cost and usage report. The following query shows your total cost over the last 6 months, broken out by Reserved Instance vs other instance usage. You can substitute the cost field for usage if you’d prefer. Please note that you should only have data for the time period after the cost and usage report has been enabled (though you can opt for up to 3 months of historical data by contacting your AWS Account Executive). If you’re just getting started, this query will only show a few days.

 

SELECT 
	DATE_FORMAT(from_iso8601_timestamp(cost_and_usage.lineitem_usagestartdate),'%Y-%m') AS "cost_and_usage.usage_start_month",
	COALESCE(SUM(cost_and_usage.lineitem_unblendedcost ), 0) AS "cost_and_usage.total_unblended_cost",
	COALESCE(SUM(CASE WHEN (CASE
         WHEN cost_and_usage.lineitem_lineitemtype = 'DiscountedUsage' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'RIFee' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'Fee' THEN 'RI Line Item'
         ELSE 'Non RI Line Item'
        END = 'RI Line Item') THEN cost_and_usage.lineitem_unblendedcost  ELSE NULL END), 0) AS "cost_and_usage.total_reserved_unblended_cost",
	1.0 * (COALESCE(SUM(CASE WHEN (CASE
         WHEN cost_and_usage.lineitem_lineitemtype = 'DiscountedUsage' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'RIFee' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'Fee' THEN 'RI Line Item'
         ELSE 'Non RI Line Item'
        END = 'RI Line Item') THEN cost_and_usage.lineitem_unblendedcost  ELSE NULL END), 0)) / NULLIF((COALESCE(SUM(cost_and_usage.lineitem_unblendedcost ), 0)),0)  AS "cost_and_usage.percent_spend_on_ris",
	COALESCE(SUM(CASE WHEN (CASE
         WHEN cost_and_usage.lineitem_lineitemtype = 'DiscountedUsage' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'RIFee' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'Fee' THEN 'RI Line Item'
         ELSE 'Non RI Line Item'
        END = 'Non RI Line Item') THEN cost_and_usage.lineitem_unblendedcost  ELSE NULL END), 0) AS "cost_and_usage.total_non_reserved_unblended_cost",
	1.0 * (COALESCE(SUM(CASE WHEN (CASE
         WHEN cost_and_usage.lineitem_lineitemtype = 'DiscountedUsage' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'RIFee' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'Fee' THEN 'RI Line Item'
         ELSE 'Non RI Line Item'
        END = 'Non RI Line Item') THEN cost_and_usage.lineitem_unblendedcost  ELSE NULL END), 0)) / NULLIF((COALESCE(SUM(cost_and_usage.lineitem_unblendedcost ), 0)),0)  AS "cost_and_usage.percent_spend_on_non_ris"
FROM aws_optimizer.cost_and_usage  AS cost_and_usage

WHERE 
	(((from_iso8601_timestamp(cost_and_usage.lineitem_usagestartdate)) >= ((DATE_ADD('month', -5, DATE_TRUNC('MONTH', CAST(NOW() AS DATE))))) AND (from_iso8601_timestamp(cost_and_usage.lineitem_usagestartdate)) < ((DATE_ADD('month', 6, DATE_ADD('month', -5, DATE_TRUNC('MONTH', CAST(NOW() AS DATE))))))))
GROUP BY 1
ORDER BY 2 DESC
LIMIT 500

The resulting table should look something like the image below (I’m surfacing tables through Looker, though the same table would result from querying via command line or any other interface).

With a BI tool, you can create dashboards for easy reference and monitoring. New data is dumped into S3 every few hours, so your dashboards can update several times per day.

It’s an iterative process to understand the appropriate number of Reserved Instances needed to meet your business needs. After you’ve properly integrated Reserved Instances into your purchasing patterns, the savings can be significant. If your coverage is consistently below 70%, you should seriously consider adjusting your purchase types and opting for more Reserved instances.

Data transfer costs

One of the great things about AWS data storage is that it’s incredibly cheap. Most charges often come from moving and processing that data. There are several different prices for transferring data, broken out largely by transfers between regions and availability zones. Transfers between regions are the most costly, followed by transfers between Availability Zones. Transfers within the same region and same availability zone are free unless using elastic or public IP addresses, in which case there is a cost. You can find more detailed information in the AWS Pricing Docs. With this in mind, there are several simple strategies for helping reduce costs.

First, since costs increase when transferring data between regions, it’s wise to ensure that as many services as possible reside within the same region. The more you can localize services to one specific region, the lower your costs will be.

Second, you should maximize the data you’re routing directly within AWS services and IP addresses. Transfers out to the open internet are the most costly and least performant mechanisms of data transfers, so it’s best to keep transfers within AWS services.

Lastly, data transfers between private IP addresses are cheaper than between elastic or public IP addresses, so utilizing private IP addresses as much as possible is the most cost-effective strategy.

The following query provides a table depicting the total costs for each AWS product, broken out transfer cost type. Substitute the “lineitem_productcode” field in the query to segment the costs by any other attribute. If you notice any unusually high spikes in cost, you’ll need to dig deeper to understand what’s driving that spike: location, volume, and so on. Drill down into specific costs by including “product_usagetype” and “product_transfertype” in your query to identify the types of transfer costs that are driving up your bill.

SELECT 
	cost_and_usage.lineitem_productcode  AS "cost_and_usage.product_code",
	COALESCE(SUM(cost_and_usage.lineitem_unblendedcost), 0) AS "cost_and_usage.total_unblended_cost",
	COALESCE(SUM(CASE WHEN REGEXP_LIKE(cost_and_usage.product_usagetype, 'DataTransfer')    THEN cost_and_usage.lineitem_unblendedcost  ELSE NULL END), 0) AS "cost_and_usage.total_data_transfer_cost",
	COALESCE(SUM(CASE WHEN REGEXP_LIKE(cost_and_usage.product_usagetype, 'DataTransfer-In')    THEN cost_and_usage.lineitem_unblendedcost  ELSE NULL END), 0) AS "cost_and_usage.total_inbound_data_transfer_cost",
	COALESCE(SUM(CASE WHEN REGEXP_LIKE(cost_and_usage.product_usagetype, 'DataTransfer-Out')    THEN cost_and_usage.lineitem_unblendedcost  ELSE NULL END), 0) AS "cost_and_usage.total_outbound_data_transfer_cost"
FROM aws_optimizer.cost_and_usage  AS cost_and_usage

WHERE 
	(((from_iso8601_timestamp(cost_and_usage.lineitem_usagestartdate)) >= ((DATE_ADD('month', -5, DATE_TRUNC('MONTH', CAST(NOW() AS DATE))))) AND (from_iso8601_timestamp(cost_and_usage.lineitem_usagestartdate)) < ((DATE_ADD('month', 6, DATE_ADD('month', -5, DATE_TRUNC('MONTH', CAST(NOW() AS DATE))))))))
GROUP BY 1
ORDER BY 2 DESC
LIMIT 500

When moving between regions or over the open web, many data transfer costs also include the origin and destination location of the data movement. Using a BI tool with mapping capabilities, you can get a nice visual of data flows. The point at the center of the map is used to represent external data flows over the open internet.

Analysis by tags

AWS provides the option to apply custom tags to individual resources, so you can allocate costs over whatever customized segment makes the most sense for your business. For a SaaS company that hosts software for customers on AWS, maybe you’d want to tag the size of each customer. The following query uses custom tags to display the reserved, data transfer, and total cost for each AWS service, broken out by tag categories, over the last 6 months. You’ll want to substitute the cost_and_usage.resourcetags_customersegment and cost_and_usage.customer_segment with the name of your customer field.

 

SELECT * FROM (
SELECT *, DENSE_RANK() OVER (ORDER BY z___min_rank) as z___pivot_row_rank, RANK() OVER (PARTITION BY z__pivot_col_rank ORDER BY z___min_rank) as z__pivot_col_ordering FROM (
SELECT *, MIN(z___rank) OVER (PARTITION BY "cost_and_usage.product_code") as z___min_rank FROM (
SELECT *, RANK() OVER (ORDER BY CASE WHEN z__pivot_col_rank=1 THEN (CASE WHEN "cost_and_usage.total_unblended_cost" IS NOT NULL THEN 0 ELSE 1 END) ELSE 2 END, CASE WHEN z__pivot_col_rank=1 THEN "cost_and_usage.total_unblended_cost" ELSE NULL END DESC, "cost_and_usage.total_unblended_cost" DESC, z__pivot_col_rank, "cost_and_usage.product_code") AS z___rank FROM (
SELECT *, DENSE_RANK() OVER (ORDER BY CASE WHEN "cost_and_usage.customer_segment" IS NULL THEN 1 ELSE 0 END, "cost_and_usage.customer_segment") AS z__pivot_col_rank FROM (
SELECT 
	cost_and_usage.lineitem_productcode  AS "cost_and_usage.product_code",
	cost_and_usage.resourcetags_customersegment  AS "cost_and_usage.customer_segment",
	COALESCE(SUM(cost_and_usage.lineitem_unblendedcost ), 0) AS "cost_and_usage.total_unblended_cost",
	1.0 * (COALESCE(SUM(CASE WHEN REGEXP_LIKE(cost_and_usage.product_usagetype, 'DataTransfer')    THEN cost_and_usage.lineitem_unblendedcost  ELSE NULL END), 0)) / NULLIF((COALESCE(SUM(cost_and_usage.lineitem_unblendedcost ), 0)),0)  AS "cost_and_usage.percent_spend_data_transfers_unblended",
	1.0 * (COALESCE(SUM(CASE WHEN (CASE
         WHEN cost_and_usage.lineitem_lineitemtype = 'DiscountedUsage' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'RIFee' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'Fee' THEN 'RI Line Item'
         ELSE 'Non RI Line Item'
        END = 'Non RI Line Item') THEN cost_and_usage.lineitem_unblendedcost  ELSE NULL END), 0)) / NULLIF((COALESCE(SUM(cost_and_usage.lineitem_unblendedcost ), 0)),0)  AS "cost_and_usage.unblended_percent_spend_on_ris"
FROM aws_optimizer.cost_and_usage_raw  AS cost_and_usage

WHERE 
	(((from_iso8601_timestamp(cost_and_usage.lineitem_usagestartdate)) >= ((DATE_ADD('month', -5, DATE_TRUNC('MONTH', CAST(NOW() AS DATE))))) AND (from_iso8601_timestamp(cost_and_usage.lineitem_usagestartdate)) < ((DATE_ADD('month', 6, DATE_ADD('month', -5, DATE_TRUNC('MONTH', CAST(NOW() AS DATE))))))))
GROUP BY 1,2) ww
) bb WHERE z__pivot_col_rank <= 16384
) aa
) xx
) zz
 WHERE z___pivot_row_rank <= 500 OR z__pivot_col_ordering = 1 ORDER BY z___pivot_row_rank

The resulting table in this example looks like the results below. In this example, you can tell that we’re making poor use of Reserved Instances because they represent such a small portion of our overall costs.

Again, using a BI tool to visualize these costs and trends over time makes the analysis much easier to consume and take action on.

Summary

Saving costs on your AWS spend is always an iterative, ongoing process. Hopefully with these queries alone, you can start to understand your spending patterns and identify opportunities for savings. However, this is just a peek into the many opportunities available through analysis of the Cost and Usage report. Each company is different, with unique needs and usage patterns. To achieve maximum cost savings, we encourage you to set up an analytics environment that enables your team to explore all potential cuts and slices of your usage data, whenever it’s necessary. Exploring different trends and spikes across regions, services, user types, etc. helps you gain comprehensive understanding of your major cost levers and consistently implement new cost reduction strategies.

Note that all of the queries and analysis provided in this post were generated using the Looker data platform. If you’re already a Looker customer, you can get all of this analysis, additional pre-configured dashboards, and much more using Looker Blocks for AWS.


About the Author

Dillon Morrison leads the Platform Ecosystem at Looker. He enjoys exploring new technologies and architecting the most efficient data solutions for the business needs of his company and their customers. In his spare time, you’ll find Dillon rock climbing in the Bay Area or nose deep in the docs of the latest AWS product release at his favorite cafe (“Arlequin in SF is unbeatable!”).

 

 

 

Showtime Seeks Injunction to Stop Mayweather v McGregor Piracy

Post Syndicated from Andy original https://torrentfreak.com/showtime-seeks-injunction-to-stop-mayweather-v-mcgregor-piracy-170816/

It’s the fight that few believed would become reality but on August 26, at the T-Mobile Arena in Las Vegas, Floyd Mayweather Jr. will duke it out with UFC lightweight champion Conor McGregor.

Despite being labeled a freak show by boxing purists, it is set to become the biggest combat sports event of all time. Mayweather, undefeated in his professional career, will face brash Irishman McGregor, who has gained a reputation for accepting fights with anyone – as long as there’s a lot of money involved. Big money is definitely the theme of the Mayweather bout.

Dubbed “The Money Fight”, some predict it could pull in a billion dollars, with McGregor pocketing $100m and Mayweather almost certainly more. Many of those lucky enough to gain entrance on the night will have spent thousands on their tickets but for the millions watching around the world….iiiiiiiit’s Showtimmme….with hefty PPV prices attached.

Of course, not everyone will be handing over $89.95 to $99.99 to watch the event officially on Showtime. Large numbers will turn to the many hundreds of websites set to stream the fight for free online, which has the potential to reduce revenues for all involved. With that in mind, Showtime Networks has filed a lawsuit in California which attempts to preemptively tackle this piracy threat.

The suit targets a number of John Does said to be behind a network of dozens of sites planning to stream the fight online for free. Defendant 1, using the alias “Kopa Mayweather”, is allegedly the operator of LiveStreamHDQ, a site that Showtime has grappled with previously.

“Plaintiff has had extensive experience trying to prevent live streaming websites from engaging in the unauthorized reproduction and distribution of Plaintiff’s copyrighted works in the past,” the lawsuit reads.

“In addition to bringing litigation, this experience includes sending cease and desist demands to LiveStreamHDQ in response to its unauthorized live streaming of the record-breaking fight between Floyd Mayweather, Jr. and Manny Pacquiao.”

Showtime says that LiveStreamHDQ is involved in the operations of at least 41 other sites that have been set up to specifically target people seeking to watch the fight without paying. Each site uses a .US ccTLD domain name.

Sample of the sites targeted by the lawsuit

Showtime informs the court that the registrant email and IP addresses of the domains overlap, which provides further proof that they’re all part of the same operation. The TV network also highlights various statements on the sites in question which demonstrate intent to show the fight without permission, including the highly dubious “Watch From Here Mayweather vs Mcgregor Live with 4k Display.”

In addition, the lawsuit is highly critical of efforts by the sites’ operator(s) to stuff the pages with fight-related keywords in order to draw in as much search engine traffic as they can.

“Plaintiff alleges that Defendants have engaged in such keyword stuffing as a form of search engine optimization in an effort to attract as much web traffic as possible in the form of Internet users searching for a way to access a live stream of the Fight,” it reads.

While site operators are expected to engage in such behavior, Showtime says that these SEO efforts have been particularly successful, obtaining high-ranking positions in major search engines for the would-be pirate sites.

For instance, Showtime says that a Google search for “Mayweather McGregor Live” results in four of the target websites appearing in the first 100 results, i.e the first 10 pages. Interestingly, however, to get that result searchers would need to put the search in quotes as shown above, since a plain search fails to turn anything up in hundreds of results.

At this stage, the important thing to note is that none of the sites are currently carrying links to the fight, because the fight is yet to happen. Nevertheless, Showtime is convinced that come fight night, all of the target websites will be populated with pirate links, accessible for free or after paying a fee. This needs to be stopped, it argues.

“Defendants’ anticipated unlawful distribution will impair the marketability and profitability of the Coverage, and interfere with Plaintiff’s own authorized distribution of the Coverage, because Defendants will provide consumers with an opportunity to view the Coverage in its entirety for free, rather than paying for the Coverage provided through Plaintiff’s authorized channels.

“This is especially true where, as here, the work at issue is live coverage of a one-time live sporting event whose outcome is unknown,” the network writes.

Showtime informs the court that it made efforts to contact the sites in question but had just a single response from an individual who claimed to be sports blogger who doesn’t offer streaming services. The undertone is one of disbelief.

In closing, Showtime demands a temporary restraining order, preliminary injunction, and permanent injunction, prohibiting the defendants from making the fight available in any way, and/or “forming new entities” in order to circumvent any subsequent court order. Compensation for suspected damages is also requested.

Showtime previously applied for and obtained a similar injunction to cover the (hugely disappointing) Mayweather v Pacquiao fight in 2015. In that case, websites were ordered to be taken down on the day before the fight.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Wanted: Front End Developer

Post Syndicated from Yev original https://www.backblaze.com/blog/wanted-front-end-developer/

Want to work at a company that helps customers in over 150 countries around the world protect the memories they hold dear? Do you want to challenge yourself with a business that serves consumers, SMBs, Enterprise, and developers? If all that sounds interesting, you might be interested to know that Backblaze is looking for a Front End Developer​!

Backblaze is a 10 year old company. Providing great customer experiences is the “secret sauce” that enables us to successfully compete against some of technology’s giants. We’ll finish the year at ~$20MM ARR and are a profitable business. This is an opportunity to have your work shine at scale in one of the fastest growing verticals in tech – Cloud Storage.

You will utilize HTML, ReactJS, CSS and jQuery to develop intuitive, elegant user experiences. As a member of our Front End Dev team, you will work closely with our web development, software design, and marketing teams.

On a day to day basis, you must be able to convert image mockups to HTML or ReactJS – There’s some production work that needs to get done. But you will also be responsible for helping build out new features, rethink old processes, and enabling third party systems to empower our marketing/sales/ and support teams.

Our Front End Developer must be proficient in:

  • HTML, ReactJS
  • UTF-8, Java Properties, and Localized HTML (Backblaze runs in 11 languages!)
  • JavaScript, CSS, Ajax
  • jQuery, Bootstrap
  • JSON, XML
  • Understanding of cross-browser compatibility issues and ways to work around them
  • Basic SEO principles and ensuring that applications will adhere to them
  • Learning about third party marketing and sales tools through reading documentation. Our systems include Google Tag Manager, Google Analytics, Salesforce, and Hubspot

Struts, Java, JSP, Servlet and Apache Tomcat are a plus, but not required.

We’re looking for someone that is:

  • Passionate about building friendly, easy to use Interfaces and APIs.
  • Likes to work closely with other engineers, support, and marketing to help customers.
  • Is comfortable working independently on a mutually agreed upon prioritization queue (we don’t micromanage, we do make sure tasks are reasonably defined and scoped).
  • Diligent with quality control. Backblaze prides itself on giving our team autonomy to get work done, do the right thing for our customers, and keep a pace that is sustainable over the long run. As such, we expect everyone that checks in code that is stable. We also have a small QA team that operates as a secondary check when needed.

Backblaze Employees Have:

  • Good attitude and willingness to do whatever it takes to get the job done
  • Strong desire to work for a small fast, paced company
  • Desire to learn and adapt to rapidly changing technologies and work environment
  • Comfort with well behaved pets in the office

This position is located in San Mateo, California. Regular attendance in the office is expected. Backblaze is an Equal Opportunity Employer and we offer competitive salary and benefits, including our no policy vacation policy.

If this sounds like you
Send an email to [email protected] with:

  1. Front End Dev​ in the subject line
  2. Your resume attached
  3. An overview of your relevant experience

The post Wanted: Front End Developer appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Time-lapse Visualizes Game of Thrones Piracy Around The Globe

Post Syndicated from Ernesto original https://torrentfreak.com/time-lapse-visualizes-game-of-thrones-piracy-around-the-globe-17-730/

Game of Thrones has been the most pirated TV-show online for years, and this isn’t expected to change anytime soon.

While most of today’s piracy takes place through streaming services, BitTorrent traffic remains significant as well. The show’s episodes are generally downloaded millions of times each, by people from all over the world.

In recent years there have been several attempts to quantify this piracy bonanza. While MILLIONS of downloads make for a good headline, there are some other trends worth looking at as well.

TorrentFreak spoke to Abigail De Kosnik, an Associate Professor at the University of California, Berkeley. Together with computer scientist and artist Benjamin De Kosnik, she runs the BitTorrent-oriented research project “alpha60.”

The goal of alpha60 is to quantify and map BitTorrent activity around various media titles, to make this “shadow economy” visible to media scholars and the general public. Over the past two weeks, they’ve taken a close look at Game of Thrones downloads.

Their tracking software collected swarm data from 72 torrents that were released shortly after the first episode premiered. Before being anonymized, the collected IP-addresses were first translated to geographical locations, to reveal various traffic patterns.

The results, summarized in a white paper, reveal that during the first five days, alpha60 registered an estimated 1.77 million downloads. Of particular interest is the five-day time-lapse of the worldwide swarm activity.

Five-day Game of Thrones piracy timelapse

The time-lapse shows that download patterns vary depending on the time of the day. There is a lot of activity in Asia, but cities such as Athens, Toronto, and Sao Paulo also pop up regularly.

When looking at the absolute numbers, Seoul comes out on top as the Game of Thrones download capital of the world, followed by Athens, São Paulo, Guangzhou, Mumbai, and Bangalore.

Perhaps more interesting is the view of the number of downloads relative to the population, or the “over-pirating” cities, as alpha60 calls them. Here, Dallas comes out on top, before Brisbane, Chicago, Riyadh, Saudi Arabia, Seattle, and Perth.

Of course, VPNs may skew the results somewhat, but overall the data should give a pretty accurate impression of the download traffic around the globe.

Below are the complete top tens of most active cities, both in absolute numbers and relative to the population. Further insights and additional information is available in the full whitepaper, which can be accessed here.

Note: The download totals reported by alpha60 are significantly lower than the MUSO figures that came out last week. Alpha60 stresses, however, that their methods and data are accurate. MUSO, for its part, has made some dubious claims in the past.

Most downloads (absolute)

1 Seoul, Rep. of Korea
2 Athens, Greece
3 São Paulo, Brazil
4 Guangzhou, China
5 Mumbai, India
6 Bangalore, India
7 Shanghai, China
8 Riyadh, Saudi Arabia
9 Delhi, India
10 Beijing, China

Most downloads (relative)

1 Dallas, USA
2 Brisbane, Australia
3 Chicago, USA
4 Riyadh, Saudi Arabia
5 Seattle, USA
6 Perth, Australia
7 Phoenix, USA
8 Toronto, Canada
9 Athens, Greece
10 Guangzhou, China

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

How To Get Your First 1,000 Customers

Post Syndicated from Gleb Budman original https://www.backblaze.com/blog/how-to-get-your-first-1000-customers/

PR for getting your first 1000 customers

If you launch your startup and no one knows, did you actually launch? As mentioned in my last post, our initial launch target was to get a 1,000 people to use our service. But how do you get even 1,000 people to sign up for your service when no one knows who you are?

There are a variety of methods to attract your first 1,000 customers, but launching with the press is my favorite. I’ll explain why and how to do it below.

Paths to Attract Your First 1,000 Customers

Social following: If you have a massive social following, those people are a reasonable target for what you’re offering. In particular if your relationship with them is one where they would buy something you recommend, this can be one of the easiest ways to get your initial customers. However, building this type of following is non-trivial and often is done over several years.

Press not only provides awareness and customers, but credibility and SEO benefits as well

Paid advertising: The advantage of paid ads is you have control over when they are presented and what they say. The primary disadvantage is they tend to be expensive, especially before you have your positioning, messaging, and funnel nailed.

Viral: There are certainly examples of companies that launched with a hugely viral video, blog post, or promotion. While fantastic if it happens, even if you do everything right, the likelihood of massive virality is miniscule and the conversion rate is often low.

Press: As I said, this is my favorite. You don’t need to pay a PR agency and can go from nothing to launched in a couple weeks. Press not only provides awareness and customers, but credibility and SEO benefits as well.

How to Pitch the Press

It’s easy: Have a compelling story, find the right journalists, make their life easy, pitch and follow-up. Of course, each one of those has some nuance, so let’s dig in.

Have a compelling story

How to Get Attention When you’ve been working for months on your startup, it’s easy to get lost in the minutiae when talking to others. Stories that a journalist will write about need to be something their readers will care about. Knowing what story to tell and how to tell it is part science and part art. Here’s how you can get there:

The basics of your story

Ask yourself the following questions, and write down the answers:

  • What are we doing? What product service are we offering?
  • Why? What problem are we solving?
  • What is interesting or unique? Either about what we’re doing, how we’re doing it, or for who we’re doing it.

“But my story isn’t that exciting”

Neither was announcing a data backup company, believe me. Look for angles that make it compelling. Here are some:

  • Did someone on your team do something major before? (build a successful company/product, create some innovation, market something we all know, etc.)
  • Do you have an interesting investor or board member?
  • Is there a personal story that drove you to start this company?
  • Are you starting it in a unique place?
  • Did you come upon the idea in a unique way?
  • Can you share something people want to know that’s not usually shared?
  • Are you partnered with a well-known company?
  • …is there something interesting/entertaining/odd/shocking/touching/etc.?

It doesn’t get much less exciting than, “We’re launching a company that will backup your data.” But there were still a lot of compelling stories:

  • Founded by serial entrepreneurs, bootstrapped a capital-intensive company, committed to each other for a year without salary.
  • Challenging the way that every backup company before was set up by not asking customers to pick and choose files to backup.
  • Designing our own storage system.
  • Etc. etc.

For the initial launch, we focused on “unlimited for $5/month” and statistics from a survey we ran with Harris Interactive that said that 94% of people did not regularly backup their data.

It’s an old adage that “Everyone has a story.” Regardless of what you’re doing, there is always something interesting to share. Dig for that.

The headline

Once you’ve captured what you think the interesting story is, you’ve got to boil it down. Yes, you need the elevator pitch, but this is shorter…it’s the headline pitch. Write the headline that you would love to see a journalist write.

Regardless of what you’re doing, there is always something interesting to share. Dig for that.

Now comes the part where you have to be really honest with yourself: if you weren’t involved, would you care?

The “Techmeme Test”

One way I try to ground myself is what I call the “Techmeme Test”. Techmeme lists the top tech articles. Read the headlines. Imagine the headline you wrote in the middle of the page. If you weren’t involved, would you click on it? Is it more or less compelling than the others. Much of tech news is dominated by the largest companies. If you want to get written about, your story should be more compelling. If not, go back above and explore your story some more.

Embargoes, exclusives and calls-to-action

Journalists write about news. Thus, if you’ve already announced something and are then pitching a journalist to cover it, unless you’re giving her something significant that hasn’t been said, it’s no longer news. As a result, there are ‘embargoes’ and ‘exclusives’.

Embargoes

    • : An embargo simply means that you are sharing news with a journalist that they need to keep private until a certain date and time.

If you’re Apple, this may be a formal and legal document. In our case, it’s as simple as saying, “Please keep embargoed until 4/13/17 at 8am California time.” in the pitch. Some sites explicitly will not keep embargoes; for example The Information will only break news. If you want to launch something later, do not share information with journalists at these sites. If you are only working with a single journalist for a story, and your announcement time is flexible, you can jointly work out a date and time to announce. However, if you have a fixed launch time or are working with a few journalists, embargoes are key.

Exclusives: An exclusive means you’re giving something specifically to that journalist. Most journalists love an exclusive as it means readers have to come to them for the story. One option is to give a journalist an exclusive on the entire story. If it is your dream journalist, this may make sense. Another option, however, is to give exclusivity on certain pieces. For example, for your launch you could give an exclusive on funding detail & a VC interview to a more finance-focused journalist and insight into the tech & a CTO interview to a more tech-focused journalist.

Call-to-Action: With our launch we gave TechCrunch, Ars Technica, and SimplyHelp URLs that gave the first few hundred of their readers access to the private beta. Once those first few hundred users from each site downloaded, the beta would be turned off.

Thus, we used a combination of embargoes, exclusives, and a call-to-action during our initial launch to be able to brief journalists on the news before it went live, give them something they could announce as exclusive, and provide a time-sensitive call-to-action to the readers so that they would actually sign up and not just read and go away.

How to Find the Most Authoritative Sites / Authors

“If a press release is published and no one sees it, was it published?” Perhaps the time existed when sending a press release out over the wire meant journalists would read it and write about it. That time has long been forgotten. Over 1,000 unread press releases are published every day. If you want your compelling story to be covered, you need to find the handful of journalists that will care.

Determine the publications

Find the publications that cover the type of story you want to share. If you’re in tech, Techmeme has a leaderboard of publications ranked by leadership and presence. This list will tell you which publications are likely to have influence. Visit the sites and see if your type of story appears on their site. But, once you’ve determined the publication do NOT send a pitch their “[email protected]” or “[email protected]” email addresses. In all the times I’ve done that, I have never had a single response. Those email addresses are likely on every PR, press release, and spam list and unlikely to get read. Instead…

Determine the journalists

Once you’ve determined which publications cover your area, check which journalists are doing the writing. Skim the articles and search for keywords and competitor names.

Over 1,000 unread press releases are published every day.

Identify one primary journalist at the publication that you would love to have cover you, and secondary ones if there are a few good options. If you’re not sure which one should be the primary, consider a few tests:

  • Do they truly seem to care about the space?
  • Do they write interesting/compelling stories that ‘get it’?
  • Do they appear on the Techmeme leaderboard?
  • Do their articles get liked/tweeted/shared and commented on?
  • Do they have a significant social presence?

Leveraging Google

Google author search by date

In addition to Techmeme or if you aren’t in the tech space Google will become a must have tool for finding the right journalists to pitch. Below the search box you will find a number of tabs. Click on Tools and change the Any time setting to Custom range. I like to use the past six months to ensure I find authors that are actively writing about my market. I start with the All results. This will return a combination of product sites and articles depending upon your search term.

Scan for articles and click on the link to see if the article is on topic. If it is find the author’s name. Often if you click on the author name it will take you to a bio page that includes their Twitter, LinkedIn, and/or Facebook profile. Many times you will find their email address in the bio. You should collect all the information and add it to your outreach spreadsheet. Click here to get a copy. It’s always a good idea to comment on the article to start building awareness of your name. Another good idea is to Tweet or Like the article.

Next click on the News tab and set the same search parameters. You will get a different set of results. Repeat the same steps. Between the two searches you will have a list of authors that actively write for the websites that Google considers the most authoritative on your market.

How to find the most socially shared authors

Buzzsumo search for most shared by date

Your next step is to find the writers whose articles get shared the most socially. Go to Buzzsumo and click on the Most Shared tab. Enter search terms for your market as well as competitor names. Again I like to use the past 6 months as the time range. You will get a list of articles that have been shared the most across Facebook, LinkedIn, Twitter, Pinterest, and Google+. In addition to finding the most shared articles and their authors you can also see some of the Twitter users that shared the article. Many of those Twitter users are big influencers in your market so it’s smart to start following and interacting with them as well as the authors.

How to Find Author Email Addresses

Some journalists publish their contact info right on the stories. For those that don’t, a bit of googling will often get you the email. For example, TechCrunch wrote a story a few years ago where they published all of their email addresses, which was in response to this new service that charges a small fee to provide journalist email addresses. Sometimes visiting their twitter pages will link to a personal site, upon which they will share an email address.

Of course all is not lost if you don’t find an email in the bio. There are two good services for finding emails, https://app.voilanorbert.com/ and https://hunter.io/. For Voila Norbert enter the author name and the website you found their article on. The majority of the time you search for an author on a major publication Norbert will return an accurate email address. If it doesn’t try Hunter.io.

On Hunter.io enter the domain name and click on Personal Only. Then scroll through the results to find the author’s email. I’ve found Norbert to be more accurate overall but between the two you will find most major author’s email addresses.

Email, by the way, is not necessarily the best way to engage a journalist. Many are avid Twitter users. Follow them and engage – that means read/retweet/favorite their tweets; reply to their questions, and generally be helpful BEFORE you pitch them. Later when you email them, you won’t be just a random email address.

Don’t spam

Now that you have all these email addresses (possibly thousands if you purchased a list) – do NOT spam. It is incredibly tempting to think “I could try to figure out which of these folks would be interested, but if I just email all of them, I’ll save myself time and be more likely to get some of them to respond.” Don’t do it.

Follow them and engage – that means read/retweet/favorite their tweets; reply to their questions, and generally be helpful BEFORE you pitch them.

First, you’ll want to tailor your pitch to the individual. Second, it’s a small world and you’ll be known as someone who spams – reputation is golden. Also, don’t call journalists. Unless you know them or they’ve said they’re open to calls, you’re most likely to just annoy them.

Build a relationship

Build Trust with reporters Play the long game. You may be focusing just on the launch and hoping to get this one story covered, but if you don’t quickly flame-out, you will have many more opportunities to tell interesting stories that you’ll want the press to cover. Be honest and don’t exaggerate.
When you have 500 users it’s tempting to say, “We’ve got thousands!” Don’t. The good journalists will see through it and it’ll likely come back to bite you later. If you don’t know something, say “I don’t know but let me find out for you.” Most journalists want to write interesting stories that their readers will appreciate. Help them do that. Build deeper relationships with 5 – 10 journalists, rather than spamming thousands.

Stay organized

It doesn’t need to be complicated, but keep a spreadsheet that includes the name, publication, and contact info of the journalists you care about. Then, use it to keep track of who you’ve pitched, who’s responded, whether you’ve sent them the materials they need, and whether they intend to write/have written.

Make their life easy

Journalists have a million PR people emailing them, are actively engaging with readers on Twitter and in the comments, are tracking their metrics, are working their sources…and all the while needing to publish new articles. They’re busy. Make their life easy and they’re more likely to engage with yours.

Get to know them

Before sending them a pitch, know what they’ve written in the space. If you tell them how your story relates to ones they’ve written, it’ll help them put the story in context, and enable them to possibly link back to a story they wrote before.

Prepare your materials

Journalists will need somewhere to get more info (prepare a fact sheet), a URL to link to, and at least one image (ideally a few to choose from.) A fact sheet gives bite-sized snippets of information they may need about your startup or product: what it is, how big the market is, what’s the pricing, who’s on the team, etc. The URL is where their reader will get the product or more information from you. It doesn’t have to be live when you’re pitching, but you should be able to tell what the URL will be. The images are ones that they could embed in the article: a product screenshot, a CEO or team photo, an infographic. Scan the types of images included in their articles. Don’t send any of these in your pitch, but have them ready. Studies, stats, customer/partner/investor quotes are also good to have.

Pitch

A pitch has to be short and compelling.

Subject Line

Think back to the headline you want. Is it really compelling? Can you shorten it to a subject line? Include what’s happening and when. For Mike Arrington at Techcrunch, our first subject line was “Startup doing an ‘online time machine’”. Later I would include, “launching June 6th”.

For John Timmer at ArsTechnica, it was “Demographics data re: your 4/17 article”. Why? Because he wrote an article titled “WiFi popular with the young people; backups, not so much”. Since we had run a demographics survey on backups, I figured as a science editor he’d be interested in this additional data.

Body

A few key things about the body of the email. It should be short and to the point, no more than a few sentences. Here was my actual, original pitch email to John:

Hey John,

We’re launching Backblaze next week which provides a Time Machine-online type of service. As part of doing some research I read your article about backups not being popular with young people and that you had wished Accenture would have given you demographics. In prep for our invite-only launch I sponsored Harris Interactive to get demographic data on who’s doing backups and if all goes well, I should have that data on Friday.

Next week starts Backup Awareness Month (and yes, probably Clean Your House Month and Brush Your Teeth Month)…but nonetheless…good time to remind readers to backup with a bit of data?

Would you be interested in seeing/talking about the data when I get it?

Would you be interested in getting a sneak peak at Backblaze? (I could give you some invite codes for your readers as well.)

Gleb Budman        

CEO and Co-Founder

Backblaze, Inc.

Automatic, Secure, High-Performance Online Backup

Cell: XXX-XXX-XXXX

The Good: It said what we’re doing, why this relates to him and his readers, provides him information he had asked for in an article, ties to something timely, is clearly tailored for him, is pitched by the CEO and Co-Founder, and provides my cell.

The Bad: It’s too long.

I got better later. Here’s an example:

Subject: Does temperature affect hard drive life?

Hi Peter, there has been much debate about whether temperature affects how long a hard drive lasts. Following up on the Backblaze analyses of how long do drives last & which drives last the longest (that you wrote about) we’ve now analyzed the impact of heat on the nearly 40,000 hard drives we have and found that…

We’re going to publish the results this Monday, 5/12 at 5am California-time. Want a sneak peak of the analysis?

Timing

A common question is “When should I launch?” What day, what time? I prefer to launch on Tuesday at 8am California-time. Launching earlier in the week gives breathing room for the news to live longer. While your launch may be a single article posted and that’s that, if it ends up a larger success, earlier in the week allows other journalists (including ones who are in other countries) to build on the story. Monday announcements can be tough because the journalists generally need to have their stories finished by Friday, and while ideally everything is buttoned up beforehand, startups sometimes use the weekend as overflow before a launch.

The 8am California-time is because it allows articles to be published at the beginning of the day West Coast and around lunch-time East Coast. Later and you risk it being past publishing time for the day. We used to launch at 5am in order to be morning for the East Coast, but it did not seem to have a significant benefit in coverage or impact, but did mean that the entire internal team needed to be up at 3am or 4am. Sometimes that’s critical, but I prefer to not burn the team out when it’s not.

Finally, try to stay clear of holidays, major announcements and large conferences. If Apple is coming out with their next iPhone, many of the tech journalists will be busy at least a couple days prior and possibly a week after. Not always obvious, but if you can, find times that are otherwise going to be slow for news.

Follow-up

There is a fine line between persistence and annoyance. I once had a journalist write me after we had an announcement that was covered by the press, “Why didn’t you let me know?! I would have written about that!” I had sent him three emails about the upcoming announcement to which he never responded.

My general rule is 3 emails.

Ugh. However, my takeaway from this isn’t that I should send 10 emails to every journalist. It’s that sometimes these things happen.

My general rule is 3 emails. If I’ve identified a specific journalist that I think would be interested and have a pitch crafted for her, I’ll send her the email ideally 2 weeks prior to the announcement. I’ll follow-up a week later, and one more time 2 days prior. If she ever says, “I’m not interested in this topic,” I note it and don’t email her on that topic again.

If a journalist wrote, I read the article and engage in the comments (or someone on our team, such as our social guy, @YevP does). We’ll often promote the story through our social channels and email our employees who may choose to share the story as well. This helps us, but also helps the journalist get their story broader reach. Again, the goal is to build a relationship with the journalists your space. If there’s something relevant to your customers that the journalist wrote, you’re providing a service to your customers AND helping the journalist get the word out about the article.

At times the stories also end up shared on sites such as Hacker News, Reddit, Slashdot, or become active conversations on Twitter. Again, we try to engage there and respond to questions (when we do, we are always clear that we’re from Backblaze.)

And finally, I’ll often send a short thank you to the journalist.

Getting Your First 1,000 Customers With Press

As I mentioned at the beginning, there is more than one way to get your first 1,000 customers. My favorite is working with the press to share your story. If you figure out your compelling story, find the right journalists, make their life easy, pitch and follow-up, you stand a high likelyhood of getting coverage and customers. Better yet, that coverage will provide credibility for your company, and if done right, will establish you as a resource for the press for the future.

Like any muscle, this process takes working out. The first time may feel a bit daunting, but just take the steps one at a time. As you do this a few times, the process will be easier and you’ll know who to reach out and quickly determine what stories will be compelling.

The post How To Get Your First 1,000 Customers appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

AWS Price Reduction – SQL Server Standard Edition on EC2

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-price-reduction-sql-server-standard-edition-on-ec2/

I’m happy to be able to announce the 62nd AWS price reduction, this one for Microsoft SQL Server Standard Edition on EC2.

Many enterprise workloads run on Microsoft Windows, primarily on-premises or in corporate data centers. We believe that AWS is the best place to build, deploy, scale, and manage Windows applications due to the breadth of services that we provide, backed up by our global reach and our partner ecosystem. Customers like Adobe, Pitney Bowes, and DeVry University have all moved core production Windows Server workloads to AWS. Their applications run the gamut from SharePoint sites to custom .NET applications and SAP, and frequently use SQL Server.

Microsoft SQL Server on AWS runs on an EC2 Windows instance and can support your application development and migration efforts. It gives you control over every setting, just as you would have if you were running your relational database on-premises, with support for 32-bit and 64-bit versions.

Today we are reducing the On-Demand and Reserved Instance prices for Microsoft SQL Server Standard Edition on EC2 running on R4, M4, I3, and X1 instances by up to 52%, depending on instance type, size, and region. You can build and run enterprise-scale applications, massively scalable websites. and mobile applications even more cost-effectively than before.

Here are the largest price reductions for each region and instance type:

Region R4 M4 I3 X1
US East (Northern Virginia) -51% -29% -50% -52%
US East (Ohio) -51% -29% -50% -52%
US West (Oregon) -51% -29% -50% -52%
US West (Northern California) -51% -30% -50%
Canada (Central) -51% -51% -50% -44%
South America (São Paulo) -49% -30% -48%
EU (Ireland) -51% -29% -50% -51%
EU (Frankfurt) -51% -29% -50% -50%
EU (London) -51% -51% -50% -44%
Asia Pacific (Singapore) -51% -31% -50% -50%
Asia Pacific (Sydney) -51% -30% -50% -50%
Asia Pacific (Tokyo) -51% -29% -50% -50%
Asia Pacific (Seoul)  -51% -31% -50% -50%
Asia Pacific (Mumbai)  -51% -33% -50% -50%

The new, lower prices for On-Demand instances are in effect as of July 1, 2017. The new pricing for Reserved Instances is in effect today.

Jeff;

 

In the Works – AWS Region in Hong Kong

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/in-the-works-aws-region-in-hong-kong/

Last year we launched new AWS Regions in Canada, India, Korea, the UK (London), and the United States (Ohio), and announced that new regions are coming to France (Paris), China (Ningxia), and Sweden (Stockholm).

Coming to Hong Kong in 2018
Today, I am happy to be able to tell you that we are planning to open up an AWS Region in Hong Kong, in 2018. Hong Kong is a leading international financial center, well known for its service oriented economy. It is rated highly on innovation and for ease of doing business. As an evangelist, I get to visit many great cities in the world, and was lucky to have spent some time in Hong Kong back in 2014 and met a number of awesome customers there. Many of these customers have given us feedback that they wanted a local AWS Region.

This will be the eighth AWS Region in Asia Pacific joining six other Regions there — Singapore, Tokyo, Sydney, Beijing, Seoul, and Mumbai, and an additional Region in China (Ningxia) expected to launch in the coming months. Together, these Regions will provide our customers with a total of 19 Availability Zones (AZs) and allow them to architect highly fault tolerant applications.

Today, our infrastructure comprises 43 Availability Zones across 16 geographic regions worldwide, with another three AWS Regions (and eight Availability Zones) in France, China, and Sweden coming online throughout 2017 and 2018, (see the AWS Global Infrastructure page for more info).

We are looking forward to serving new and existing customers in Hong Kong and working with partners across Asia-Pacific. Of course, the new region will also be open to existing AWS customers who would like to process and store data in Hong Kong. Public sector organizations such as government agencies, educational institutions, and nonprofits in Hong Kong will be able to use this region to store sensitive data locally (the AWS in the Public Sector page has plenty of success stories drawn from our worldwide customer base).

If you are a customer or a partner and have specific questions about this Region, you can contact our Hong Kong team.

Help Wanted
If you are interested in learning more about AWS positions in Hong Kong, please visit the Amazon Jobs site and set the location to Hong Kong.

Jeff;

 

Build a Visualization and Monitoring Dashboard for IoT Data with Amazon Kinesis Analytics and Amazon QuickSight

Post Syndicated from Karan Desai original https://aws.amazon.com/blogs/big-data/build-a-visualization-and-monitoring-dashboard-for-iot-data-with-amazon-kinesis-analytics-and-amazon-quicksight/

Customers across the world are increasingly building innovative Internet of Things (IoT) workloads on AWS. With AWS, they can handle the constant stream of data coming from millions of new, internet-connected devices. This data can be a valuable source of information if it can be processed, analyzed, and visualized quickly in a scalable, cost-efficient manner. Engineers and developers can monitor performance and troubleshoot issues while sales and marketing can track usage patterns and statistics to base business decisions.

In this post, I demonstrate a sample solution to build a quick and easy monitoring and visualization dashboard for your IoT data using AWS serverless and managed services. There’s no need for purchasing any additional software or hardware. If you are already using AWS IoT, you can build this dashboard to tap into your existing device data. If you are new to AWS IoT, you can be up and running in minutes using sample data. Later, you can customize it to your needs, as your business grows to millions of devices and messages.

Architecture

The following is a high-level architecture diagram showing the serverless setup to configure.

 

AWS service overview

AWS IoT is a managed cloud platform that lets connected devices interact easily and securely with cloud applications and other devices. AWS IoT can process and route billions of messages to AWS endpoints and to other devices reliably and securely.

Amazon Kinesis Firehose is the easiest way to capture, transform, and load streaming data continuously into AWS from thousands of data sources, such as IoT devices. It is a fully managed service that automatically scales to match the throughput of your data and requires no ongoing administration.

Amazon Kinesis Analytics allows you to process streaming data coming from IoT devices in real time with standard SQL, without having to learn new programming languages or processing frameworks, providing actionable insights promptly.

The processed data is fed into Amazon QuickSight, which is a fast, cloud-powered business analytics service that makes it easy to build visualizations, perform ad-hoc analysis, and quickly get business insights from the data.

The most popular way for Internet-connected devices to send data is using MQTT messages. The AWS IoT gateway receives these messages from registered IoT devices. The solution in this post uses device data from AWS Simple Beer Service (SBS), a series of internet-connected kegerators sending sensor outputs such as temperature, humidity, and sound levels in a JSON payload. You can use any existing IoT data source that you may have.

The AWS IoT rules engine allows selecting data from message payloads, processing it, and sending it to other services. You forward the data to a Firehose delivery stream to consolidate the continuous data stream into batches for further processing. The batched data is also stored temporarily in an Amazon S3 bucket for later retrieval and can be set for deletion after a specified time using S3 Lifecycle Management rules.

The incoming data from the Firehose delivery stream is fed into an Analytics application that provides an easy way to process the data in real time using standard SQL queries. Analytics allows writing standard SQL queries to extract specific components from the incoming data stream and perform real-time ETL on it. In this post, you use this feature to aggregate minimum and maximum temperature values from the sensors per minute. You load it in Amazon QuickSight to create a monitoring dashboard and check if the devices are over-heating or cooling down during use. You also extract every device’s location, parameters such as temperature, sound levels, humidity, and the time stamp in Analytics to use on the visualization dashboard.

The processed data from the two queries is fed into two Firehose delivery streams, both of which batch the data into CSV files every minute and store it in S3. The batching time interval is configurable between 1 and 15 minutes in 1-second intervals.

Finally, you use Amazon QuickSight to ingest the processed CSV files from S3 as a data source to build visualizations. Amazon QuickSight’s super-fast, parallel, in-memory, calculation engine (SPICE) parses the ingested data and allows you to create a variety of visualizations with different graph types. You can also use the Amazon QuickSight built-in Story feature to combine visualizations into business dashboards that can be shared in a secure manner.

Implementation

AWS IoT, Amazon Kinesis, and Amazon QuickSight are all fully managed services, which means you can complete the entire setup in just a few steps using the AWS Management Console. Don’t worry about setting up any underlying hardware or installing any additional software. So, get started.

Step 1. Set up your AWS IoT data source

Do you currently use AWS IoT? If you have an existing IoT thing set up and running on AWS IoT, you can skip to Step 2.

If you have an AWS IoT button or other IoT devices that can publish MQTT messages and would like to use that for the setup, follow the Getting Started with AWS IoT topic to connect your thing to AWS IoT. Continue to Step 2.

If you do not have an existing IoT device, you can generate simulated device data using a script on your local machine and have it publish to AWS IoT. The following script lets you set up your AWS IoT environment and publish simulated data that mimics device data from Simple Beer Service.

Generate sample Data

Running the sbs.py Python script generates fictitious AWS IoT messages from multiple SBS devices. The IoT rule sends the message to Firehose for further processing.

The script requires access to AWS CLI credentials and boto3 installation on the machine running the script. Download and run the following Python script:

https://github.com/awslabs/sbs-iot-data-generator/blob/master/sbs.py

The script generates random data that looks like the following:

{"deviceParameter": "Temperature", "deviceValue": 33, "deviceId": "SBS01", "dateTime": "2017-02-03 11:29:37"}
{"deviceParameter": "Sound", "deviceValue": 140, "deviceId": "SBS03", "dateTime": "2017-02-03 11:29:38"}
{"deviceParameter": "Humidity", "deviceValue": 63, "deviceId": "SBS01", "dateTime": "2017-02-03 11:29:39"}
{"deviceParameter": "Flow", "deviceValue": 80, "deviceId": "SBS04", "dateTime": "2017-02-03 11:29:41"}

Run the script and keep it running for the duration of the project to generate sufficient data.

Tip: If you encounter any issues running the script from your local machine, launch an EC2 instance and run the script there as a root user. Remember to assign an appropriate IAM role to your instance at the time of launch that allows it to access AWS IoT.

Step 2. Create three Firehose delivery streams

For this post, you require three Firehose delivery streams:  one to batch raw data from AWS IoT, and two to batch output device data and aggregated data from Analytics.

  1. In the console, choose Firehose.
  2. Create all three Firehose delivery streams using the following field values.

Delivery stream 1:

Name IoT-Source-Stream
S3 bucket <your unique name>-kinesis
S3 prefix source/

Delivery stream 2:

Name IoT-Destination-Data-Stream
S3 bucket <your unique name>-kinesis
S3 prefix data/

Delivery stream 3:

Name IoT-Destination-Aggregate-Stream
S3 bucket <your unique name>-kinesis
S3 prefix aggregate/

Step 3. Set up AWS IoT to receive and forward incoming data

  1. In the console, choose IoT.
  2. Create a new AWS IoT rule with the following field values.
Name IoT_to_Firehose
Attribute *
Topic Filter /sbs/devicedata/#
Add Action Send messages to an Amazon Kinesis Firehose stream (select IoT-Source-Stream from dropdown)
Select Separator “\n (newline)”

A quick check before proceeding further: make sure that you have run the script to generate simulated IoT data or that your IoT Thing is running and delivering data. If not, set it up now. The Amazon Kinesis Analytics application you set up in the next step needs the data to process it further.

Step 4: Create an Analytics application to process data

  1. In the console, choose Kinesis.
  2. Create a new application.
  3. Enter a name of your choice, for example, SBS-IoT-Data.
  4. For the source, choose IoT-Source-Stream.

Analytics auto-discovers the schema on the data by sampling records from the input stream. It also includes an in-built SQL editor that allows you to write standard SQL queries to transform incoming data.

Tip: If Analytics is unable to discover your incoming data, it may be missing the appropriate IAM permissions. In the IAM console, select the role that you assigned to your IoT rule in Step 3. Make sure that it has the ARN of the IoT-Source-Data Firehose stream listed in the firehose:putRecord section.

Here is a sample SQL query that generates two output streams:

  • DESTINATION_SQL_BASIC_STREAM contains the device ID, device parameter, its value, and the time stamp from the incoming stream.
  • DESTINATION_SQL_AGGREGATE_STREAM aggregates the maximum and minimum values of temperatures from the sensors over a one-minute period from the incoming data.
-- Create an output stream with four columns, which is used to send IoT data to the destination
CREATE OR REPLACE STREAM "DESTINATION_SQL_BASIC_STREAM" (dateTime TIMESTAMP, deviceId VARCHAR(8), deviceParameter VARCHAR(16), deviceValue INTEGER);

-- Create a pump that continuously selects from the source stream and inserts it into the output data stream
CREATE OR REPLACE PUMP "STREAM_PUMP_1" AS INSERT INTO "DESTINATION_SQL_BASIC_STREAM"

-- Filter specific columns from the source stream
SELECT STREAM "dateTime", "deviceId", "deviceParameter", "deviceValue" FROM "SOURCE_SQL_STREAM_001";

-- Create a second output stream with three columns, which is used to send aggregated min/max data to the destination
CREATE OR REPLACE STREAM "DESTINATION_SQL_AGGREGATE_STREAM" (dateTime TIMESTAMP, highestTemp SMALLINT, lowestTemp SMALLINT);

-- Create a pump that continuously selects from a source stream 
CREATE OR REPLACE PUMP "STREAM_PUMP_2" AS INSERT INTO "DESTINATION_SQL_AGGREGATE_STREAM"

-- Extract time in minutes, plus the highest and lowest value of device temperature in that minute, into the destination aggregate stream, aggregated per minute
SELECT STREAM FLOOR("SOURCE_SQL_STREAM_001".ROWTIME TO MINUTE) AS "dateTime", MAX("deviceValue") AS "highestTemp", MIN("deviceValue") AS "lowestTemp" FROM "SOURCE_SQL_STREAM_001" WHERE "deviceParameter"='Temperature' GROUP BY FLOOR("SOURCE_SQL_STREAM_001".ROWTIME TO MINUTE);

Real-time analytics shows the results of the SQL query. If everything is working correctly, you see three streams listed, similar to the following screenshot.

Step 5: Connect the Analytics application to output Firehose delivery streams

You create two destinations for the two delivery streams that you created in the previous step. A single Analytics application can have multiple destinations defined; however, this needs to be set up using the AWS CLI, not from the console. If you do not already have it, install the AWS CLI on your local machine and configure it with your credentials.

Tip: If you are running the IoT script from an EC2 instance, it comes pre-installed with the AWS CLI.

Create the first destination delivery stream 

The AWS CLI command to create a new output Firehose delivery stream is as follows:

aws kinesisanalytics add-application-output --application-name <Name of Analytics Application> --current-application-version-id <number> --application-output 'Name=DESTINATION_SQL_BASIC_STREAM,KinesisFirehoseOutput={ResourceARN=<ARN of IoT-Data-Stream>,RoleARN=<ARN of Analytics application>,DestinationSchema={RecordFormatType=CSV}'

Do not copy this into the CLI just yet! Before entering this command, make the following four changes to personalize it:

  • For Name of Analytics Application, enter the value from Step 4, or from the Analytics console.
  • For current-application-version-ID, run the following command:
aws kinesisanalytics describe-application --application-name <application name from above>; | grep ApplicationVersionId
  • For ResourceARN, run the following command:
aws firehose describe-delivery-stream --delivery-stream-name IoT-Destination-Data-Stream | grep DeliveryStreamARN
  • For RoleARN, run the following command:
aws kinesisanalytics describe-application --application-name <application name from above>; | grep RoleARN

Now, paste the complete command in the AWS CLI and press Enter. If there are any errors, the response provides details. If everything goes well, a new destination delivery stream is created to send the first query (DESTINATION_SQL_BASIC_STREAM) to IoT-Destination-Data-Stream.

Create the second destination delivery stream

Following similar steps as above, create a second destination Firehose delivery stream with the following changes:

  • For Name of Analytics Application, enter the same name as the first delivery stream.
  • For current-application-version-ID, increment by 1 from the previous value (unless you made other changes in between these steps). If unsure, run the same command as above to get it again.
  • For ResourceARN, get the value by running the following CLI command:
aws firehose describe-delivery-stream --delivery-stream-name IoT-Destination-Aggregate-Stream | grep DeliveryStreamARN
  • For RoleArn, enter the same value as the first stream.

Run the aws kinesisanalytics CLI command, similar to the previous step but with the new parameters substituted. This creates the second output Firehose destination delivery stream.

Update the IAM role for Analytics to allow writing to both output streams.

  1. In the console, choose IAM, Roles.
  2. Select the role that you created with Analytics in Step 4.
  3. Choose Policy, JSON, and Edit.
  4. Find “Sid”: “WriteOutputFirehose” in the JSON document, go to the “Resource” section and make sure that it includes Resource ARNs of both streams that you found in the previous step.
  5. If it has only one ARN, add the second ARN and choose Save.

This completes the Amazon Kinesis setup. The incoming IoT data is processed by Analytics and delivered, using two output delivery streams, to two separate folders in your S3 bucket.

Step 6: Set up Amazon QuickSight to analyze the data

To build the visualization dashboard, ingest the processed CSV files from the S3 bucket into Amazon QuickSight.

  1. In the console, choose QuickSight.
  2. If this is your first time using Amazon QuickSight, you are asked to create a new account. Follow the prompts.
  3. When you are logged in to your account, choose New Analysis and enter a name of your choice.
  4. Choose New data set for the analysis or, if you have previously imported your data set, select one from the available data sets.
  5. You import two data sets: one with general device parameters information, and the other with aggregates of maximum and minimum temperatures for monitoring. For the first data set, choose S3 from the list of available data sources and enter a name, for example, IoT Device Data.
  6. The location of the S3 bucket and the objects to use are provided to Amazon QuickSight as a manifest file. Create a new manifest file following the supported formats for Amazon S3 manifest files.
  7. In the URIPrefixes section, provide your appropriate S3 bucket and folder location for the general device data. Hint: it should include <your unique name>-kinesis/data/.

Your manifest file should look similar to the following:

{ 
    "fileLocations": [                                                    
              {"URIPrefixes": ["https://s3.amazonaws.com/<YOUR_BUCKET_NAME>/data/<YEAR>/<MONTH>/<DATE>/<HOUR>/"]}
     ],
     "globalUploadSettings": { 
     "format": "CSV",  
     "delimiter": ","
    }
}

Amazon QuickSight imports and parses the data set, and provides available data fields that can be used for making graphs. The Edit/Preview data button allows you to format and transform the data, change data types, and filter or join your data. Make sure that the columns have the correct titles. If not, you can edit them and then save.

Tip: choose the downward arrow on the top right and unselect Files include headers to give each column appropriate headers. Choose Save. This takes you back to the data sets page.

Follow the same steps as above to import the second data set. This time, your manifest should include your aggregate data set folder on S3, which is named <your unique name>-kinesis/aggregate/. Update headers if necessary and choose Save & visualize.

Build an analysis

The visualization screen shows the data set that you last imported, which in this case is the aggregate data. To include the general device data as well, for Fields on the top left, choose Edit analysis data sets. Choose Add data set and select the other data set that you saved earlier.

Now both data sets are available on the analysis screen. For Visual Types at bottom left, select the type of graph to make. For Fields, select the fields to visualize. For example, drag Device ID, Device Parameter, and Value to Field wells, as shown in the screenshot below, to generate a visualization of average parameter values compared across devices.

You can create another visual by choose +Add. This time, select a line graph to show monitoring of the maximum temperature values of the sensors in any minute, from the aggregate data set.

If you would like to create an interactive story to present to your team or organization, you can choose the Story option on the left panel. Create a dashboard with multiple visualizations, to save and share securely with the intended audience. An example of a story is shown below.

Conclusion

Any data is valuable only when it can be actually put to use. In this post, you’ve seen how it’s possible to quickly build a simple Analytics application to ingest, process, and visualize IoT data in near real time entirely using AWS managed services. This solution is scalable and reliable, and costs a fraction of other business intelligence solutions. It is easy enough that anyone with an AWS account can build and use it without any special training.

If you have any questions or suggestions, please comment below.


About the Author

Karan Desai is a Solutions Architect with Amazon Web Services. He works with startups and small businesses in the US, helping them adopt cloud technology to build scalable and secure solutions using AWS. In his spare time, he likes to build personal IoT projects, travel to offbeat places and write about it.

 

 


Related

Visualize Big Data with Amazon QuickSight, Presto, and Apache Spark on Amazon EMR

 

 

 

 

 

 

 

Hiring a Content Director

Post Syndicated from Ahin Thomas original https://www.backblaze.com/blog/hiring-content-director/


Backblaze is looking to hire a full time Content Director. This role is an essential piece of our team, reporting directly to our VP of Marketing. As the hiring manager, I’d like to tell you a little bit more about the role, how I’m thinking about the collaboration, and why I believe this to be a great opportunity.

A Little About Backblaze and the Role

Since 2007, Backblaze has earned a strong reputation as a leader in data storage. Our products are astonishingly easy to use and affordable to purchase. We have engaged customers and an involved community that helps drive our brand. Our audience numbers in the millions and our primary interaction point is the Backblaze blog. We publish content for engineers (data infrastructure, topics in the data storage world), consumers (how to’s, merits of backing up), and entrepreneurs (business insights). In all categories, our Content Director drives our earned positioned as leaders.

Backblaze has a culture focused on being fair and good (to each other and our customers). We have created a sustainable business that is profitable and growing. Our team places a premium on open communication, being cleverly unconventional, and helping each other out. The Content Director, specifically, balances our needs as a commercial enterprise (at the end of the day, we want to sell our products) with the custodianship of our blog (and the trust of our audience).

There’s a lot of ground to be covered at Backblaze. We have three discreet business lines:

  • Computer Backup -> a 10 year old business focusing on backing up consumer computers.
  • B2 Cloud Storage -> Competing with Amazon, Google, and Microsoft… just at ¼ of the price (but with the same performance characteristics).
  • Business Backup -> Both Computer Backup and B2 Cloud Storage, but focused on SMBs and enterprise.

The Best Candidate Is…

An excellent writer – possessing a solid academic understanding of writing, the creative process, and delivering against deadlines. You know how to write with multiple voices for multiple audiences. We do not expect our Content Director to be a storage infrastructure expert; we do expect a facility with researching topics, accessing our engineering and infrastructure team for guidance, and generally translating the technical into something easy to understand. The best Content Director must be an active participant in the business/ strategy / and editorial debates and then must execute with ruthless precision.

Our Content Director’s “day job” is making sure the blog is running smoothly and the sales team has compelling collateral (emails, case studies, white papers).

Specifically, the Perfect Content Director Excels at:

  • Creating well researched, elegantly constructed content on deadline. For example, each week, 2 articles should be published on our blog. Blog posts should rotate to address the constituencies for our 3 business lines – not all blog posts will appeal to everyone, but over the course of a month, we want multiple compelling pieces for each segment of our audience. Similarly, case studies (and outbound emails) should be tailored to our sales team’s proposed campaigns / audiences. The Content Director creates ~75% of all content but is responsible for editing 100%.
  • Understanding organic methods for weaving business needs into compelling content. The majority of our content (but not EVERY piece) must tie to some business strategy. We hate fluff and hold our promotional content to a standard of being worth someone’s time to read. To be effective, the Content Director must understand the target customer segments and use cases for our products.
  • Straddling both Consumer & SaaS mechanics. A key part of the job will be working to augment the collateral used by our sales team for both B2 Cloud Storage and Business Backup. This content should be compelling and optimized for converting leads. And our foundational business line, Computer Backup, deserves to be nurtured and grown.
  • Product marketing. The Content Director “owns” the blog. But also assists in writing cases studies / white papers, creating collateral (email, trade show). Each of these things has a variety of call to action(s) and audiences. Direct experience is a plus, experience that will plausibly translate to these areas is a requirement.
  • Articulating views on storage, backup, and cloud infrastructure. Not everyone has experience with this. That’s fine, but if you do, it’s strongly beneficial.

A Thursday In The Life:

  • Coordinate Collaborators – We are deliverables driven culture, not a meeting driven one. We expect you to collaborate with internal blog authors and the occasional guest poster.
  • Collaborate with Design – Ensure imagery for upcoming posts / collateral are on track.
  • Augment Sales team – Lock content for next week’s outbound campaign.
  • Self directed blog agenda – Feedback for next Tuesday’s post is addressed, next Thursday’s post is circulated to marketing team for feedback & SEO polish.
  • Review Editorial calendar, make any changes.

Oh! And We Have Great Perks:

  • Competitive healthcare plans
  • Competitive compensation and 401k
  • All employees receive Option grants
  • Unlimited vacation days
  • Strong coffee & fully stocked Micro kitchen
  • Catered breakfast and lunches
  • Awesome people who work on awesome projects
  • Childcare bonus
  • Normal work hours
  • Get to bring your pets into the office
  • San Mateo Office – located near Caltrain and Highways 101 & 280.

Interested in Joining Our Team?

Send us an email to [email protected] with the subject “Content Director”. Please include your resume and 3 brief abstracts for content pieces.
Some hints for each of your three abstracts:

  • Create a compelling headline
  • Write clearly and concisely
  • Be brief, each abstract should be 100 words or less – no longer
  • Target each abstract to a different specific audience that is relevant to our business lines

Thank you for taking the time to read and consider all this. I hope it sounds like a great opportunity for you or someone you know. Principles only need apply.

The post Hiring a Content Director appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.