Tag Archives: fashion

AWS IoT 1-Click – Use Simple Devices to Trigger Lambda Functions

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-iot-1-click-use-simple-devices-to-trigger-lambda-functions/

We announced a preview of AWS IoT 1-Click at AWS re:Invent 2017 and have been refining it ever since, focusing on simplicity and a clean out-of-box experience. Designed to make IoT available and accessible to a broad audience, AWS IoT 1-Click is now generally available, along with new IoT buttons from AWS and AT&T.

I sat down with the dev team a month or two ago to learn about the service so that I could start thinking about my blog post. During the meeting they gave me a pair of IoT buttons and I started to think about some creative ways to put them to use. Here are a few that I came up with:

Help Request – Earlier this month I spent a very pleasant weekend at the HackTillDawn hackathon in Los Angeles. As the participants were hacking away, they occasionally had questions about AWS, machine learning, Amazon SageMaker, and AWS DeepLens. While we had plenty of AWS Solution Architects on hand (decked out in fashionable & distinctive AWS shirts for easy identification), I imagined an IoT button for each team. Pressing the button would alert the SA crew via SMS and direct them to the proper table.

Camera ControlTim Bray and I were in the AWS video studio, prepping for the first episode of Tim’s series on AWS Messaging. Minutes before we opened the Twitch stream I realized that we did not have a clean, unobtrusive way to ask the camera operator to switch to a closeup view. Again, I imagined that a couple of IoT buttons would allow us to make the request.

Remote Dog Treat Dispenser – My dog barks every time a stranger opens the gate in front of our house. While it is great to have confirmation that my Ring doorbell is working, I would like to be able to press a button and dispense a treat so that Luna stops barking!

Homes, offices, factories, schools, vehicles, and health care facilities can all benefit from IoT buttons and other simple IoT devices, all managed using AWS IoT 1-Click.

All About AWS IoT 1-Click
As I said earlier, we have been focusing on simplicity and a clean out-of-box experience. Here’s what that means:

Architects can dream up applications for inexpensive, low-powered devices.

Developers don’t need to write any device-level code. They can make use of pre-built actions, which send email or SMS messages, or write their own custom actions using AWS Lambda functions.

Installers don’t have to install certificates or configure cloud endpoints on newly acquired devices, and don’t have to worry about firmware updates.

Administrators can monitor the overall status and health of each device, and can arrange to receive alerts when a device nears the end of its useful life and needs to be replaced, using a single interface that spans device types and manufacturers.

I’ll show you how easy this is in just a moment. But first, let’s talk about the current set of devices that are supported by AWS IoT 1-Click.

Who’s Got the Button?
We’re launching with support for two types of buttons (both pictured above). Both types of buttons are pre-configured with X.509 certificates, communicate to the cloud over secure connections, and are ready to use.

The AWS IoT Enterprise Button communicates via Wi-Fi. It has a 2000-click lifetime, encrypts outbound data using TLS, and can be configured using BLE and our mobile app. It retails for $19.99 (shipping and handling not included) and can be used in the United States, Europe, and Japan.

The AT&T LTE-M Button communicates via the LTE-M cellular network. It has a 1500-click lifetime, and also encrypts outbound data using TLS. The device and the bundled data plan is available an an introductory price of $29.99 (shipping and handling not included), and can be used in the United States.

We are very interested in working with device manufacturers in order to make even more shapes, sizes, and types of devices (badge readers, asset trackers, motion detectors, and industrial sensors, to name a few) available to our customers. Our team will be happy to tell you about our provisioning tools and our facility for pushing OTA (over the air) updates to large fleets of devices; you can contact them at [email protected].

AWS IoT 1-Click Concepts
I’m eager to show you how to use AWS IoT 1-Click and the buttons, but need to introduce a few concepts first.

Device – A button or other item that can send messages. Each device is uniquely identified by a serial number.

Placement Template – Describes a like-minded collection of devices to be deployed. Specifies the action to be performed and lists the names of custom attributes for each device.

Placement – A device that has been deployed. Referring to placements instead of devices gives you the freedom to replace and upgrade devices with minimal disruption. Each placement can include values for custom attributes such as a location (“Building 8, 3rd Floor, Room 1337”) or a purpose (“Coffee Request Button”).

Action – The AWS Lambda function to invoke when the button is pressed. You can write a function from scratch, or you can make use of a pair of predefined functions that send an email or an SMS message. The actions have access to the attributes; you can, for example, send an SMS message with the text “Urgent need for coffee in Building 8, 3rd Floor, Room 1337.”

Getting Started with AWS IoT 1-Click
Let’s set up an IoT button using the AWS IoT 1-Click Console:

If I didn’t have any buttons I could click Buy devices to get some. But, I do have some, so I click Claim devices to move ahead. I enter the device ID or claim code for my AT&T button and click Claim (I can enter multiple claim codes or device IDs if I want):

The AWS buttons can be claimed using the console or the mobile app; the first step is to use the mobile app to configure the button to use my Wi-Fi:

Then I scan the barcode on the box and click the button to complete the process of claiming the device. Both of my buttons are now visible in the console:

I am now ready to put them to use. I click on Projects, and then Create a project:

I name and describe my project, and click Next to proceed:

Now I define a device template, along with names and default values for the placement attributes. Here’s how I set up a device template (projects can contain several, but I just need one):

The action has two mandatory parameters (phone number and SMS message) built in; I add three more (Building, Room, and Floor) and click Create project:

I’m almost ready to ask for some coffee! The next step is to associate my buttons with this project by creating a placement for each one. I click Create placements to proceed. I name each placement, select the device to associate with it, and then enter values for the attributes that I established for the project. I can also add additional attributes that are peculiar to this placement:

I can inspect my project and see that everything looks good:

I click on the buttons and the SMS messages appear:

I can monitor device activity in the AWS IoT 1-Click Console:

And also in the Lambda Console:

The Lambda function itself is also accessible, and can be used as-is or customized:

As you can see, this is the code that lets me use {{*}}include all of the placement attributes in the message and {{Building}} (for example) to include a specific placement attribute.

Now Available
I’ve barely scratched the surface of this cool new service and I encourage you to give it a try (or a click) yourself. Buy a button or two, build something cool, and let me know all about it!

Pricing is based on the number of enabled devices in your account, measured monthly and pro-rated for partial months. Devices can be enabled or disabled at any time. See the AWS IoT 1-Click Pricing page for more info.

To learn more, visit the AWS IoT 1-Click home page or read the AWS IoT 1-Click documentation.

Jeff;

 

Creating a 1.3 Million vCPU Grid on AWS using EC2 Spot Instances and TIBCO GridServer

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/creating-a-1-3-million-vcpu-grid-on-aws-using-ec2-spot-instances-and-tibco-gridserver/

Many of my colleagues are fortunate to be able to spend a good part of their day sitting down with and listening to our customers, doing their best to understand ways that we can better meet their business and technology needs. This information is treated with extreme care and is used to drive the roadmap for new services and new features.

AWS customers in the financial services industry (often abbreviated as FSI) are looking ahead to the Fundamental Review of Trading Book (FRTB) regulations that will come in to effect between 2019 and 2021. Among other things, these regulations mandate a new approach to the “value at risk” calculations that each financial institution must perform in the four hour time window after trading ends in New York and begins in Tokyo. Today, our customers report this mission-critical calculation consumes on the order of 200,000 vCPUs, growing to between 400K and 800K vCPUs in order to meet the FRTB regulations. While there’s still some debate about the magnitude and frequency with which they’ll need to run this expanded calculation, the overall direction is clear.

Building a Big Grid
In order to make sure that we are ready to help our FSI customers meet these new regulations, we worked with TIBCO to set up and run a proof of concept grid in the AWS Cloud. The periodic nature of the calculation, along with the amount of processing power and storage needed to run it to completion within four hours, make it a great fit for an environment where a vast amount of cost-effective compute power is available on an on-demand basis.

Our customers are already using the TIBCO GridServer on-premises and want to use it in the cloud. This product is designed to run grids at enterprise scale. It runs apps in a virtualized fashion, and accepts requests for resources, dynamically provisioning them on an as-needed basis. The cloud version supports Amazon Linux as well as the PostgreSQL-compatible edition of Amazon Aurora.

Working together with TIBCO, we set out to create a grid that was substantially larger than the current high-end prediction of 800K vCPUs, adding a 50% safety factor and then rounding up to reach 1.3 million vCPUs (5x the size of the largest on-premises grid). With that target in mind, the account limits were raised as follows:

  • Spot Instance Limit – 120,000
  • EBS Volume Limit – 120,000
  • EBS Capacity Limit – 2 PB

If you plan to create a grid of this size, you should also bring your friendly local AWS Solutions Architect into the loop as early as possible. They will review your plans, provide you with architecture guidance, and help you to schedule your run.

Running the Grid
We hit the Go button and launched the grid, watching as it bid for and obtained Spot Instances, each of which booted, initialized, and joined the grid within two minutes. The test workload used the Strata open source analytics & market risk library from OpenGamma and was set up with their assistance.

The grid grew to 61,299 Spot Instances (1.3 million vCPUs drawn from 34 instance types spanning 3 generations of EC2 hardware) as planned, with just 1,937 instances reclaimed and automatically replaced during the run, and cost $30,000 per hour to run, at an average hourly cost of $0.078 per vCPU. If the same instances had been used in On-Demand form, the hourly cost to run the grid would have been approximately $93,000.

Despite the scale of the grid, prices for the EC2 instances did not move during the bidding process. This is due to the overall size of the AWS Cloud and the smooth price change model that we launched late last year.

To give you a sense of the compute power, we computed that this grid would have taken the #1 position on the TOP 500 supercomputer list in November 2007 by a considerable margin, and the #2 position in June 2008. Today, it would occupy position #360 on the list.

I hope that you enjoyed this AWS success story, and that it gives you an idea of the scale that you can achieve in the cloud!

Jeff;

[$] A mapping layer for filesystems

Post Syndicated from jake original https://lwn.net/Articles/753650/rss

In a plenary session on the second day of the Linux Storage, Filesystem,
and Memory-Management Summit (LSFMM), Dave Chinner described his ideas for
a virtual block address-space layer. It would allow “space accounting to be
shared and managed at various layers in the storage stack”. One of the
targets for this work is for filesystems on thin-provisioned devices, where
the filesystem
is larger than the storage devices holding it (and administrators are
expected to add storage as needed); in current systems, running out of
space causes huge problems for filesystems and users because the filesystem
cannot communicate that error in a usable fashion.

Whimsical builds and messing things up

Post Syndicated from Helen Lynn original https://www.raspberrypi.org/blog/whimsical-builds-and-messing-things-up/

Today is the early May bank holiday in England and Wales, a public holiday, and while this blog rarely rests, the Pi Towers team does. So, while we take a day with our families, our friends, and/or our favourite pastimes, I thought I’d point you at a couple of features from HackSpace magazine, our monthly magazine for makers.

To my mind, they go quite well with a deckchair in the garden, the buzz of a lawnmower a few houses down, and a view of the weeds I ought to have dealt with by now, but I’m sure you’ll find your own ambience.

Make anything with pencils – HackSpace magazine

If you want a unique piece of jewellery to show your love for pencils, follow Peter Brown’s lead. Peter glued twelve pencils together in two rows of six. He then measured the size of his finger and drilled a hole between the glued pencils using a drill bit.

First off, pencils. It hadn’t occurred to me that you could make super useful stuff like a miniature crossbow and a catapult out of pencils. Not only can you do this, you can probably go ahead and do it right now: all you need is a handful of pencils, some rubber bands, some drawing pins, and a bulldog clip (or, as you might prefer, some push pins and a binder clip). The sentence that really leaps out at me here is “To keep a handful of boys aged three to eleven occupied during a family trip, Marie decided to build mini crossbows to help their target practice.” The internet hasn’t helped me find out much about Marie, but I am in awe of her.

If you haven’t wandered off to make a stationery arsenal by now, read Lucy Rogers‘ reflections on making a right mess of things. I hope you do, because I think it’d be great if more people coped better with the fact that we all, unavoidably, fail. You probably won’t really get anywhere without a few goes where you just completely muck it all up.

A ceramic mug, broken into several pieces on the floor

Never mind. We can always line a plant pot with them.
“In Pieces” by dusk-photography / CC BY

This true of everything. Wet lab work and gardening and coding and parenting. And everything. You can share your heroic failures in the comments, if you like, as well as any historic weaponry you have fashioned from the contents of your desk tidy.

The post Whimsical builds and messing things up appeared first on Raspberry Pi.

YouTube Won’t Put Up With Blatant Piracy Tutorials Forever

Post Syndicated from Andy original https://torrentfreak.com/youtube-wont-put-up-with-blatant-piracy-tutorials-forever-180506/

Once upon a time, Internet users’ voices would be heard in limited circles, on platforms such as Usenet or other niche platforms.

Then, with the rise of forum platforms such as phpBB in 2000 and Invision Power Board in 2002, thriving communities could gather in public to discuss endless specialist topics, including file-sharing of course.

When dedicated piracy forums began to gain traction, it was pretty much a free-for-all. People discussed obtaining free content absolutely openly. Nothing was taboo and no one considered that there would be any repercussions. As such, moderation was limited to keeping troublemakers in check.

As the years progressed and lawsuits against both sites and services became more commonplace, most sites that weren’t actually serving illegal content began to consider their positions. Run by hobbyists, most didn’t want the hassle of a multi-million dollar lawsuit, so links to pirate content began to diminish and the more overt piracy tutorials began to disappear underground.

Those that remained in plain sight became much more considered. Tutorials on how to pirate specific Hollywood blockbusters were no longer needed, a plain general tutorial would suffice. And, as communities matured and took time to understand the implications of their actions, those without political motivations realized that drawing attention to potential criminality was neither required nor necessary.

Then YouTube and social media happened and almost overnight, no one was in charge and anyone could say whatever they liked.

In this new reality, there were no irritating moderator-type figures removing links to this and that, and nobody warning people against breaking rules that suddenly didn’t exist anymore. In essence, previously tight-knit and street-wise file-sharing and piracy communities not only became fragmented, but also chaotic.

This meant that anyone could become a leader and in some cases, this was the utopia that many had hoped for. Not only couldn’t the record labels or Hollywood tell people what to do anymore, discussion site operators couldn’t either. For those who didn’t abuse the power and for those who knew no better, this was a much-needed breath of fresh air. But, like all good things, it was unlikely to last forever.

Where most file-sharing of yesterday was carried out by hobbyist enthusiasts, many of today’s pirates are far more casual. They’re just as thirsty for content, but they don’t want to spend hours hunting for it. They want it all on a plate, at the flick of a switch, delivered to their TV with a minimum of hassle.

With online discussions increasingly seen as laborious and old-fashioned, many mainstream pirates have turned to easy-to-consume videos. In support of their Kodi media player habits, YouTube has become the educational platform of choice for millions.

As a result, there is now a long line of self-declared Kodi piracy specialists scooping up millions of views on YouTube. Their videos – which in many cases are thinly veiled advertisements for third party addons, Kodi ‘builds’, illegal IPTV services, and obscure Android APKs – are now the main way for a new generation to obtain direct advice on pirating.

Many of the videos are incredibly blatant, like the past 15 years of litigation never happened. All the lessons learned by the phpBB board operators of yesteryear, of how to achieve their goals of sharing information without getting shut down, have been long forgotten. In their place, a barrage of daily videos designed to generate clicks and affiliate revenue, no matter what the cost, no matter what the risk.

It’s pretty clear that these videos are at least partly responsible for the phenomenal uptick in Kodi and Android-based piracy over the past few years. In that respect, many lovers of free content will be eternally grateful for the service they’ve provided. But like many piracy movements over the years, people shouldn’t get too attached to them, at least in their current form.

Thanks to the devil-may-care approach of many influential YouTubers, it won’t be long before a whole new set of moderators begin flexing their muscles. While your average phpBB moderator could be reasoned with in order to get a second chance, a determined and largely faceless YouTube will eject offenders without so much as a clear explanation.

When this happens (and it’s only a question of time given the growing blatancy of many tutorials) YouTubers will not only lose their voices but their revenue streams too. While YouTube’s partner programs bring in some welcome cash, the profitable affiliate schemes touted on these channels for external products will also be under threat.

Perhaps the most surprising thing in this drama-waiting-to-happen is that many of the most popular YouTubers can hardly be considered young and naive. While some are of more tender years, most – with their undoubted skill, knowledge and work ethic – should know better for their 30 or 40 years on this planet. Yet not only do they make their names public, they feature their faces heavily in their videos too.

Still, it’s likely that it will take some big YouTube accounts to fall before YouTubers respond by shaving the sharp edges off their blatant promotion of illegal activity. And there’s little doubt that those advertising products (which is most of them) will have to do so sooner rather than later.

Just this week, YouTube made it clear that it won’t tolerate people making money from the promotion of illegal activities.

“YouTube creators may include paid endorsements as part of their content only if the product or service they are endorsing complies with our advertising policies,” YouTube told the BBC.

“We will be working with creators going forward so they better understand that in video promotions [they] must not promote dishonest activity.”

That being said, like many other players in the piracy and file-sharing space over the past 18 years, YouTubers will eventually begin to learn that not only can the smart survive, they can flourish too.

Sure, there will be people out there who’ll protest that free speech allows citizens to express themselves in a manner of their choosing. But try PM’ing that to YouTube in response to a strike, and see how that fares.

When they say you’re done, the road back is a long one.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

EC2 Fleet – Manage Thousands of On-Demand and Spot Instances with One Request

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/ec2-fleet-manage-thousands-of-on-demand-and-spot-instances-with-one-request/

EC2 Spot Fleets are really cool. You can launch a fleet of Spot Instances that spans EC2 instance types and Availability Zones without having to write custom code to discover capacity or monitor prices. You can set the target capacity (the size of the fleet) in units that are meaningful to your application and have Spot Fleet create and then maintain the fleet on your behalf. Our customers are creating Spot Fleets of all sizes. For example, one financial service customer runs Monte Carlo simulations across 10 different EC2 instance types. They routinely make requests for hundreds of thousands of vCPUs and count on Spot Fleet to give them access to massive amounts of capacity at the best possible price.

EC2 Fleet
Today we are extending and generalizing the set-it-and-forget-it model that we pioneered in Spot Fleet with EC2 Fleet, a new building block that gives you the ability to create fleets that are composed of a combination of EC2 On-Demand, Reserved, and Spot Instances with a single API call. You tell us what you need, capacity and instance-wise, and we’ll handle all the heavy lifting. We will launch, manage, monitor and scale instances as needed, without the need for scaffolding code.

You can specify the capacity of your fleet in terms of instances, vCPUs, or application-oriented units, and also indicate how much of the capacity should be fulfilled by Spot Instances. The application-oriented units allow you to specify the relative power of each EC2 instance type in a way that directly maps to the needs of your application. All three capacity specification options (instances, vCPUs, and application-oriented units) are known as weights.

I think you’ll find a number ways this feature makes managing a fleet of instances easier, and believe that you will also appreciate the team’s near-term feature roadmap of interest (more on that in a bit).

Using EC2 Fleet
There are a number of ways that you can use this feature, whether you’re running a stateless web service, a big data cluster or a continuous integration pipeline. Today I’m going to describe how you can use EC2 Fleet for genomic processing, but this is similar to workloads like risk analysis, log processing or image rendering. Modern DNA sequencers can produce multiple terabytes of raw data each day, to process that data into meaningful information in a timely fashion you need lots of processing power. I’ll be showing you how to deploy a “grid” of worker nodes that can quickly crunch through secondary analysis tasks in parallel.

Projects in genomics can use the elasticity EC2 provides to experiment and try out new pipelines on hundreds or even thousands of servers. With EC2 you can access as many cores as you need and only pay for what you use. Prior to today, you would need to use the RunInstances API or an Auto Scaling group for the On-Demand & Reserved Instance portion of your grid. To get the best price performance you’d also create and manage a Spot Fleet or multiple Spot Auto Scaling groups with different instance types if you wanted to add Spot Instances to turbo-boost your secondary analysis. Finally, to automate scaling decisions across multiple APIs and Auto Scaling groups you would need to write Lambda functions that periodically assess your grid’s progress & backlog, as well as current Spot prices – modifying your Auto Scaling Groups and Spot Fleets accordingly.

You can now replace all of this with a single EC2 Fleet, analyzing genomes at scale for as little as $1 per analysis. In my grid, each step in in the pipeline requires 1 vCPU and 4 GiB of memory, a perfect match for M4 and M5 instances with 4 GiB of memory per vCPU. I will create a fleet using M4 and M5 instances with weights that correspond to the number of vCPUs on each instance:

  • m4.16xlarge – 64 vCPUs, weight = 64
  • m5.24xlarge – 96 vCPUs, weight = 96

This is expressed in a template that looks like this:

"Overrides": [
{
  "InstanceType": "m4.16xlarge",
  "WeightedCapacity": 64,
},
{
  "InstanceType": "m5.24xlarge",
  "WeightedCapacity": 96,
},
]

By default, EC2 Fleet will select the most cost effective combination of instance types and Availability Zones (both specified in the template) using the current prices for the Spot Instances and public prices for the On-Demand Instances (if you specify instances for which you have matching RIs, your discounts will apply). The default mode takes weights into account to get the instances that have the lowest price per unit. So for my grid, fleet will find the instance that offers the lowest price per vCPU.

Now I can request capacity in terms of vCPUs, knowing EC2 Fleet will select the lowest cost option using only the instance types I’ve defined as acceptable. Also, I can specify how many vCPUs I want to launch using On-Demand or Reserved Instance capacity and how many vCPUs should be launched using Spot Instance capacity:

"TargetCapacitySpecification": {
	"TotalTargetCapacity": 2880,
	"OnDemandTargetCapacity": 960,
	"SpotTargetCapacity": 1920,
	"DefaultTargetCapacityType": "Spot"
}

The above means that I want a total of 2880 vCPUs, with 960 vCPUs fulfilled using On-Demand and 1920 using Spot. The On-Demand price per vCPU is lower for m5.24xlarge than the On-Demand price per vCPU for m4.16xlarge, so EC2 Fleet will launch 10 m5.24xlarge instances to fulfill 960 vCPUs. Based on current Spot pricing (again, on a per-vCPU basis), EC2 Fleet will choose to launch 30 m4.16xlarge instances or 20 m5.24xlarges, delivering 1920 vCPUs either way.

Putting it all together, I have a single file (fl1.json) that describes my fleet:

    "LaunchTemplateConfigs": [
        {
            "LaunchTemplateSpecification": {
                "LaunchTemplateId": "lt-0e8c754449b27161c",
                "Version": "1"
            }
        "Overrides": [
        {
          "InstanceType": "m4.16xlarge",
          "WeightedCapacity": 64,
        },
        {
          "InstanceType": "m5.24xlarge",
          "WeightedCapacity": 96,
        },
      ]
        }
    ],
    "TargetCapacitySpecification": {
        "TotalTargetCapacity": 2880,
        "OnDemandTargetCapacity": 960,
        "SpotTargetCapacity": 1920,
        "DefaultTargetCapacityType": "Spot"
    }
}

I can launch my fleet with a single command:

$ aws ec2 create-fleet --cli-input-json file://home/ec2-user/fl1.json
{
    "FleetId":"fleet-838cf4e5-fded-4f68-acb5-8c47ee1b248a"
}

My entire fleet is created within seconds and was built using 10 m5.24xlarge On-Demand Instances and 30 m4.16xlarge Spot Instances, since the current Spot price was 1.5¢ per vCPU for m4.16xlarge and 1.6¢ per vCPU for m5.24xlarge.

Now lets imagine my grid has crunched through its backlog and no longer needs the additional Spot Instances. I can then modify the size of my fleet by changing the target capacity in my fleet specification, like this:

{         
    "TotalTargetCapacity": 960,
}

Since 960 was equal to the amount of On-Demand vCPUs I had requested, when I describe my fleet I will see all of my capacity being delivered using On-Demand capacity:

"TargetCapacitySpecification": {
	"TotalTargetCapacity": 960,
	"OnDemandTargetCapacity": 960,
	"SpotTargetCapacity": 0,
	"DefaultTargetCapacityType": "Spot"
}

When I no longer need my fleet I can delete it and terminate the instances in it like this:

$ aws ec2 delete-fleets --fleet-id fleet-838cf4e5-fded-4f68-acb5-8c47ee1b248a \
  --terminate-instances   
{
    "UnsuccessfulFleetDletetions": [],
    "SuccessfulFleetDeletions": [
        {
            "CurrentFleetState": "deleted_terminating",
            "PreviousFleetState": "active",
            "FleetId": "fleet-838cf4e5-fded-4f68-acb5-8c47ee1b248a"
        }
    ]
}

Earlier I described how RI discounts apply when EC2 Fleet launches instances for which you have matching RIs, so you might be wondering how else RI customers benefit from EC2 Fleet. Let’s say that I own regional RIs for M4 instances. In my EC2 Fleet I would remove m5.24xlarge and specify m4.10xlarge and m4.16xlarge. Then when EC2 Fleet creates the grid, it will quickly find M4 capacity across the sizes and AZs I’ve specified, and my RI discounts apply automatically to this usage.

In the Works
We plan to connect EC2 Fleet and EC2 Auto Scaling groups. This will let you create a single fleet that mixed instance types and Spot, Reserved and On-Demand, while also taking advantage of EC2 Auto Scaling features such as health checks and lifecycle hooks. This integration will also bring EC2 Fleet functionality to services such as Amazon ECS, Amazon EKS, and AWS Batch that build on and make use of EC2 Auto Scaling for fleet management.

Available Now
You can create and make use of EC2 Fleets today in all public AWS Regions!

Jeff;

Get Started with Blockchain Using the new AWS Blockchain Templates

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/get-started-with-blockchain-using-the-new-aws-blockchain-templates/

Many of today’s discussions around blockchain technology remind me of the classic Shimmer Floor Wax skit. According to Dan Aykroyd, Shimmer is a dessert topping. Gilda Radner claims that it is a floor wax, and Chevy Chase settles the debate and reveals that it actually is both! Some of the people that I talk to see blockchains as the foundation of a new monetary system and a way to facilitate international payments. Others see blockchains as a distributed ledger and immutable data source that can be applied to logistics, supply chain, land registration, crowdfunding, and other use cases. Either way, it is clear that there are a lot of intriguing possibilities and we are working to help our customers use this technology more effectively.

We are launching AWS Blockchain Templates today. These templates will let you launch an Ethereum (either public or private) or Hyperledger Fabric (private) network in a matter of minutes and with just a few clicks. The templates create and configure all of the AWS resources needed to get you going in a robust and scalable fashion.

Launching a Private Ethereum Network
The Ethereum template offers two launch options. The ecs option creates an Amazon ECS cluster within a Virtual Private Cloud (VPC) and launches a set of Docker images in the cluster. The docker-local option also runs within a VPC, and launches the Docker images on EC2 instances. The template supports Ethereum mining, the EthStats and EthExplorer status pages, and a set of nodes that implement and respond to the Ethereum RPC protocol. Both options create and make use of a DynamoDB table for service discovery, along with Application Load Balancers for the status pages.

Here are the AWS Blockchain Templates for Ethereum:

I start by opening the CloudFormation Console in the desired region and clicking Create Stack:

I select Specify an Amazon S3 template URL, enter the URL of the template for the region, and click Next:

I give my stack a name:

Next, I enter the first set of parameters, including the network ID for the genesis block. I’ll stick with the default values for now:

I will also use the default values for the remaining network parameters:

Moving right along, I choose the container orchestration platform (ecs or docker-local, as I explained earlier) and the EC2 instance type for the container nodes:

Next, I choose my VPC and the subnets for the Ethereum network and the Application Load Balancer:

I configure my keypair, EC2 security group, IAM role, and instance profile ARN (full information on the required permissions can be found in the documentation):

The Instance Profile ARN can be found on the summary page for the role:

I confirm that I want to deploy EthStats and EthExplorer, choose the tag and version for the nested CloudFormation templates that are used by this one, and click Next to proceed:

On the next page I specify a tag for the resources that the stack will create, leave the other options as-is, and click Next:

I review all of the parameters and options, acknowledge that the stack might create IAM resources, and click Create to build my network:

The template makes use of three nested templates:

After all of the stacks have been created (mine took about 5 minutes), I can select JeffNet and click the Outputs tab to discover the links to EthStats and EthExplorer:

Here’s my EthStats:

And my EthExplorer:

If I am writing apps that make use of my private network to store and process smart contracts, I would use the EthJsonRpcUrl.

Stay Tuned
My colleagues are eager to get your feedback on these new templates and plan to add new versions of the frameworks as they become available.

Jeff;

 

Implementing safe AWS Lambda deployments with AWS CodeDeploy

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/implementing-safe-aws-lambda-deployments-with-aws-codedeploy/

This post courtesy of George Mao, AWS Senior Serverless Specialist – Solutions Architect

AWS Lambda and AWS CodeDeploy recently made it possible to automatically shift incoming traffic between two function versions based on a preconfigured rollout strategy. This new feature allows you to gradually shift traffic to the new function. If there are any issues with the new code, you can quickly rollback and control the impact to your application.

Previously, you had to manually move 100% of traffic from the old version to the new version. Now, you can have CodeDeploy automatically execute pre- or post-deployment tests and automate a gradual rollout strategy. Traffic shifting is built right into the AWS Serverless Application Model (SAM), making it easy to define and deploy your traffic shifting capabilities. SAM is an extension of AWS CloudFormation that provides a simplified way of defining serverless applications.

In this post, I show you how to use SAM, CloudFormation, and CodeDeploy to accomplish an automated rollout strategy for safe Lambda deployments.

Scenario

For this walkthrough, you write a Lambda application that returns a count of the S3 buckets that you own. You deploy it and use it in production. Later on, you receive requirements that tell you that you need to change your Lambda application to count only buckets that begin with the letter “a”.

Before you make the change, you need to be sure that your new Lambda application works as expected. If it does have issues, you want to minimize the number of impacted users and roll back easily. To accomplish this, you create a deployment process that publishes the new Lambda function, but does not send any traffic to it. You use CodeDeploy to execute a PreTraffic test to ensure that your new function works as expected. After the test succeeds, CodeDeploy automatically shifts traffic gradually to the new version of the Lambda function.

Your Lambda function is exposed as a REST service via an Amazon API Gateway deployment. This makes it easy to test and integrate.

Prerequisites

To execute the SAM and CloudFormation deployment, you must have the following IAM permissions:

  • cloudformation:*
  • lambda:*
  • codedeploy:*
  • iam:create*

You may use the AWS SAM Local CLI or the AWS CLI to package and deploy your Lambda application. If you choose to use SAM Local, be sure to install it onto your system. For more information, see AWS SAM Local Installation.

All of the code used in this post can be found in this GitHub repository: https://github.com/aws-samples/aws-safe-lambda-deployments.

Walkthrough

For this post, use SAM to define your resources because it comes with built-in CodeDeploy support for safe Lambda deployments.  The deployment is handled and automated by CloudFormation.

SAM allows you to define your Serverless applications in a simple and concise fashion, because it automatically creates all necessary resources behind the scenes. For example, if you do not define an execution role for a Lambda function, SAM automatically creates one. SAM also creates the CodeDeploy application necessary to drive the traffic shifting, as well as the IAM service role that CodeDeploy uses to execute all actions.

Create a SAM template

To get started, write your SAM template and call it template.yaml.

AWSTemplateFormatVersion : '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: An example SAM template for Lambda Safe Deployments.

Resources:

  returnS3Buckets:
    Type: AWS::Serverless::Function
    Properties:
      Handler: returnS3Buckets.handler
      Runtime: nodejs6.10
      AutoPublishAlias: live
      Policies:
        - Version: "2012-10-17"
          Statement: 
          - Effect: "Allow"
            Action: 
              - "s3:ListAllMyBuckets"
            Resource: '*'
      DeploymentPreference:
          Type: Linear10PercentEvery1Minute
          Hooks:
            PreTraffic: !Ref preTrafficHook
      Events:
        Api:
          Type: Api
          Properties:
            Path: /test
            Method: get

  preTrafficHook:
    Type: AWS::Serverless::Function
    Properties:
      Handler: preTrafficHook.handler
      Policies:
        - Version: "2012-10-17"
          Statement: 
          - Effect: "Allow"
            Action: 
              - "codedeploy:PutLifecycleEventHookExecutionStatus"
            Resource:
              !Sub 'arn:aws:codedeploy:${AWS::Region}:${AWS::AccountId}:deploymentgroup:${ServerlessDeploymentApplication}/*'
        - Version: "2012-10-17"
          Statement: 
          - Effect: "Allow"
            Action: 
              - "lambda:InvokeFunction"
            Resource: !Ref returnS3Buckets.Version
      Runtime: nodejs6.10
      FunctionName: 'CodeDeployHook_preTrafficHook'
      DeploymentPreference:
        Enabled: false
      Timeout: 5
      Environment:
        Variables:
          NewVersion: !Ref returnS3Buckets.Version

This template creates two functions:

  • returnS3Buckets
  • preTrafficHook

The returnS3Buckets function is where your application logic lives. It’s a simple piece of code that uses the AWS SDK for JavaScript in Node.JS to call the Amazon S3 listBuckets API action and return the number of buckets.

'use strict';

var AWS = require('aws-sdk');
var s3 = new AWS.S3();

exports.handler = (event, context, callback) => {
	console.log("I am here! " + context.functionName  +  ":"  +  context.functionVersion);

	s3.listBuckets(function (err, data){
		if(err){
			console.log(err, err.stack);
			callback(null, {
				statusCode: 500,
				body: "Failed!"
			});
		}
		else{
			var allBuckets = data.Buckets;

			console.log("Total buckets: " + allBuckets.length);
			callback(null, {
				statusCode: 200,
				body: allBuckets.length
			});
		}
	});	
}

Review the key parts of the SAM template that defines returnS3Buckets:

  • The AutoPublishAlias attribute instructs SAM to automatically publish a new version of the Lambda function for each new deployment and link it to the live alias.
  • The Policies attribute specifies additional policy statements that SAM adds onto the automatically generated IAM role for this function. The first statement provides the function with permission to call listBuckets.
  • The DeploymentPreference attribute configures the type of rollout pattern to use. In this case, you are shifting traffic in a linear fashion, moving 10% of traffic every minute to the new version. For more information about supported patterns, see Serverless Application Model: Traffic Shifting Configurations.
  • The Hooks attribute specifies that you want to execute the preTrafficHook Lambda function before CodeDeploy automatically begins shifting traffic. This function should perform validation testing on the newly deployed Lambda version. This function invokes the new Lambda function and checks the results. If you’re satisfied with the tests, instruct CodeDeploy to proceed with the rollout via an API call to: codedeploy.putLifecycleEventHookExecutionStatus.
  • The Events attribute defines an API-based event source that can trigger this function. It accepts requests on the /test path using an HTTP GET method.
'use strict';

const AWS = require('aws-sdk');
const codedeploy = new AWS.CodeDeploy({apiVersion: '2014-10-06'});
var lambda = new AWS.Lambda();

exports.handler = (event, context, callback) => {

	console.log("Entering PreTraffic Hook!");
	
	// Read the DeploymentId & LifecycleEventHookExecutionId from the event payload
    var deploymentId = event.DeploymentId;
	var lifecycleEventHookExecutionId = event.LifecycleEventHookExecutionId;

	var functionToTest = process.env.NewVersion;
	console.log("Testing new function version: " + functionToTest);

	// Perform validation of the newly deployed Lambda version
	var lambdaParams = {
		FunctionName: functionToTest,
		InvocationType: "RequestResponse"
	};

	var lambdaResult = "Failed";
	lambda.invoke(lambdaParams, function(err, data) {
		if (err){	// an error occurred
			console.log(err, err.stack);
			lambdaResult = "Failed";
		}
		else{	// successful response
			var result = JSON.parse(data.Payload);
			console.log("Result: " +  JSON.stringify(result));

			// Check the response for valid results
			// The response will be a JSON payload with statusCode and body properties. ie:
			// {
			//		"statusCode": 200,
			//		"body": 51
			// }
			if(result.body == 9){	
				lambdaResult = "Succeeded";
				console.log ("Validation testing succeeded!");
			}
			else{
				lambdaResult = "Failed";
				console.log ("Validation testing failed!");
			}

			// Complete the PreTraffic Hook by sending CodeDeploy the validation status
			var params = {
				deploymentId: deploymentId,
				lifecycleEventHookExecutionId: lifecycleEventHookExecutionId,
				status: lambdaResult // status can be 'Succeeded' or 'Failed'
			};
			
			// Pass AWS CodeDeploy the prepared validation test results.
			codedeploy.putLifecycleEventHookExecutionStatus(params, function(err, data) {
				if (err) {
					// Validation failed.
					console.log('CodeDeploy Status update failed');
					console.log(err, err.stack);
					callback("CodeDeploy Status update failed");
				} else {
					// Validation succeeded.
					console.log('Codedeploy status updated successfully');
					callback(null, 'Codedeploy status updated successfully');
				}
			});
		}  
	});
}

The hook is hardcoded to check that the number of S3 buckets returned is 9.

Review the key parts of the SAM template that defines preTrafficHook:

  • The Policies attribute specifies additional policy statements that SAM adds onto the automatically generated IAM role for this function. The first statement provides permissions to call the CodeDeploy PutLifecycleEventHookExecutionStatus API action. The second statement provides permissions to invoke the specific version of the returnS3Buckets function to test
  • This function has traffic shifting features disabled by setting the DeploymentPreference option to false.
  • The FunctionName attribute explicitly tells CloudFormation what to name the function. Otherwise, CloudFormation creates the function with the default naming convention: [stackName]-[FunctionName]-[uniqueID].  Name the function with the “CodeDeployHook_” prefix because the CodeDeployServiceRole role only allows InvokeFunction on functions named with that prefix.
  • Set the Timeout attribute to allow enough time to complete your validation tests.
  • Use an environment variable to inject the ARN of the newest deployed version of the returnS3Buckets function. The ARN allows the function to know the specific version to invoke and perform validation testing on.

Deploy the function

Your SAM template is all set and the code is written—you’re ready to deploy the function for the first time. Here’s how to do it via the SAM CLI. Replace “sam” with “cloudformation” to use CloudFormation instead.

First, package the function. This command returns a CloudFormation importable file, packaged.yaml.

sam package –template-file template.yaml –s3-bucket mybucket –output-template-file packaged.yaml

Now deploy everything:

sam deploy –template-file packaged.yaml –stack-name mySafeDeployStack –capabilities CAPABILITY_IAM

At this point, both Lambda functions have been deployed within the CloudFormation stack mySafeDeployStack. The returnS3Buckets has been deployed as Version 1:

SAM automatically created a few things, including the CodeDeploy application, with the deployment pattern that you specified (Linear10PercentEvery1Minute). There is currently one deployment group, with no action, because no deployments have occurred. SAM also created the IAM service role that this CodeDeploy application uses:

There is a single managed policy attached to this role, which allows CodeDeploy to invoke any Lambda function that begins with “CodeDeployHook_”.

An API has been set up called safeDeployStack. It targets your Lambda function with the /test resource using the GET method. When you test the endpoint, API Gateway executes the returnS3Buckets function and it returns the number of S3 buckets that you own. In this case, it’s 51.

Publish a new Lambda function version

Now implement the requirements change, which is to make returnS3Buckets count only buckets that begin with the letter “a”. The code now looks like the following (see returnS3BucketsNew.js in GitHub):

'use strict';

var AWS = require('aws-sdk');
var s3 = new AWS.S3();

exports.handler = (event, context, callback) => {
	console.log("I am here! " + context.functionName  +  ":"  +  context.functionVersion);

	s3.listBuckets(function (err, data){
		if(err){
			console.log(err, err.stack);
			callback(null, {
				statusCode: 500,
				body: "Failed!"
			});
		}
		else{
			var allBuckets = data.Buckets;

			console.log("Total buckets: " + allBuckets.length);
			//callback(null, allBuckets.length);

			//  New Code begins here
			var counter=0;
			for(var i  in allBuckets){
				if(allBuckets[i].Name[0] === "a")
					counter++;
			}
			console.log("Total buckets starting with a: " + counter);

			callback(null, {
				statusCode: 200,
				body: counter
			});
			
		}
	});	
}

Repackage and redeploy with the same two commands as earlier:

sam package –template-file template.yaml –s3-bucket mybucket –output-template-file packaged.yaml
	
sam deploy –template-file packaged.yaml –stack-name mySafeDeployStack –capabilities CAPABILITY_IAM

CloudFormation understands that this is a stack update instead of an entirely new stack. You can see that reflected in the CloudFormation console:

During the update, CloudFormation deploys the new Lambda function as version 2 and adds it to the “live” alias. There is no traffic routing there yet. CodeDeploy now takes over to begin the safe deployment process.

The first thing CodeDeploy does is invoke the preTrafficHook function. Verify that this happened by reviewing the Lambda logs and metrics:

The function should progress successfully, invoke Version 2 of returnS3Buckets, and finally invoke the CodeDeploy API with a success code. After this occurs, CodeDeploy begins the predefined rollout strategy. Open the CodeDeploy console to review the deployment progress (Linear10PercentEvery1Minute):

Verify the traffic shift

During the deployment, verify that the traffic shift has started to occur by running the test periodically. As the deployment shifts towards the new version, a larger percentage of the responses return 9 instead of 51. These numbers match the S3 buckets.

A minute later, you see 10% more traffic shifting to the new version. The whole process takes 10 minutes to complete. After completion, open the Lambda console and verify that the “live” alias now points to version 2:

After 10 minutes, the deployment is complete and CodeDeploy signals success to CloudFormation and completes the stack update.

Check the results

If you invoke the function alias manually, you see the results of the new implementation.

aws lambda invoke –function [lambda arn to live alias] out.txt

You can also execute the prod stage of your API and verify the results by issuing an HTTP GET to the invoke URL:

Summary

This post has shown you how you can safely automate your Lambda deployments using the Lambda traffic shifting feature. You used the Serverless Application Model (SAM) to define your Lambda functions and configured CodeDeploy to manage your deployment patterns. Finally, you used CloudFormation to automate the deployment and updates to your function and PreTraffic hook.

Now that you know all about this new feature, you’re ready to begin automating Lambda deployments with confidence that things will work as designed. I look forward to hearing about what you’ve built with the AWS Serverless Platform.

AWS Online Tech Talks – April & Early May 2018

Post Syndicated from Betsy Chernoff original https://aws.amazon.com/blogs/aws/aws-online-tech-talks-april-early-may-2018/

We have several upcoming tech talks in the month of April and early May. Come join us to learn about AWS services and solution offerings. We’ll have AWS experts online to help answer questions in real-time. Sign up now to learn more, we look forward to seeing you.

Note – All sessions are free and in Pacific Time.

April & early May — 2018 Schedule

Compute

April 30, 2018 | 01:00 PM – 01:45 PM PTBest Practices for Running Amazon EC2 Spot Instances with Amazon EMR (300) – Learn about the best practices for scaling big data workloads as well as process, store, and analyze big data securely and cost effectively with Amazon EMR and Amazon EC2 Spot Instances.

May 1, 2018 | 01:00 PM – 01:45 PM PTHow to Bring Microsoft Apps to AWS (300) – Learn more about how to save significant money by bringing your Microsoft workloads to AWS.

May 2, 2018 | 01:00 PM – 01:45 PM PTDeep Dive on Amazon EC2 Accelerated Computing (300) – Get a technical deep dive on how AWS’ GPU and FGPA-based compute services can help you to optimize and accelerate your ML/DL and HPC workloads in the cloud.

Containers

April 23, 2018 | 11:00 AM – 11:45 AM PTNew Features for Building Powerful Containerized Microservices on AWS (300) – Learn about how this new feature works and how you can start using it to build and run modern, containerized applications on AWS.

Databases

April 23, 2018 | 01:00 PM – 01:45 PM PTElastiCache: Deep Dive Best Practices and Usage Patterns (200) – Learn about Redis-compatible in-memory data store and cache with Amazon ElastiCache.

April 25, 2018 | 01:00 PM – 01:45 PM PTIntro to Open Source Databases on AWS (200) – Learn how to tap the benefits of open source databases on AWS without the administrative hassle.

DevOps

April 25, 2018 | 09:00 AM – 09:45 AM PTDebug your Container and Serverless Applications with AWS X-Ray in 5 Minutes (300) – Learn how AWS X-Ray makes debugging your Container and Serverless applications fun.

Enterprise & Hybrid

April 23, 2018 | 09:00 AM – 09:45 AM PTAn Overview of Best Practices of Large-Scale Migrations (300) – Learn about the tools and best practices on how to migrate to AWS at scale.

April 24, 2018 | 11:00 AM – 11:45 AM PTDeploy your Desktops and Apps on AWS (300) – Learn how to deploy your desktops and apps on AWS with Amazon WorkSpaces and Amazon AppStream 2.0

IoT

May 2, 2018 | 11:00 AM – 11:45 AM PTHow to Easily and Securely Connect Devices to AWS IoT (200) – Learn how to easily and securely connect devices to the cloud and reliably scale to billions of devices and trillions of messages with AWS IoT.

Machine Learning

April 24, 2018 | 09:00 AM – 09:45 AM PT Automate for Efficiency with Amazon Transcribe and Amazon Translate (200) – Learn how you can increase the efficiency and reach your operations with Amazon Translate and Amazon Transcribe.

April 26, 2018 | 09:00 AM – 09:45 AM PT Perform Machine Learning at the IoT Edge using AWS Greengrass and Amazon Sagemaker (200) – Learn more about developing machine learning applications for the IoT edge.

Mobile

April 30, 2018 | 11:00 AM – 11:45 AM PTOffline GraphQL Apps with AWS AppSync (300) – Come learn how to enable real-time and offline data in your applications with GraphQL using AWS AppSync.

Networking

May 2, 2018 | 09:00 AM – 09:45 AM PT Taking Serverless to the Edge (300) – Learn how to run your code closer to your end users in a serverless fashion. Also, David Von Lehman from Aerobatic will discuss how they used [email protected] to reduce latency and cloud costs for their customer’s websites.

Security, Identity & Compliance

April 30, 2018 | 09:00 AM – 09:45 AM PTAmazon GuardDuty – Let’s Attack My Account! (300) – Amazon GuardDuty Test Drive – Practical steps on generating test findings.

May 3, 2018 | 09:00 AM – 09:45 AM PTProtect Your Game Servers from DDoS Attacks (200) – Learn how to use the new AWS Shield Advanced for EC2 to protect your internet-facing game servers against network layer DDoS attacks and application layer attacks of all kinds.

Serverless

April 24, 2018 | 01:00 PM – 01:45 PM PTTips and Tricks for Building and Deploying Serverless Apps In Minutes (200) – Learn how to build and deploy apps in minutes.

Storage

May 1, 2018 | 11:00 AM – 11:45 AM PTBuilding Data Lakes That Cost Less and Deliver Results Faster (300) – Learn how Amazon S3 Select And Amazon Glacier Select increase application performance by up to 400% and reduce total cost of ownership by extending your data lake into cost-effective archive storage.

May 3, 2018 | 11:00 AM – 11:45 AM PTIntegrating On-Premises Vendors with AWS for Backup (300) – Learn how to work with AWS and technology partners to build backup & restore solutions for your on-premises, hybrid, and cloud native environments.

Using JWT For Sessions

Post Syndicated from Bozho original https://techblog.bozho.net/using-jwt-sessions/

The topic has been discussed many times, on hacker news, reddit, blogs. And the consensus is – DON’T USE JWT (for user sessions).

And I largely agree with the criticism of typical arguments for the JWT, the typical “but I can make it work…” explanations and the flaws of the JWT standard..

I won’t repeat everything here, so please go and read those articles. You can really shoot yourself in the foot with JWT, it’s complex to get to know it well and it has little benefits for most of the usecases. I guess for API calls it makes sense, especially if you reuse the same API in a single-page application and for your RESTful clients, but I’ll focus on the user session usecase.

Having all this criticism, I’ve gone against what the articles above recommend, and use JWT, navigating through their arguments and claiming I’m in a sweet spot. I can very well be wrong.

I store the user ID in a JWT token stored as a cookie. Not local storage, as that’s problematic. Not the whole state, as I don’t need that may lead to problems (pointed out in the linked articles). In fact, I don’t have any session state apart from the user data, which I think is a good practice.

What I want to avoid in my setup is sharing sessions across nodes. And this is a very compelling reason to not use the session mechanism of your web server/framework. No, you don’t need to have millions of users in order to need your application to run on more than one node. In fact, it should almost always run on (at least) two nodes, because nodes die and you don’t want downtime. Sticky sessions at the load balancer are a solution to that problem but you are just outsourcing the centralized session storage to the load balancer (and some load balancers might not support it). Shared session cache (e.g. memcached, elasticache, hazelcast) is also an option, and many web servers (at least in Java) support pluggable session replication mechanisms, but that introduces another component to the archtecture, another part of the stack to be supported and that can possibly break. It is not necessarily bad, but if there’s a simple way to avoid it, I’d go for it.

In order to avoid shared session storage, you need either the whole session state to be passed in the request/response cycle (as cookie, request parameter, header), or to receive a userId and load the user from the database or a cache. As we’ve learned, the former might be a bad choice. Despite that fact that frameworks like ASP.NET and JSF dump the whole state in the HTML of the page, it doesn’t intuitively sound good.

As for the latter – you may say “ok, if you are going to load the user from the database on every request this is going to be slow and if you use a cache, then why not use the cache for the sessions themselves?”. Well, the cache can be local. Remember we have just a few application nodes. Each node can have a local, in-memory cache for the currently active users. The fact that all nodes will have the same user loaded (after a few requests are routed to them by the load balancer in a round-robin fashion) is not important, as that cache is small. But you won’t have to take any care for replicating it across nodes, taking care of new nodes coming and going from the cluster, dealing with network issues between the nodes, etc. Each application node will be an island not caring about any other application node.

So here goes my first objection to the linked articles – just storing the user identifier in a JWT token is not pointless, as it saves you from session replication.

What about the criticism for the JWT standard and the security implications of its cryptography? Entirely correct, it’s easy to shoot yourself in the foot. That’s why I’m using JWT only with MAC, and only with a particular algorithm that I verify upon receiving the token, thus (allegedly) avoiding all the pitfalls. In all fairness, I’m willing to use the alternative proposed in one of the articles – PASETO – but it doesn’t have a Java library and it will take some time implementing one (might do in the future). To summarize – if there was another easy to use way for authenticated encryption of cookies, I’d use it.

So I’m basically using JWT in “PASETO-mode”, with only one operation and only one algorithm. And that should be fine as a general approach – the article doesn’t criticize the idea of having a user identifier in a token (and a stateless application node), it criticizes the complexity and vulnerabilities of the standard. This is sort of my second objection – “Don’t use JWT” is widely understood to mean “Don’t use tokens”, where that is not the case.

Have I introduced some vulnerability in my strive for architectural simplicity and lack of shared state? I hope not.

The post Using JWT For Sessions appeared first on Bozho's tech blog.

Facebook Will Verify the Physical Location of Ad Buyers with Paper Postcards

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/02/facebook_will_v.html

It’s not a great solution, but it’s something:

The process of using postcards containing a specific code will be required for advertising that mentions a specific candidate running for a federal office, Katie Harbath, Facebook’s global director of policy programs, said. The requirement will not apply to issue-based political ads, she said.

“If you run an ad mentioning a candidate, we are going to mail you a postcard and you will have to use that code to prove you are in the United States,” Harbath said at a weekend conference of the National Association of Secretaries of State, where executives from Twitter Inc and Alphabet Inc’s Google also spoke.

“It won’t solve everything,” Harbath said in a brief interview with Reuters following her remarks.

But sending codes through old-fashioned mail was the most effective method the tech company could come up with to prevent Russians and other bad actors from purchasing ads while posing as someone else, Harbath said.

It does mean a several-days delay between purchasing an ad and seeing it run.

Top 8 Best Practices for High-Performance ETL Processing Using Amazon Redshift

Post Syndicated from Thiyagarajan Arumugam original https://aws.amazon.com/blogs/big-data/top-8-best-practices-for-high-performance-etl-processing-using-amazon-redshift/

An ETL (Extract, Transform, Load) process enables you to load data from source systems into your data warehouse. This is typically executed as a batch or near-real-time ingest process to keep the data warehouse current and provide up-to-date analytical data to end users.

Amazon Redshift is a fast, petabyte-scale data warehouse that enables you easily to make data-driven decisions. With Amazon Redshift, you can get insights into your big data in a cost-effective fashion using standard SQL. You can set up any type of data model, from star and snowflake schemas, to simple de-normalized tables for running any analytical queries.

To operate a robust ETL platform and deliver data to Amazon Redshift in a timely manner, design your ETL processes to take account of Amazon Redshift’s architecture. When migrating from a legacy data warehouse to Amazon Redshift, it is tempting to adopt a lift-and-shift approach, but this can result in performance and scale issues long term. This post guides you through the following best practices for ensuring optimal, consistent runtimes for your ETL processes:

  • COPY data from multiple, evenly sized files.
  • Use workload management to improve ETL runtimes.
  • Perform table maintenance regularly.
  • Perform multiple steps in a single transaction.
  • Loading data in bulk.
  • Use UNLOAD to extract large result sets.
  • Use Amazon Redshift Spectrum for ad hoc ETL processing.
  • Monitor daily ETL health using diagnostic queries.

1. COPY data from multiple, evenly sized files

Amazon Redshift is an MPP (massively parallel processing) database, where all the compute nodes divide and parallelize the work of ingesting data. Each node is further subdivided into slices, with each slice having one or more dedicated cores, equally dividing the processing capacity. The number of slices per node depends on the node type of the cluster. For example, each DS2.XLARGE compute node has two slices, whereas each DS2.8XLARGE compute node has 16 slices.

When you load data into Amazon Redshift, you should aim to have each slice do an equal amount of work. When you load the data from a single large file or from files split into uneven sizes, some slices do more work than others. As a result, the process runs only as fast as the slowest, or most heavily loaded, slice. In the example shown below, a single large file is loaded into a two-node cluster, resulting in only one of the nodes, “Compute-0”, performing all the data ingestion:

When splitting your data files, ensure that they are of approximately equal size – between 1 MB and 1 GB after compression. The number of files should be a multiple of the number of slices in your cluster. Also, I strongly recommend that you individually compress the load files using gzip, lzop, or bzip2 to efficiently load large datasets.

When loading multiple files into a single table, use a single COPY command for the table, rather than multiple COPY commands. Amazon Redshift automatically parallelizes the data ingestion. Using a single COPY command to bulk load data into a table ensures optimal use of cluster resources, and quickest possible throughput.

2. Use workload management to improve ETL runtimes

Use Amazon Redshift’s workload management (WLM) to define multiple queues dedicated to different workloads (for example, ETL versus reporting) and to manage the runtimes of queries. As you migrate more workloads into Amazon Redshift, your ETL runtimes can become inconsistent if WLM is not appropriately set up.

I recommend limiting the overall concurrency of WLM across all queues to around 15 or less. This WLM guide helps you organize and monitor the different queues for your Amazon Redshift cluster.

When managing different workloads on your Amazon Redshift cluster, consider the following for the queue setup:

  • Create a queue dedicated to your ETL processes. Configure this queue with a small number of slots (5 or fewer). Amazon Redshift is designed for analytics queries, rather than transaction processing. The cost of COMMIT is relatively high, and excessive use of COMMIT can result in queries waiting for access to the commit queue. Because ETL is a commit-intensive process, having a separate queue with a small number of slots helps mitigate this issue.
  • Claim extra memory available in a queue. When executing an ETL query, you can take advantage of the wlm_query_slot_count to claim the extra memory available in a particular queue. For example, a typical ETL process might involve COPYing raw data into a staging table so that downstream ETL jobs can run transformations that calculate daily, weekly, and monthly aggregates. To speed up the COPY process (so that the downstream tasks can start in parallel sooner), the wlm_query_slot_count can be increased for this step.
  • Create a separate queue for reporting queries. Configure query monitoring rules on this queue to further manage long-running and expensive queries.
  • Take advantage of the dynamic memory parameters. They swap the memory from your ETL to your reporting queue after the ETL job has completed.

3. Perform table maintenance regularly

Amazon Redshift is a columnar database, which enables fast transformations for aggregating data. Performing regular table maintenance ensures that transformation ETLs are predictable and performant. To get the best performance from your Amazon Redshift database, you must ensure that database tables regularly are VACUUMed and ANALYZEd. The Analyze & Vacuum schema utility helps you automate the table maintenance task and have VACUUM & ANALYZE executed in a regular fashion.

  • Use VACUUM to sort tables and remove deleted blocks

During a typical ETL refresh process, tables receive new incoming records using COPY, and unneeded data (cold data) is removed using DELETE. New rows are added to the unsorted region in a table. Deleted rows are simply marked for deletion.

DELETE does not automatically reclaim the space occupied by the deleted rows. Adding and removing large numbers of rows can therefore cause the unsorted region and the number of deleted blocks to grow. This can degrade the performance of queries executed against these tables.

After an ETL process completes, perform VACUUM to ensure that user queries execute in a consistent manner. The complete list of tables that need VACUUMing can be found using the Amazon Redshift Util’s table_info script.

Use the following approaches to ensure that VACCUM is completed in a timely manner:

  • Use wlm_query_slot_count to claim all the memory allocated in the ETL WLM queue during the VACUUM process.
  • DROP or TRUNCATE intermediate or staging tables, thereby eliminating the need to VACUUM them.
  • If your table has a compound sort key with only one sort column, try to load your data in sort key order. This helps reduce or eliminate the need to VACUUM the table.
  • Consider using time series This helps reduce the amount of data you need to VACUUM.
  • Use ANALYZE to update database statistics

Amazon Redshift uses a cost-based query planner and optimizer using statistics about tables to make good decisions about the query plan for the SQL statements. Regular statistics collection after the ETL completion ensures that user queries run fast, and that daily ETL processes are performant. The Amazon Redshift utility table_info script provides insights into the freshness of the statistics. Keeping the statistics off (pct_stats_off) less than 20% ensures effective query plans for the SQL queries.

4. Perform multiple steps in a single transaction

ETL transformation logic often spans multiple steps. Because commits in Amazon Redshift are expensive, if each ETL step performs a commit, multiple concurrent ETL processes can take a long time to execute.

To minimize the number of commits in a process, the steps in an ETL script should be surrounded by a BEGIN…END statement so that a single commit is performed only after all the transformation logic has been executed. For example, here is an example multi-step ETL script that performs one commit at the end:

Begin
CREATE temporary staging_table;
INSERT INTO staging_table SELECT .. FROM source (transformation logic);
DELETE FROM daily_table WHERE dataset_date =?;
INSERT INTO daily_table SELECT .. FROM staging_table (daily aggregate);
DELETE FROM weekly_table WHERE weekending_date=?;
INSERT INTO weekly_table SELECT .. FROM staging_table(weekly aggregate);
Commit

5. Loading data in bulk

Amazon Redshift is designed to store and query petabyte-scale datasets. Using Amazon S3 you can stage and accumulate data from multiple source systems before executing a bulk COPY operation. The following methods allow efficient and fast transfer of these bulk datasets into Amazon Redshift:

  • Use a manifest file to ingest large datasets that span multiple files. The manifest file is a JSON file that lists all the files to be loaded into Amazon Redshift. Using a manifest file ensures that Amazon Redshift has a consistent view of the data to be loaded from S3, while also ensuring that duplicate files do not result in the same data being loaded more than one time.
  • Use temporary staging tables to hold the data for transformation. These tables are automatically dropped after the ETL session is complete. Temporary tables can be created using the CREATE TEMPORARY TABLE syntax, or by issuing a SELECT … INTO #TEMP_TABLE query. Explicitly specifying the CREATE TEMPORARY TABLE statement allows you to control the DISTRIBUTION KEY, SORT KEY, and compression settings to further improve performance.
  • User ALTER table APPEND to swap data from the staging tables to the target table. Data in the source table is moved to matching columns in the target table. Column order doesn’t matter. After data is successfully appended to the target table, the source table is empty. ALTER TABLE APPEND is much faster than a similar CREATE TABLE AS or INSERT INTO operation because it doesn’t involve copying or moving data.

6. Use UNLOAD to extract large result sets

Fetching a large number of rows using SELECT is expensive and takes a long time. When a large amount of data is fetched from the Amazon Redshift cluster, the leader node has to hold the data temporarily until the fetches are complete. Further, data is streamed out sequentially, which results in longer elapsed time. As a result, the leader node can become hot, which not only affects the SELECT that is being executed, but also throttles resources for creating execution plans and managing the overall cluster resources. Here is an example of a large SELECT statement. Notice that the leader node is doing most of the work to stream out the rows:

Use UNLOAD to extract large results sets directly to S3. After it’s in S3, the data can be shared with multiple downstream systems. By default, UNLOAD writes data in parallel to multiple files according to the number of slices in the cluster. All the compute nodes participate to quickly offload the data into S3.

If you are extracting data for use with Amazon Redshift Spectrum, you should make use of the MAXFILESIZE parameter to and keep files are 150 MB. Similar to item 1 above, having many evenly sized files ensures that Redshift Spectrum can do the maximum amount of work in parallel.

7. Use Redshift Spectrum for ad hoc ETL processing

Events such as data backfill, promotional activity, and special calendar days can trigger additional data volumes that affect the data refresh times in your Amazon Redshift cluster. To help address these spikes in data volumes and throughput, I recommend staging data in S3. After data is organized in S3, Redshift Spectrum enables you to query it directly using standard SQL. In this way, you gain the benefits of additional capacity without having to resize your cluster.

For tips on getting started with and optimizing the use of Redshift Spectrum, see the previous post, 10 Best Practices for Amazon Redshift Spectrum.

8. Monitor daily ETL health using diagnostic queries

Monitoring the health of your ETL processes on a regular basis helps identify the early onset of performance issues before they have a significant impact on your cluster. The following monitoring scripts can be used to provide insights into the health of your ETL processes:

ScriptUse when…Solution
commit_stats.sql – Commit queue statistics from past days, showing largest queue length and queue time firstDML statements such as INSERT/UPDATE/COPY/DELETE operations take several times longer to execute when multiple of these operations are in progressSet up separate WLM queues for the ETL process and limit the concurrency to < 5.
copy_performance.sql –  Copy command statistics for the past days Daily COPY operations take longer to execute• Follow the best practices for the COPY command.
• Analyze data growth with the incoming datasets and consider cluster resize to meet the expected SLA.
table_info.sql – Table skew and unsorted statistics along with storage and key informationTransformation steps take longer to execute• Set up regular VACCUM jobs to address unsorted rows and claim the deleted blocks so that transformation SQL execute optimally.
• Consider a table redesign to avoid data skewness.
v_check_transaction_locks.sql – Monitor transaction locksINSERT/UPDATE/COPY/DELETE operations on particular tables do not respond back in timely manner, compared to when run after the ETLMultiple DML statements are operating on the same target table at the same moment from different transactions. Set up ETL job dependency so that they execute serially for the same target table.
v_get_schema_priv_by_user.sql – Get the schema that the user has access toReporting users can view intermediate tablesSet up separate database groups for reporting and ETL users, and grants access to objects using GRANT.
v_generate_tbl_ddl.sql – Get the table DDLYou need to create an empty table with same structure as target table for data backfillGenerate DDL using this script for data backfill.
v_space_used_per_tbl.sql – monitor space used by individual tablesAmazon Redshift data warehouse space growth is trending upwards more than normal

Analyze the individual tables that are growing at higher rate than normal. Consider data archival using UNLOAD to S3 and Redshift Spectrum for later analysis.

Use unscanned_table_summary.sql to find unused table and archive or drop them.

top_queries.sql – Return the top 50 time consuming statements aggregated by its textETL transformations are taking longer to executeAnalyze the top transformation SQL and use EXPLAIN to find opportunities for tuning the query plan.

There are several other useful scripts available in the amazon-redshift-utils repository. The AWS Lambda Utility Runner runs a subset of these scripts on a scheduled basis, allowing you to automate much of monitoring of your ETL processes.

Example ETL process

The following ETL process reinforces some of the best practices discussed in this post. Consider the following four-step daily ETL workflow where data from an RDBMS source system is staged in S3 and then loaded into Amazon Redshift. Amazon Redshift is used to calculate daily, weekly, and monthly aggregations, which are then unloaded to S3, where they can be further processed and made available for end-user reporting using a number of different tools, including Redshift Spectrum and Amazon Athena.

Step 1:  Extract from the RDBMS source to a S3 bucket

In this ETL process, the data extract job fetches change data every 1 hour and it is staged into multiple hourly files. For example, the staged S3 folder looks like the following:

 [[email protected] ~]$ aws s3 ls s3://<<S3 Bucket>>/batch/2017/07/02/
2017-07-02 01:59:58   81900220 20170702T01.export.gz
2017-07-02 02:59:56   84926844 20170702T02.export.gz
2017-07-02 03:59:54   78990356 20170702T03.export.gz
…
2017-07-02 22:00:03   75966745 20170702T21.export.gz
2017-07-02 23:00:02   89199874 20170702T22.export.gz
2017-07-02 00:59:59   71161715 20170702T23.export.gz

Organizing the data into multiple, evenly sized files enables the COPY command to ingest this data using all available resources in the Amazon Redshift cluster. Further, the files are compressed (gzipped) to further reduce COPY times.

Step 2: Stage data to the Amazon Redshift table for cleansing

Ingesting the data can be accomplished using a JSON-based manifest file. Using the manifest file ensures that S3 eventual consistency issues can be eliminated and also provides an opportunity to dedupe any files if needed. A sample manifest20170702.json file looks like the following:

{
  "entries": [
    {"url":" s3://<<S3 Bucket>>/batch/2017/07/02/20170702T01.export.gz", "mandatory":true},
    {"url":" s3://<<S3 Bucket>>/batch/2017/07/02/20170702T02.export.gz", "mandatory":true},
    …
    {"url":" s3://<<S3 Bucket>>/batch/2017/07/02/20170702T23.export.gz", "mandatory":true}
  ]
}

The data can be ingested using the following command:

SET wlm_query_slot_count TO <<max available concurrency in the ETL queue>>;
COPY stage_tbl FROM 's3:// <<S3 Bucket>>/batch/manifest20170702.json' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' manifest;

Because the downstream ETL processes depend on this COPY command to complete, the wlm_query_slot_count is used to claim all the memory available to the queue. This helps the COPY command complete as quickly as possible.

Step 3: Transform data to create daily, weekly, and monthly datasets and load into target tables

Data is staged in the “stage_tbl” from where it can be transformed into the daily, weekly, and monthly aggregates and loaded into target tables. The following job illustrates a typical weekly process:

Begin
INSERT into ETL_LOG (..) values (..);
DELETE from weekly_tbl where dataset_week = <<current week>>;
INSERT into weekly_tbl (..)
  SELECT date_trunc('week', dataset_day) AS week_begin_dataset_date, SUM(C1) AS C1, SUM(C2) AS C2
	FROM   stage_tbl
GROUP BY date_trunc('week', dataset_day);
INSERT into AUDIT_LOG values (..);
COMMIT;
End;

As shown above, multiple steps are combined into one transaction to perform a single commit, reducing contention on the commit queue.

Step 4: Unload the daily dataset to populate the S3 data lake bucket

The transformed results are now unloaded into another S3 bucket, where they can be further processed and made available for end-user reporting using a number of different tools, including Redshift Spectrum and Amazon Athena.

unload ('SELECT * FROM weekly_tbl WHERE dataset_week = <<current week>>’) TO 's3:// <<S3 Bucket>>/datalake/weekly/20170526/' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';

Summary

Amazon Redshift lets you easily operate petabyte-scale data warehouses on the cloud. This post summarized the best practices for operating scalable ETL natively within Amazon Redshift. I demonstrated efficient ways to ingest and transform data, along with close monitoring. I also demonstrated the best practices being used in a typical sample ETL workload to transform the data into Amazon Redshift.

If you have questions or suggestions, please comment below.

 


About the Author

Thiyagarajan Arumugam is a Big Data Solutions Architect at Amazon Web Services and designs customer architectures to process data at scale. Prior to AWS, he built data warehouse solutions at Amazon.com. In his free time, he enjoys all outdoor sports and practices the Indian classical drum mridangam.

 

Early Challenges: Managing Cash Flow

Post Syndicated from Gleb Budman original https://www.backblaze.com/blog/managing-cash-flow/

Cash flow projection charts

This post by Backblaze’s CEO and co-founder Gleb Budman is the eighth in a series about entrepreneurship. You can choose posts in the series from the list below:

  1. How Backblaze got Started: The Problem, The Solution, and the Stuff In-Between
  2. Building a Competitive Moat: Turning Challenges Into Advantages
  3. From Idea to Launch: Getting Your First Customers
  4. How to Get Your First 1,000 Customers
  5. Surviving Your First Year
  6. How to Compete with Giants
  7. The Decision on Transparency
  8. Early Challenges: Managing Cash Flow

Use the Join button above to receive notification of new posts in this series.

Running out of cash is one of the quickest ways for a startup to go out of business. When you are starting a company the question of where to get cash is usually the top priority, but managing cash flow is critical for every stage in the lifecycle of a company. As a primarily bootstrapped but capital-intensive business, managing cash flow at Backblaze was and still is a key element of our success and requires continued focus. Let’s look at what we learned over the years.

Raising Your Initial Funding

When starting a tech business in Silicon Valley, the default assumption is that you will immediately try to raise venture funding. There are certainly many advantages to raising funding — not the least of which is that you don’t need to be cash-flow positive since you have cash in the bank and the expectation is that you will have a “burn rate,” i.e. you’ll be spending more than you make.

Note: While you’re not expected to be cash-flow positive, that doesn’t mean you don’t have to worry about cash. Cash-flow management will determine your burn rate. Whether you can get to cash-flow breakeven or need to raise another round of funding is a direct byproduct of your cash flow management.

Also, raising funding takes time (most successful fundraising cycles take 3-6 months start-to-finish), and time at a startup is in short supply. Constantly trying to raise funding can take away from product development and pursuing growth opportunities. If you’re not successful in raising funding, you then have to either shut down or find an alternate method of funding the business.

Sources of Funding

Depending on the stage of the company, type of company, and other factors, you may have access to different sources of funding. Let’s list a number of them:

Customers

Sales — the best kind of funding. It is non-dilutive, doesn’t have to be paid back, and is a direct metric of the success of your company.

Pre-Sales — some customers may be willing to pay you for a product in beta, a test, or pre-pay for a product they’ll receive when finished. Pre-Sales income also is great because it shares the characteristics of cash from sales, but you get the cash early. It also can be a good sign that the product you’re building fills a market need. We started charging for Backblaze computer backup while it was still in private beta, which allowed us to not only collect cash from customers, but also test the billing experience and users’ real desire for the service.

Services — if you’re a service company and customers are paying you for that, great. You can effectively scale for the number of hours available in a day. As demand grows, you can add more employees to increase the total number of billable hours.

Note: If you’re a product company and customers are paying you to consult, that can provide much needed cash, and could provide feedback toward the right product. However, it can also distract from your core business, send you down a path where you’re building a product for a single customer, and addict you to a path that prevents you from building a scalable business.

Investors

Yourself — you likely are putting your time into the business, and deferring salary in the process. You may also put your own cash into the business either as an investment or a loan.

Angels — angels are ideal as early investors since they are used to investing in businesses with little to no traction. AngelList is a good place to find them, though finding people you’re connected with through someone that knows you well is best.

Crowdfunding — a component of the JOBS Act permitted entrepreneurs to raise money from nearly anyone since May 2016. The SEC imposes limits on both investors and the companies. This article goes into some depth on the options and sites available.

VCs — VCs are ideal for companies that need to raise at least a few million dollars and intend to build a business that will be worth over $1 billion.

Debt

Friends & Family — F&F are often the first people to give you money because they are investing in you. It’s great to have some early supporters, but it also can be risky to take money from people who aren’t used to the risks. The key advice here is to only take money from people who won’t mind losing it. If someone is talking about using their children’s college funds or borrowing from their 401k, say ‘no thank you’ — even if they’re sure they want to loan you money.

Bank Loans — a variety of loan types exist, but most either require the company to have been operational for a couple years, be able to borrow against money the company has or is making, or be able to get a personal guarantee from the founders whereby their own credit is on the line. Fundera provides a good overview of loan options and can help secure some, but most will not be an option for a brand new startup.

Grants

Government — in some areas there is the potential for government grants to facilitate research. The SBIR program facilitates some such grants.

At Backblaze, we used a number of these options:

• Investors/Yourself
We loaned a cumulative total of a couple hundred thousand dollars to the company and invested our time by going without a salary for a year and a half.
• Customers/Pre-Sales
We started selling the Backblaze service while it was still in beta.
• Customers/Sales
We launched v1.0 and kept selling.
• Investors/Angels
After a year and a half, we raised $370k from 11 angels. All of them were either people whom we knew personally or were a strong recommendation from a mutual friend.
• Debt/Loans
After a couple years we were able to get equipment leases whereby the Storage Pods and hard drives were used as collateral to secure the lease on them.
• Investors/VCs
Ater five years we raised $5m from TMT Investments to add to the balance sheet and invest in growth.

The variety and quantity of sources we used is by no means uncommon.

GAAP vs. Cash

Most companies start tracking financials based on cash, and as they scale they switch to GAAP (Generally Accepted Accounting Principles). Cash is easier to track — we got paid $XXXX and spent $YYY — and as often mentioned, is required for the business to stay alive. GAAP has more subtlety and complexity, but provides a clearer picture of how the business is really doing. Backblaze was on a ‘cash’ system for the first few years, then switched to GAAP. For this post, I’m going to focus on things that help cash flow, not GAAP profitability.

Stages of Cash Flow Management

All-spend

In a pure service business (e.g. solo proprietor law firm), you may have no expenses other than your time, so this stage doesn’t exist. However, in a product business there is a period of time where you are building the product and have nothing to sell. You have zero cash coming in, but have cash going out. Your cash-flow is completely negative and you need funds to cover that.

Sales-generating

Starting to see cash come in from customers is thrilling. I initially had our system set up to email me with every $5 payment we received. You’re making sales, but not covering expenses.

Ramen-profitable

But it takes a lot of $5 payments to pay for servers and salaries, so for a while expenses are likely to outstrip sales. Getting to ramen-profitable is a critical stage where sales cover the business expenses and are “paying enough for the founders to eat ramen.” This extends the runway for a business, but is not completely sustainable, since presumably the founders can’t (or won’t) live forever on a subsistence salary.

Business-profitable

This is the ultimate stage whereby the business is truly profitable, including paying everyone market-rate salaries. A business at this stage is self-sustaining. (Of course, market shifts and plenty of other challenges can kill the business, but cash-flow issues alone will not.)

Note, I’m using the word ‘profitable’ here to mean this is still on a cash-basis.

Backblaze was in the all-spend stage for just over a year, during which time we built the service and hadn’t yet made the service available to customers. Backblaze was in the sales-generating stage for nearly another year before the company was barely ramen-profitable where sales were covering the company expenses and paying the founders minimum wage. (I say ‘barely’ since minimum wage in the SF Bay Area is arguably never subsistence.) It took almost three more years before the company was business-profitable, paying everyone including the founders market-rate.

Cash Flow Forecasting

When raising funding it’s helpful to think of milestones reached. You don’t necessarily need enough cash on day one to last for the next 100 years of the company. Some good milestones to consider are how much cash you need to prove there is a market need, prove you can build a product to meet that need, or get to ramen-profitable.

Two things to consider:

1) Unit Economics (COGS)

If your product is 100% software, this may not be relevant. Once software is built it costs effectively nothing to deliver the product to one customer or one million customers. However, in most businesses there is some incremental cost to provide the product. If you’re selling a hardware device, perhaps you sell it for $100 but it costs you $50 to make it. This is called “COGS” (Cost of Goods Sold).

Many products rely on cloud services where the costs scale with growth. That model works great, but it’s still important to understand what the costs are for the cloud service you use per unit of product you sell.

Support is often done by the founders early-on in a business, but that is another real cost to factor in and estimate on a per-user basis. Taking all of the per unit costs combined, you may charge $10/month/user for your service, but if it costs you $7/month/user in cloud services, you’re only netting $3/month/user.

2) Operating Expenses (OpEx)

These are expenses that don’t scale with the number of product units you sell. Typically this includes research & development, sales & marketing, and general & administrative expenses. Presumably there is a certain level of these functions required to build the product, market it, sell it, and run the organization. You can choose to invest or cut back on these, but you’ll still make the same amount per product unit.

Incremental Net Profit Per Unit

If you’ve calculated your COGS and your unit economics are “upside down,” where the amount you charge is less than that it costs you to provide your service, it’s worth thinking hard about how that’s going to change over time. If it will not change, there is no scale that will make the business work. Presuming you do make money on each unit of product you sell — what is sometimes referred to as “Contribution Margin” — consider how many of those product units you need to sell to cover your operating expenses as described above.

Calculating Your Profit

The math on getting to ramen-profitable is simple:

(Number of Product Units Sold x Contribution Margin) - Operating Expenses = Profit

If your operating expenses include subsistence salaries for the founders and profit > $0, you’re ramen-profitable.

Improving Cash Flow

Having access to sources of cash, whether from selling to customers or other methods, is excellent. But needing less cash gives you more choices and allows you to either dilute less, owe less, or invest more.

There are two ways to improve cash flow:

1) Collect More Cash

The best way to collect more cash is to provide more value to your customers and as a result have them pay you more. Additional features/products/services can allow this. However, you can also collect more cash by changing how you charge for your product. If you have a subscription, changing from charging monthly to yearly dramatically improves your cash flow. If you have a product that customers use up, selling a year’s supply instead of selling them one-by-one can help.

2) Spend Less Cash

Reducing COGS is a fantastic way to spend less cash in a scalable way. If you can do this without harming the product or customer experience, you win. There are a myriad of ways to also reduce operating expenses, including taking sub-market salaries, using your home instead of renting office space, staying focused on your core product, etc.

Ultimately, collecting more and spending less cash dramatically simplifies the process of getting to ramen-profitable and later to business-profitable.

Be Careful (Why GAAP Matters)

A word of caution: while running out of cash will put you out of business immediately, overextending yourself will likely put you out of business not much later. GAAP shows how a business is really doing; cash doesn’t. If you only focus on cash, it is possible to commit yourself to both delivering products and repaying loans in the future in an unsustainable fashion. If you’re taking out loans, watch the total balance and monthly payments you’re committing to. If you’re asking customers for pre-payment, make sure you believe you can deliver on what they’ve paid for.

Summary

There are numerous challenges to building a business, and ensuring you have enough cash is amongst the most important. Having the cash to keep going lets you keep working on all of the other challenges. The frameworks above were critical for maintaining Backblaze’s cash flow and cash balance. Hopefully you can take some of the lessons we learned and apply them to your business. Let us know what works for you in the comments below.

The post Early Challenges: Managing Cash Flow appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Marvellous retrofitted home assistants

Post Syndicated from Alex Bate original https://www.raspberrypi.org/blog/retrofitted-home-assistants/

As more and more digital home assistants are appearing on the consumer market, it’s not uncommon to see the towering Amazon Echo or sleek Google Home when visiting friends or family. But we, the maker community, are rarely happy unless our tech stands out from the rest. So without further ado, here’s a roundup of some fantastic retrofitted home assistant projects you can recreate and give pride of place in your kitchen, on your bookshelf, or wherever else you’d like to talk to your virtual, disembodied PA.

Google AIY Robot Conversion

Turned an 80s Tomy Mr Money into a little Google AIY / Raspberry Pi based assistant.

Matt ‘Circuitbeard’ Brailsford’s Tomy Mr Money Google AIY Assistant is just one of many home-brew home assistants makers have built since the release of APIs for Amazon Alexa and Google Home. Here are some more…

Teddy Ruxpin

Oh Teddy, how exciting and mysterious you were when I unwrapped you back in the mideighties. With your awkwardly moving lips and twitching eyelids, you were the cream of the crop of robotic toys! How was I to know that during my thirties, you would become augmented with home assistant software and suddenly instil within me a fear unlike any I’d felt before? (Save for my lifelong horror of ET…)

Alexa Ruxpin – Raspberry Pi & Alexa Powered Teddy Bear

Please watch: “DIY Fidget LED Display – Part 1” https://www.youtube.com/watch?v=FAZIc82Duzk -~-~~-~~~-~~-~- There are tons of virtual assistants out on the market: Siri, Ok Google, Alexa, etc. I had this crazy idea…what if I made the virtual assistant real…kinda. I decided to take an old animatronic teddy bear and hack it so that it ran Amazon Alexa.

Several makers around the world have performed surgery on Teddy to install a Raspberry Pi within his stomach and integrate him with Amazon Alexa Voice or Google’s AIY Projects Voice kit. And because these makers are talented, they’ve also managed to hijack Teddy’s wiring to make his lips move in time with his responses to your commands. Freaky…

Speaking of freaky: check out Zack’s Furlexa — an Amazon Alexa Furby that will haunt your nightmares.

Give old tech new life

Devices that were the height of technology when you purchased them may now be languishing in your attic collecting dust. With new and improved versions of gadgets and gizmos being released almost constantly, it is likely that your household harbours a spare whosit or whatsit which you can dismantle and give a new Raspberry Pi heart and purpose.

Take, for example, Martin Mander’s Google Pi intercom. By gutting and thoroughly cleaning a vintage intercom, Martin fashioned a suitable housing the Google AIY Projects Voice kit to create a new home assistant for his house:

1986 Google Pi Intercom

This is a 1986 Radio Shack Intercom that I’ve converted into a Google Home style device using a Raspberry Pi and the Google AIY (Artificial Intelligence Yourself) kit that came free with the MagPi magazine (issue 57). It uses the Google Assistant to answer questions and perform actions, using IFTTT to integrate with smart home accessories and other web services.

Not only does this build look fantastic, it’s also a great conversation starter for any visitors who had a similar device during the eighties.

Also take a look at Martin’s 1970s Amazon Alexa phone for more nostalgic splendour.

Put it in a box

…and then I’ll put that box inside of another box, and then I’ll mail that box to myself, and when it arrives…

A GIF from the emperors new groove - Raspberry Pi Home Assistant

A GIF. A harmless, little GIF…and proof of the comms team’s obsession with The Emperor’s New Groove.

You don’t have to be fancy when it comes to housing your home assistant. And often, especially if you’re working with the smaller people in your household, the results of a simple homespun approach are just as delightful.

Here are Hannah and her dad Tom, explaining how they built a home assistant together and fit it inside an old cigar box:

Raspberry Pi 3 Amazon Echo – The Alexa Kids Build!

My 7 year old daughter and I decided to play around with the Raspberry Pi and build ourselves an Amazon Echo (Alexa). The video tells you about what we did and the links below will take you to all the sites we used to get this up and running.

Also see the Google AIY Projects Voice kit — the cardboard box-est of home assistant boxes.

Make your own home assistant

And now it’s your turn! I challenge you all (and also myself) to create a home assistant using the Raspberry Pi. Whether you decide to fit Amazon Alexa inside an old shoebox or Google Home inside your sister’s Barbie, I’d love to see what you create using the free home assistant software available online.

Check out these other home assistants for Raspberry Pi, and keep an eye on our blog to see what I manage to create as part of the challenge.

Ten virtual house points for everyone who shares their build with us online, either in the comments below or by tagging us on your social media account.

The post Marvellous retrofitted home assistants appeared first on Raspberry Pi.

In the Works – AWS IoT Device Defender – Secure Your IoT Fleet

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/in-the-works-aws-sepio-secure-your-iot-fleet/

Scale takes on a whole new meaning when it comes to IoT. Last year I was lucky enough to tour a gigantic factory that had, on average, one environment sensor per square meter. The sensors measured temperature, humidity, and air purity several times per second, and served as an early warning system for contaminants. I’ve heard customers express interest in deploying IoT-enabled consumer devices in the millions or tens of millions.

With powerful, long-lived devices deployed in a geographically distributed fashion, managing security challenges is crucial. However, the limited amount of local compute power and memory can sometimes limit the ability to use encryption and other forms of data protection.

To address these challenges and to allow our customers to confidently deploy IoT devices at scale, we are working on IoT Device Defender. While the details might change before release, AWS IoT Device Defender is designed to offer these benefits:

Continuous AuditingAWS IoT Device Defender monitors the policies related to your devices to ensure that the desired security settings are in place. It looks for drifts away from best practices and supports custom audit rules so that you can check for conditions that are specific to your deployment. For example, you could check to see if a compromised device has subscribed to sensor data from another device. You can run audits on a schedule or on an as-needed basis.

Real-Time Detection and AlertingAWS IoT Device Defender looks for and quickly alerts you to unusual behavior that could be coming from a compromised device. It does this by monitoring the behavior of similar devices over time, looking for unauthorized access attempts, changes in connection patterns, and changes in traffic patterns (either inbound or outbound).

Fast Investigation and Mitigation – In the event that you get an alert that something unusual is happening, AWS IoT Device Defender gives you the tools, including contextual information, to help you to investigate and mitigate the problem. Device information, device statistics, diagnostic logs, and previous alerts are all at your fingertips. You have the option to reboot the device, revoke its permissions, reset it to factory defaults, or push a security fix.

Stay Tuned
I’ll have more info (and a hands-on post) as soon as possible, so stay tuned!

Jeff;

New- AWS IoT Device Management

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-iot-device-management/

AWS IoT and AWS Greengrass give you a solid foundation and programming environment for your IoT devices and applications.

The nature of IoT means that an at-scale device deployment often encompasses millions or even tens of millions of devices deployed at hundreds or thousands of locations. At that scale, treating each device individually is impossible. You need to be able to set up, monitor, update, and eventually retire devices in bulk, collective fashion while also retaining the flexibility to accommodate varying deployment configurations, device models, and so forth.

New AWS IoT Device Management
Today we are launching AWS IoT Device Management to help address this challenge. It will help you through each phase of the device lifecycle, from manufacturing to retirement. Here’s what you get:

Onboarding – Starting with devices in their as-manufactured state, you can control the provisioning workflow. You can use IoT Device Management templates to quickly onboard entire fleets of devices with a few clicks. The templates can include information about device certificates and access policies.

Organization – In order to deal with massive numbers of devices, AWS IoT Device Management extends the existing IoT Device Registry and allows you to create a hierarchical model of your fleet and to set policies on a hierarchical basis. You can drill-down through the hierarchy in order to locate individual devices. You can also query your fleet on attributes such as device type or firmware version.

Monitoring – Telemetry from the devices is used to gather real-time connection, authentication, and status metrics, which are published to Amazon CloudWatch. You can examine the metrics and locate outliers for further investigation. IoT Device Management lets you configure the log level for each device group, and you can also publish change events for the Registry and Jobs for monitoring purposes.

Remote ManagementAWS IoT Device Management lets you remotely manage your devices. You can push new software and firmware to them, reset to factory defaults, reboot, and set up bulk updates at the desired velocity.

Exploring AWS IoT Device Management
The AWS IoT Device Management Console took me on a tour and pointed out how to access each of the features of the service:

I already have a large set of devices (pressure gauges):

These gauges were created using the new template-driven bulk registration feature. Here’s how I create a template:

The gauges are organized into groups (by US state in this case):

Here are the gauges in Colorado:

AWS IoT group policies allow you to control access to specific IoT resources and actions for all members of a group. The policies are structured very much like IAM policies, and can be created in the console:

Jobs are used to selectively update devices. Here’s how I create one:

As indicated by the Job type above, jobs can run either once or continuously. Here’s how I choose the devices to be updated:

I can create custom authorizers that make use of a Lambda function:

I’ve shown you a medium-sized subset of AWS IoT Device Management in this post. Check it out for yourself to learn more!

Jeff;