It’s the most wonderful time of the year! There’s much mistletoeing, and hearts will be glowing – as will thousands of Raspberry Pi-enabled Christmas light displays around the world.
This morning I have mostly been spending my virtual time by a roadside in snowy Poland, inflicting carols on passers-by. (It turns out that the Polish carols this crib is programmed with rock a lot harder than the ones we listen to in England.) Visit the crib’s website to control it yourself.
We are also suckers for a good Christmas son et lumiere. If you’re looking to make something yourself, LightShow Pi has been around for some years now, and goes from strength to strength. We’ve covered projects built with it in previous years, and it’s still in active development from what we can see, with new features for this Christmas like the ability to address individual RGB pixels. Most of the sound and music displays you’ll see using a Raspberry Pi are running LightShow Pi; it’s got a huge user base, and its online community on Reddit is a great place to get started.
Light display contains over 4,000 lights and 7,800 individual channels. It is controlled by 3 network based lighting controllers. The audio and lighting sequences are sent to the controllers by a Raspberry Pi.
This display from the USA must have taken forever to set up: you’re looking at 4,000 lights and 7,800 channels. Here’s something more domestically proportioned from YouTube user Ken B, showing off LightShow Pi’s microweb user interface, which is perfect for use on your phone.
Demonstration of the microweb interface along with LED only operation using two matrices, lower one cycling.
Scared of the neighbours burning down your outdoor display, or not enough space for a full-size tree? Never fear: The Pi Hut’s 3D Christmas tree, designed by Rachel Rayns, formerly of this parish, is on sale again this year. We particularly loved this adaptation from Blitz City DIY, where Liz (not me, another Liz) RGB-ifies the tree: a great little Christmas electronics project to work through with the kids. Or on your own, because we don’t need to have all our fun vicariously through our children this Christmas. (Repeat ten times.)
The Pi Hut’s Xmas Tree Kit is a fun little soldering kit for the Raspberry Pi. It’s a great kit, but I thought it could do with a bit more color. This is just a quick video to talk about the kit and show off all the RGB goodness.
Any Christmas projects you’d like to share? Let us know in the comments!
Hey folks, Rob here! It’s the last Thursday of the month, and that means it’s time for a brand-new The MagPi. Issue 70 is all about home automation using your favourite microcomputer, the Raspberry Pi.
Home automation in this month’s The MagPi!
Raspberry Pi home automation
We think home automation is an excellent use of the Raspberry Pi, hiding it around your house and letting it power your lights and doorbells and…fish tanks? We show you how to do all of that, and give you some excellent tips on how to add even more automation to your home in our ten-page cover feature.
Upcycle your life
Our other big feature this issue covers upcycling, the hot trend of taking old electronics and making them better than new with some custom code and a tactically placed Raspberry Pi. For this feature, we had a chat with Martin Mander, upcycler extraordinaire, to find out his top tips for hacking your old hardware.
Upcycling is a lot of fun
But wait, there’s more!
If for some reason you want even more content, you’re in luck! We have some fun tutorials for you to try, like creating a theremin and turning a Babbage into an IoT nanny cam. We also continue our quest to make a video game in C++. Our project showcase is headlined by the Teslonda on page 28, a Honda/Tesla car hybrid that is just wonderful.
We review PiBorg’s latest robot
All this comes with our definitive reviews and the community section where we celebrate you, our amazing community! You’re all good beans
An amazing, and practical, Raspberry Pi project
Get The MagPi 70
Issue 70 is available today from WHSmith, Tesco, Sainsbury’s, and Asda. If you live in the US, head over to your local Barnes & Noble or Micro Center in the next few days for a print copy. You can also get the new issue online from our store, or digitally via our Android and iOS apps. And don’t forget, there’s always the free PDF as well.
New subscription offer!
Want to support the Raspberry Pi Foundation and the magazine? We’ve launched a new way to subscribe to the print version of The MagPi: you can now take out a monthly £4 subscription to the magazine, effectively creating a rolling pre-order system that saves you money on each issue.
You can also take out a twelve-month print subscription and get a Pi Zero W plus case and adapter cables absolutely free! This offer does not currently have an end date.
Warning: a GIF used in today’s blog contains flashing images.
Students at the University of Bremen, Germany, have built a wearable camera that records the seconds of vision lost when you blink. Augenblick uses a Raspberry Pi Zero and Camera Module alongside muscle sensors to record footage whenever you close your eyes, producing a rather disjointed film of the sights you miss out on.
Blink and you’ll miss it
The average person blinks up to five times a minute, with each blink lasting 0.5 to 0.8 seconds. These half-seconds add up to about 30 minutes a day. What sights are we losing during these minutes? That is the question asked by students Manasse Pinsuwan and René Henrich when they set out to design Augenblick.
Blinking is a highly invasive mechanism for our eyesight. Every day we close our eyes thousands of times without noticing it. Our mind manages to never let us wonder what exactly happens in the moments that we miss.
Capturing lost moments
For Augenblick, the wearer sticks MyoWare Muscle Sensor pads to their face, and these detect the electrical impulses that trigger blinking.
Two pads are applied over the orbicularis oculi muscle that forms a ring around the eye socket, while the third pad is attached to the cheek as a neutral point.
Biology fact: there are two muscles responsible for blinking. The orbicularis oculi muscle closes the eye, while the levator palpebrae superioris muscle opens it — and yes, they both sound like the names of Harry Potter spells.
The sensor is read 25 times a second. Whenever it detects that the orbicularis oculi is active, the Camera Module records video footage.
Pressing a button on the side of the Augenblick glasses set the code running. An LED lights up whenever the camera is recording and also serves to confirm the correct placement of the sensor pads.
The Pi Zero saves the footage so that it can be stitched together later to form a continuous, if disjointed, film.
Learn more about the Augenblick blink camera
You can find more information on the conception, design, and build process of Augenblickhere in German, with a shorter explanation including lots of photos here in English.
Version 26.1 of the Emacs editor is out. Highlights include a built-in Lisp threading mechanism that provides some concurrency, double buffering when running under X, a redesigned flymake mode, 24-bit color support in text mode, and a systemd unit file.
We’re usually averse to buzzwords at HackSpace magazine, but not this month: in issue 7, we’re taking a deep dive into the Internet of Things.
Internet of Things (IoT)
To many people, IoT is a shady term used by companies to sell you something you already own, but this time with WiFi; to us, it’s a way to make our builds smarter, more useful, and more connected. In HackSpace magazine #7, you can join us on a tour of the boards that power IoT projects, marvel at the ways in which other makers are using IoT, and get started with your first IoT project!
DIY retro computing: this issue, we’re taking our collective hat off to Spencer Owen. He stuck his home-brew computer on Tindie thinking he might make a bit of beer money — now he’s paying the mortgage with his making skills and inviting others to build modules for his machine. And if that tickles your fancy, why not take a crack at our Z80 tutorial? Get out your breadboard, assemble your jumper wires, and prepare to build a real-life computer!
Shameless patriotism: combine Lego, Arduino, and the car of choice for 1960 gold bullion thieves, and you’ve got yourself a groovy weekend project. We proudly present to you one man’s epic quest to add LED lights (controllable via a smartphone!) to his daughter’s LEGO Mini Cooper.
Patriotism intensifies: for the last 200-odd years, the Black Country has been a hotbed of making. Urban Hax, based in Walsall, is the latest makerspace to show off its riches in the coveted Space of the Month pages. Every space has its own way of doing things, but not every space has a portrait of Rob Halford on the wall. All hail!
Diversity: advice on diversity often boils down to ‘Be nice to people’, which might feel more vague than actionable. This is where we come in to help: it is truly worth making the effort to give people of all backgrounds access to your makerspace, so we take a look at why it’s nice to be nice, and at the ways in which one makerspace has put niceness into practice — with great results.
And there’s more!
We also show you how to easily calculate the size and radius of laser-cut gears, use a bank of LEDs to etch PCBs in your own mini factory, and use chemistry to mess with your lunch menu.
All this plus much, much more waits for you in HackSpace magazine issue 7!
Get your copy of HackSpace magazine
If you like the sound of that, you can find HackSpace magazine in WHSmith, Tesco, Sainsbury’s, and independent newsagents in the UK. If you live in the US, check out your local Barnes & Noble, Fry’s, or Micro Center next week. We’re also shipping to stores in Australia, Hong Kong, Canada, Singapore, Belgium, and Brazil, so be sure to ask your local newsagent whether they’ll be getting HackSpace magazine.
If your day has been a little fraught so far, watch this video. It opens with a tableau of methodically laid-out components and then shows them soldered, screwed, and slotted neatly into place. Everything fits perfectly; nothing needs percussive adjustment. Then it shows us glimpses of an AR future just like the one promised in the less dystopian comics and TV programmes of my 1980s childhood. It is all very soothing, and exactly what I needed.
Transform any surface into mixed-reality using Raspberry Pi, a laser projector, and Android Things. Android Experiments – http://experiments.withgoogle.com/android/lantern Lantern project site – http://nordprojects.co/lantern check below to make your own ↓↓↓ Get the code – https://github.com/nordprojects/lantern Build the lamp – https://www.hackster.io/nord-projects/lantern-9f0c28
Creating augmented reality with projection
We’ve seen plenty of Raspberry Pi IoT builds that are smart devices for the home; they add computing power to things like lights, door locks, or toasters to make these objects interact with humans and with their environment in new ways. Nord Projects‘ Lantern takes a different approach. In their words, it:
imagines a future where projections are used to present ambient information, and relevant UI within everyday objects. Point it at a clock to show your appointments, or point to speaker to display the currently playing song. Unlike a screen, when Lantern’s projections are no longer needed, they simply fade away.
Lantern is set up so that you can connect your wireless device to it using Google Nearby. This means there’s no need to create an account before you can dive into augmented reality.
Your own open-source AR lamp
Nord Projects collaborated on Lantern with Google’s Android Things team. They’ve made it fully open-source, so you can find the code on GitHub and also download their parts list, which includes a Pi, an IKEA lamp, an accelerometer, and a laser projector. Build instructions are at hackster.io and on GitHub.
This is a particularly clear tutorial, very well illustrated with photos and GIFs, and once you’ve sourced and 3D-printed all of the components, you shouldn’t need a whole lot of experience to put everything together successfully. Since everything is open-source, though, if you want to adapt it — for example, if you’d like to source a less costly projector than the snazzy one used here — you can do that too.
The instructions walk you through the mechanical build and the wiring, as well as installing Android Things and Nord Projects’ custom software on the Raspberry Pi. Once you’ve set everything up, an accelerometer connected to the Pi’s GPIO pins lets the lamp know which surface it is pointing at. A companion app on your mobile device lets you choose from the mini apps that work on that surface to select the projection you want.
The designers are making several mini apps available for Lantern, including the charmingly named Space Porthole: this uses Processing and your local longitude and latitude to project onto your ceiling the stars you’d see if you punched a hole through to the sky, if it were night time, and clear weather. Wouldn’t you rather look at that than deal with the ant problem in your kitchen or tackle your GitHub notifications?
What would you like to project onto your living environment? Let us know in the comments!
Join us this month to learn about some of the exciting new services and solution best practices at AWS. We also have our first re:Invent 2018 webinar series, “How to re:Invent”. Sign up now to learn more, we look forward to seeing you.
May 30, 2018 | 01:00 PM – 01:45 PM PT – Accelerating Life Sciences with HPC on AWS – Learn how you can accelerate your Life Sciences research workloads by harnessing the power of high performance computing on AWS.
May 22, 2018 | 11:00 AM – 11:45 AM PT – Hybrid Cloud Customer Use Cases on AWS – Learn how customers are leveraging AWS hybrid cloud capabilities to easily extend their datacenter capacity, deliver new services and applications, and ensure business continuity and disaster recovery.
May 31, 2018 | 11:00 AM – 11:45 AM PT – Using AWS IoT for Industrial Applications – Discover how you can quickly onboard your fleet of connected devices, keep them secure, and build predictive analytics with AWS IoT.
May 24, 2018 | 09:00 AM – 09:45 AM PT – Introducing AWS DeepLens – Learn how AWS DeepLens provides a new way for developers to learn machine learning by pairing the physical device with a broad set of tutorials, examples, source code, and integration with familiar AWS services.
May 30, 2018 | 11:00 AM – 11:45 AM PT – Accelerate Productivity by Computing at the Edge – Learn how AWS Snowball Edge support for compute instances helps accelerate data transfers, execute custom applications, and reduce overall storage costs.
Join us as we celebrate the Year of Engineering in the newest issue of Hello World, our magazine for computing and digital making educators.
Inspiring future engineers
We’ve brought together a wide range of experts to share their ideas and advice on how to bring engineering to your classroom — read issue 5 to find out the best ways to inspire the next generation.
Plus we’ve got plenty on GP and Scratch, we answer your latest questions, and we bring you our usual collection of useful features, guides, and lesson plans.
Highlights of issue 5 include:
The bluffers’ guide to putting together a tech-themed school trip
Inclusion, and coding for the visually impaired
Getting students interested in databases
Why copying may not always be a bad thing
How to get Hello World #5
Hello World is available as a free download under a Creative Commons license for everyone in world who is interested in computer science and digital making education. Get the latest issue as a PDF file straight from the Hello World website.
We’re currently offering free print copies of the magazine to serving educators in the UK. This offer is open to teachers, Code Club and CoderDojo volunteers, teaching assistants, teacher trainers, and others who help children and young people learn about computing and digital making. Subscribe to have your free print magazine posted directly to your home, or subscribe digitally — 20000 educators have already signed up to receive theirs!
Get in touch!
You could write for us about your experiences as an educator, and share your advice with the community. Wherever you are in the world, get in touch by emailing our editorial team about your article idea — we would love to hear from you!
Some gaming consoles make it easy to stream to Twitch, some gaming consoles don’t (come on, Nintendo). So for those that don’t, I’ve made this beta version of the “Twitch-O-Matic”. No it doesn’t chop onions or fold your laundry, but what it DOES do is stream anything with HDMI output to your Twitch channel with the simple push of a button!
eSports and online game streaming
Interest in eSports has skyrocketed over the last few years, with viewership numbers in the hundreds of millions, sponsorship deals increasing in value and prestige, and tournament prize funds reaching millions of dollars. So it’s no wonder that more and more gamers are starting to stream live to online platforms in order to boost their fanbase and try to cash in on this growing industry.
Streaming to Twitch
Launched in 2011, Twitch.tv is an online live-streaming platform with a primary focus on video gaming. Users can create accounts to contribute their comments and content to the site, as well as watching live-streamed gaming competitions and broadcasts. With a staggering fifteen million daily users, Twitch is accessible via smartphone and gaming console apps, smart TVs, computers, and tablets. But if you want to stream to Twitch, you may find yourself using third-party software in order to do so. And with more buttons to click and more wires to plug in for older, app-less consoles, streaming can get confusing.
Side note: we Tinkernut
We’ve featured Tinkernut a few times on the Raspberry Pi blog – his tutorials are clear, his projects are interesting and useful, and his live-streamed comment videos for every build are a nice touch to sharing homebrew builds on the internet.
So, yes, we love him. [This is true. Alex never shuts up about him. – Ed.] And since he has over 500K subscribers on YouTube, we’re obviously not the only ones. We wave our Tinkernut flags with pride.
The Raspberry Pi Zero W is connected to the HDMI to CSI adapter via the camera connector, in the same way you’d attach the camera ribbon. Tinkernut uses a standard Raspbian image on an 8GB SD card, with SSH enabled for remote access from his laptop. He uses the simple command Raspivid to test the HDMI connection by recording ten seconds of video footage from his console.
One lead is all you need
Once you have the Pi receiving video from your console, you can connect to Twitch using your Twitch stream key, which you can find by logging in to your account at Twitch.tv. Tinkernut’s tutorial gives you all the commands you need to stream from your Pi.
To up the aesthetic impact of your project, adding buttons and backlights is fairly straightforward.
Pretty LED frills
To run the stream command, Tinketnut uses a button: press once to start the stream, press again to stop. Pressing the button also turns on the LED backlight, so it’s obvious when streaming is in progress.
For the full code and 3D-printable case STL file, head to Tinketnut’s hackster.io project page. And if you’re already using a Raspberry Pi for Twitch streaming, share your build setup with us. Cheers!
AWS CloudFormation allows developers and systems administrators to easily create and manage a collection of related AWS resources (called a CloudFormation stack) by provisioning and updating them in an orderly and predictable way. CloudFormation users can now deploy and manage AWS Batch resources in exactly the same way that they are managing the rest of their AWS infrastructure.
This post highlights the native resources supported in CloudFormation and demonstrates how to create AWS Batch compute environments using CloudFormation. All sample CloudFormation, per-region templates related to this post can be found on the CloudFormation sample template site. The Ohio (us-east-2) Region is used as the example region for the remainder of this post.
AWS Batch Resources
AWS Batch is a managed service that helps you efficiently run batch computing workloads on the AWS Cloud. Users submit jobs to job queues, specifying the application to be run and their jobs’ CPU and memory requirements. AWS Batch is responsible for launching the appropriate quantity and types of instances needed to run your jobs.
AWS Batch removes the undifferentiated heavy lifting of configuring and managing compute infrastructure, allowing you to instead focus on your applications and users. This is demonstrated in the How AWS Batch Works video.
AWS Batch manages the following resources:
A job definition specifies how jobs are to be run—for example, which Docker image to use for your job, how many vCPUs and how much memory is required, the IAM role to be used, and more.
Jobs are submitted to job queues where they reside until they can be scheduled to run on Amazon EC2 instances within a compute environment. An AWS account can have multiple job queues, each with varying priority. This gives you the ability to closely align the consumption of compute resources with your organizational requirements.
Compute environments provision and manage your EC2 instances and other compute resources that are used to run your AWS Batch jobs. Job queues are mapped to one more compute environments and a given environment can also be mapped to one or more job queues. This many-to-many relationship is defined by the compute environment order and job queue priority properties.
The following diagram shows a general overview of how the AWS Batch resources interact.
CloudFormation stack creation and updates
Upon the creation of your stack, an AWS Batch job definition is registered using your CloudFormation template. If a job definition with the same name has already been registered, a new revision is created. On stack updates, any changes to your job definition specifications in the CloudFormation template result in a new revision of that job definition and a deregistration of the previous job definition revision. Stack deletions only result in the deregistration of your job definition, as AWS Batch does not delete job definitions.
At the stack creation, a job queue is created using the template. Any changes to your job queue properties within the stack result in a call to the UpdateJobQueue API action. Similarly, stack deletions result in the deletion of job queues from your AWS Batch compute environment.
CloudFormation creates an AWS Batch compute environment using the properties specified in your template. Stack updates result in updates to your compute environment where possible. If you need to change a parameter that is not supported by the UpdateComputeEnvironment API action, stack updates result in the deletion and re-creation of your compute environment. Upon stack deletion, your compute environment is disabled, and then deleted.
All naming conventions specified by CloudFormation should be followed—especially in the case of resource replacement—or you run the risk of a failed stack changes. For example, all AWS Batch resource property names must be capitalized, and resource names must be changed in the case of resource replacement, as is the case in any CloudFormation stack.
If you do not provide values for ComputeEnvironmentName, JobQueueName, or JobDefinitionName in your template, a pseudo-random name is generated for you using the logical ID that you gave the resource in CloudFormation.
Launching a “Hello World” example stack
Here’s a familiar “Hello World” example of a CloudFormation stack with AWS Batch resources.
This example registers a simple job definition, a job queue that can accept job submissions, and a compute environment that contains the compute resources used to execute your job. The stack template also creates additional AWS resources that are required by AWS Batch:
An IAM service role that gives AWS Batch permissions to take the required actions on your behalf
An IAM ECS instance role
A VPC subnet (though I’ve provided a general template, I suggest that this be a private subnet)
A security group
This stack can easily be deployed in the CloudFormation console, but I provide CLI commands that complete the stack creation for you. Use the Launch stack button or run the following command:
You can monitor the creation of the resources in your CloudFormation stack in the CloudFormation console, on the Events tab:
Confirm the successful creation of your stack by observing a CREATE_COMPLETE status. At this point, you should also be able to view the new resource ARNs on the Outputs tab:
After your stack is successfully created, everything that you need to submit a “hello-world” job is complete.
Make sure to use the accurate job definition name and revision number. You can find the accurate Amazon Resource Name (ARN) on the CloudFormation stack Outputs tab. A pseudo-random resource name is generated for your AWS Batch resources. If you do have an existing hello-world job definition, make sure that you run the command with the job definition revision created by your new CloudFormation stack from the stack outputs.
You can monitor the successful execution of the job in the AWS Batch console under Jobs:
When you are done using this stack and want to delete the resources, run the following command. CloudFormation deregisters the job definition, and deletes the job queue, compute environment, and the rest of the resources in the stack template.
Now that you know the basics of AWS Batch resources, here’s a more complex example.
High– and low-priority job queues with On-Demand and Spot compute environments
This CloudFormation stack creates two job queues with varying priority and two compute environments. You have one On-Demand compute environment and one Spot compute environment with a Spot price at 40% of On-Demand.
The first job queue is higher priority and feeds jobs to both compute environments, while the lower priority job queue only submits jobs for execution to the Spot compute environment.
There are two job definitions, one high-priority job queue and one low-priority job queue. Each job submitted using a given job definition is submitted to a job queue. For example, jobs submitted with an important-production-application job definition are submitted to the high priority job queue, while jobs submitted with a test-application job definition are submitted to the low priority job queue.
This example registers both job definitions and creates your compute environments and job queues. It also creates the VPC, subnet, security group, IAM service role for AWS Batch, ECS instance role, and an IAM Spot Fleet role. Use the Launch stack button or run the following command:
As with any CloudFormation stack, you can update resources for your application’s specific needs. AWS CloudFormation Designer is a graphic tool for creating, viewing, and modifying CloudFormation templates.
Any changes to resource properties that require replacement results in the creation of a new resource to reflect this change, and the deletion of the obsolete resource. Changes to an immutable compute environment or job queue properties results in replacement. Changes to updateable properties update the existing resource. Any changes to job definitions (beyond the name) result in the registration of a new revision of the existing job definition, followed by the deregistration of the previous revision.
Finally, run the following command to delete the CloudFormation stack containing your AWS Batch resources:
In this post, I detailed the steps to create, update with and without replacement, and delete your AWS Batch resources using CloudFormation templates as part of CloudFormation stacks with other AWS service resources. For more information, see the following topics:
We launched AWS Support a full decade ago, with Gold and Silver plans focused on Amazon EC2, Amazon S3, and Amazon SQS. Starting from that initial offering, backed by a small team in Seattle, AWS Support now encompasses thousands of people working from more than 60 locations.
A Quick Look Back Over the years, that offering has matured and evolved in order to meet the needs of an increasingly diverse base of AWS customers. We aim to support you at every step of your cloud adoption journey, from your initial experiments to the time you deploy mission-critical workloads and applications.
We have worked hard to make our support model helpful and proactive. We do our best to provide you with the tools, alerts, and knowledge that will help you to build systems that are secure, robust, and dependable. Here are some of our most recent efforts toward that goal:
Trusted Advisor S3 Bucket Policy Check – AWS Trusted Advisor provides you with five categories of checks and makes recommendations that are designed to improve security and performance. Earlier this year we announced that the S3 Bucket Permissions Check is now free, and available to all AWS users. If you are signed up for the Business or Professional level of AWS Support, you can also monitor this check (and many others) using Amazon CloudWatch Events. You can use this to monitor and secure your buckets without human intervention.
Personal Health Dashboard – This tool provides you with alerts and guidance when AWS is experiencing events that may affect you. You get a personalized view into the performance and availability of the AWS services that underlie your AWS resources. It also generates Amazon CloudWatch Events so that you can initiate automated failover and remediation if necessary.
Well Architected / Cloud Ops Review – We’ve learned a lot about how to architect AWS-powered systems over the years and we want to share everything we know with you! The AWS Well-Architected Framework provide proven, detailed guidance in critical areas including operational excellence, security, reliability, performance efficiency, and cost optimization. You can read the materials online and you can also sign up for the online training course. If you are signed up for Enterprise support, you can also benefit from our Cloud Ops review.
Infrastructure Event Management – If you are launching a new app, kicking off a big migration, or hosting a large-scale event similar to Prime Day, we are ready with guidance and real-time support. Our Infrastructure Event Management team will help you to assess the readiness of your environment and work with you to identify and mitigate risks ahead of time.
The Amazon retail site makes heavy use of AWS. You can read my post, Prime Day 2017 – Powered by AWS, to learn more about the process of preparing to sustain a record-setting amount of traffic and to accept a like number of orders.
Come and Join Us The AWS Support Team is in continuous hiring mode and we have openings all over the world! Here are a couple of highlights:
HackSpace magazine is back with our brand-new issue 6, available for you on shop shelves, in your inbox, and on our website right now.
Inside Hackspace magazine 6
Paper is probably the first thing you ever used for making, and for good reason: in no other medium can you iterate through 20 designs at the cost of only a few pennies. We’ve roped in Rob Ives to show us how to make a barking paper dog with moveable parts and a cam mechanism. Even better, the magazine includes this free paper automaton for you to make yourself. That’s right: free!
At the other end of the scale, there’s the forge, where heat, light, and noise combine to create immutable steel. We speak to Alec Steele, YouTuber, blacksmith, and philosopher, about his amazingly beautiful Damascus steel creations, and about why there’s no difference between grinding a knife and blowing holes in a mountain to build a road through it.
Do it yourself
You’ve heard of reading glasses — how about glasses that read for you? Using a camera, optical character recognition software, and a text-to-speech engine (and of course a Raspberry Pi to hold it all together), reader Andrew Lewis has hacked together his own system to help deal with age-related macular degeneration.
It’s the definition of hacking: here’s a problem, there’s no solution in the shops, so you go and build it yourself!
60 years ago, the cutting edge of home hacking was the transistor radio. Before the internet was dreamt of, the transistor radio made the world smaller and brought people together. Nowadays, the components you need to build a radio are cheap and easily available, so if you’re in any way electronically inclined, building a radio is an ideal excuse to dust off your soldering iron.
If you’re a 12-month subscriber (if you’re not, you really should be), you’ve no doubt been thinking of all sorts of things to do with the Adafruit Circuit Playground Express we gave you for free. How about a sewable circuit for a canvas bag? Use the accelerometer to detect patterns of movement — walking, for example — and flash a series of lights in response. It’s clever, fun, and an easy way to add some programmable fun to your shopping trips.
We’re also making gin, hacking a children’s toy car to unlock more features, and getting started with robot sumo to fill the void left by the cancellation of Robot Wars.
All this, plus an 11-metre tall mechanical miner, in HackSpace magazine issue 6 — subscribe here from just £4 an issue or get the PDF version for free. You can also find HackSpace magazine in WHSmith, Tesco, Sainsbury’s, and independent newsagents in the UK. If you live in the US, check out your local Barnes & Noble, Fry’s, or Micro Center next week. We’re also shipping to stores in Australia, Hong Kong, Canada, Singapore, Belgium, and Brazil, so be sure to ask your local newsagent whether they’ll be getting HackSpace magazine.
The release of pip 10.0 has been announced. Some highlights of this release include the removal of Python 2.6 support, limited PEP 518 support (with more to come), a new “pip config” command, and other improvements.
This column is from The MagPi issue 61. You can download a PDF of the full issue for free, or subscribe to receive the print edition through your letterbox or the digital edition on your tablet. All proceeds from the print and digital editions help the Raspberry Pi Foundation achieve our charitable goals.
The pinned tweet on Dave Akerman’s Twitter account shows a table displaying the various components needed for a high-altitude balloon (HAB) flight. Batteries, leads, a camera and Raspberry Pi, plus an unusually themed payload. The caption reads ‘The Queen, The Duke of York, and my TARDIS”, and sums up Dave’s maker career in a heartbeat.
The Queen, The Duke of York, and my TARDIS 🙂 #UKHAS #RaspberryPi
Though writing software for industrial automation pays the bills, the majority of Dave’s time is spent in the world of high-altitude ballooning and the ever-growing community that encompasses it. And, while he makes some money sending business-themed balloons to near space for the likes of Aardman Animations, Confused.com, and the BBC, Dave is best known in the Raspberry Pi community for his use of the small computer in every payload, and his work as a tutor alongside the Foundation’s staff at Skycademy events.
Dave continues to help others while breaking records and having a good time exploring the atmosphere.
Dave has dedicated many hours and many, many more miles to assist with the Foundation’s Skycademy programme, helping to explore high-altitude ballooning with educators from across the UK. Using a Raspberry Pi and various other pieces of lightweight tech, Dave and Foundation staff member James Robinson explored the incorporation of high-altitude ballooning into education. Through Skycademy, educators were able to learn new skills and take them to the classroom, setting off their own balloons with their students, and recording the results on Raspberry Pis.
Dave’s most recent flight broke a new record. On 13 August 2017, his HAB payload was able to send back the highest images taken by any amateur flight.
But education isn’t the only reason for Dave’s involvement in the HAB community. As with anyone passionate about a specific hobby, Dave strives to break records. The most recent record-breaking flight took place on 13 August 2017, when Dave’s Raspberry Pi Zero HAB sent home the highest images taken by any amateur high-altitude balloon launch: at 43014 metres. No other HAB balloon has provided images from such an altitude, and the lightweight nature of the Pi Zero definitely helped, as Dave went on to mention on Twitter a few days later.
Dave is recognised as being the first person to incorporate a Raspberry Pi into a HAB payload, and continues to break records with the help of the little green board. More recently, he’s been able to lighten the load by using the Raspberry Pi Zero.
When the first Pi made its way to near space, Dave tore the computer apart in order to meet the weight restriction. The Pi in the Sky board was created to add the extra features needed for the flight. Since then, the HAT has experienced a few changes.
The Pi in the Sky board, created specifically for HAB flights.
Dave first fell in love with high-altitude ballooning after coming across the hobby in a video shared on a photographic forum. With a lifelong interest in space thanks to watching the Moon landings as a boy, plus a talent for electronics and photography, it seems a natural progression for him. Throw in his coding skills from learning to program on a Teletype and it’s no wonder he was ready and eager to take to the skies, so to speak, and capture the curvature of the Earth. What was so great about using the Raspberry Pi was the instant gratification he got from receiving images in real time as they were taken during the flight. While other devices could control a camera and store captured images for later retrieval, thanks to the Pi Dave was able to transmit the files back down to Earth and check the progress of his balloon while attempting to break records with a flight.
One of the many commercial flights Dave has organised featured the classic children’s TV character Morph, a creation of the Aardman Animations studio known for Wallace and Gromit. Morph took to the sky twice in his mission to reach near space, and finally succeeded in 2016.
High-altitude ballooning isn’t the only part of Dave’s life that incorporates a Raspberry Pi. Having “lost count” of how many Pis he has running tasks, Dave has also created radio receivers for APRS (ham radio data), ADS-B (aircraft tracking), and OGN (gliders), along with a time-lapse camera in his garden, and he has a few more Pi for tinkering purposes.
This post courtesy of Michael Edge, Sr. Cloud Architect – AWS Professional Services
Applications built using a microservices architecture typically result in a number of independent, loosely coupled microservices communicating with each other, synchronously via their APIs and asynchronously via events. These microservices are often owned by different product teams, and these teams may segregate their resources into different AWS accounts for reasons that include security, billing, and resource isolation. This can sometimes result in the following challenges:
Cross-account deployment: A single pipeline must deploy a microservice into multiple accounts; for example, a microservice must be deployed to environments such as DEV, QA, and PROD, all in separate accounts.
Cross-account lookup: During deployment, a resource deployed into one AWS account may need to refer to a resource deployed in another AWS account.
Cross-account communication: Microservices executing in one AWS account may need to communicate with microservices executing in another AWS account.
In this post, I look at ways to address these challenges using a sample application composed of a web application supported by two serverless microservices. The microservices are owned by different product teams and deployed into different accounts using AWS CodePipeline, AWS CloudFormation, and the Serverless Application Model (SAM). At runtime, the microservices communicate using an event-driven architecture that requires asynchronous, cross-account communication via an Amazon Simple Notification Service (Amazon SNS) topic.
First, look at the sample application I use to demonstrate these concepts. In the following overview diagram, you can see the following:
The entire application consists of three main services:
A Booking microservice, owned by the Booking account.
An Airmiles microservice, owned by the Airmiles account.
A web application that uses the services exposed by both microservices, owned by the Web Channel account.
The Booking microservice creates flight bookings and publishes booking events to an SNS topic.
The Airmiles microservice consumes booking events from the SNS topic and uses the booking event to calculate the airmiles associated with the flight booking. It also supports querying airmiles for a specific flight booking.
The web application allows an end user to make flight bookings, view flights bookings, and view the airmiles associated with a flight booking.
The typical booking flow would be triggered by an end user making a flight booking using the web application, which invokes the Booking microservice via its REST API. The Booking microservice persists the flight booking and publishes the booking event to an SNS topic to enable sharing of the booking with other interested consumers. In this sample application, the Airmiles microservice subscribes to the SNS topic and consumes the booking event, using the booking information to calculate the airmiles. In line with microservices best practices, both the Booking and Airmiles microservices store their information in their own DynamoDB tables, and expose an API (via API Gateway) that is used by the web application.
Before you delve into the details of the sample application, get the source code and deploy it.
Cross-account deployment of Lambda functions using CodePipeline has been previously discussed by my colleague Anuj Sharma in his post, Building a Secure Cross-Account Continuous Delivery Pipeline. This sample application builds upon the solution proposed by Anuj, using some of the same scripts and a similar account structure. To make it feasible for you to deploy the sample application, I’ve reduced the number of accounts needed down to three accounts by consolidating some of the services. In the following diagram, you can see the services used by the sample application require three accounts:
Tools: A central location for the continuous delivery/deployment services such as CodePipeline and AWS CodeBuild. To reduce the number of accounts required by the sample application, also deploy the AWS CodeCommit repositories here, though typically they may belong in a separate Dev account.
Booking: Account for the Booking microservice.
Airmiles: Account for the Airmiles microservice.
Without consolidation, the sample application may require up to 10 accounts: one for Tools, and three accounts each for Booking, Airmiles and Web Application (to support the DEV, QA, and PROD environments).
Account structure for sample application
To follow the rest of this post, clone the repository in step 1 below. To deploy the application on AWS, follow steps 2 and 3:
Follow the instructions in the repository README to build the CodePipeline pipelines and deploy the microservices and web application.
Challenge 1: Cross-account deployment using CodePipeline
Though the Booking pipeline executes in the Tools account, it deploys the Booking Lambda functions into the Booking account.
In the sample application code repository, open the ToolsAcct/code-pipeline.yaml CloudFormation template.
Scroll down to the Pipeline resource and look for the DeployToTest pipeline stage (shown below). There are two AWS Identity and Access Management (IAM) service roles used in this stage that allow cross-account activity. Both of these roles exist in the Booking account:
Under Actions.RoleArn, find the service role assumed by CodePipeline to execute this pipeline stage in the Booking account. The role referred to by the parameter NonProdCodePipelineActionServiceRole allows access to the CodePipeline artifacts in the S3 bucket in the Tools account, and also access to the AWS KMS key needed to encrypt/decrypt the artifacts.
Under Actions.Configuration.RoleArn, find the service role assumed by CloudFormation when it carries out the CHANGE_SET_REPLACE action in the Booking account.
These roles are created in the CloudFormation template NonProdAccount/toolsacct-codepipeline-cloudformation-deployer.yaml.
Challenge 2: Cross-account stack lookup using custom resources
Asynchronous communication is a fairly common pattern in microservices architectures. A publishing microservice publishes an event that consumers may be interested in, without any concern for who those consumers may be.
In the case of the sample application, the publisher is the Booking microservice, publishing a flight booking event onto an SNS topic that exists in the Booking account. The consumer is the Airmiles microservice in the Airmiles account. To enable the two microservices to communicate, the Airmiles microservice must look up the ARN of the Booking SNS topic at deployment time in order to set up a subscription to it.
To enable CloudFormation templates to be reused, you are not hardcoding resource names in the templates. Because you allow CloudFormation to generate a resource name for the Booking SNS topic, the Airmiles microservice CloudFormation template must look up the SNS topic name at stack creation time. It’s not possible to use cross-stack references, as these can’t be used across different accounts.
However, you can use a CloudFormation custom resource to achieve the same outcome. This is discussed in the next section. Using a custom resource to look up stack exports in another stack does not create a dependency between the two stacks, unlike cross-stack references. A stack dependency would prevent one stack being deleted if another stack depended upon it. For more information, see Fn::ImportValue.
Using a Lambda function as a custom resource to look up the booking SNS topic
The sample application uses a Lambda function as a custom resource. This approach is discussed in AWS Lambda-backed Custom Resources. Walk through the sample application and see how the custom resource is used to list the stack export variables in another account and return these values to the calling AWS CloudFormation stack. Examine each of the following aspects of the custom resource:
Deploying the custom resource.
Custom resource IAM role.
A calling CloudFormation stack uses the custom resource.
The custom resource assumes a role in another account.
The custom resource obtains the stack exports from all stacks in the other account.
The custom resource returns the stack exports to the calling CloudFormation stack.
The custom resource handles all the event types sent by the calling CloudFormation stack.
Step1: Deploying the custom resource
The custom Lambda function is deployed using SAM, as are the Lambda functions for the Booking and Airmiles microservices. See the CustomLookupExports resource in Custom/custom-lookup-exports.yml.
Step2: Custom resource IAM role
The CustomLookupExports resource in Custom/custom-lookup-exports.yml executes using the CustomLookupLambdaRole IAM role. This role allows the custom Lambda function to assume a cross account role that is created along with the other cross account roles in the NonProdAccount/toolsacct-codepipeline-cloudformation-deployer.yml. See the resource CustomCrossAccountServiceRole.
Step3: The CloudFormation stack uses the custom resource
The Airmiles microservice is created by the Airmiles CloudFormation template Airmiles/sam-airmile.yml, a SAM template that uses the custom Lambda resource to look up the ARN of the Booking SNS topic. The custom resource is specified by the CUSTOMLOOKUP resource, and the PostAirmileFunction resource uses the custom resource to look up the ARN of the SNS topic and create a subscription to it.
Because the Airmiles Lambda function is going to subscribe to an SNS topic in another account, it must grant the SNS topic in the Booking account permissions to invoke the Airmiles Lambda function in the Airmiles account whenever a new event is published. Permissions are granted by the LambdaResourcePolicy resource.
Step4: The custom resource assumes a role in another account
When the Lambda custom resource is invoked by the Airmiles CloudFormation template, it must assume a role in the Booking account (see Step 2) in order to query the stack exports in that account.
This can be seen in Custom/custom-lookup-exports.py, where the AWS Simple Token Service (AWS STS) is used to obtain a temporary access key to allow access to resources in the account referred to by the environment variable: ‘CUSTOM_CROSS_ACCOUNT_ROLE_ARN’. This environment variable is defined in the Custom/custom-lookup-exports.yml CloudFormation template, and refers to the role created in Step 2.
Step5: The custom resource obtains the stack exports from all stacks in the other account
The function get_exports() in Custom/custom-lookup-exports.py uses the AWS SDK to list all the stack exports in the Booking account.
Step 6: The custom resource returns the stack exports to the calling CloudFormation stack
The calling CloudFormation template, Airmiles/sam-airmile.yml, uses the custom resource to look up a stack export with the name of booking-lambda-BookingTopicArn. This is exported by the CloudFormation template Booking/sam-booking.yml.
Step7: The custom resource handles all the event types sent by the calling CloudFormation stack
Custom resources used in a stack are called whenever the stack is created, updated, or deleted. They must therefore handle CREATE, UPDATE, and DELETE stack events. If your custom resource fails to send a SUCCESS or FAILED notification for each of these stack events, your stack may become stuck in a state such as CREATE_IN_PROGRESS or UPDATE_ROLLBACK_IN_PROGRESS. To handle this cleanly, use the crhelper custom resource helper. This strongly encourages you to handle the CREATE, UPDATE, and DELETE CloudFormation stack events.
Challenge 3: Cross-account SNS subscription
The SNS topic is specified as a resource in the Booking/sam-booking.yml CloudFormation template. To allow an event published to this topic to trigger the Airmiles Lambda function, permissions must be granted on both the Booking SNS topic and the Airmiles Lambda function. This is discussed in the next section.
Permissions on booking SNS topic
SNS topics support resource-based policies, which allow a policy to be attached directly to a resource specifying who can access the resource. This policy can specify which accounts can access a resource, and what actions those accounts are allowed to perform. This is the same approach as used by a small number of AWS services, such as Amazon S3, KMS, Amazon SQS, and Lambda. In Booking/sam-booking.yml, the SNS topic policy allows resources in the Airmiles account (referenced by the parameter NonProdAccount in the following snippet) to subscribe to the Booking SNS topic:
The Airmiles microservice is created by the Airmiles/sam-airmile.yml CloudFormation template. The template uses SAM to specify the Lambda functions together with their associated API Gateway configurations. SAM hides much of the complexity of deploying Lambda functions from you.
For instance, by adding an ‘Events’ resource to the PostAirmileFunction in the CloudFormation template, the Airmiles Lambda function is triggered by an event published to the Booking SNS topic. SAM creates the SNS subscription, as well as the permissions necessary for the SNS subscription to trigger the Lambda function.
However, the Lambda permissions generated automatically by SAM are not sufficient for cross-account SNS subscription, which means you must specify an additional Lambda permissions resource in the CloudFormation template. LambdaResourcePolicy, in the following snippet, specifies that the Booking SNS topic is allowed to invoke the Lambda function PostAirmileFunction in the Airmiles account.
In this post, I showed you how product teams can use CodePipeline to deploy microservices into different AWS accounts and different environments such as DEV, QA, and PROD. I also showed how, at runtime, microservices in different accounts can communicate securely with each other using an asynchronous, event-driven architecture. This allows product teams to maintain their own AWS accounts for billing and security purposes, while still supporting communication with microservices in other accounts owned by other product teams.
Thanks are due to the following people for their help in preparing this post:
Kevin Yung for the great web application used in the sample application
Jay McConnell for crhelper, the custom resource helper used with the CloudFormation custom resources
The data center keeps growing, with well over 500 Petabytes of data under management we needed more systems administrators to help us keep track of all the systems as our operation expands. Our latest systems administrator is Billy! Let’s learn a bit more about him shall we?
What is your Backblaze Title? Sr. Systems Administrator
Where are you originally from? Boston, MA
What attracted you to Backblaze? I’ve read the hard drive articles that were published and was excited to be a part of the company that took the time to do that kind of analysis and share it with the world.
What do you expect to learn while being at Backblaze? I expect that I’ll learn about the problems that arise from a larger scale operation and how to solve them. I’m very curious to find out what they are.
Where else have you worked? I’ve worked for the MIT Math Dept, Google, a social network owned by AOL called Bebo, Evernote, a contractor recommendation site owned by The Home Depot called RedBeacon, and a few others that weren’t as interesting.
Where did you go to school? I started college at The Cooper Union, discovered that Electrical Engineering wasn’t my thing, then graduated from the Computer Science program at Northeastern.
What’s your dream job? Is couch potato a job? I like to solve puzzles and play with toys, which is why I really enjoy being a sysadmin. My dream job is to do pretty much what I do now, but not have to participate in on-call.
Favorite place you’ve traveled? We did a 2 week tour through Europe on our honeymoon. I’d go back to any place there.
Favorite hobby? Reading and listening to music. I spent a stupid amount of money on a stereo, so I make sure it gets plenty of use. I spent much less money on my library card, but I try to utilize it quite a bit as well.
Of what achievement are you most proud? I designed a built a set of shelves for the closet in my kids’ room. Built with hand tools. The only electricity I used was the lights to see what I was doing.
Star Trek or Star Wars? Star Trek: The Next Generation
Coke or Pepsi? Coke!
Favorite food? Pesto. Usually on angel hair, but it also works well on bread, or steak, or a spoon.
Why do you like certain things? I like things that are a little outside the norm, like musical covers and mashups, or things that look like 1 thing but are really something else. Secret compartments are also fun.
Anything else you’d like you’d like to tell us? I’m full of anecdotes and lines from songs and movies and tv shows.
Pesto is delicious! Welcome to the systems administrator team Billy, we’ll keep the fridge stocked with Coke for you!
There’s a new issue of HackSpace magazine on the shelves today, and as usual it’s full of things to make and do!
We love making hardware, and we’d also love to turn this hobby into a way to make a living. So in the hope of picking up a few tips, we spoke to the woman behind Adafruit: Limor Fried, aka Ladyada.
Adafruit has played a massive part in bringing the maker movement into homes and schools, so we’re chuffed to have Limor’s words of wisdom in the magazine.
Raspberry Pi 3B+
As you may have heard, there’s a new Pi in town, and that can only mean one thing for HackSpace magazine: let’s test it to its limits!
The Raspberry Pi 3 Model B+ is faster, better, and stronger, but what does that mean in practical terms for your projects?
Kids are amazing! Their curious minds, untouched by mundane adulthood, come up with crazy stuff that no sensible grown-up would think to build. No sensible grown-up, that is, apart from the engineers behind Kids Invent Stuff, the brilliant YouTube channel that takes children’s inventions and makes them real.
Kids Invent Stuff is the YouTube channel where kids’ invention ideas get made into real working inventions. Learn more about Kids Invent Stuff at www.kidsinventstuff.com Have you seen Connor’s Crazy Car invention? https://youtu.be/4_sF6ZFNzrg Have you seen our Flamethrowing piano?
We spoke to Ruth Amos, entrepreneur, engineer, and one half of the Kids Invent Stuff team.
It shouldn’t just be kids who get to play with fun stuff! This month, in the name of research, we’ve brought a Stirling engine–powered buggy from Shenzhen.
This ingenious mechanical engine is the closest you’ll get to owning a home-brew steam engine without running the risk of having a boiler explode in your face.
In this issue, turn a Dremel multitool into a workbench saw with some wood, perspex, and a bit of laser cutting; make a Starfleet com-badge and pretend you’re Captain Jean-Luc Picard (shaving your hair off not compulsory); add intelligence to builds the easy way with Node-RED; and get stuck into Cheerlights, one of the world’s biggest IoT project.
All this, plus your ultimate guide to blinkenlights, and the only knot you’ll ever need, in HackSpace magazine issue 5.
Individual copies of HackSpace magazine are available in selected stockists across the UK, including Tesco, WHSmith, and Sainsbury’s. They’ll also be making their way across the globe to USA, Canada, Australia, Brazil, Hong Kong, Singapore, and Belgium in the coming weeks, so ask your local retailer whether they’re getting a delivery.
You can also purchase your copy on the Raspberry Pi Press website, and browse our complete collection of other Raspberry Pi publications, such as The MagPi, Hello World, and Raspberry Pi Projects Books.
The GStreamer team has announced a major feature release of the GStreamer cross-platform multimedia framework. Highlights include WebRTC support, experimental support for the next-gen royalty-free AV1 video codec, support for the Secure Reliable Transport (SRT) video streaming protocol, and much more. The release notes contain more details.
Data that describe processes in a spatial context are everywhere in our day-to-day lives and they dominate big data problems. Map data, for instance, whether describing networks of roads or remote sensing data from satellites, get us where we need to go. Atmospheric data from simulations and sensors underlie our weather forecasts and climate models. Devices and sensors with GPS can provide a spatial context to nearly all mobile data.
In this post, we introduce the WIND toolkit, a huge (500 TB), open weather model dataset that’s available to the world on Amazon’s cloud services. We walk through how to access this data and some of the open-source software developed to make it easily accessible. Our solution considers a subset of geospatial data that exist on a grid (raster) and explores ways to provide access to large-scale raster data from weather models. The solution uses foundational AWS services and the Hierarchical Data Format (HDF), a well adopted format for scientific data.
The approach developed here can be extended to any data that fit in an HDF5 file, which can describe sparse and dense vectors and matrices of arbitrary dimensions. This format is already popular within the physical sciences for both experimental and simulation data. We discuss solutions to gridded data storage for a massive dataset of public weather model outputs called the Wind Integration National Dataset (WIND) toolkit. We also highlight strategies that are general to other large geospatial data management problems.
Wind Integration National Dataset
As variable renewable power penetration levels increase in power systems worldwide, the importance of renewable integration studies to ensure continued economic and reliable operation of the power grid is also increasing. The WIND toolkit is the largest freely available grid integration dataset to date.
The WIND toolkit was developed by 3TIER by Vaisala. They were under a subcontract to the National Renewable Energy Laboratory (NREL) to support studies on integration of wind energy into the existing US grid. NREL is a part of a network of national laboratories for the US Department of Energy and has a mission to advance the science and engineering of energy efficiency, sustainable transportation, and renewable power technologies.
The toolkit has been used by consultants, research groups, and universities worldwide to support grid integration studies. Less traditional uses also include resource assessments for wind plants (such as those powering Amazon data centers), and studying the effects of weather on California condor migrations in the Baja peninsula.
The diversity of applications highlights the value of accessible, open public data. Yet, there’s a catch: the dataset is huge. The WIND toolkit provides simulated atmospheric (weather) data at a two-km spatial resolution and five-minute temporal resolution at multiple heights for seven years. The entire dataset is half a petabyte (500 TB) in size and is stored in the NREL High Performance Computing data center in Golden, Colorado. Making this dataset publicly available easily and in a cost-effective manner is a major challenge.
As other laboratories and public institutions work to release their data to the world, they may face similar challenges to those that we experienced. Some prior, well-intentioned efforts to release huge datasets as-is have resulted in data resources that are technically available but fundamentally unusable. They may be stored in an unintuitive format or indexed and organized to support only a subset of potential uses. Downloading hundreds of terabytes of data is often impractical. Most users don’t have access to a big data cluster (or super computer) to slice and dice the data as they need after it’s downloaded.
We aim to provide a large amount of data (50 terabytes) to the public in a way that is efficient, scalable, and easy to use. In many cases, researchers can access these huge cloud-located datasets using the same software and algorithms they have developed for smaller datasets stored locally. Only the pieces of data they need for their individual analysis must be downloaded. To make this work in practice, we worked with the HDF Group and have built upon their forthcoming Highly Scalable Data Service.
In the rest of this post, we discuss how the HSDS software was developed to use Amazon EC2 and Amazon S3 resources to provide convenient and scalable access to these huge geospatial datasets. We describe how the HSDS service has been put to work for the WIND Toolkit dataset and demonstrate how to access it using the h5pyd Python library and the REST API. We conclude with information about our ongoing work to release more ‘open’ datasets to the public using AWS services, and ways to improve and extend the HSDS with newer Amazon services like Amazon ECS and AWS Lambda.
Developing a scalable service for big geospatial data
The HDF5 file format and API have been used for many years and is an effective means of storing large scientific datasets. For example, NASA’s Earth Observing System (EOS) satellites collect more than 16 TBs of data per day using HDF5.
With the rise of the cloud, there are new challenges and opportunities to rethink how HDF5 can be enhanced to work effectively as a component in a cloud-native architecture. For the HDF Group, working with NREL has been a great opportunity to put ideas into practice with a production-size dataset.
An HDF5 file consists of a directed graph of group and dataset objects. Datasets can be thought of as a multidimensional array with support for user-defined metadata tags and compression. Typical operations on datasets would be reading or writing data to a regular subregion (a hyperslab) or reading and writing individual elements (a point selection). Also, group and dataset objects may each contain an arbitrary number of the user-defined metadata elements known as attributes.
Many people have used the HDF library in applications developed or ported to run on EC2 instances, but there are a number of constraints that often prove problematic:
The HDF5 library can’t read directly from HDF5 files stored as S3 objects. The entire file (often many GB in size) would need to be copied to local storage before the first byte can be read. Also, the instance must be configured with the appropriately sized EBS volume)
The HDF library only has access to the computational resources of the instance itself (as opposed to a cluster of instances), so many operations are bottlenecked by the library.
Any modifications to the HDF5 file would somehow have to be synchronized with changes that other instances have made to same file before writing back to S3.
Using a pattern common to many offerings from AWS, the solution to these constraints is to develop a service framework around the HDF data model. Using this model, the HDF Group has created the Highly Scalable Data Service (HSDS) that provides all the functionality that traditionally was provided by the HDF5 library. By using the service, you don’t need to manage your own file volumes, but can just read and write whatever data that you need.
Because the service manages the actual data persistence to a durable medium (S3, in this case), you don’t need to worry about disk management. Simply stream the data you need from the service as you need it. Secondly, putting the functionality behind a service allows some tricks to increase performance (described in more detail later). And lastly, HSDS allows any number of clients to access the data at the same time, enabling HDF5 to be used as a coordination mechanism for multiple readers and writers.
In designing the HSDS architecture, we gave much thought to how to achieve scalability of the HSDS service. For accessing HDF5 data, there are two different types of scaling to consider:
Multiple clients making many requests to the service
Single requests that require a significant amount of data processing
To deal with the first scaling challenge, as with most services, we considered how the service responds as the request rate increases. AWS provides some great tools that help in this regard:
Auto Scaling groups
Elastic Load Balancing load balancers
The ability of S3 to handle large aggregate throughput rates
By using a cluster of EC2 instances behind a load balancer, you can handle different client loads in a cost-effective manner.
The second scaling challenge concerns single requests that would take significant processing time with just one compute node. One example of this from the WIND toolkit would be extracting all the values in the seven-year time span for a given geographic point and dataset.
In HDF5, large datasets are typically stored as “chunks”; that is, a regular partition of the array. In HSDS, each chunk is stored as a binary object in S3. The sequential approach to retrieving the time series values would be for the service to read each chunk needed from S3, extract the needed elements, and go on to the next chunk. In this case, that would involve processing 2557 chunks, and would be quite slow.
Fortunately, with HSDS, you can speed this up quite a bit by exploiting the compute and I/O capabilities of the cluster. Upon receiving the request, the receiving node can use other nodes in the cluster to read different portions of the selection. With multiple nodes reading from S3 in parallel, performance improves as the cluster size increases.
The diagram below illustrates how this works in simplified case of four chunks and four nodes.
This architecture has worked in well in practice. In testing with the WIND toolkit and time series extraction, we observed a request latency of ~60 seconds using four nodes vs. ~5 seconds with 40 nodes. Performance roughly scales with the size of the cluster.
A planned enhancement to this is to use AWS Lambda for the worker processing. This enables 1000-way parallel reads at a reasonable cost, as you only pay for the milliseconds of CPU time used with AWS Lambda.
Public access to atmospheric data using HSDS and AWS
An early challenge in releasing the WIND toolkit data was in deciding how to subset the data for different use cases. In general, few researchers need access to the entire 0.5 PB of data and a great deal of efficiency and cost reduction can be gained by making directed constituent datasets.
NREL grid integration researchers initially extracted a 2-TB subset by selecting 120,000 points where the wind resource seemed appropriate for development. They also chose only those data important for wind applications (100-m wind speed, converted to power), the most interesting locations for those performing grid studies. To support the remaining users who needed more data resolution, we down-sampled the data to a 60-minute temporal resolution, keeping all the other variables and spatial resolution intact. This reduced dataset is 50 TB of data describing 30+ atmospheric variables of data for 7 years at a 60-minute temporal resolution.
Programmatic access is possible using the h5pyd Python library, a distributed analog to the widely used h5py library. Users interact with the datasets (variables) and slice the data from its (time x longitude x latitude) cube form as they see fit.
Examples and use cases are described in a set of Jupyter notebooks and available on GitHub:
Now you have a Jupyter notebook server running on your EC2 server.
From your laptop, create an SSH tunnel:
$ ssh –L 8888:localhost:8888 (IP address of the EC2 server)
Now, you can browse to localhost:8888 using the correct token, and interact with the notebooks as if they were local. Within the directory, there are examples for accessing the HSDS API and plotting wind and weather data using matplotlib.
Controlling access and defraying costs
A final concern is rate limiting and access control. Although the HSDS service is scalable and relatively robust, we had a few practical concerns:
How can we protect from malicious or accidental use that may lead to high egress fees (for example, someone who attempts to repeatedly download the entire dataset from S3)?
How can we keep track of who is using the data both to document the value of the data resource and to justify the costs?
If costs become too high, can we charge for some or all API use to help cover the costs?
To approach these problems, we investigated using Amazon API Gateway and its simplified integration with the AWS Marketplace for SaaS monetization as well as third-party API proxies.
In the end, we chose to use API Umbrella due to its close involvement with http://data.gov. While AWS Marketplace is a compelling option for future datasets, the decision was made to keep this dataset entirely open, at least for now. As community use and associated costs grow, we’ll likely revisit Marketplace. Meanwhile, API Umbrella provides controls for rate limiting and API key registration out of the box and was simple to implement as a front-end proxy to HSDS. Those applications that may want to charge for API use can accomplish a similar strategy using Amazon API Gateway and AWS Marketplace.
Ongoing work and other resources
As NREL and other government research labs, municipalities, and organizations try to share data with the public, we expect many of you will face similar challenges to those we have tried to approach with the architecture described in this post. Providing large datasets is one challenge. Doing so in a way that is affordable and convenient for users is an entirely more difficult goal. Using AWS cloud-native services and the existing foundation of the HDF file format has allowed us to tackle that challenge in a meaningful way.
Dr. Caleb Phillips is a senior scientist with the Data Analysis and Visualization Group within the Computational Sciences Center at the National Renewable Energy Laboratory. Caleb comes from a background in computer science systems, applied statistics, computational modeling, and optimization. His work at NREL spans the breadth of renewable energy technologies and focuses on applying modern data science techniques to data problems at scale.
Dr. Caroline Draxl is a senior scientist at NREL. She supports the research and modeling activities of the US Department of Energy from mesoscale to wind plant scale. Caroline uses mesoscale models to research wind resources in various countries, and participates in on- and offshore boundary layer research and in the coupling of the mesoscale flow features (kilometer scale) to the microscale (tens of meters). She holds a M.S. degree in Meteorology and Geophysics from the University of Innsbruck, Austria, and a PhD in Meteorology from the Technical University of Denmark.
John Readey has been a Senior Architect at The HDF Group since he joined in June 2014. His interests include web services related to HDF, applications that support the use of HDF and data visualization.Before joining The HDF Group, John worked at Amazon.com from 2006–2014 where he developed service-based systems for eCommerce and AWS.
Jordan Perr-Sauer is an RPP intern with the Data Analysis and Visualization Group within the Computational Sciences Center at the National Renewable Energy Laboratory. Jordan hopes to use his professional background in software engineering and his academic training in applied mathematics to solve the challenging problems facing America and the world.
The collective thoughts of the interwebz
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.