Tag Archives: bit.ly

SoFi, the underwater robotic fish

Post Syndicated from Alex Bate original https://www.raspberrypi.org/blog/robotic-fish/

With the Greenland shark finally caught on video for the very first time, scientists and engineers are discussing the limitations of current marine monitoring technology. One significant advance comes from the CSAIL team at Massachusetts Institute of Technology (MIT): SoFi, the robotic fish.

A Robotic Fish Swims in the Ocean

More info: http://bit.ly/SoFiRobot Paper: http://robert.katzschmann.eu/wp-content/uploads/2018/03/katzschmann2018exploration.pdf

The untethered SoFi robot

Last week, the Computer Science and Artificial Intelligence Laboratory (CSAIL) team at MIT unveiled SoFi, “a soft robotic fish that can independently swim alongside real fish in the ocean.”

MIT CSAIL underwater fish SoFi using Raspberry Pi

Directed by a Super Nintendo controller and acoustic signals, SoFi can dive untethered to a maximum of 18 feet for a total of 40 minutes. A Raspberry Pi receives input from the controller and amplifies the ultrasound signals for SoFi via a HiFiBerry. The controller, Raspberry Pi, and HiFiBerry are sealed within a waterproof, cast-moulded silicone membrane filled with non-conductive mineral oil, allowing for underwater equalisation.

MIT CSAIL underwater fish SoFi using Raspberry Pi

The ultrasound signals, received by a modem within SoFi’s head, control everything from direction, tail oscillation, pitch, and depth to the onboard camera.

As explained on MIT’s news blog, “to make the robot swim, the motor pumps water into two balloon-like chambers in the fish’s tail that operate like a set of pistons in an engine. As one chamber expands, it bends and flexes to one side; when the actuators push water to the other channel, that one bends and flexes in the other direction.”

MIT CSAIL underwater fish SoFi using Raspberry Pi

Ocean exploration

While we’ve seen many autonomous underwater vehicles (AUVs) using onboard Raspberry Pis, SoFi’s ability to roam untethered with a wireless waterproof controller is an exciting achievement.

“To our knowledge, this is the first robotic fish that can swim untethered in three dimensions for extended periods of time. We are excited about the possibility of being able to use a system like this to get closer to marine life than humans can get on their own.” – CSAIL PhD candidate Robert Katzschmann

As the MIT news post notes, SoFi’s simple, lightweight setup of a single camera, a motor, and a smartphone lithium polymer battery set it apart it from existing bulky AUVs that require large motors or support from boats.

For more in-depth information on SoFi and the onboard tech that controls it, find the CSAIL team’s paper here.

The post SoFi, the underwater robotic fish appeared first on Raspberry Pi.

How to Create an AMI Builder with AWS CodeBuild and HashiCorp Packer – Part 2

Post Syndicated from Heitor Lessa original https://aws.amazon.com/blogs/devops/how-to-create-an-ami-builder-with-aws-codebuild-and-hashicorp-packer-part-2/

Written by AWS Solutions Architects Jason Barto and Heitor Lessa

 
In Part 1 of this post, we described how AWS CodeBuild, AWS CodeCommit, and HashiCorp Packer can be used to build an Amazon Machine Image (AMI) from the latest version of Amazon Linux. In this post, we show how to use AWS CodePipeline, AWS CloudFormation, and Amazon CloudWatch Events to continuously ship new AMIs. We use Ansible by Red Hat to harden the OS on the AMIs through a well-known set of security controls outlined by the Center for Internet Security in its CIS Amazon Linux Benchmark.

You’ll find the source code for this post in our GitHub repo.

At the end of this post, we will have the following architecture:

Requirements

 
To follow along, you will need Git and a text editor. Make sure Git is configured to work with AWS CodeCommit, as described in Part 1.

Technologies

 
In addition to the services and products used in Part 1 of this post, we also use these AWS services and third-party software:

AWS CloudFormation gives developers and systems administrators an easy way to create and manage a collection of related AWS resources, provisioning and updating them in an orderly and predictable fashion.

Amazon CloudWatch Events enables you to react selectively to events in the cloud and in your applications. Specifically, you can create CloudWatch Events rules that match event patterns, and take actions in response to those patterns.

AWS CodePipeline is a continuous integration and continuous delivery service for fast and reliable application and infrastructure updates. AWS CodePipeline builds, tests, and deploys your code every time there is a code change, based on release process models you define.

Amazon SNS is a fast, flexible, fully managed push notification service that lets you send individual messages or to fan out messages to large numbers of recipients. Amazon SNS makes it simple and cost-effective to send push notifications to mobile device users or email recipients. The service can even send messages to other distributed services.

Ansible is a simple IT automation system that handles configuration management, application deployment, cloud provisioning, ad-hoc task-execution, and multinode orchestration.

Getting Started

 
We use CloudFormation to bootstrap the following infrastructure:

ComponentPurpose
AWS CodeCommit repositoryGit repository where the AMI builder code is stored.
S3 bucketBuild artifact repository used by AWS CodePipeline and AWS CodeBuild.
AWS CodeBuild projectExecutes the AWS CodeBuild instructions contained in the build specification file.
AWS CodePipeline pipelineOrchestrates the AMI build process, triggered by new changes in the AWS CodeCommit repository.
SNS topicNotifies subscribed email addresses when an AMI build is complete.
CloudWatch Events ruleDefines how the AMI builder should send a custom event to notify an SNS topic.
RegionAMI Builder Launch Template
N. Virginia (us-east-1)
Ireland (eu-west-1)

After launching the CloudFormation template linked here, we will have a pipeline in the AWS CodePipeline console. (Failed at this stage simply means we don’t have any data in our newly created AWS CodeCommit Git repository.)

Next, we will clone the newly created AWS CodeCommit repository.

If this is your first time connecting to a AWS CodeCommit repository, please see instructions in our documentation on Setup steps for HTTPS Connections to AWS CodeCommit Repositories.

To clone the AWS CodeCommit repository (console)

  1. From the AWS Management Console, open the AWS CloudFormation console.
  2. Choose the AMI-Builder-Blogpost stack, and then choose Output.
  3. Make a note of the Git repository URL.
  4. Use git to clone the repository.

For example: git clone https://git-codecommit.eu-west-1.amazonaws.com/v1/repos/AMI-Builder_repo

To clone the AWS CodeCommit repository (CLI)

# Retrieve CodeCommit repo URL
git_repo=$(aws cloudformation describe-stacks --query 'Stacks[0].Outputs[?OutputKey==`GitRepository`].OutputValue' --output text --stack-name "AMI-Builder-Blogpost")

# Clone repository locally
git clone ${git_repo}

Bootstrap the Repo with the AMI Builder Structure

 
Now that our infrastructure is ready, download all the files and templates required to build the AMI.

Your local Git repo should have the following structure:

.
├── ami_builder_event.json
├── ansible
├── buildspec.yml
├── cloudformation
├── packer_cis.json

Next, push these changes to AWS CodeCommit, and then let AWS CodePipeline orchestrate the creation of the AMI:

git add .
git commit -m "My first AMI"
git push origin master

AWS CodeBuild Implementation Details

 
While we wait for the AMI to be created, let’s see what’s changed in our AWS CodeBuild buildspec.yml file:

...
phases:
  ...
  build:
    commands:
      ...
      - ./packer build -color=false packer_cis.json | tee build.log
  post_build:
    commands:
      - egrep "${AWS_REGION}\:\sami\-" build.log | cut -d' ' -f2 > ami_id.txt
      # Packer doesn't return non-zero status; we must do that if Packer build failed
      - test -s ami_id.txt || exit 1
      - sed -i.bak "s/<<AMI-ID>>/$(cat ami_id.txt)/g" ami_builder_event.json
      - aws events put-events --entries file://ami_builder_event.json
      ...
artifacts:
  files:
    - ami_builder_event.json
    - build.log
  discard-paths: yes

In the build phase, we capture Packer output into a file named build.log. In the post_build phase, we take the following actions:

  1. Look up the AMI ID created by Packer and save its findings to a temporary file (ami_id.txt).
  2. Forcefully make AWS CodeBuild to fail if the AMI ID (ami_id.txt) is not found. This is required because Packer doesn’t fail if something goes wrong during the AMI creation process. We have to tell AWS CodeBuild to stop by informing it that an error occurred.
  3. If an AMI ID is found, we update the ami_builder_event.json file and then notify CloudWatch Events that the AMI creation process is complete.
  4. CloudWatch Events publishes a message to an SNS topic. Anyone subscribed to the topic will be notified in email that an AMI has been created.

Lastly, the new artifacts phase instructs AWS CodeBuild to upload files built during the build process (ami_builder_event.json and build.log) to the S3 bucket specified in the Outputs section of the CloudFormation template. These artifacts can then be used as an input artifact in any later stage in AWS CodePipeline.

For information about customizing the artifacts sequence of the buildspec.yml, see the Build Specification Reference for AWS CodeBuild.

CloudWatch Events Implementation Details

 
CloudWatch Events allow you to extend the AMI builder to not only send email after the AMI has been created, but to hook up any of the supported targets to react to the AMI builder event. This event publication means you can decouple from Packer actions you might take after AMI completion and plug in other actions, as you see fit.

For more information about targets in CloudWatch Events, see the CloudWatch Events API Reference.

In this case, CloudWatch Events should receive the following event, match it with a rule we created through CloudFormation, and publish a message to SNS so that you can receive an email.

Example CloudWatch custom event

[
        {
            "Source": "com.ami.builder",
            "DetailType": "AmiBuilder",
            "Detail": "{ \"AmiStatus\": \"Created\"}",
            "Resources": [ "ami-12cd5guf" ]
        }
]

Cloudwatch Events rule

{
  "detail-type": [
    "AmiBuilder"
  ],
  "source": [
    "com.ami.builder"
  ],
  "detail": {
    "AmiStatus": [
      "Created"
    ]
  }
}

Example SNS message sent in email

{
    "version": "0",
    "id": "f8bdede0-b9d7...",
    "detail-type": "AmiBuilder",
    "source": "com.ami.builder",
    "account": "<<aws_account_number>>",
    "time": "2017-04-28T17:56:40Z",
    "region": "eu-west-1",
    "resources": ["ami-112cd5guf "],
    "detail": {
        "AmiStatus": "Created"
    }
}

Packer Implementation Details

 
In addition to the build specification file, there are differences between the current version of the HashiCorp Packer template (packer_cis.json) and the one used in Part 1.

Variables

  "variables": {
    "vpc": "{{env `BUILD_VPC_ID`}}",
    "subnet": "{{env `BUILD_SUBNET_ID`}}",
         “ami_name”: “Prod-CIS-Latest-AMZN-{{isotime \”02-Jan-06 03_04_05\”}}”
  },
  • ami_name: Prefixes a name used by Packer to tag resources during the Builders sequence.
  • vpc and subnet: Environment variables defined by the CloudFormation stack parameters.

We no longer assume a default VPC is present and instead use the VPC and subnet specified in the CloudFormation parameters. CloudFormation configures the AWS CodeBuild project to use these values as environment variables. They are made available throughout the build process.

That allows for more flexibility should you need to change which VPC and subnet will be used by Packer to launch temporary resources.

Builders

  "builders": [{
    ...
    "ami_name": “{{user `ami_name`| clean_ami_name}}”,
    "tags": {
      "Name": “{{user `ami_name`}}”,
    },
    "run_tags": {
      "Name": “{{user `ami_name`}}",
    },
    "run_volume_tags": {
      "Name": “{{user `ami_name`}}",
    },
    "snapshot_tags": {
      "Name": “{{user `ami_name`}}",
    },
    ...
    "vpc_id": "{{user `vpc` }}",
    "subnet_id": "{{user `subnet` }}"
  }],

We now have new properties (*_tag) and a new function (clean_ami_name) and launch temporary resources in a VPC and subnet specified in the environment variables. AMI names can only contain a certain set of ASCII characters. If the input in project deviates from the expected characters (for example, includes whitespace or slashes), Packer’s clean_ami_name function will fix it.

For more information, see functions on the HashiCorp Packer website.

Provisioners

  "provisioners": [
    {
        "type": "shell",
        "inline": [
            "sudo pip install ansible"
        ]
    }, 
    {
        "type": "ansible-local",
        "playbook_file": "ansible/playbook.yaml",
        "role_paths": [
            "ansible/roles/common"
        ],
        "playbook_dir": "ansible",
        "galaxy_file": "ansible/requirements.yaml"
    },
    {
      "type": "shell",
      "inline": [
        "rm .ssh/authorized_keys ; sudo rm /root/.ssh/authorized_keys"
      ]
    }

We used shell provisioner to apply OS patches in Part 1. Now, we use shell to install Ansible on the target machine and ansible-local to import, install, and execute Ansible roles to make our target machine conform to our standards.

Packer uses shell to remove temporary keys before it creates an AMI from the target and temporary EC2 instance.

Ansible Implementation Details

 
Ansible provides OS patching through a custom Common role that can be easily customized for other tasks.

CIS Benchmark and Cloudwatch Logs are implemented through two Ansible third-party roles that are defined in ansible/requirements.yaml as seen in the Packer template.

The Ansible provisioner uses Ansible Galaxy to download these roles onto the target machine and execute them as instructed by ansible/playbook.yaml.

For information about how these components are organized, see the Playbook Roles and Include Statements in the Ansible documentation.

The following Ansible playbook (ansible</playbook.yaml) controls the execution order and custom properties:

---
- hosts: localhost
  connection: local
  gather_facts: true    # gather OS info that is made available for tasks/roles
  become: yes           # majority of CIS tasks require root
  vars:
    # CIS Controls whitepaper:  http://bit.ly/2mGAmUc
    # AWS CIS Whitepaper:       http://bit.ly/2m2Ovrh
    cis_level_1_exclusions:
    # 3.4.2 and 3.4.3 effectively blocks access to all ports to the machine
    ## This can break automation; ignoring it as there are stronger mechanisms than that
      - 3.4.2 
      - 3.4.3
    # CloudWatch Logs will be used instead of Rsyslog/Syslog-ng
    ## Same would be true if any other software doesn't support Rsyslog/Syslog-ng mechanisms
      - 4.2.1.4
      - 4.2.2.4
      - 4.2.2.5
    # Autofs is not installed in newer versions, let's ignore
      - 1.1.19
    # Cloudwatch Logs role configuration
    logs:
      - file: /var/log/messages
        group_name: "system_logs"
  roles:
    - common
    - anthcourtney.cis-amazon-linux
    - dharrisio.aws-cloudwatch-logs-agent

Both third-party Ansible roles can be easily configured through variables (vars). We use Ansible playbook variables to exclude CIS controls that don’t apply to our case and to instruct the CloudWatch Logs agent to stream the /var/log/messages log file to CloudWatch Logs.

If you need to add more OS or application logs, you can easily duplicate the playbook and make changes. The CloudWatch Logs agent will ship configured log messages to CloudWatch Logs.

For more information about parameters you can use to further customize third-party roles, download Ansible roles for the Cloudwatch Logs Agent and CIS Amazon Linux from the Galaxy website.

Committing Changes

 
Now that Ansible and CloudWatch Events are configured as a part of the build process, commiting any changes to the AWS CodeComit Git Repository will triger a new AMI build process that can be followed through the AWS CodePipeline console.

When the build is complete, an email will be sent to the email address you provided as a part of the CloudFormation stack deployment. The email serves as notification that an AMI has been built and is ready for use.

Summary

 
We used AWS CodeCommit, AWS CodePipeline, AWS CodeBuild, Packer, and Ansible to build a pipeline that continuously builds new, hardened CIS AMIs. We used Amazon SNS so that email addresses subscribed to a SNS topic are notified upon completion of the AMI build.

By treating our AMI creation process as code, we can iterate and track changes over time. In this way, it’s no different from a software development workflow. With that in mind, software patches, OS configuration, and logs that need to be shipped to a central location are only a git commit away.

Next Steps

 
Here are some ideas to extend this AMI builder:

  • Hook up a Lambda function in Cloudwatch Events to update EC2 Auto Scaling configuration upon completion of the AMI build.
  • Use AWS CodePipeline parallel steps to build multiple Packer images.
  • Add a commit ID as a tag for the AMI you created.
  • Create a scheduled Lambda function through Cloudwatch Events to clean up old AMIs based on timestamp (name or additional tag).
  • Implement Windows support for the AMI builder.
  • Create a cross-account or cross-region AMI build.

Cloudwatch Events allow the AMI builder to decouple AMI configuration and creation so that you can easily add your own logic using targets (AWS Lambda, Amazon SQS, Amazon SNS) to add events or recycle EC2 instances with the new AMI.

If you have questions or other feedback, feel free to leave it in the comments or contribute to the AMI Builder repo on GitHub.

Making sweet, sweet music with PiSound

Post Syndicated from Jonic original https://www.raspberrypi.org/blog/making-sweet-sweet-music-pisound/

I’d say I am a passable guitarist. Ever since I learnt about the existence of the Raspberry Pi in 2012, I’ve wondered how I could use one as a guitar effects unit. Unfortunately, I’m also quite lazy and have therefore done precisely nothing to make one. Now, though, I no longer have to beat myself up about this. Thanks to the PiSound board from Blokas, musicians can connect all manner of audio gear to their Raspberry Pi, bringing their projects to a whole new level. Essentially, it transforms your Pi into a complete audio workstation! What musician wouldn’t want a piece of that?

PiSound: a soundcard HAT for the Raspberry Pi

Raspberry Pi with PiSound attached

The PiSound in situ: do those dials go all the way to eleven?

PiSound is a HAT for the Raspberry Pi 3 which acts as a souped-up sound card. It allows you to send and receive audio signals from its jacks, and send MIDI input/output signals to compatible devices. It features two 6mm in/out jacks, two standard DIN-5 MIDI in/out sockets, potentiometers for volume and gain, and ‘The Button’ (with emphatic capitals) for activating audio manipulation patches. Following an incredibly successful Indiegogo campaign, the PiSound team is preparing the board for sale later in the year.

Setting the board up was simple, thanks to the excellent documentation on the PiSound site. First, I mounted the board on my Raspberry Pi’s GPIO pins and secured it with the supplied screws. Next, I ran one script in a terminal window on a fresh installation of Raspbian, which downloaded, installed, and set up all the software I needed to get going. All I had to do after that was connect my instruments and get to work creating patches for Pure Data, a popular visual programming interface for manipulating media streams.

PiSound with instruments and computer

Image from Blokas

Get creative with PiSound!

During my testing, I created some simple fuzz, delay, and tremolo guitar effects. The possibilities, though, are as broad as your imagination. I’ve come up with some ideas to inspire you:

  • You could create a web interface for the guitar effects, accessible over a local network on a smartphone or tablet.
  • How about controlling an interactive light show or projected visualisation on stage using the audio characteristics of the guitar signal?
  • Channel your inner Matt Bellamy and rig up some MIDI hardware on your guitar to trigger loops and samples while you play.
  • Use a tilt switch to increase the intensity of an effect when the angle of the guitar’s neck is changed (imagine you’re really going for it during a solo).
  • You could even use the audio input stream as a base for generating other non-audio results.

pisound – Audio & MIDI Interface for your Raspberry Pi

Indiegogo Campaign: https://igg.me/at/pisound More Info: http://www.blokas.io Sounds by Sarukas: http://bit.ly/2myN8lf

Now I have had a taste of what this incredible little board can do, I’m very excited to see what new things it will enable me to do as a performer. It’s compact and practical, too: as the entire thing is about the size of a standard guitar pedal, I could embed it into one of my guitars if I wanted to. Alternatively, I could get creative and design a custom enclosure for it.

Using Sonic Pi with PiSound

Community favourite Sonic Pi will also support the board very soon, as Sam Aaron and Ben Smith ably demonstrated at our fifth birthday party celebrations. This means you don’t even need to be able to play an instrument to make something awesome with this clever little HAT.

The Future of @Sonic_Pi with Sam Aaron & Ben Smith at #PiParty

Uploaded by Alan O’Donohoe on 2017-03-05.

I’m incredibly impressed with the hardware and the support on the PiSound website. It’s going to be my go-to HAT for advanced audio projects, and, when it finally launches later this year, I’ll have all the motivation I need to create the guitar effects unit I’ve always wanted.

Find out more about PiSound over at the Blokas website, and take a deeper look at the tech specs and other information over at the PiSound documentation site.

Disclaimer: I am personally a backer of the Indiegogo campaign, and Blokas very kindly supplied a beta board for this review.

The post Making sweet, sweet music with PiSound appeared first on Raspberry Pi.

How to Pi: Halloween Edition 2016

Post Syndicated from Alex Bate original https://www.raspberrypi.org/blog/how-to-pi-halloween-edition-2016/

Happy Halloween, one and all. Whether you’ve planned a night of trick-or-treating, watching scary movies, or hiding from costumed children with the lights off, our How to Pi guide should get you ready for the evening’s festivities. Enjoy!

Costumes

This is definitely a Pi Towers favourite. The Disco Ball costume by Wolfie uses a drone battery and Raspberry Pi to create, well, a child-sized human disco ball. The video links on the project page seem to be down; however, all the ingredients needed for the project are listed at Thingiverse, and a walkthrough of the wiring can be seen here. Below, you’ll see the full effect of the costume, and I’m sure we can all agree that we need one here in the office.

Halloween 2016 Disco Ball

Some aerial shots of Serena’s halloween costume we made. It contains 288 full color LEDs, a dual battery system for power, and a Raspberry Pi B2 running the sequence that was created in xLights.

If you feel ‘too cool’ to fit inside a giant disco ball, how about fitting inside a computer… sort of? The Jacket houses a Raspberry Pi with a monitor in the sleeve because, well, why not?

‘The Jacket’ 2.0 My Cyberp…

lsquo;The Jacket’ 2.0 My Cyberpunk inspired jacket was completed just in time for a Halloween party last night. This year’s upgrades added to the EL tape and 5″ LCD, with spikes, a pi zero and an action cam (look for the missing chest spike).

 

Dealing with Trick-or-treaters

Trick or Trivia, the trivia-based Halloween candy dispenser from YouTube maker TheMakersWorkbench, dispenses candy based on correct answers to spooky themed questions. For example, Casper is a friendly what? Select ‘Ghost’ on the touchscreen and receive three pieces of candy. Select an incorrect answer and receive only one.

It’s one of the best ways to give out candy to trick-or-treaters, without having to answer the door or put in any effort whatsoever.

Trick Or Trivia Trivia-Based Halloween Candy Dispenser Servo Demo

This video is a companion video to a project series I am posting on Element14.com. The video demonstrates the candy dispensing system for the Trick or Trivia candy dispenser project. You can find the post that this video accompanies at the following link: http://bit.ly/TrickorTrivia If you like this video, please consider becoming out patron on Patreon.

Or just stop them knocking in the first place with this…

Raspberry Pi Motion Sensor Halloween Trick

A Raspberry Pi running Ubuntu Mate connected to an old laptop screen. I have a motion sensor hidden in the letterbox. When you approach the door it detects you. Next the pi sends a signal to a Wi Fi enabled WeMo switch to turn on the screen.

Scary pranks

When it comes to using a Raspberry Pi to prank people, the team at Circuit-Help have definitely come up with the goods. By using a setup similar to the magic mirror project, they fitted an ultrasonic sensor to display a zombie video within the mirror whenever an unsuspecting soul approaches. Next year’s The Walking Dead-themed Halloween party is sorted!

Haunted Halloween Mirror

This Raspberry Pi Halloween Mirror is perfect for both parties and pranks! http://www.circuit-help.com.ph/haunted-halloween-mirror/

If the zombie mirror isn’t enough, how about some animated portraits for your wall? Here’s Pi Borg’s Moving Eye Halloween portrait. Full instructions here.

Spooky Raspberry Pi controlled Halloween picture

Check out our quick Halloween Project, make your own Raspberry Pi powered spooky portrait! http://www.instructables.com/id/Halloween-painting-with-moving-eyes/

Pumpkins

We’ve seen a flurry of Raspberry Pi pumpkins this year. From light shows to motion-activated noise makers, it’s the year of the pimped-up pumpkin. Here’s Oliver with his entry into the automated pumpkin patch, offering up a motion-activated pumpkin jam-packed with LEDs.

Raspberry Pi Motion Sensor Light Up Pumpkin

Using a Raspberry Pi with a PIR motion sensor and a bunch of NeoPixels to make a scary Halloween Pumpkin

Or get super-fancy and use a couple of Pimoroni Unicorn HATs to create animated pumpkin eyes. Instructions here.

Raspberry Pi Pumpkin LED Matrix Eyes

Inspired by the many Halloween electronics projects we saw last year, we tried our own this year. Source code is on github https://github.com/mirkin/pi-word-clock

Ignore the world and get coding

If you’re one of the many who would rather ignore Halloween, close the curtains, and pretend not to be home, here are some fun, spooky projects to work on this evening. Yes, they’re still Halloween-themed… but c’mon, they’ll be fun regardless!

Halloween Music Light Project – Follow the tutorial at Linux.com to create this awesome and effective musical light show. You can replace the tune for a less Halloweeny experience.

Halloween Music-Light project created using Raspberry Pi and Lightshow project.

Uploaded by Swapnil Bhartiya on 2016-10-12.

Spooky Spot the Difference – Let the Raspberry Pi Foundation team guide you through this fun prank, and use the skills you learn to replace the images for other events and holidays.

spot_the_diff

Whatever you get up to with a Raspberry Pi this Halloween, make sure to tag us across social media on Facebook, Twitter, Instagram, G+, and Vine. You can also check out our Spooky Pi board on Pinterest.

The post How to Pi: Halloween Edition 2016 appeared first on Raspberry Pi.

Five(ish) awesome RetroPie builds

Post Syndicated from Alex Bate original https://www.raspberrypi.org/blog/fiveish-awesome-retropie-builds/

If you’ve yet to hear about RetroPie, how’s it going living under that rock?

RetroPie, for the few who are unfamiliar, allows users to play retro video games on their Raspberry Pi or PC. From Alex Kidd to Ecco the Dolphin, Streets of Rage 2 to Cool Spot, nostalgia junkies can get their fill by flashing the RetroPie image to their Pi and plugging in their TV and a couple of USB controllers.

But for many, this simple setup is not enough. Alongside the RetroPie unit, many makers are building incredible cases and modifications to make their creation stand out from the rest.

Here’s five of what I believe to be some of the best RetroPie builds shared on social media:

1. Furniture Builds

If you don’t have the space for an arcade machine, why not incorporate RetroPie into your coffee table or desk?

This ‘Mid-century-ish Retro Games Table’ by Reddit user GuzziGuy fits a screen and custom-made controllers beneath a folding surface, allowing full use of the table when you’re not busy Space Raiding or Mario Karting.

GuzziGuy RetroPie Table

2. Arcade Cabinets

While the arcade cabinet at Pi Towers has seen better days (we have #LukeTheIntern working on it as I type), many of you makers are putting us to shame with your own builds. Whether it be a tabletop version or full 7ft cabinet, more and more RetroPie arcades are popping up, their builders desperate to replicate the sights of our gaming pasts.

One maker, YouTuber Bob Clagett, built his own RetroPie Arcade Cabinet from scratch, documenting the entire process on his channel.

With sensors that start the machine upon your approach, LED backlighting, and cartoon vinyl artwork of his family, it’s easy to see why this is a firm favourite.

Arcade Cabinet build – Part 3 // How-To

Check out how I made this fully custom arcade cabinet, powered by a Raspberry Pi, to play retro games! Subscribe to my channel: http://bit.ly/1k8msFr Get digital plans for this cabinet to build your own!

3. Handheld Gaming

If you’re looking for a more personal gaming experience, or if you simply want to see just how small you can make your build, you can’t go wrong with a handheld gaming console. With the release of the Raspberry Pi Zero, the ability to fit an entire RetroPie setup within the smallest of spaces has become somewhat of a social media maker challenge.

Chase Lambeth used an old Burger King toy and Pi Zero to create one of the smallest RetroPie Gameboys around… and it broke the internet in the process.

Mini Gameboy Chase Lambeth

4. Console Recycling

What better way to play a retro game than via a retro game console? And while I don’t condone pulling apart a working NES or MegaDrive, there’s no harm in cannibalising a deceased unit for the greater good, or using one of many 3D-printable designs to recreate a classic.

Here’s YouTuber DaftMike‘s entry into the RetroPie Hall of Fame: a mini-NES with NFC-enabled cartridges that autoplay when inserted.

Raspberry Pi Mini NES Classic Console

This is a demo of my Raspberry Pi ‘NES Classic’ build. You can see photos, more details and code here: http://www.daftmike.com/2016/07/NESPi.html Update video: https://youtu.be/M0hWhv1lw48 Update #2: https://youtu.be/hhYf5DPzLqg Electronics kits are now available for pre-order, details here: http://www.daftmike.com/p/nespi-electronics-kit.html Build Guide Update: https://youtu.be/8rFBWdRpufo Build Guide Part 1: https://youtu.be/8feZYk9HmYg Build Guide Part 2: https://youtu.be/vOz1-6GqTZc New case design files: http://www.thingiverse.com/thing:1727668 Better Snap Fit Cases!

5. Everything Else

I can’t create a list of RetroPie builds without mentioning the unusual creations that appear on our social media feeds from time to time. And while you may consider putting more than one example in #5 cheating, I say… well, I say pfft.

Example 1 – Sean (from SimpleCove)’s Retro Arcade

It felt wrong to include this within Arcade Cabinets as it’s not really a cabinet. Creating the entire thing from scratch using monitors, wood, and a lot of veneer, the end result could easily have travelled here from the 1940s.

Retro Arcade Cabinet Using A Raspberry Pi & RetroPie

I’ve wanted one of these raspberry pi/retro pi arcade systems for a while but wanted to make a special box to put it in that looked like an antique table top TV/radio. I feel the outcome of this project is exactly that.

Example 2 – the HackerHouse Portable Console… built-in controller… thing

The team at HackerHouse, along with many other makers, decided to incorporate the entire RetroPie build into the controller, allowing you to easily take your gaming system with you without the need for a separate console unit. Following on from the theme of their YouTube channel, they offer a complete tutorial on how to make the controller.

Make a Raspberry Pi Portable Arcade Console (with Retropie)

Find out how to make an easy portable arcade console (cabinet) using a Raspberry Pi. You can bring it anywhere, plug it into any tv, and play all your favorite classic ROMs. This arcade has 4 general buttons and a joystick, but you can also plug in any old usb enabled controller.

Example 3 – Zach’s PiCart

RetroPie inside a NES game cartridge… need I say more?

Pi Cart: a Raspberry Pi Retro Gaming Rig in an NES Cartridge

I put a Raspberry Pi Zero (and 2,400 vintage games) into an NES cartridge and it’s awesome. Powered by RetroPie. I also wrote a step-by-step guide on howchoo and a list of all the materials you’ll need to build your own: https://howchoo.com/g/mti0oge5nzk/pi-cart-a-raspberry-pi-retro-gaming-rig-in-an-nes-cartridge

Here’s a video to help you set up your own RetroPie. What games would you play first? And what other builds have caught your attention online?

The post Five(ish) awesome RetroPie builds appeared first on Raspberry Pi.

Inspired Raspberry Pi Projects at Maker Faire

Post Syndicated from Courtney Lentz original https://www.raspberrypi.org/blog/inspired-raspberry-pi-projects-mfba/

We get to read about and see an abundance of project builds through online channels, but we especially love when we get the opportunity to meet the makers themselves as they share their projects first-hand. That’s why an event like Maker Faire continues to be so successful. It provides a platform and a dedicated space, if only for a weekend, for makers and tinkerers alike to come together and share with other enthusiasts.

Raspberry Pi on Twitter

The team is up and at ’em at @makerfaire! Come say hello, try a Raspberry Pi 3, and grab a sticker. #MakerFairepic.twitter.com/mjYOiPBKGy

If you didn’t make it to this year’s Bay Area Maker Faire to see the thousands of maker projects, here is a roundup of our favorite Raspberry Pi projects from the weekend.

Flaschen Taschen is a massive video display made out of beer bottles, milk crates, and RGB LED strings. The display is reminiscent of a Lite-Brite (remember those?) only this one is taller than you and a tad more sophisticated. Each bottle is capped with a single addressable RGB LED. The bottoms of the bottles act as lenses for the emitted light. The colors resemble those of a thermal camera, and they move like amoebas under a microscope.

Raspberry Pi on Twitter

This beer bottle video display, #flaschentaschen, is driven by Raspberry Pi and can run up to 160fps! @noisebridgepic.twitter.com/iYrHGhiwDk

The sheer size of the Flaschen Taschen is what initially caught our eye. After we learned the details of its construction we were even more intrigued. The entire display is driven by a Raspberry Pi and some custom circuitry.

slack_for_ios_upload_1024

The art installation is a great example of upcycling, using everyday items to create something beautiful and thoughtful. The project name is a nod to c-base’s Mate-Light project. Check-out their Github repository for more details on the design and project documentation, and enjoy this video of the setup from Hackaday.

Video Wall Made with 1575 Corona Beer Bottles and Determination

The members of Noisebridge Hackerspace in San Francisco went all out this year, building a 1,575 pixel display for their booth at Bay Area Maker Faire. The pixels are Corona Beer bottles, 25 to a crate stacked 9 crates wide and 7 crates tall.

While MCM’s enlarged Raspberry Pi may have looked like a prop from the 1989 movie Honey, I Shrunk the Kids, it was also fully functional.

Raspberry Pi on Twitter

Right now, @cortlentz1 is getting our @makerfaire stand ready in San Mateo. She spotted this: IT’S FULLY FUNCTIONAL.pic.twitter.com/8WBYAp0ynW

The Raspberry Pi Infinity+ is ten times the size of a Raspberry Pi Model B. It was made by our friends Michael Castor and Christian Moist over at MCM Electronics, an official distributor of Raspberry Pi.

It’s hard to say what was more captivating: the GPIO header, the USB ports big as one’s head, or the precise detailing of the board’s components illustrated from high-res photos into Adobe Illustrator.

Infinity+ build
But, because they are true maker pros, Michael and Christian were sure to document the complete build process. You can find the detailed BOM and design notes on each of their personal blogs.

Raspberry-Pi-Infinity-Ports-Mounted-e1464202892860-1024x1024

Not all makers stand behind a table or in a booth at the faire. Many take to the fairgrounds with projects in hand. You’ll often see the natural congregation of people around makers carrying their projects, who are happy share the story of their build process again and again as they themselves make their way around the faire.

Maker Faire on Twitter

Maker Faire = magic! Relive the weekend through photos: http://bit.ly/20jhZ1K #MFBA16pic.twitter.com/yx4FPFuxq2

This was just how we met Jonathan, a young maker, and his father. Jonathan—proudly gripping his homemade Game Boy—stopped by the Raspberry Pi booth, and we are sure happy he did. The Game Boy replicated the classic handheld version but swapped out the matte plastic grey case for a handmade wood enclosure, and Jonathan gave it his own personal touch by adding customised operation buttons.

Raspberry Pi on Twitter

Here’s a wooden Game Boy made by Jonathan and @shuman_projects. Naturally there’s a Raspberry Pi inside! #MakerFairepic.twitter.com/wAtlnmgtKb

Though the attention to detail and design were impressive, the best part of this project was that it transformed a typically siloed activity on a personal device, turning it into a participatory build for a father and son. That is precisely the sort of making that we love to see happening around the Raspberry Pi.

Thank you to everyone who came to visit us at Maker Faire Bay Area. For those of you missed out, come say hello to us at a future event. You’ll find members of the Raspberry Pi team at these upcoming events:

The post Inspired Raspberry Pi Projects at Maker Faire appeared first on Raspberry Pi.

Security Risks of Shortened URLs

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2016/04/security_risks_11.html

Shortened URLs, produced by services like bit.ly and goo.gl, can be brute-forced. And searching random shortened URLs yields all sorts of secret documents. Plus, many of them can be edited, and can be infected with malware.

Academic paper. Blog post with lots of detail.

Gone In Six Characters: Short URLs Considered Harmful for Cloud Services (Freedom to Tinker)

Post Syndicated from jake original http://lwn.net/Articles/683880/rss

Over at the Freedom to Tinker blog, guest poster Vitaly Shmatikov, who is a professor at Cornell Tech, writes about his study [PDF] of what
URL shortening means for the security and privacy of cloud services.
TL;DR: short URLs produced by bit.ly, goo.gl, and similar services are so short that they can be scanned by brute force. Our scan discovered a large number of Microsoft OneDrive accounts with private documents. Many of these accounts are unlocked and allow anyone to inject malware that will be automatically downloaded to users’ devices. We also discovered many driving directions that reveal sensitive information for identifiable individuals, including their visits to specialized medical facilities, prisons, and adult establishments.

Scaling Writes on Amazon DynamoDB Tables with Global Secondary Indexes

Post Syndicated from Ian Meyers original https://blogs.aws.amazon.com/bigdata/post/Tx3KPZDXIBJEQ4B/Scaling-Writes-on-Amazon-DynamoDB-Tables-with-Global-Secondary-Indexes

Ian Meyers is a Principal Solutions Architect with AWS

Amazon DynamoDB is a fast, flexible, and fully managed NoSQL database service that supports both document and key-value store models that need consistent, single-digit millisecond latency at any scale. In this post, we discuss a technique that can be used with DynamoDB to ensure virtually unlimited scaling when using secondary indexes. We focus on a data structure that provides data as a time series, commonly used for real-time analytics.

Time series tables

A time series table stores a broad range of values like any table, but organizes and sequences the data by a unit of time. This allows us to compare values that occurred in one time period to the values that occurred in another, given a common set of criteria. You would use a time series to answer questions such as ‘what is the revenue this quarter vs. the same quarter last year?’ or ‘what is the busiest hour over the course of a month?’

The time series that we reference today is a common requirement for many customers: ad impressions and click events over time, to facilitate the analysis of referring websites. For this solution, we want to be able to track the referring site URL or ID (referrer), when the event occurred, and how many clicks or impressions occurred over a period.

In DynamoDB, tables are accessed through a variety of mechanisms. Each table contains multiple items, where each item is composed of multiple attributes. For every table, you specify a primary key, which consists of either a hash or “hash and range” attribute.

Primary keys for time series tables

To create our time series table, we create a table with a hash and range primary key, which allows us to look up an item using two discrete values. The hash key for our table is the referrer. The range key is the date and time from within the impression or click event, stored at some downsampled time granularity such as ‘per-minute’ or ‘per-hour’. We downsample for simplicity of looking up specific time values for a referrer, but also because this significantly reduces the size of the hash and range key.

To perform this downsampling, we take a given event time and reduce it to the time period in which it occurred. For example, downsampling the value ‘2015-15-06 10:31:32’ to ‘per-hour’ would result in a value of ‘2015-15-06 10:00:00’, while ‘per-minute’ would give us ‘2015-15-06 10:31:00’.

This data structure allows us to receive an event, and then very efficiently update the aggregation of that event for the time period in which it occurred by the event’s referrer and the downsampled timestamp.

Local secondary indexes

In the same way that we can use the hash/range primary key to access a table, we can create multiple local secondary indexes that use the table’s existing hash key plus another attribute for access. When we create the index, we can select other non-indexed attributes to include, or project, into the index. These indexes are useful when we know the hash key, but want to be able to access data on the basis of multiple different attributes with extremely high performance.

In our example, we might want to find the referrers who created events for a specific page. A local secondary index on the referrer plus the URL would allow us to find those referrers who specifically accessed one page vs. another, out of all of the URLs accessed.

Global secondary indexes

Global secondary indexes give us the ability to create entirely new hash or hash/range indexes for a table from other attributes. As with local secondary indexes, we can choose which table attributes to project into an index. These indexes are useful when we want to access by different attributes than the hash/range key but with the same performance.

In our example, we definitely want to be able to see the events that occurred in a date range; for example, ‘show me new records since time N’. In our time series table, we can create a new global secondary index just on eventTime, which allows us to scan the table efficiently with DynamoDB query expressions such as ‘eventTime >= :querytime’.

Processing time series data with elastic throughput

To populate this time series table, we must receive the impression and click events, and then downsample and aggregate the data in our time series table. A powerful way to do this, without managing any servers, is to use Amazon Kinesis to stream events and AWS Lambda to process them, as shown in the following graphic.

When we implement a Lambda function for this architecture, we see that we get multiple Amazon Kinesis records in a single function invocation, for a narrow time period as defined by the producer of the data. In this type of application, we want to buffer events in memory up to some threshold before writing to DynamoDB, so that we limit the I/O use on the table.

The required write rate on our table is defined by the number of unique referrers we are provided with when processing data, multiplied by the duration of ‘downsampled time’ for this set of events. For example, if we have an event stream of 2000 events per second and 200 unique referrers, and we receive 5000 events into our Lambda function invocation, then we would expect to require:

((EventsReceived / EventsPerSecond) * UniqueReferrerCount) / NumberOfSecondsDownsampled

((5000 / 2000 = 2.5) * 200 = 500) / 1 = 500 writes/second

We can set provisioned write IOPS on our DynamoDB table to 550 (to give ourselves a bit of headroom), and then scale up and down over time as the event rate changes. However, we don’t just require this write IOPS for the table, but also for the global secondary index on eventTime.

The accidental bottleneck

DynamoDB provisioned throughput is set on a table or index as a whole, and is divided up among many partitions that store data. The partition is selected on the basis of the hash key and at most, DynamoDB offers approximately 1000 writes/second to a single hash key value on a partition.

This won’t be a problem for our main table, because the 500 writes/second are spread over the 200 unique referrer values that make up the hash/range key for a specific downsampled time value. However, our global secondary index only has a single hash key value, eventTime, and so we write to only one partition. This will not achieve the required 500 writes/second on a given eventTime value. With the global secondary index as defined, we observe write throttling and timeouts, and because global secondary indexes are written asynchronously, we also observe an increased latency between the write to the table and the update of the secondary index.

No matter how high we provision the write IOPS on the index, we will always see throttling because we are focusing our writes to a single key (eventTime), and thus a single partition.

Addressing write bottlenecks with scattering

This problem can be solved by introducing a write pattern called scattering. Later, we review how we can re-gather this data at query time or in an asynchronous process to give us a simple model for reads.

We avoided a write bottleneck on our table because the writes are distributed across the 200 unique referrers to our site. We can use this same principle to remove the bottleneck on our global secondary index. Instead of creating the index on eventTime alone, we can convert it to a hash/range index on a new attribute ‘scatteredValue’ plus ‘eventTime’. scatteredValue is a synthetic column into which we write a random number between 0 and 99 every time we create or update a record.

This leading random value in the index enables DynamoDB to spread the writes over a larger number of partitions. We’re writing 100 unique values, so DynamoDB can scale to 100 partitions. This means that we can achieve 1000 * cardinality of scatter value (100) = 100,000 writes/second for a single eventTime value. This is much better!

If we needed even more writes/second, then we could increase to 1000 unique scatteredValue entries (0-999). Our table writes now execute without any throttling, and data in our global secondary index does not lag significantly behind the table.

Gathering scattered records together

We originally created the global secondary index on eventTime so that we could ask a query such as ‘what aggregate events happened after time N?’ Now that the index is on scatteredValue/eventTime, we can’t just query the eventTime attribute. However, we can take advantage of the DynamoDB parallel scan feature to easily gather all the relevant records together.

To do this, we create multiple worker threads in our application, who each scan several scatteredValue entries and apply an expression to the eventTime. If we use 10 parallel workers, then worker 0 must scan scatteredValue between 0 and 9, worker 1 scans values between 10 and 19, and so on, until worker 10 scans scatteredValue between 90 and 99. This can be visualized as by the following graphic:

These workers can then provide a unique list of hash/range key values on referrer/eventTime (which are always projected into a global secondary index) or, if we decided to project eventCount into the index, we could simply use this value directly with an aggregation.

For those who would like to implement a parallel gather reader, an example of this implemented in Node.js is available in the appendix.

Summary

Global secondary indexes offer a powerful mechanism to allow you to query complex data structures, but in certain cases they can result in write bottlenecks that cannot be addressed by simply increasing the write IOPS on the DynamoDB table or index. By using a random value as the hash key of a global secondary index, we can evenly spread the write load across multiple DynamoDB partitions, and then we can use this index at query time via the parallel Scan API. This approach offers us the ability to scale writes on a table with a complex index structure to any required write rate.

Further Resources

For more information about using this type of data structure, you can review Amazon Kinesis Aggregators aggregators (https://github.com/awslabs/amazon-kinesis-aggregators), a framework designed for automatic time series analysis of data streamed from Amazon Kinesis. In this codebase, DynamoDataStore (http://bit.ly/1PLLJzv) and DynamoQueryEngine (http://bit.ly/1EduDZS) implement the scatter/gather pattern, and are available for re-use in other AWS applications.

If you have questions or suggestions, please leave a comment below.

Appendix 1: Sample code to query time series data from DynamoDB

var region = process.env[‘AWS_REGION’];

var async = require(‘async’);
var aws = require(‘aws-sdk’);
aws.config.update({
region : region
});
var dynamoDB = new aws.DynamoDB({
apiVersion : ‘2012-08-10’,
region : region
});

// generate an array of integers between start and start+count
var range = function(start, count) {
return Array.apply(0, Array(count)).map(function(element, index) {
return index + start;
});
};

// function to query DynamoDB for a specific date time value and scatter prefix
var gatherQueryDDB = function(dateTimeValue, index, callback) {
var params = {
TableName : "myDynamoTable",
IndexName : "myGsiOnEventTime",
KeyConditionExpression : "scatterPrefix = :index and eventTime = :date",
ExpressionAttributeValues : {
":index" : {
N : ” + index
},
":date" : {
S : dateTimeValue
}
}
};
dynamoDB.query(params, function(err, data) {
if (err) {
callback(err);
} else {
callback(null, data.Items);
}
});
};

// function which prints the results of the worker scan
var printResults = function(err, results) {
if (err) {
console.log(JSON.stringify(err));
process.exit(-1);
} else {
results.map(function(item) {
if (item && item.length > 0) {
console.log(JSON.stringify(item));
}
});
}
};

/* set this range to the scatter prefix size */
var scatteredValues = range(0, 100);

/* set this value to how many concurrent reads to do against the table */
var concurrentReads = 20;

async.mapLimit(scatteredValues, concurrentReads, gatherQueryDDB.bind(undefined, "date query value in yyyy-MM-dd HH:mm:ss format"), printResults);

 

Related:

Powering Gaming Applications with DynamoDB

Scaling Writes on Amazon DynamoDB Tables with Global Secondary Indexes

Post Syndicated from Ian Meyers original https://blogs.aws.amazon.com/bigdata/post/Tx3KPZDXIBJEQ4B/Scaling-Writes-on-Amazon-DynamoDB-Tables-with-Global-Secondary-Indexes

Ian Meyers is a Principal Solutions Architect with AWS

Amazon DynamoDB is a fast, flexible, and fully managed NoSQL database service that supports both document and key-value store models that need consistent, single-digit millisecond latency at any scale. In this post, we discuss a technique that can be used with DynamoDB to ensure virtually unlimited scaling when using secondary indexes. We focus on a data structure that provides data as a time series, commonly used for real-time analytics.

Time series tables

A time series table stores a broad range of values like any table, but organizes and sequences the data by a unit of time. This allows us to compare values that occurred in one time period to the values that occurred in another, given a common set of criteria. You would use a time series to answer questions such as ‘what is the revenue this quarter vs. the same quarter last year?’ or ‘what is the busiest hour over the course of a month?’

The time series that we reference today is a common requirement for many customers: ad impressions and click events over time, to facilitate the analysis of referring websites. For this solution, we want to be able to track the referring site URL or ID (referrer), when the event occurred, and how many clicks or impressions occurred over a period.

In DynamoDB, tables are accessed through a variety of mechanisms. Each table contains multiple items, where each item is composed of multiple attributes. For every table, you specify a primary key, which consists of either a hash or “hash and range” attribute.

Primary keys for time series tables

To create our time series table, we create a table with a hash and range primary key, which allows us to look up an item using two discrete values. The hash key for our table is the referrer. The range key is the date and time from within the impression or click event, stored at some downsampled time granularity such as ‘per-minute’ or ‘per-hour’. We downsample for simplicity of looking up specific time values for a referrer, but also because this significantly reduces the size of the hash and range key.

To perform this downsampling, we take a given event time and reduce it to the time period in which it occurred. For example, downsampling the value ‘2015-15-06 10:31:32’ to ‘per-hour’ would result in a value of ‘2015-15-06 10:00:00’, while ‘per-minute’ would give us ‘2015-15-06 10:31:00’.

This data structure allows us to receive an event, and then very efficiently update the aggregation of that event for the time period in which it occurred by the event’s referrer and the downsampled timestamp.

Local secondary indexes

In the same way that we can use the hash/range primary key to access a table, we can create multiple local secondary indexes that use the table’s existing hash key plus another attribute for access. When we create the index, we can select other non-indexed attributes to include, or project, into the index. These indexes are useful when we know the hash key, but want to be able to access data on the basis of multiple different attributes with extremely high performance.

In our example, we might want to find the referrers who created events for a specific page. A local secondary index on the referrer plus the URL would allow us to find those referrers who specifically accessed one page vs. another, out of all of the URLs accessed.

Global secondary indexes

Global secondary indexes give us the ability to create entirely new hash or hash/range indexes for a table from other attributes. As with local secondary indexes, we can choose which table attributes to project into an index. These indexes are useful when we want to access by different attributes than the hash/range key but with the same performance.

In our example, we definitely want to be able to see the events that occurred in a date range; for example, ‘show me new records since time N’. In our time series table, we can create a new global secondary index just on eventTime, which allows us to scan the table efficiently with DynamoDB query expressions such as ‘eventTime >= :querytime’.

Processing time series data with elastic throughput

To populate this time series table, we must receive the impression and click events, and then downsample and aggregate the data in our time series table. A powerful way to do this, without managing any servers, is to use Amazon Kinesis to stream events and AWS Lambda to process them, as shown in the following graphic.

When we implement a Lambda function for this architecture, we see that we get multiple Amazon Kinesis records in a single function invocation, for a narrow time period as defined by the producer of the data. In this type of application, we want to buffer events in memory up to some threshold before writing to DynamoDB, so that we limit the I/O use on the table.

The required write rate on our table is defined by the number of unique referrers we are provided with when processing data, multiplied by the duration of ‘downsampled time’ for this set of events. For example, if we have an event stream of 2000 events per second and 200 unique referrers, and we receive 5000 events into our Lambda function invocation, then we would expect to require:

((EventsReceived / EventsPerSecond) * UniqueReferrerCount) / NumberOfSecondsDownsampled

((5000 / 2000 = 2.5) * 200 = 500) / 1 = 500 writes/second

We can set provisioned write IOPS on our DynamoDB table to 550 (to give ourselves a bit of headroom), and then scale up and down over time as the event rate changes. However, we don’t just require this write IOPS for the table, but also for the global secondary index on eventTime.

The accidental bottleneck

DynamoDB provisioned throughput is set on a table or index as a whole, and is divided up among many partitions that store data. The partition is selected on the basis of the hash key and at most, DynamoDB offers approximately 1000 writes/second to a single hash key value on a partition.

This won’t be a problem for our main table, because the 500 writes/second are spread over the 200 unique referrer values that make up the hash/range key for a specific downsampled time value. However, our global secondary index only has a single hash key value, eventTime, and so we write to only one partition. This will not achieve the required 500 writes/second on a given eventTime value. With the global secondary index as defined, we observe write throttling and timeouts, and because global secondary indexes are written asynchronously, we also observe an increased latency between the write to the table and the update of the secondary index.

No matter how high we provision the write IOPS on the index, we will always see throttling because we are focusing our writes to a single key (eventTime), and thus a single partition.

Addressing write bottlenecks with scattering

This problem can be solved by introducing a write pattern called scattering. Later, we review how we can re-gather this data at query time or in an asynchronous process to give us a simple model for reads.

We avoided a write bottleneck on our table because the writes are distributed across the 200 unique referrers to our site. We can use this same principle to remove the bottleneck on our global secondary index. Instead of creating the index on eventTime alone, we can convert it to a hash/range index on a new attribute ‘scatteredValue’ plus ‘eventTime’. scatteredValue is a synthetic column into which we write a random number between 0 and 99 every time we create or update a record.

This leading random value in the index enables DynamoDB to spread the writes over a larger number of partitions. We’re writing 100 unique values, so DynamoDB can scale to 100 partitions. This means that we can achieve 1000 * cardinality of scatter value (100) = 100,000 writes/second for a single eventTime value. This is much better!

If we needed even more writes/second, then we could increase to 1000 unique scatteredValue entries (0-999). Our table writes now execute without any throttling, and data in our global secondary index does not lag significantly behind the table.

Gathering scattered records together

We originally created the global secondary index on eventTime so that we could ask a query such as ‘what aggregate events happened after time N?’ Now that the index is on scatteredValue/eventTime, we can’t just query the eventTime attribute. However, we can take advantage of the DynamoDB parallel scan feature to easily gather all the relevant records together.

To do this, we create multiple worker threads in our application, who each scan several scatteredValue entries and apply an expression to the eventTime. If we use 10 parallel workers, then worker 0 must scan scatteredValue between 0 and 9, worker 1 scans values between 10 and 19, and so on, until worker 10 scans scatteredValue between 90 and 99. This can be visualized as by the following graphic:

These workers can then provide a unique list of hash/range key values on referrer/eventTime (which are always projected into a global secondary index) or, if we decided to project eventCount into the index, we could simply use this value directly with an aggregation.

For those who would like to implement a parallel gather reader, an example of this implemented in Node.js is available in the appendix.

Summary

Global secondary indexes offer a powerful mechanism to allow you to query complex data structures, but in certain cases they can result in write bottlenecks that cannot be addressed by simply increasing the write IOPS on the DynamoDB table or index. By using a random value as the hash key of a global secondary index, we can evenly spread the write load across multiple DynamoDB partitions, and then we can use this index at query time via the parallel Scan API. This approach offers us the ability to scale writes on a table with a complex index structure to any required write rate.

Further Resources

For more information about using this type of data structure, you can review Amazon Kinesis Aggregators aggregators (https://github.com/awslabs/amazon-kinesis-aggregators), a framework designed for automatic time series analysis of data streamed from Amazon Kinesis. In this codebase, DynamoDataStore (http://bit.ly/1PLLJzv) and DynamoQueryEngine (http://bit.ly/1EduDZS) implement the scatter/gather pattern, and are available for re-use in other AWS applications.

If you have questions or suggestions, please leave a comment below.

Appendix 1: Sample code to query time series data from DynamoDB

var region = process.env[‘AWS_REGION’];

var async = require(‘async’);
var aws = require(‘aws-sdk’);
aws.config.update({
region : region
});
var dynamoDB = new aws.DynamoDB({
apiVersion : ‘2012-08-10’,
region : region
});

// generate an array of integers between start and start+count
var range = function(start, count) {
return Array.apply(0, Array(count)).map(function(element, index) {
return index + start;
});
};

// function to query DynamoDB for a specific date time value and scatter prefix
var gatherQueryDDB = function(dateTimeValue, index, callback) {
var params = {
TableName : "myDynamoTable",
IndexName : "myGsiOnEventTime",
KeyConditionExpression : "scatterPrefix = :index and eventTime = :date",
ExpressionAttributeValues : {
":index" : {
N : ” + index
},
":date" : {
S : dateTimeValue
}
}
};
dynamoDB.query(params, function(err, data) {
if (err) {
callback(err);
} else {
callback(null, data.Items);
}
});
};

// function which prints the results of the worker scan
var printResults = function(err, results) {
if (err) {
console.log(JSON.stringify(err));
process.exit(-1);
} else {
results.map(function(item) {
if (item && item.length > 0) {
console.log(JSON.stringify(item));
}
});
}
};

/* set this range to the scatter prefix size */
var scatteredValues = range(0, 100);

/* set this value to how many concurrent reads to do against the table */
var concurrentReads = 20;

async.mapLimit(scatteredValues, concurrentReads, gatherQueryDDB.bind(undefined, "date query value in yyyy-MM-dd HH:mm:ss format"), printResults);

 

Related:

Powering Gaming Applications with DynamoDB