Never let it be said that some makers won’t jump in at the deep end for their ambitious experiments with the Raspberry Pi. When Ievgenii Tkachenko fancied a challenge, he sought to go where few had gone before by creating an underwater drone, successfully producing a working prototype that he’s now hard at work refining.
Inspired by watching inventors on the Discovery Channel, Ievgenii has learned much from his endeavour. “For me it was a significant engineering challenge,” he says, and while he has ended up submerging himself within a process of trial-and-error, the results so far have been impressive.
Pi dive
The project began with a loose plan in Ievgenii’s head. “I knew what I should have in the project as a minimum: motions, lights, camera, and a gyroscope inside the device and smartphone control outside,” he explains. “Pretty simple, but I didn’t have a clue what equipment I would be able to use for the drone, and I was limited by finances.”
Bearing that in mind, one of his first moves was to choose a Raspberry Pi 3B, which he says was perfect for controlling the motors, diodes, and gyroscope while sending video streams from a camera and receiving commands from a control device.
The Raspberry Pi 3 sits in the housing and connects to a LiPo battery that also powers the LEDs and motors
“I was really surprised that this small board has a fully functional UNIX-based OS and that software like the Node.js server can be easily installed,” he tells us. “It has control input and output pins and there are a lot of libraries. With an Ethernet port and wireless LAN and a camera, it just felt plug-and-play. I couldn’t find a better solution.”
The LEDs are attached to radiators to prevent overheating, and a pulse driver is used for flashlight control
Working with a friend, Ievgenii sought to create suitable housing for the components, which included a twin twisted-pair wire suitable for transferring data underwater, an electric motor, an electronic speed control, an LED together with a pulse driver, and a battery. Four motors were attached to the outside of the housing, and efforts were made to ensure it was waterproof. Tests in a bath and out on a lake were conducted.
Streaming video
With a WiFi router on the shore connected to the Raspberry Pi via RJ45 connectors and an Ethernet cable, Ievgenii developed an Android application to connect to the Raspberry Pi by address and port (“as an Android developer, I’m used to working with the platform”). This also allowed movement to be controlled via the touchscreen, although he says a gamepad for Android can also be used. When it’s up and running, the Pi streams a video from the camera to the app — “live video streaming is not simple, and I spent a lot of time on the solution” — but the wired connection means the drone can only currently travel as far as the cable length allows.
The camera was placed in this transparent waterproof case attached to the front of the waterproof housing
In that sense, it’s not perfect. “It’s also hard to handle the drone, and it needs to be enhanced with an additional controls board and a few more electromotors for smooth movement,” Ievgenii admits. But as well as wanting to base the project on fast and reliable C++ code and make use of a USB 4K camera, he can see the future potential and he feels it will swim rather than sink.
“Similar drones are used for boat inspections, and they can also be used by rescue squads or for scientific purposes,” he points out. “They can be used to discover a vast marine world without training and risks too. In fact, now that I understand the Raspberry Pi, I know I can create almost anything, from a radio electronic toy car to a smart home.”
The MagPi magazine
This article was lovingly borrowed from the latest issue of The MagPi magazine. Pick up your copy of issue 80 from your local stockist, online, or by downloading the free PDF.
If you’re a subscriber to HackSpace magazine you’ll already know all about issue 10. For the rest of you who’ve yet to subscribe, issue 10 is out today!
Build a drone
Ever since Icarus flew too close to the sun, man has dreamed of flight. Thanks to brushless motors, cheaper batteries than ever before, and smaller, more powerful microcontrollers, pretty much anyone with the right know-how can build their own drone. Discover the crucial steps you need to get right; find the right motors, propellers, and chassis; then get out there while the weather is still good and soar like a PCB eagle.
Rocket-launching robot
If you prefer to keep your remote-controlled vehicles on the ground, we have an inspiring tale of how one maker combined a miniature strandbeest with our other great obsession (fire, obviously) to create a unique firework launcher. Guy Fawkes would surely be pleased.
Hardware hacking for the environment
In less frivolous project news, we’re reporting from the Okavango Delta in Botswana, where open hardware, open data, and the hard work of volunteers are giving ecologists more information about this essential wetland region. Makers are bringing science out of labs and classrooms, and putting it into the hands of citizen scientists who want to understand and protect their local environment – that’s something we should be proud of.
PCBs win prizes
The Hackaday Prize: the Academy Awards of open hardware. Enter your project today and you stand a chance of winning $50,000. The competition is fierce, so before you do, read our interview with Stephen Tranovich. Stephen is the Technical Community Lead at the Hackaday Prize and decides who gets the chance to win the glittering prizes. Learn from their words!
Food
Our editor Ben loves to eat, so this month he’s been eating lamb kebabs cooked in his home-made tandoor. This ancient cooking method is used all over the Indian subcontinent, and imparts a unique flavour with its combination of heat and steam. Best of all, you can make your own tandoor oven with a Dremel and a few plant pots.
Tutorials
Add push notifications to your letterbox (so your dog doesn’t eat your new passport), write a game for an Arduino, add a recharging pocket to a bag so you can Instagram on the go, and learn everything there is to know about capacitors. All this and more, in HackSpace magazine issue 10!
Get your copy of HackSpace magazine
If you like the sound of this month’s content, you can find HackSpace magazine in WHSmith, Tesco, Sainsbury’s, and independent newsagents in the UK. If you live in the US, check out your local Barnes & Noble, Fry’s, or Micro Center next week. We’re also shipping to stores in Australia, Hong Kong, Canada, Singapore, Belgium, and Brazil, so be sure to ask your local newsagent whether they’ll be getting HackSpace magazine. And if you’d rather try before you buy, you can always download the free PDF.
Subscribe now
“Subscribe now” may not be subtle as a marketing message, but we really think you should. You’ll get the magazine early, plus a lovely physical paper copy, which has really good battery life.
Oh, and twelve-month print subscribers get an Adafruit Circuit Playground Express loaded with inputs and sensors and ready for your next project. Tempted?
Earlier this year on 3 and 4 March, communities around the world held Raspberry Jam events to celebrate Raspberry Pi’s sixth birthday. We sent out special birthday kits to participating Jams — it was amazing to know the kits would end up in the hands of people in parts of the world very far from Raspberry Pi HQ in Cambridge, UK.
The Raspberry Jam Camer team: Damien Doumer, Eyong Etta, Loïc Dessap and Lionel Sichom, aka Lionel Tellem
Preparing for the #PiParty
One birthday kit went to Yaoundé, the capital of Cameroon. There, a team of four students in their twenties — Lionel Sichom (aka Lionel Tellem), Eyong Etta, Loïc Dessap, and Damien Doumer — were organising Yaoundé’s first Jam, called Raspberry Jam Camer, as part of the Raspberry Jam Big Birthday Weekend. The team knew one another through their shared interests and skills in electronics, robotics, and programming. Damien explains in his blog post about the Jam that they planned ahead for several activities for the Jam based on their own projects, so they could be confident of having a few things that would definitely be successful for attendees to do and see.
Show-and-tell at Raspberry Jam Cameroon
Loïc presented a Raspberry Pi–based, Android app–controlled robot arm that he had built, and Lionel coded a small video game using Scratch on Raspberry Pi while the audience watched. Damien demonstrated the possibilities of Windows 10 IoT Core on Raspberry Pi, showing how to install it, how to use it remotely, and what you can do with it, including building a simple application.
Loïc showcases the prototype robot arm he built
There was lots more too, with others discussing their own Pi projects and talking about the possibilities Raspberry Pi offers, including a Pi-controlled drone and car. Cake was a prevailing theme of the Raspberry Jam Big Birthday Weekend around the world, and Raspberry Jam Camer made sure they didn’t miss out.
Yay, birthday cake!!
A big success
Most visitors to the Jam were secondary school students, while others were university students and graduates. The majority were unfamiliar with Raspberry Pi, but all wanted to learn about Raspberry Pi and what they could do with it. Damien comments that the fact most people were new to Raspberry Pi made the event more interactive rather than creating any challenges, because the visitors were all interested in finding out about the little computer. The Jam was an all-round success, and the team was pleased with how it went:
What I liked the most was that we sensitized several people about the Raspberry Pi and what one can be capable of with such a small but powerful device. — Damien Doumer
The Jam team rounded off the event by announcing that this was the start of a Raspberry Pi community in Yaoundé. They hope that they and others will be able to organise more Jams and similar events in the area to spread the word about what people can do with Raspberry Pi, and to help them realise their ideas.
Raspberry Jam Camer gets the thumbs-up
The Raspberry Pi community in Cameroon
In a French-language interview about their Jam, the team behind Raspberry Jam Camer said they’d like programming to become the third official language of Cameroon, after French and English; their aim is to to popularise programming and digital making across Cameroonian society. Neither of these fields is very familiar to most people in Cameroon, but both are very well aligned with the country’s ambitions for development. The team is conscious of the difficulties around the emergence of information and communication technologies in the Cameroonian context; in response, they are seizing the opportunities Raspberry Pi offers to give children and young people access to modern and constantly evolving technology at low cost.
Thanks to Lionel, Eyong, Damien, and Loïc, and to everyone who helped put on a Jam for the Big Birthday Weekend! Remember, anyone can start a Jam at any time — and we provide plenty of resources to get you started. Check out the Guidebook, the Jam branding pack, our specially-made Jam activities online (in multiplelanguages), printable worksheets, and more.
What happens when you combine the Internet of Things, Machine Learning, and Edge Computing? Before I tell you, let’s review each one and discuss what AWS has to offer.
Internet of Things (IoT) – Devices that connect the physical world and the digital one. The devices, often equipped with one or more types of sensors, can be found in factories, vehicles, mines, fields, homes, and so forth. Important AWS services include AWS IoT Core, AWS IoT Analytics, AWS IoT Device Management, and Amazon FreeRTOS, along with others that you can find on the AWS IoT page.
Machine Learning (ML) – Systems that can be trained using an at-scale dataset and statistical algorithms, and used to make inferences from fresh data. At Amazon we use machine learning to drive the recommendations that you see when you shop, to optimize the paths in our fulfillment centers, fly drones, and much more. We support leading open source machine learning frameworks such as TensorFlow and MXNet, and make ML accessible and easy to use through Amazon SageMaker. We also provide Amazon Rekognition for images and for video, Amazon Lex for chatbots, and a wide array of language services for text analysis, translation, speech recognition, and text to speech.
Edge Computing – The power to have compute resources and decision-making capabilities in disparate locations, often with intermittent or no connectivity to the cloud. AWS Greengrass builds on AWS IoT, giving you the ability to run Lambda functions and keep device state in sync even when not connected to the Internet.
ML Inference at the Edge Today I would like to toss all three of these important new technologies into a blender! You can now perform Machine Learning inference at the edge using AWS Greengrass. This allows you to use the power of the AWS cloud (including fast, powerful instances equipped with GPUs) to build, train, and test your ML models before deploying them to small, low-powered, intermittently-connected IoT devices running in those factories, vehicles, mines, fields, and homes that I mentioned.
Here are a few of the many ways that you can put Greengrass ML Inference to use:
Precision Farming – With an ever-growing world population and unpredictable weather that can affect crop yields, the opportunity to use technology to increase yields is immense. Intelligent devices that are literally in the field can process images of soil, plants, pests, and crops, taking local corrective action and sending status reports to the cloud.
Physical Security – Smart devices (including the AWS DeepLens) can process images and scenes locally, looking for objects, watching for changes, and even detecting faces. When something of interest or concern arises, the device can pass the image or the video to the cloud and use Amazon Rekognition to take a closer look.
Industrial Maintenance – Smart, local monitoring can increase operational efficiency and reduce unplanned downtime. The monitors can run inference operations on power consumption, noise levels, and vibration to flag anomalies, predict failures, detect faulty equipment.
Greengrass ML Inference Overview There are several different aspects to this new AWS feature. Let’s take a look at each one:
Machine Learning Models – Precompiled TensorFlow and MXNet libraries, optimized for production use on the NVIDIA Jetson TX2 and Intel Atom devices, and development use on 32-bit Raspberry Pi devices. The optimized libraries can take advantage of GPU and FPGA hardware accelerators at the edge in order to provide fast, local inferences.
Model Building and Training – The ability to use Amazon SageMaker and other cloud-based ML tools to build, train, and test your models before deploying them to your IoT devices. To learn more about SageMaker, read Amazon SageMaker – Accelerated Machine Learning.
Model Deployment – SageMaker models can (if you give them the proper IAM permissions) be referenced directly from your Greengrass groups. You can also make use of models stored in S3 buckets. You can add a new machine learning resource to a group with a couple of clicks:
We’re making with a purpose in issue 3 of HackSpace magazine. Not only are we discovering ways in which 3D printing is helping to save resources — and in some case lives — in the developing world, we’re also going all out with recycling. While others might be content with separating their glass and plastic waste, we’re going much, much further by making useful things out of discarded old bits of rubbish you can find at your local scrapyard.
Hackspaces
We’re going to Cheltenham Hackspace to learn how to make a leather belt, to Liverpool to discover the ways in which an open-source design and some bits and bobs from IKEA are protecting our food supply, and we also take a peek through the doors of Nottingham Hackspace.
Tutorials
The new issue also has the most tutorials you’ll have seen anywhere since…well, since HackSpace magazine issue 2! Guides to 3D-printing on fabric, Arduino programming, and ESP8266 hacking are all to be found in issue 3. Plus, we’ve come up with yet another way to pipe numbers from the internet into big, red, glowing boxes — it’s what LEDs were made for.
With the addition of racing drones, an angry reindeer, and an intelligent toaster, we think we’ve definitely put together an issue you’ll enjoy.
Get your copy
The physical copy of HackSpace magazine is available at all good UK newsagents today, and you can order it online from the Raspberry Pi Press store wherever you are based. Moreover, you can download the free PDF version from our website. And if you’ve read our first two issues and enjoyed what you’ve seen, be sure to subscribe!
Write for us
Are you working on a cool project? Do you want to share your skills with the world, inspire others, and maybe show off a little? HackSpace magazine wants your article! Send an outline of your piece to us, and we’ll get back to you about including it in a future issue.
Researchers at Ben Gurion University in Beer Sheva, Israel have built a proof-of-concept system for counter-surveillance against spy drones that demonstrates a clever, if not exactly simple, way to determine whether a certain person or object is under aerial surveillance. They first generate a recognizable pattern on whatever subject — a window, say — someone might want to guard from potential surveillance. Then they remotely intercept a drone’s radio signals to look for that pattern in the streaming video the drone sends back to its operator. If they spot it, they can determine that the drone is looking at their subject.
In other words, they can see what the drone sees, pulling out their recognizable pattern from the radio signal, even without breaking the drone’s encrypted video.
The details have to do with the way drone video is compressed:
The researchers’ technique takes advantage of an efficiency feature streaming video has used for years, known as “delta frames.” Instead of encoding video as a series of raw images, it’s compressed into a series of changes from the previous image in the video. That means when a streaming video shows a still object, it transmits fewer bytes of data than when it shows one that moves or changes color.
That compression feature can reveal key information about the content of the video to someone who’s intercepting the streaming data, security researchers have shown in recent research, even when the data is encrypted.
Another new year brings with it thoughts of setting goals and targets. Thankfully, there is a new issue of Hello World packed with practical advise to set you on the road to success.
Hello World is our magazine about computing and digital making for educators, and it’s a collaboration between the Raspberry Pi Foundation and Computing at School, which is part of the British Computing Society.
In issue 4, our international panel of educators and experts recommends approaches to continuing professional development in computer science education.
Approaches to professional development, and much more
With recommendations for more professional development in the Royal Society’s report, and government funding to support this, our cover feature explores some successful approaches. In addition, the issue is packed with other great resources, guides, features, and lesson plans to support educators.
Highlights include:
The Royal Society: After the Reboot — learn about the latest report and its findings about computing education
The Cyber Games — a new programme looking for the next generation of security experts
Engaging Students with Drones
Digital Literacy: Lost in Translation?
Object-oriented Coding with Python
Get your copy of Hello World 4
Hello World is available as a free Creative Commons download for anyone around the world who is interested in computer science and digital making education. You can get the latest issue as a PDF file straight from the Hello World website.
Thanks to the very generous sponsorship of BT, we are able to offer free print copies of the magazine to serving educators in the UK. It’s for teachers, Code Club volunteers, teaching assistants, teacher trainers, and others who help children and young people learn about computing and digital making. So remember to subscribe to have your free print magazine posted directly to your home — 6000 educators have already signed up to receive theirs!
Could you write for Hello World?
By sharing your knowledge and experience of working with young people to learn about computing, computer science, and digital making in Hello World, you will help inspire others to get involved. You will also help bring the power of digital making to more and more educators and learners.
The computing education community is full of people who lend their experience to help colleagues. Contributing to Hello World is a great way to take an active part in this supportive community, and you’ll be adding to a body of free, open-source learning resources that are available for anyone to use, adapt, and share. It’s also a tremendous platform to broadcast your work: Hello World digital versions alone have been downloaded more than 50000 times!
At Opensource.com, Mike Bursell looks at blockchain security from the angle of trust. Unlike cryptocurrencies, which are pseudonymous typically, other kinds of blockchains will require mapping users to real-life identities; that raises the trust issue.
“What’s really interesting is that, if you’re thinking about moving to a permissioned blockchain or distributed ledger with permissioned actors, then you’re going to have to spend some time thinking about trust. You’re unlikely to be using a proof-of-work system for making blocks—there’s little point in a permissioned system—so who decides what comprises a “valid” block that the rest of the system should agree on? Well, you can rotate around some (or all) of the entities, or you can have a random choice, or you can elect a small number of über-trusted entities. Combinations of these schemes may also work.
If these entities all exist within one trust domain, which you control, then fine, but what if they’re distributors, or customers, or partners, or other banks, or manufacturers, or semi-autonomous drones, or vehicles in a commercial fleet? You really need to ensure that the trust relationships that you’re encoding into your implementation/deployment truly reflect the legal and IRL [in real life] trust relationships that you have with the entities that are being represented in your system.
And the problem is that, once you’ve deployed that system, it’s likely to be very difficult to backtrack, adjust, or reset the trust relationships that you’ve designed.”
ZDNet is reporting about another data leak, this one from US Army’s Intelligence and Security Command (INSCOM), which is also within to the NSA.
The disk image, when unpacked and loaded, is a snapshot of a hard drive dating back to May 2013 from a Linux-based server that forms part of a cloud-based intelligence sharing system, known as Red Disk. The project, developed by INSCOM’s Futures Directorate, was slated to complement the Army’s so-called distributed common ground system (DCGS), a legacy platform for processing and sharing intelligence, surveillance, and reconnaissance information.
[…]
Red Disk was envisioned as a highly customizable cloud system that could meet the demands of large, complex military operations. The hope was that Red Disk could provide a consistent picture from the Pentagon to deployed soldiers in the Afghan battlefield, including satellite images and video feeds from drones trained on terrorists and enemy fighters, according to a Foreign Policy report.
[…]
Red Disk was a modular, customizable, and scalable system for sharing intelligence across the battlefield, like electronic intercepts, drone footage and satellite imagery, and classified reports, for troops to access with laptops and tablets on the battlefield. Marking files found in several directories imply the disk is “top secret,” and restricted from being shared to foreign intelligence partners.
A couple of points. One, this isn’t particularly sensitive. It’s an intelligence distribution system under development. It’s not raw intelligence. Two, this doesn’t seem to be classified data. Even the article hedges, using the unofficial term of “highly sensitive.” Three, it doesn’t seem that Chris Vickery, the researcher that discovered the data, has published it.
Chris Vickery, director of cyber risk research at security firm UpGuard, found the data and informed the government of the breach in October. The storage server was subsequently secured, though its owner remains unknown.
We can’t believe that there are just few days left before re:Invent 2017. If you are attending this year, you’ll want to check out our Big Data sessions! The Big Data and Machine Learning categories are bigger than ever. As in previous years, you can find these sessions in various tracks, including Analytics & Big Data, Deep Learning Summit, Artificial Intelligence & Machine Learning, Architecture, and Databases.
We have great sessions from organizations and companies like Vanguard, Cox Automotive, Pinterest, Netflix, FINRA, Amtrak, AmazonFresh, Sysco Foods, Twilio, American Heart Association, Expedia, Esri, Nextdoor, and many more. All sessions are recorded and made available on YouTube. In addition, all slide decks from the sessions will be available on SlideShare.net after the conference.
This post highlights the sessions that will be presented as part of the Analytics & Big Data track, as well as relevant sessions from other tracks like Architecture, Artificial Intelligence & Machine Learning, and IoT. If you’re interested in Machine Learning sessions, don’t forget to check out our Guide to Machine Learning at re:Invent 2017.
This year’s session catalog contains the following breakout sessions.
Raju Gulabani, VP, Database, Analytics and AI at AWS will discuss the evolution of database and analytics services in AWS, the new database and analytics services and features we launched this year, and our vision for continued innovation in this space. We are witnessing an unprecedented growth in the amount of data collected, in many different forms. Storage, management, and analysis of this data require database services that scale and perform in ways not possible before. AWS offers a collection of database and other data services—including Amazon Aurora, Amazon DynamoDB, Amazon RDS, Amazon Redshift, Amazon ElastiCache, Amazon Kinesis, and Amazon EMR—to process, store, manage, and analyze data. In this session, we provide an overview of AWS database and analytics services and discuss how customers are using these services today.
Deep dive customer use cases
ABD401 – How Netflix Monitors Applications in Near Real-Time with Amazon Kinesis Thousands of services work in concert to deliver millions of hours of video streams to Netflix customers every day. These applications vary in size, function, and technology, but they all make use of the Netflix network to communicate. Understanding the interactions between these services is a daunting challenge both because of the sheer volume of traffic and the dynamic nature of deployments. In this session, we first discuss why Netflix chose Kinesis Streams to address these challenges at scale. We then dive deep into how Netflix uses Kinesis Streams to enrich network traffic logs and identify usage patterns in real time. Lastly, we cover how Netflix uses this system to build comprehensive dependency maps, increase network efficiency, and improve failure resiliency. From this session, you will learn how to build a real-time application monitoring system using network traffic logs and get real-time, actionable insights.
In this session, learn how Nextdoor replaced their home-grown data pipeline based on a topology of Flume nodes with a completely serverless architecture based on Kinesis and Lambda. By making these changes, they improved both the reliability of their data and the delivery times of billions of records of data to their Amazon S3–based data lake and Amazon Redshift cluster. Nextdoor is a private social networking service for neighborhoods.
ABD205 – Taking a Page Out of Ivy Tech’s Book: Using Data for Student Success Data speaks. Discover how Ivy Tech, the nation’s largest singly accredited community college, uses AWS to gather, analyze, and take action on student behavioral data for the betterment of over 3,100 students. This session outlines the process from inception to implementation across the state of Indiana and highlights how Ivy Tech’s model can be applied to your own complex business problems.
ABD207 – Leveraging AWS to Fight Financial Crime and Protect National Security Banks aren’t known to share data and collaborate with one another. But that is exactly what the Mid-Sized Bank Coalition of America (MBCA) is doing to fight digital financial crime—and protect national security. Using the AWS Cloud, the MBCA developed a shared data analytics utility that processes terabytes of non-competitive customer account, transaction, and government risk data. The intelligence produced from the data helps banks increase the efficiency of their operations, cut labor and operating costs, and reduce false positive volumes. The collective intelligence also allows greater enforcement of Anti-Money Laundering (AML) regulations by helping members detect internal risks—and identify the challenges to detecting these risks in the first place. This session demonstrates how the AWS Cloud supports the MBCA to deliver advanced data analytics, provide consistent operating models across financial institutions, reduce costs, and strengthen national security.
ABD208 – Cox Automotive Empowered to Scale with Splunk Cloud & AWS and Explores New Innovation with Amazon Kinesis Firehose In this session, learn how Cox Automotive is using Splunk Cloud for real time visibility into its AWS and hybrid environments to achieve near instantaneous MTTI, reduce auction incidents by 90%, and proactively predict outages. We also introduce a highly anticipated capability that allows you to ingest, transform, and analyze data in real time using Splunk and Amazon Kinesis Firehose to gain valuable insights from your cloud resources. It’s now quicker and easier than ever to gain access to analytics-driven infrastructure monitoring using Splunk Enterprise & Splunk Cloud.
ABD209 – Accelerating the Speed of Innovation with a Data Sciences Data & Analytics Hub at Takeda Historically, silos of data, analytics, and processes across functions, stages of development, and geography created a barrier to R&D efficiency. Gathering the right data necessary for decision-making was challenging due to issues of accessibility, trust, and timeliness. In this session, learn how Takeda is undergoing a transformation in R&D to increase the speed-to-market of high-impact therapies to improve patient lives. The Data and Analytics Hub was built, with Deloitte, to address these issues and support the efficient generation of data insights for functions such as clinical operations, clinical development, medical affairs, portfolio management, and R&D finance. In the AWS hosted data lake, this data is processed, integrated, and made available to business end users through data visualization interfaces, and to data scientists through direct connectivity. Learn how Takeda has achieved significant time reductions—from weeks to minutes—to gather and provision data that has the potential to reduce cycle times in drug development. The hub also enables more efficient operations and alignment to achieve product goals through cross functional team accountability and collaboration due to the ability to access the same cross domain data.
ABD210 – Modernizing Amtrak: Serverless Solution for Real-Time Data Capabilities As the nation’s only high-speed intercity passenger rail provider, Amtrak needs to know critical information to run their business such as: Who’s onboard any train at any time? How are booking and revenue trending? Amtrak was faced with unpredictable and often slow response times from existing databases, ranging from seconds to hours; existing booking and revenue dashboards were spreadsheet-based and manual; multiple copies of data were stored in different repositories, lacking integration and consistency; and operations and maintenance (O&M) costs were relatively high. Join us as we demonstrate how Deloitte and Amtrak successfully went live with a cloud-native operational database and analytical datamart for near-real-time reporting in under six months. We highlight the specific challenges and the modernization of architecture on an AWS native Platform as a Service (PaaS) solution. The solution includes cloud-native components such as AWS Lambda for microservices, Amazon Kinesis and AWS Data Pipeline for moving data, Amazon S3 for storage, Amazon DynamoDB for a managed NoSQL database service, and Amazon Redshift for near-real time reports and dashboards. Deloitte’s solution enabled “at scale” processing of 1 million transactions/day and up to 2K transactions/minute. It provided flexibility and scalability, largely eliminate the need for system management, and dramatically reduce operating costs. Moreover, it laid the groundwork for decommissioning legacy systems, anticipated to save at least $1M over 3 years.
ABD211 – Sysco Foods: A Journey from Too Much Data to Curated Insights In this session, we detail Sysco’s journey from a company focused on hindsight-based reporting to one focused on insights and foresight. For this shift, Sysco moved from multiple data warehouses to an AWS ecosystem, including Amazon Redshift, Amazon EMR, AWS Data Pipeline, and more. As the team at Sysco worked with Tableau, they gained agile insight across their business. Learn how Sysco decided to use AWS, how they scaled, and how they became more strategic with the AWS ecosystem and Tableau.
ABD217 – From Batch to Streaming: How Amazon Flex Uses Real-time Analytics to Deliver Packages on Time Reducing the time to get actionable insights from data is important to all businesses, and customers who employ batch data analytics tools are exploring the benefits of streaming analytics. Learn best practices to extend your architecture from data warehouses and databases to real-time solutions. Learn how to use Amazon Kinesis to get real-time data insights and integrate them with Amazon Aurora, Amazon RDS, Amazon Redshift, and Amazon S3. The Amazon Flex team describes how they used streaming analytics in their Amazon Flex mobile app, used by Amazon delivery drivers to deliver millions of packages each month on time. They discuss the architecture that enabled the move from a batch processing system to a real-time system, overcoming the challenges of migrating existing batch data to streaming data, and how to benefit from real-time analytics.
ABD218 – How EuroLeague Basketball Uses IoT Analytics to Engage Fans IoT and big data have made their way out of industrial applications, general automation, and consumer goods, and are now a valuable tool for improving consumer engagement across a number of industries, including media, entertainment, and sports. The low cost and ease of implementation of AWS analytics services and AWS IoT have allowed AGT, a leader in IoT, to develop their IoTA analytics platform. Using IoTA, AGT brought a tailored solution to EuroLeague Basketball for real-time content production and fan engagement during the 2017-18 season. In this session, we take a deep dive into how this solution is architected for secure, scalable, and highly performant data collection from athletes, coaches, and fans. We also talk about how the data is transformed into insights and integrated into a content generation pipeline. Lastly, we demonstrate how this solution can be easily adapted for other industries and applications.
ABD222 – How to Confidently Unleash Data to Meet the Needs of Your Entire Organization Where are you on the spectrum of IT leaders? Are you confident that you’re providing the technology and solutions that consistently meet or exceed the needs of your internal customers? Do your peers at the executive table see you as an innovative technology leader? Innovative IT leaders understand the value of getting data and analytics directly into the hands of decision makers, and into their own. In this session, Daren Thayne, Domo’s Chief Technology Officer, shares how innovative IT leaders are helping drive a culture change at their organizations. See how transformative it can be to have real-time access to all of the data that’ is relevant to YOUR job (including a complete view of your entire AWS environment), as well as understand how it can help you lead the way in applying that same pattern throughout your entire company
ABD303 – Developing an Insights Platform – Sysco’s Journey from Disparate Systems to Data Lake and Beyond Sysco has nearly 200 operating companies across its multiple lines of business throughout the United States, Canada, Central/South America, and Europe. As the global leader in food services, Sysco identified the need to streamline the collection, transformation, and presentation of data produced by the distributed units and systems, into a central data ecosystem. Sysco’s Business Intelligence and Analytics team addressed these requirements by creating a data lake with scalable analytics and query engines leveraging AWS services. In this session, Sysco will outline their journey from a hindsight reporting focused company to an insights driven organization. They will cover solution architecture, challenges, and lessons learned from deploying a self-service insights platform. They will also walk through the design patterns they used and how they designed the solution to provide predictive analytics using Amazon Redshift Spectrum, Amazon S3, Amazon EMR, AWS Glue, Amazon Elasticsearch Service and other AWS services.
ABD309 – How Twilio Scaled Its Data-Driven Culture As a leading cloud communications platform, Twilio has always been strongly data-driven. But as headcount and data volumes grew—and grew quickly—they faced many new challenges. One-off, static reports work when you’re a small startup, but how do you support a growth stage company to a successful IPO and beyond? Today, Twilio’s data team relies on AWS and Looker to provide data access to 700 colleagues. Departments have the data they need to make decisions, and cloud-based scale means they get answers fast. Data delivers real-business value at Twilio, providing a 360-degree view of their customer, product, and business. In this session, you hear firsthand stories directly from the Twilio data team and learn real-world tips for fostering a truly data-driven culture at scale.
ABD310 – How FINRA Secures Its Big Data and Data Science Platform on AWS FINRA uses big data and data science technologies to detect fraud, market manipulation, and insider trading across US capital markets. As a financial regulator, FINRA analyzes highly sensitive data, so information security is critical. Learn how FINRA secures its Amazon S3 Data Lake and its data science platform on Amazon EMR and Amazon Redshift, while empowering data scientists with tools they need to be effective. In addition, FINRA shares AWS security best practices, covering topics such as AMI updates, micro segmentation, encryption, key management, logging, identity and access management, and compliance.
ABD331 – Log Analytics at Expedia Using Amazon Elasticsearch Service Expedia uses Amazon Elasticsearch Service (Amazon ES) for a variety of mission-critical use cases, ranging from log aggregation to application monitoring and pricing optimization. In this session, the Expedia team reviews how they use Amazon ES and Kibana to analyze and visualize Docker startup logs, AWS CloudTrail data, and application metrics. They share best practices for architecting a scalable, secure log analytics solution using Amazon ES, so you can add new data sources almost effortlessly and get insights quickly
ABD316 – American Heart Association: Finding Cures to Heart Disease Through the Power of Technology Combining disparate datasets and making them accessible to data scientists and researchers is a prevalent challenge for many organizations, not just in healthcare research. American Heart Association (AHA) has built a data science platform using Amazon EMR, Amazon Elasticsearch Service, and other AWS services, that corrals multiple datasets and enables advanced research on phenotype and genotype datasets, aimed at curing heart diseases. In this session, we present how AHA built this platform and the key challenges they addressed with the solution. We also provide a demo of the platform, and leave you with suggestions and next steps so you can build similar solutions for your use cases
ABD319 – Tooling Up for Efficiency: DIY Solutions @ Netflix At Netflix, we have traditionally approached cloud efficiency from a human standpoint, whether it be in-person meetings with the largest service teams or manually flipping reservations. Over time, we realized that these manual processes are not scalable as the business continues to grow. Therefore, in the past year, we have focused on building out tools that allow us to make more insightful, data-driven decisions around capacity and efficiency. In this session, we discuss the DIY applications, dashboards, and processes we built to help with capacity and efficiency. We start at the ten thousand foot view to understand the unique business and cloud problems that drove us to create these products, and discuss implementation details, including the challenges encountered along the way. Tools discussed include Picsou, the successor to our AWS billing file cost analyzer; Libra, an easy-to-use reservation conversion application; and cost and efficiency dashboards that relay useful financial context to 50+ engineering teams and managers.
ABD312 – Deep Dive: Migrating Big Data Workloads to AWS Customers are migrating their analytics, data processing (ETL), and data science workloads running on Apache Hadoop, Spark, and data warehouse appliances from on-premise deployments to AWS in order to save costs, increase availability, and improve performance. AWS offers a broad set of analytics services, including solutions for batch processing, stream processing, machine learning, data workflow orchestration, and data warehousing. This session will focus on identifying the components and workflows in your current environment; and providing the best practices to migrate these workloads to the right AWS data analytics product. We will cover services such as Amazon EMR, Amazon Athena, Amazon Redshift, Amazon Kinesis, and more. We will also feature Vanguard, an American investment management company based in Malvern, Pennsylvania with over $4.4 trillion in assets under management. Ritesh Shah, Sr. Program Manager for Cloud Analytics Program at Vanguard, will describe how they orchestrated their migration to AWS analytics services, including Hadoop and Spark workloads to Amazon EMR. Ritesh will highlight the technical challenges they faced and overcame along the way, as well as share common recommendations and tuning tips to accelerate the time to production.
ABD402 – How Esri Optimizes Massive Image Archives for Analytics in the Cloud Petabyte scale archives of satellites, planes, and drones imagery continue to grow exponentially. They mostly exist as semi-structured data, but they are only valuable when accessed and processed by a wide range of products for both visualization and analysis. This session provides an overview of how ArcGIS indexes and structures data so that any part of it can be quickly accessed, processed, and analyzed by reading only the minimum amount of data needed for the task. In this session, we share best practices for structuring and compressing massive datasets in Amazon S3, so it can be analyzed efficiently. We also review a number of different image formats, including GeoTIFF (used for the Public Datasets on AWS program, Landsat on AWS), cloud optimized GeoTIFF, MRF, and CRF as well as different compression approaches to show the effect on processing performance. Finally, we provide examples of how this technology has been used to help image processing and analysis for the response to Hurricane Harvey.
ABD329 – A Look Under the Hood – How Amazon.com Uses AWS Services for Analytics at Massive Scale Amazon’s consumer business continues to grow, and so does the volume of data and the number and complexity of the analytics done in support of the business. In this session, we talk about how Amazon.com uses AWS technologies to build a scalable environment for data and analytics. We look at how Amazon is evolving the world of data warehousing with a combination of a data lake and parallel, scalable compute engines such as Amazon EMR and Amazon Redshift.
ABD327 – Migrating Your Traditional Data Warehouse to a Modern Data Lake In this session, we discuss the latest features of Amazon Redshift and Redshift Spectrum, and take a deep dive into its architecture and inner workings. We share many of the recent availability, performance, and management enhancements and how they improve your end user experience. You also hear from 21st Century Fox, who presents a case study of their fast migration from an on-premises data warehouse to Amazon Redshift. Learn how they are expanding their data warehouse to a data lake that encompasses multiple data sources and data formats. This architecture helps them tie together siloed business units and get actionable 360-degree insights across their consumer base. MCL202 – Ally Bank & Cognizant: Transforming Customer Experience Using Amazon Alexa Given the increasing popularity of natural language interfaces such as Voice as User technology or conversational artificial intelligence (AI), Ally® Bank was looking to interact with customers by enabling direct transactions through conversation or voice. They also needed to develop a capability that allows third parties to connect to the bank securely for information sharing and exchange, using oAuth, an authentication protocol seen as the future of secure banking technology. Cognizant’s Architecture team partnered with Ally Bank’s Enterprise Architecture group and identified the right product for oAuth integration with Amazon Alexa and third-party technologies. In this session, we discuss how building products with conversational AI helps Ally Bank offer an innovative customer experience; increase retention through improved data-driven personalization; increase the efficiency and convenience of customer service; and gain deep insights into customer needs through data analysis and predictive analytics to offer new products and services.
MCL317 – Orchestrating Machine Learning Training for Netflix Recommendations At Netflix, we use machine learning (ML) algorithms extensively to recommend relevant titles to our 100+ million members based on their tastes. Everything on the member home page is an evidence-driven, A/B-tested experience that we roll out backed by ML models. These models are trained using Meson, our workflow orchestration system. Meson distinguishes itself from other workflow engines by handling more sophisticated execution graphs, such as loops and parameterized fan-outs. Meson can schedule Spark jobs, Docker containers, bash scripts, gists of Scala code, and more. Meson also provides a rich visual interface for monitoring active workflows and inspecting execution logs. It has a powerful Scala DSL for authoring workflows as well as the REST API. In this session, we focus on how Meson trains recommendation ML models in production, and how we have re-architected it to scale up for a growing need of broad ETL applications within Netflix. As a driver for this change, we have had to evolve the persistence layer for Meson. We talk about how we migrated from Cassandra to Amazon RDS backed by Amazon Aurora
MCL350 – Humans vs. the Machines: How Pinterest Uses Amazon Mechanical Turk’s Worker Community to Improve Machine Learning Ever since the term “crowdsourcing” was coined in 2006, it’s been a buzzword for technology companies and social institutions. In the technology sector, crowdsourcing is instrumental for verifying machine learning algorithms, which, in turn, improves the user’s experience. In this session, we explore how Pinterest adapted to an increased reliability on human evaluation to improve their product, with a focus on how they’ve integrated with Mechanical Turk’s platform. This presentation is aimed at engineers, analysts, program managers, and product managers who are interested in how companies rely on Mechanical Turk’s human evaluation platform to better understand content and improve machine learning algorithms. The discussion focuses on the analysis and product decisions related to building a high quality crowdsourcing system that takes advantage of Mechanical Turk’s powerful worker community.
ABD201 – Big Data Architectural Patterns and Best Practices on AWS In this session, we simplify big data processing as a data bus comprising various stages: collect, store, process, analyze, and visualize. Next, we discuss how to choose the right technology in each stage based on criteria such as data structure, query latency, cost, request rate, item size, data volume, durability, and so on. Finally, we provide reference architectures, design patterns, and best practices for assembling these technologies to solve your big data problems at the right cost
ABD202 – Best Practices for Building Serverless Big Data Applications Serverless technologies let you build and scale applications and services rapidly without the need to provision or manage servers. In this session, we show you how to incorporate serverless concepts into your big data architectures. We explore the concepts behind and benefits of serverless architectures for big data, looking at design patterns to ingest, store, process, and visualize your data. Along the way, we explain when and how you can use serverless technologies to streamline data processing, minimize infrastructure management, and improve agility and robustness and share a reference architecture using a combination of cloud and open source technologies to solve your big data problems. Topics include: use cases and best practices for serverless big data applications; leveraging AWS technologies such as Amazon DynamoDB, Amazon S3, Amazon Kinesis, AWS Lambda, Amazon Athena, and Amazon EMR; and serverless ETL, event processing, ad hoc analysis, and real-time analytics.
ABD206 – Building Visualizations and Dashboards with Amazon QuickSight Just as a picture is worth a thousand words, a visual is worth a thousand data points. A key aspect of our ability to gain insights from our data is to look for patterns, and these patterns are often not evident when we simply look at data in tables. The right visualization will help you gain a deeper understanding in a much quicker timeframe. In this session, we will show you how to quickly and easily visualize your data using Amazon QuickSight. We will show you how you can connect to data sources, generate custom metrics and calculations, create comprehensive business dashboards with various chart types, and setup filters and drill downs to slice and dice the data.
ABD203 – Real-Time Streaming Applications on AWS: Use Cases and Patterns To win in the marketplace and provide differentiated customer experiences, businesses need to be able to use live data in real time to facilitate fast decision making. In this session, you learn common streaming data processing use cases and architectures. First, we give an overview of streaming data and AWS streaming data capabilities. Next, we look at a few customer examples and their real-time streaming applications. Finally, we walk through common architectures and design patterns of top streaming data use cases.
ABD213 – How to Build a Data Lake with AWS Glue Data Catalog As data volumes grow and customers store more data on AWS, they often have valuable data that is not easily discoverable and available for analytics. The AWS Glue Data Catalog provides a central view of your data lake, making data readily available for analytics. We introduce key features of the AWS Glue Data Catalog and its use cases. Learn how crawlers can automatically discover your data, extract relevant metadata, and add it as table definitions to the AWS Glue Data Catalog. We will also explore the integration between AWS Glue Data Catalog and Amazon Athena, Amazon EMR, and Amazon Redshift Spectrum.
ABD214 – Real-time User Insights for Mobile and Web Applications with Amazon Pinpoint With customers demanding relevant and real-time experiences across a range of devices, digital businesses are looking to gather user data at scale, understand this data, and respond to customer needs instantly. This requires tools that can record large volumes of user data in a structured fashion, and then instantly make this data available to generate insights. In this session, we demonstrate how you can use Amazon Pinpoint to capture user data in a structured yet flexible manner. Further, we demonstrate how this data can be set up for instant consumption using services like Amazon Kinesis Firehose and Amazon Redshift. We walk through example data based on real world scenarios, to illustrate how Amazon Pinpoint lets you easily organize millions of events, record them in real-time, and store them for further analysis.
ABD223 – IT Innovators: New Technology for Leveraging Data to Enable Agility, Innovation, and Business Optimization Companies of all sizes are looking for technology to efficiently leverage data and their existing IT investments to stay competitive and understand where to find new growth. Regardless of where companies are in their data-driven journey, they face greater demands for information by customers, prospects, partners, vendors and employees. All stakeholders inside and outside the organization want information on-demand or in “real time”, available anywhere on any device. They want to use it to optimize business outcomes without having to rely on complex software tools or human gatekeepers to relevant information. Learn how IT innovators at companies such as MasterCard, Jefferson Health, and TELUS are using Domo’s Business Cloud to help their organizations more effectively leverage data at scale.
ABD301 – Analyzing Streaming Data in Real Time with Amazon Kinesis Amazon Kinesis makes it easy to collect, process, and analyze real-time, streaming data so you can get timely insights and react quickly to new information. In this session, we present an end-to-end streaming data solution using Kinesis Streams for data ingestion, Kinesis Analytics for real-time processing, and Kinesis Firehose for persistence. We review in detail how to write SQL queries using streaming data and discuss best practices to optimize and monitor your Kinesis Analytics applications. Lastly, we discuss how to estimate the cost of the entire system
ABD302 – Real-Time Data Exploration and Analytics with Amazon Elasticsearch Service and Kibana In this session, we use Apache web logs as example and show you how to build an end-to-end analytics solution. First, we cover how to configure an Amazon ES cluster and ingest data using Amazon Kinesis Firehose. We look at best practices for choosing instance types, storage options, shard counts, and index rotations based on the throughput of incoming data. Then we demonstrate how to set up a Kibana dashboard and build custom dashboard widgets. Finally, we review approaches for generating custom, ad-hoc reports.
ABD304 – Best Practices for Data Warehousing with Amazon Redshift & Redshift Spectrum Most companies are over-run with data, yet they lack critical insights to make timely and accurate business decisions. They are missing the opportunity to combine large amounts of new, unstructured big data that resides outside their data warehouse with trusted, structured data inside their data warehouse. In this session, we take an in-depth look at how modern data warehousing blends and analyzes all your data, inside and outside your data warehouse without moving the data, to give you deeper insights to run your business. We will cover best practices on how to design optimal schemas, load data efficiently, and optimize your queries to deliver high throughput and performance.
ABD305 – Design Patterns and Best Practices for Data Analytics with Amazon EMR Amazon EMR is one of the largest Hadoop operators in the world, enabling customers to run ETL, machine learning, real-time processing, data science, and low-latency SQL at petabyte scale. In this session, we introduce you to Amazon EMR design patterns such as using Amazon S3 instead of HDFS, taking advantage of both long and short-lived clusters, and other Amazon EMR architectural best practices. We talk about lowering cost with Auto Scaling and Spot Instances, and security best practices for encryption and fine-grained access control. Finally, we dive into some of our recent launches to keep you current on our latest features.
ABD307 – Deep Analytics for Global AWS Marketing Organization To meet the needs of the global marketing organization, the AWS marketing analytics team built a scalable platform that allows the data science team to deliver custom econometric and machine learning models for end user self-service. To meet data security standards, we use end-to-end data encryption and different AWS services such as Amazon Redshift, Amazon RDS, Amazon S3, Amazon EMR with Apache Spark and Auto Scaling. In this session, you see real examples of how we have scaled and automated critical analysis, such as calculating the impact of marketing programs like re:Invent and prioritizing leads for our sales teams.
ABD311 – Deploying Business Analytics at Enterprise Scale with Amazon QuickSight One of the biggest tradeoffs customers usually make when deploying BI solutions at scale is agility versus governance. Large-scale BI implementations with the right governance structure can take months to design and deploy. In this session, learn how you can avoid making this tradeoff using Amazon QuickSight. Learn how to easily deploy Amazon QuickSight to thousands of users using Active Directory and Federated SSO, while securely accessing your data sources in Amazon VPCs or on-premises. We also cover how to control access to your datasets, implement row-level security, create scheduled email reports, and audit access to your data.
ABD315 – Building Serverless ETL Pipelines with AWS Glue Organizations need to gain insight and knowledge from a growing number of Internet of Things (IoT), APIs, clickstreams, unstructured and log data sources. However, organizations are also often limited by legacy data warehouses and ETL processes that were designed for transactional data. In this session, we introduce key ETL features of AWS Glue, cover common use cases ranging from scheduled nightly data warehouse loads to near real-time, event-driven ETL flows for your data lake. We discuss how to build scalable, efficient, and serverless ETL pipelines using AWS Glue. Additionally, Merck will share how they built an end-to-end ETL pipeline for their application release management system, and launched it in production in less than a week using AWS Glue.
ABD318 – Architecting a data lake with Amazon S3, Amazon Kinesis, and Amazon Athena Learn how to architect a data lake where different teams within your organization can publish and consume data in a self-service manner. As organizations aim to become more data-driven, data engineering teams have to build architectures that can cater to the needs of diverse users – from developers, to business analysts, to data scientists. Each of these user groups employs different tools, have different data needs and access data in different ways. In this talk, we will dive deep into assembling a data lake using Amazon S3, Amazon Kinesis, Amazon Athena, Amazon EMR, and AWS Glue. The session will feature Mohit Rao, Architect and Integration lead at Atlassian, the maker of products such as JIRA, Confluence, and Stride. First, we will look at a couple of common architectures for building a data lake. Then we will show how Atlassian built a self-service data lake, where any team within the company can publish a dataset to be consumed by a broad set of users.
Companies have valuable data that they may not be analyzing due to the complexity, scalability, and performance issues of loading the data into their data warehouse. However, with the right tools, you can extend your analytics to query data in your data lake—with no loading required. Amazon Redshift Spectrum extends the analytic power of Amazon Redshift beyond data stored in your data warehouse to run SQL queries directly against vast amounts of unstructured data in your Amazon S3 data lake. This gives you the freedom to store your data where you want, in the format you want, and have it available for analytics when you need it. Join a discussion with AWS solution architects to ask question.
ABD330 – Combining Batch and Stream Processing to Get the Best of Both Worlds Today, many architects and developers are looking to build solutions that integrate batch and real-time data processing, and deliver the best of both approaches. Lambda architecture (not to be confused with the AWS Lambda service) is a design pattern that leverages both batch and real-time processing within a single solution to meet the latency, accuracy, and throughput requirements of big data use cases. Come join us for a discussion on how to implement Lambda architecture (batch, speed, and serving layers) and best practices for data processing, loading, and performance tuning
ABD335 – Real-Time Anomaly Detection Using Amazon Kinesis Amazon Kinesis Analytics offers a built-in machine learning algorithm that you can use to easily detect anomalies in your VPC network traffic and improve security monitoring. Join us for an interactive discussion on how to stream your VPC flow Logs to Amazon Kinesis Streams and identify anomalies using Kinesis Analytics.
ABD339 – Deep Dive and Best Practices for Amazon Athena Amazon Athena is an interactive query service that enables you to process data directly from Amazon S3 without the need for infrastructure. Since its launch at re:invent 2016, several organizations have adopted Athena as the central tool to process all their data. In this talk, we dive deep into the most common use cases, including working with other AWS services. We review the best practices for creating tables and partitions and performance optimizations. We also dive into how Athena handles security, authorization, and authentication. Lastly, we hear from a customer who has reduced costs and improved time to market by deploying Athena across their organization.
We look forward to meeting you at re:Invent 2017!
About the Author
Roy Ben-Alta is a solution architect and principal business development manager at Amazon Web Services in New York. He focuses on Data Analytics and ML Technologies, working with AWS customers to build innovative data-driven products.
In August, four US Senators introduced a bill designed to improve Internet of Things (IoT) security. The IoT Cybersecurity Improvement Act of 2017 is a modest piece of legislation. It doesn’t regulate the IoT market. It doesn’t single out any industries for particular attention, or force any companies to do anything. It doesn’t even modify the liability laws for embedded software. Companies can continue to sell IoT devices with whatever lousy security they want.
What the bill does do is leverage the government’s buying power to nudge the market: any IoT product that the government buys must meet minimum security standards. It requires vendors to ensure that devices can not only be patched, but are patched in an authenticated and timely manner; don’t have unchangeable default passwords; and are free from known vulnerabilities. It’s about as low a security bar as you can set, and that it will considerably improve security speaks volumes about the current state of IoT security. (Full disclosure: I helped draft some of the bill’s security requirements.)
The bill would also modify the Computer Fraud and Abuse and the Digital Millennium Copyright Acts to allow security researchers to study the security of IoT devices purchased by the government. It’s a far narrower exemption than our industry needs. But it’s a good first step, which is probably the best thing you can say about this legislation.
However, it’s unlikely this first step will even be taken. I am writing this column in August, and have no doubt that the bill will have gone nowhere by the time you read it in October or later. If hearings are held, they won’t matter. The bill won’t have been voted on by any committee, and it won’t be on any legislative calendar. The odds of this bill becoming law are zero. And that’s not just because of current politics — I’d be equally pessimistic under the Obama administration.
But the situation is critical. The Internet is dangerous — and the IoT gives it not just eyes and ears, but also hands and feet. Security vulnerabilities, exploits, and attacks that once affected only bits and bytes now affect flesh and blood.
Markets, as we’ve repeatedly learned over the past century, are terrible mechanisms for improving the safety of products and services. It was true for automobile, food, restaurant, airplane, fire, and financial-instrument safety. The reasons are complicated, but basically, sellers don’t compete on safety features because buyers can’t efficiently differentiate products based on safety considerations. The race-to-the-bottom mechanism that markets use to minimize prices also minimizes quality. Without government intervention, the IoT remains dangerously insecure.
The US government has no appetite for intervention, so we won’t see serious safety and security regulations, a new federal agency, or better liability laws. We might have a better chance in the EU. Depending on how the General Data Protection Regulation on data privacy pans out, the EU might pass a similar security law in 5 years. No other country has a large enough market share to make a difference.
Sometimes we can opt out of the IoT, but that option is becoming increasingly rare. Last year, I tried and failed to purchase a new car without an Internet connection. In a few years, it’s going to be nearly impossible to not be multiply connected to the IoT. And our biggest IoT security risks will stem not from devices we have a market relationship with, but from everyone else’s cars, cameras, routers, drones, and so on.
We can try to shop our ideals and demand more security, but companies don’t compete on IoT safety — and we security experts aren’t a large enough market force to make a difference.
We need a Plan B, although I’m not sure what that is. E-mail me if you have any ideas.
This essay previously appeared in the September/October issue of IEEE Security & Privacy.
This very interesting essay looks at the future of military robotics and finds many analogs in nature:
Imagine a low-cost drone with the range of a Canada goose, a bird that can cover 1,500 miles in a single day at an average speed of 60 miles per hour. Planet Earth profiled a single flock of snow geese, birds that make similar marathon journeys, albeit slower. The flock of six-pound snow geese was so large it formed a sky-darkening cloud 12 miles long. How would an aircraft carrier battlegroup respond to an attack from millions of aerial kamikaze explosive drones that, like geese, can fly hundreds of miles? A single aircraft carrier costs billions of dollars, and the United States relies heavily on its ten aircraft carrier strike groups to project power around the globe. But as military robots match more capabilities found in nature, some of the major systems and strategies upon which U.S. national security currently relies — perhaps even the fearsome aircraft carrier strike group — might experience the same sort of technological disruption that the smartphone revolution brought about in the consumer world.
Technological advances change the world. That’s partly because of what they are, but even more because of the social changes they enable. New technologies upend power balances. They give groups new capabilities, increased effectiveness, and new defenses. The Internet decades have been a never-ending series of these upendings. We’ve seen existing industries fall and new industries rise. We’ve seen governments become more powerful in some areas and less in others. We’ve seen the rise of a new form of governance: a multi-stakeholder model where skilled individuals can have more power than multinational corporations or major governments.
Among the many power struggles, there is one type I want to particularly highlight: the battles between the nimble individuals who start using a new technology first, and the slower organizations that come along later.
In general, the unempowered are the first to benefit from new technologies: hackers, dissidents, marginalized groups, criminals, and so on. When they first encountered the Internet, it was transformative. Suddenly, they had access to technologies for dissemination, coordination, organization, and action — things that were impossibly hard before. This can be incredibly empowering. In the early decades of the Internet, we saw it in the rise of Usenet discussion forums and special-interest mailing lists, in how the Internet routed around censorship, and how Internet governance bypassed traditional government and corporate models. More recently, we saw it in the SOPA/PIPA debate of 2011-12, the Gezi protests in Turkey and the various “color” revolutions, and the rising use of crowdfunding. These technologies can invert power dynamics, even in the presence of government surveillance and censorship.
But that’s just half the story. Technology magnifies power in general, but the rates of adoption are different. Criminals, dissidents, the unorganized — all outliers — are more agile. They can make use of new technologies faster, and can magnify their collective power because of it. But when the already-powerful big institutions finally figured out how to use the Internet, they had more raw power to magnify.
This is true for both governments and corporations. We now know that governments all over the world are militarizing the Internet, using it for surveillance, censorship, and propaganda. Large corporations are using it to control what we can do and see, and the rise of winner-take-all distribution systems only exacerbates this.
This is the fundamental tension at the heart of the Internet, and information-based technology in general. The unempowered are more efficient at leveraging new technology, while the powerful have more raw power to leverage. These two trends lead to a battle between the quick and the strong: the quick who can make use of new power faster, and the strong who can make use of that same power more effectively.
This battle is playing out today in many different areas of information technology. You can see it in the security vs. surveillance battles between criminals and the FBI, or dissidents and the Chinese government. You can see it in the battles between content pirates and various media organizations. You can see it where social-media giants and Internet-commerce giants battle against new upstarts. You can see it in politics, where the newer Internet-aware organizations fight with the older, more established, political organizations. You can even see it in warfare, where a small cadre of military can keep a country under perpetual bombardment — using drones — with no risk to the attackers.
This battle is fundamental to Cory Doctorow’s new novel Walkaway. Our heroes represent the quick: those who have checked out of traditional society, and thrive because easy access to 3D printers enables them to eschew traditional notions of property. Their enemy is the strong: the traditional government institutions that exert their power mostly because they can. This battle rages through most of the book, as the quick embrace ever-new technologies and the strong struggle to catch up.
It’s easy to root for the quick, both in Doctorow’s book and in the real world. And while I’m not going to give away Doctorow’s ending — and I don’t know enough to predict how it will play out in the real world — right now, trends favor the strong.
Centralized infrastructure favors traditional power, and the Internet is becoming more centralized. This is true both at the endpoints, where companies like Facebook, Apple, Google, and Amazon control much of how we interact with information. It’s also true in the middle, where companies like Comcast increasingly control how information gets to us. It’s true in countries like Russia and China that increasingly legislate their own national agenda onto their pieces of the Internet. And it’s even true in countries like the US and the UK, that increasingly legislate more government surveillance capabilities.
At the 1996 World Economic Forum, cyber-libertarian John Perry Barlow issued his “Declaration of the Independence of Cyberspace,” telling the assembled world leaders and titans of Industry: “You have no moral right to rule us, nor do you possess any methods of enforcement that we have true reason to fear.” Many of us believed him a scant 20 years ago, but today those words ring hollow.
But if history is any guide, these things are cyclic. In another 20 years, even newer technologies — both the ones Doctorow focuses on and the ones no one can predict — could easily tip the balance back in favor of the quick. Whether that will result in more of a utopia or a dystopia depends partly on these technologies, but even more on the social changes resulting from these technologies. I’m short-term pessimistic but long-term optimistic.
What do you do when you have almost half a petabyte (PB) of data? That’s the situation in which Michael Oskierko finds himself. He’s a self-proclaimed digital pack rat who’s amassed more than 390 terabytes (TB) total, and it’s continuing to grow.
Based in Texas, Michael Oskierko is a financial analyst by day. But he’s set up one of the biggest personal data warehouses we’ve seen. The Oskierko family has a huge collection of photos, videos, documents and more – much more than most of us. Heck, more data than many companies have.
How Did It Get Like This?
“There was a moment when we were pregnant with our second child,” Michael explained. “I guess it was a nesting instinct. I was looking at pictures of our first child and played them back on a 4K monitor. It was grainy and choppy.”
Disappointed with the quality of those early images, he vowed to store future memories in a pristine state. “I got a DSLR that took great pictures and saved everything in RAW format. That’s about 30 MB per image right there.”
Michael says he now has close to 1 million photos (from many different devices, not just the DSLR) and about 200,000 videos stored in their original formats. Michael says that video footage from his drone alone occupies about 300 GB.
The Oskierkos are also avid music listeners: iTunes counts 707 days’ worth of music in their library at present. Michael keeps Green Day’s entire library on heavy rotation, with a lot of other alternative rock a few clicks away. His wife’s musical tastes are quite broad, ranging from ghetto rap to gospel. They’re also avid audiobook listeners, and it all adds up: Dozens more TB of shared storage space dedicated to audio files.
What’s more, he’s kept very careful digital records of stuff that otherwise might have gotten tossed to the curbside years ago. “I have every single note, test, project, and assignment from 7th grade through graduate school scanned and archived,” he tells us. He’s even scanned his textbooks from high school and college!
I started cutting these up and scanning the pages before the nifty ‘Scan to PDF’ was a real widespread option and duplexing scanners were expensive,” he said.
One of the biggest uses of space isn’t something that Michael needs constant access to, but he’s happy to have when the need arises. As a hobbyist programmer who works in multiple languages and on different platforms, Michael maintains a library of uncompressed disk images (ISOs) which he uses as needed.
When you have this much storage, it’s silly to get greedy with it. Michael operates his sprawling setup as a personal cloud for his family members, as well.
“I have a few hosted websites, and everyone in my family has a preconfigured FTP client to connect to my servers,” he said.
Bargain Hunting For Big Storage
How do you get 390 TB without spending a mint? Michael says it’s all about finding the right deals. The whole thing got started when a former boss asked if Michael would be interested in buying the assets of his shuttered computer repair business. Michael ended up with an inventory of parts which he’s successfully scavenged into the beginning of his 390 TB digital empire.
He’s augmented and improved that over time, evolving his digital library over six distinct storage systems that he’s used to maintaining all of his family’s personal data. He keeps an eye out wherever he can for good deals.
“There are a few IT support and service places I pass by on my daily commute to work,” he said. He stops in periodically to check if they’re blowing out inventory. Ebay and other online auction sites are great places for him to find deals.
“I just bought 100 1 TB drives from a guy on eBay for $4 each,” he said.
Michael has outgrown and retired a bunch of devices over the years as his storage empire has grown, but he keeps an orderly collection of parts and supplies for when he has to make some repairs.
How To Manage Large Directories: Keep It Simple
“I thoroughly enjoy data archiving and organizing,” Michael said. Perhaps a massive understatement. While he’s looked at Digital Asset Management (DAM) software and other tools to manage his ever-growing library, Michael prefers a more straightforward approach to figuring out what’s where. His focus is on a simplified directory structure.
“I would have to say I spend about 2 hours a week just going through files and sorting things out but it’s fun for me,” Michael said. “There are essentially five top-level directories.”
Documents, installs, disk images, music, and a general storage directory comprise the highest hierarchy. “I don’t put files in folders with other folders,” he explained. “The problem I run into is figuring out where to go for old archives that are spread across multiple machines.”
How To Back Up That Much Data
Even though he has a high-speed fiber optic connection to the Internet, Michael doesn’t want to use it all for backup. So much of his local backup and duplication is done using cloning and Windows’ built-in Xcopy tool, which he manages using home-grown batch files.
Michael also relies on Backblaze Personal Backup for mission-critical data on his family’s personal systems. “I recommend it to everyone I talk to,” he said.
In addition to loads of available local storage for backups, three of his Michael’s personal computers back up to Backblaze. He makes them accessible to family members who want the peace of mind of cloud-based backup. He’s also set up Backblaze for his father in law’s business and his mother’s personal computer.
“I let Backblaze do all the heavy lifting,” he said. “If you ever have a failure, Backblaze will have a copy we can restore.”
Thanks from all of us at Backblaze for spreading the love, Michael!
What’s Next?
The 390 TB is spread across six systems, which has led to some logistical difficulties for Michael, like remembering to power up the right one to get what he needs (he doesn’t typically run everything all the time to help conserve electricity).
“Sometimes I have to sit there and think, ‘Where did I store my drone footage,’” Michael said.
To simplify things, Michael is trying to consolidate his setup. And to that end, he recently acquired a decommissioned Storage Pod from Backblaze. He said he plans to populate the 45-bay Pod with as large hard drives as he can afford, which will hopefully make it simpler, easier and more efficient to store all that data.
Well, as soon as he can find a great deal on 8 TB and 10 TB drives, anyway. Keep checking eBay, Michael, and stay in touch! We can’t wait to see what your Storage Pod will look like in action!
This column is from The MagPi issue 53. You can download a PDF of the full issue for free, or subscribe to receive the print edition in your mailbox or the digital edition on your tablet. All proceeds from the print and digital editions help the Raspberry Pi Foundation achieve its charitable goals.
Let’s Robot streams twice a week, Tuesdays and Thursdays, and allows the general public to control a team of robots within an interactive set, often consisting of mazes, clues, challenges, and even the occasional foe. Users work together via the Twitch.tv platform, sending instructions to the robots in order to navigate their terrain and complete the set objectives.
Let’s Robot aims to change the way we interact with television, putting the viewer in the driving seat.
Aylobot, the first robot of the project, boasts a LEGO body, while Ninabot, the somewhat 2.0 upgrade of the two, has a gripper, allowing more interaction from users. Both robots have their own cameras that stream to Twitch, so that those in control can see what they’re up to on a more personal level; several new additions have joined the robot team since then, each with their own unique skill.
Twice a week, the robots are controlled by the viewers, allowing them the chance to complete tasks such as force-feeding the intern, attempting to write party invitations, and battling in boss fights.
Jillian Ogle
Let’s Robot is the brainchild of Jillian Ogle, who originally set out to make “the world’s first interactive live show using telepresence robots collaboratively controlled by the audience”. However, Jill discovered quite quickly that the robots needed to complete the project simply didn’t exist to the standard required… and so Let’s Robot was born.
After researching various components for the task, Jill decided upon the Raspberry Pi, and it’s this small SBC that now exists within the bodies of Aylobot, Ninabot, and the rest of the Let’s Robot family.
“Post-Its I drew for our #LetsRobot subscribers. We put these in the physical sets made for the robots. I still have a lot more to draw…”
In her previous life, Jill worked in art and game design, including a role as art director for Playdom, a subsidiary of Disney Interactive; she moved on to found Aylo Games in 2013 and Let’s Robot in 2015. The hardware side of the builds has been something of a recently discovered skill, with Jill admitting, “Anything I know about hardware I’ve picked up in the last two years while developing this project.”
73 Likes, 3 Comments – Jillian Ogle (@letsjill) on Instagram: “This was my first ever drone flight, live on #twitch. I think it went well. #letsrobot #robot…”
Social media funtimes
More recently, as Let’s Robot continues to grow, Jill can be found sharing the antics of the robots across social media, documenting their quests – such as the hilarious attempt to create party invites and the more recent Hillarybot vs Trumpbot balloon head battle, where robots with extendable pin-mounted arms fight to pop each other’s head.
400 Likes, 2 Comments – Jillian Ogle (@letsjill) on Instagram: “Last night was the robot presidential debate, and here is an early version of candidate #Trump bot….”
Gotta catch ’em all
Alongside the robots, Jill has created several other projects that both add to the interactive experience of Let’s Robot and comment on other elements of social trends out in the world. Most notably, there is the Pokémon Go Robot, originally a robot arm that would simulate the throw of an on-screen Poké Ball. It later grew wheels and took to the outside world, hunting down its pocket monster prey.
Originally sitting on a desk, the Pokémon Go Robot earned itself a new upgrade, gaining the body of a rover to allow it to handle the terrain of the outside world. Paired with the Livestream Goggles, viewers can join in the fun.
It’s also worth noting other builds, such as the WiFi Livestream Goggles that Jill can be seen sporting across several social media posts. The goggles, with a Pi camera fitted between the wearer’s eyes, allow viewers to witness Jill’s work from her perspective. It’s a great build, especially given how open the Let’s Robot team are about their continued work and progression.
The WiFi-enabled helmet allows viewers the ability to see what Jill sees, offering a new perspective alongside the Let’s Robot bots. The Raspberry Pi camera fits perfectly between the eyes, bringing a true eye level to the viewer. She also created internet-controlled LED eyebrows… see the video!
And finally, one project we are eager to see completed is the ‘in production’ Pi-powered transparent HUD. By incorporating refractive acrylic, Jill aims to create a see-through display that allows her to read user comments via the Twitch live-stream chat, without having to turn her eyes to a separate monitor
Since the publication of this article in The MagPi magazine, Jill and the Let’s Robot team have continued to grow their project. There are some interesting and exciting developments ahead – we’ll cover their progress in a future blog.
Researchers have configured two computers to talk to each other using a laser and a scanner.
Scanners work by detecting reflected light on their glass pane. The light creates a charge that the scanner translates into binary, which gets converted into an image. But scanners are sensitive to any changes of light in a room — even when paper is on the glass pane or when the light source is infrared — which changes the charges that get converted to binary. This means signals can be sent through the scanner by flashing light at its glass pane using either a visible light source or an infrared laser that is invisible to human eyes.
There are a couple of caveats to the attack — the malware to decode the signals has to already be installed on a system on the network, and the lid on the scanner has to be at least partially open to receive the light. It’s not unusual for workers to leave scanner lids open after using them, however, and an attacker could also pay a cleaning crew or other worker to leave the lid open at night.
The setup is that there’s malware on the computer connected to the scanner, and that computer isn’t on the Internet. This technique allows an attacker to communicate with that computer. For extra coolness, the laser can be mounted on a drone.
By continuing to use the site, you agree to the use of cookies. more information
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.