Vice has a long article about how the US military buys commercial location data worldwide.
The U.S. military is buying the granular movement data of people around the world, harvested from innocuous-seeming apps, Motherboard has learned. The most popular app among a group Motherboard analyzed connected to this sort of data sale is a Muslim prayer and Quran app that has more than 98 million downloads worldwide. Others include a Muslim dating app, a popular Craigslist app, an app for following storms, and a “level” app that can be used to help, for example, install shelves in a bedroom.
This isn’t new, this isn’t just data of non-US citizens, and this isn’t the US military. We have lots of instances where the government buys data that it cannot legally collect itself.
“My problem with contact tracing apps is that they have absolutely no value,” Bruce Schneier, a privacy expert and fellow at the Berkman Klein Center for Internet & Society at Harvard University, told BuzzFeed News. “I’m not even talking about the privacy concerns, I mean the efficacy. Does anybody think this will do something useful? … This is just something governments want to do for the hell of it. To me, it’s just techies doing techie things because they don’t know what else to do.”
I haven’t blogged about this because I thought it was obvious. But from the tweets and emails I have received, it seems not.
This is a classic identification problem, and efficacy depends on two things: false positives and false negatives.
False positives: Any app will have a precise definition of a contact: let’s say it’s less than six feet for more than ten minutes. The false positive rate is the percentage of contacts that don’t result in transmissions. This will be because of several reasons. One, the app’s location and proximity systems — based on GPS and Bluetooth — just aren’t accurate enough to capture every contact. Two, the app won’t be aware of any extenuating circumstances, like walls or partitions. And three, not every contact results in transmission; the disease has some transmission rate that’s less than 100% (and I don’t know what that is).
False negatives: This is the rate the app fails to register a contact when an infection occurs. This also will be because of several reasons. One, errors in the app’s location and proximity systems. Two, transmissions that occur from people who don’t have the app (even Singapore didn’t get above a 20% adoption rate for the app). And three, not every transmission is a result of that precisely defined contact — the virus sometimes travels further.
Assume you take the app out grocery shopping with you and it subsequently alerts you of a contact. What should you do? It’s not accurate enough for you to quarantine yourself for two weeks. And without ubiquitous, cheap, fast, and accurate testing, you can’t confirm the app’s diagnosis. So the alert is useless.
Similarly, assume you take the app out grocery shopping and it doesn’t alert you of any contact. Are you in the clear? No, you’re not. You actually have no idea if you’ve been infected.
The end result is an app that doesn’t work. People will post their bad experiences on social media, and people will read those posts and realize that the app is not to be trusted. That loss of trust is even worse than having no app at all.
It has nothing to do with privacy concerns. The idea that contact tracing can be done with an app, and not human health professionals, is just plain dumb.
At Grab, thousands of bookings happen daily via the Grab app. The driver phones and GPS devices enable us to collect large-scale GPS trajectories.
Apart from the time and location of the object, GPS trajectories are also characterised by other parameters such as speed, the headed direction, the area and distance covered during its travel, and the travelled time. Thus, the trajectory patterns from users GPS data are a valuable source of information for a wide range of urban applications, such as solving transportation problems, traffic prediction, and developing reasonable urban planning.
Currently, it’s a herculean task to create and maintain the GPS datasets since it’s costly and laborious. As a result, most of the GPS datasets available today in the market have poor coverage or contain outdated information. They cover only a small area of a city, have low sampling rates and contain less contextual information of the GPS pings such as no accuracy level, bearing, and speed.Despite over a dozen mapping communities engaged in collecting GPS trajectory datasets, a significant amount of effort would be required for data cleaning and data pre-processing in order to utilize them.
To overcome the shortfalls in the existing datasets, we built Grab-Posisi, the first GPS trajectory dataset of Southeast Asia. The term Posisi refers to a position in Bahasa. The data was collected from Grab drivers’ phones while in transit. By tackling the addition of major arterial roads in regions where existing maps have poor coverage, and the incremental improvement of coverage in regions where major roads are already mapped, Posisi substantially improves mapping productivity.
What’s inside the dataset
The whole Grab-Posisi dataset contains in total 84K trajectories that consist of more than 80 million GPS pings and cover over 1 million km. The average trajectory length is 11.94 km and the average duration per trip is 21.50 minutes.
The data were collected very recently in April 2019 with a 1 second sampling rate, which is the highest amongst all the publicly available datasets. It also has richer contextual information, including the accuracy level, bearing and speed. The accuracy level is important because GPS measurements are noisy and the true location can be anywhere inside a circle centred at the reported location with a radius equal to the accuracy level. The bearing is the horizontal direction of travel, measured in degrees relative to true north. Finally, the speed is reported in meters/second over ground.
As the GPS trajectories were collected from Grab drivers’ phones while in transit, we labelled each trajectory by phone device type being either Android or iOS. This is the first dataset which differentiates such device information. Furthermore, we also label the trajectories by driving mode (Car or Motorcycle).
All drivers’ personal information is encrypted and the real start/end locations are removed within the dataset.
Each trajectory is serialised in a file in Apache Parquet format. The whole dataset size is around 2 GB. Each GPS ping is associated with values for a trajectory ID, latitude, longitude, timestamp (UTC), accuracy level, bearing and speed. The GPS sampling rate is 1 second, which is the highest among all the existing open source datasets. Table 1 shows a sample of the dataset.
Figure 1a shows the spatial coverage of the dataset in Singapore. Compared with the GPS datasets available in the market that only cover a specific area of a city, the Grab-Posisi dataset encompasses almost the whole island of Singapore. Figure 1b depicts the GPS density in Singapore. Red represents high density while green represents low density. Expressways in Singapore are clearly visible because of their dense GPS pings.
Figure 2a illustrates that the Grab-Posisi dataset encloses not only central Jakarta but also extends to external highways. Figure 2b depicts the GPS density of cars in Jakarta. Compared with Singapore, trips in Jakarta are spread out in all different areas, not just concentrated on highways.
Applications of Grab-Posisi
The following are some of the applications of Grab-Posisi dataset.
On Map Inference
The traditional method used in updating road networks in maps is time-consuming and labour-intensive. That’s why maps might have important roads missing and real-time traffic conditions might be unavailable. To address this problem, we can use GPS trajectories in reconstructing road networks automatically.
A bunch of map generation algorithms can be applied to infer both map topology and road attributes. Figure 3b shows a snippet of the inferred map from our GPS trajectories (Figure 3a) using one of the algorithms. As you can see from the blue dots, the skeleton of the underlining map inferred is correct, although some section of the inferred road is disconnected, and at the roundabout in the bottom right corner it’s not a smooth curve.
Figure 3a. Raw GPS trajectories
Figure 3b. Inferred Map
On Map Matching
The map matching refers to the task of automatically determining the correct route where the driver has travelled on a digital map, given a sequence of raw and noisy GPS points. The correction of the raw GPS data has been important for many location-based applications such as navigation, tracking, and road attribute detection as aforementioned. The accuracy levels provided in the Grab-Posisi dataset can be of great use to address this issue.
On Traffic Detection and Forecast
In addition to the inference of a static digital map, the Grab-Posisi GPS dataset can also be used to perform real-time traffic forecasting, which is very important for congestion detection, flow control, route planning, and navigation. Some examples of the fundamental indicators that are mostly used to monitor the current status of traffic conditions include the average speed, volume, and density in each road segment. These variables can be computed based on drivers’ GPS trajectories and can be used to predict the future traffic conditions.
On Mode Detection
Transportation mode detection refers to the task of identifying the travel mode of a user (some examples of transportation mode include walk, bike, car, bus, etc.). The GPS trajectories in our dataset are associated with rich attributes including GPS accuracy, bearing, and speed in addition to the latitude and longitude of geo-coordinates, which can be used to develop mode detection models. Our dataset also provides labels for each trajectory to be collected from a car or motorcycle, which can be used to verify performance of those models.
The real-world GPS trajectories of people reveal realistic travel patterns and demands, which can be of great help for city planning. As there are some realistic constraints faced by governments such as budget limitations and construction inconvenience, it is important to incorporate both the planning authorities’ requirements and the realistic travel demands mined from trajectories for intelligent city planning. For example, the trajectories of cars can provide suggestions on how to schedule highway constructions. The trajectories of motorcycles can help the government to choose the optimal locations to construct motorcycle lanes for safety concerns.
Want to access our dataset?
Grab-Posisi dataset offers a great value and is a significant resource to the community for benchmarking and revisiting existing technologies.
If you want to access our dataset for research purposes, email [email protected] with the following details:
Your Name and contact details
Your potential usage of the dataset
When using Grab-Posisi dataset, please cite the following paper:
Huang, X., Yin, Y., Lim, S., Wang, G., Hu, B., Varadarajan, J., … & Zimmermann, R. (2019, November). Grab-Posisi: An Extensive Real-Life GPS Trajectory Dataset in Southeast Asia. In Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Prediction of Human Mobility (pp. 1-10). DOI: https://doi.org/10.1145/3356995.3364536
Note: You cannot use Grab-Posisi dataset for commercial purposes.
Grab is more than just the leading ride-hailing and mobile payments platform in Southeast Asia. We use data and technology to improve everything from transportation to payments and financial services across a region of more than 620 million people. We aspire to unlock the true potential of Southeast Asia and look for like-minded individuals to join us on this ride.
If you share our vision of driving South East Asia forward, apply to join our team today.
Long article on the manipulation of GPS in Shanghai. It seems not to be some Chinese military program, but ships who are stealing sand.
The Shanghai “crop circles,” which somehow spoof each vessel to a different false location, are something new. “I’m still puzzled by this,” says Humphreys. “I can’t get it to work out in the math. It’s an interesting mystery.” It’s also a mystery that raises the possibility of potentially deadly accidents.
“Captains and pilots have become very dependent on GPS, because it has been historically very reliable,” says Humphreys. “If it claims to be working, they rely on it and don’t double-check it all that much.”
On June 5 this year, the Run 5678, a river cargo ship, tried to overtake a smaller craft on the Huangpu, about five miles south of the Bund. The Run avoided the small ship but plowed right into the New Glory (Chinese name: Tong Yang Jingrui), a freighter heading north.
Provides an overview of the hardware and software needed to put together a home-made Chartplotter with its own GPS and AIS receiver. Cost for this project was about $350 US in 2019.
The entire build cost approximately $350. It incorporates a Raspberry Pi 3 Model B+, dAISy AIS receiver HAT, USB GPS module, and touchscreen display, all hooked up to his boat.
Perfect for navigating the often foggy San Francisco Bay, the chartplotter allows James to track the position, speed, and direction of major vessels in the area, superimposed over high-quality NOAA nautical charts.
A year ago, the Norwegian Consumer Council published an excellent security analysis of children’s GPS-connected smart watches. The security was terrible. Not only could parents track the children, anyone else could also track the children.
A recent analysis checked if anything had improved after that torrent of bad press. Short answer: no.
Guess what: a train wreck. Anyone could access the entire database, including real time child location, name, parents details etc. Not just Gator watches either — the same back end covered multiple brands and tens of thousands of watches
The Gator web backend was passing the user level as a parameter. Changing that value to another number gave super admin access throughout the platform. The system failed to validate that the user had the appropriate permission to take admin control!
This means that an attacker could get full access to all account information and all watch information. They could view any user of the system and any device on the system, including its location. They could manipulate everything and even change users’ emails/passwords to lock them out of their watch.
In fairness, upon our reporting of the vulnerability to them, Gator got it fixed in 48 hours.
This is a lesson in the limits of naming and shaming: publishing vulnerabilities in an effort to get companies to improve their security. If a company is specifically named, it is likely to improve the specific vulnerability described. But that is unlikely to translate into improved security practices in the future. If an industry, or product category, is named generally, nothing is likely to happen. This is one of the reasons I am a proponent of regulation.
Carputers! Fabrice Aneche is documenting his ongoing build, which equips an older (2011) car with some of the features a 2018 model might have: thus far, a reversing camera (bought off the shelf, with a modified GUI to show the date and the camera’s output built with Qt and Golang), GPS and offline route guidance.
We’re not sure how the car got through that little door there.
It was back in 2013, when the Raspberry Pi had been on the market for about a year, that we started to see carputer projects emerge. They tended to be focussed in two directions: in-car entertainment, and on-board diagnostics (OBD). We ended up hiring the wonderful Martin O’Hanlon, who wrote up the first OBD project we came across, just this year. Being featured on this blog can change your life, I tell you.
In the last five years, the Pi’s evolved: you’re now working with a lot more processing power, there’s onboard WiFi, and far more peripherals which can be useful in a…vehicular context are available. Consequently, the flavour of the car projects we’re seeing has changed somewhat, with navigation systems and cameras much more visible. Fabrice’s is one of the best examples we’ve found.
Night-view navigation system
GPS is all very well, but you, the human person driver, will want directions at every turn. So Fabrice wrote a user interface to serve up live maps and directions, mostly in Qt5 and QML (he’s got some interesting discussion on his website about why he stopped using X11, which turned out to be too slow for his needs). All the non-QML work is done in Go. It’s all open-source, and on GitHub, if you’d like to contribute or roll your own project. He’s also worked over the Linux GPS daemons, found them lacking, and has produced his own:
…the Linux gps daemons are using obscure and over complicated protocols so I’ve decided to write my own gps daemon in Go using a gRPC stream interface. You can find it here.
I’m also not satisfied with the map matching of OSRM for real time display, I may rewrite one using mbmatch.
We’ll be keeping an eye on this project; given how much clever has gone into it already, we’re pretty sure that Fabrice will be adding new features. Thanks Fabrice!
The goal of this project was to drop a glider from the edge of space using a high altitude weather balloon. The glider is entirely homemade and uses the opensource Pixhawk flight controller + a Raspberry Pi Zero to disconnect at the desired altitude and fly to a predetermined landing location.
Here at Pi Towers, we thoroughly enjoy the link between high-altitude balloon (HAB) enthusiasts and the Raspberry Pi community, from Dave Akerman‘s first attempt at sending a Raspberry Pi to near-space, to our own Skycademy programme training educators in high-altitude ballooning. HABs and the Pi go together like the macaroni and cheese, peanut butter and jelly, chips and gravy…you get the idea.
The RaptorTech glider
The RaptorTech team equipped their glider with a Pixhawk flight controller and the small $5 Raspberry Pi Zero to control the time point when the glider disconnects from the HAB, and to allow the glider to autonomously navigate back to a specific landing site.
They made the glider out of foam core and coroplast, with a covering of tape to waterproof the body. Inside it were two cameras, two servos, the Raspberry Pi Zero, and the Pixhawk flight controller with added GPS tracker (in case the glider got lost on the way home). The electronics were protected by handwarmers from freezing at high altitude.
The Raspberry Pi Zero ran a Python script to control the Pixhawk. At take-off, the Zero set the controller into manual mode to keep the glider from trying to fly off toward its final destination. When the glider reached a pre-determined altitude, the Zero disconnected the glider from the HAB by setting off a solid state relay to burn through the connecting wire. Then the Pi started up the flight controller to direct the glider home. You can find the code for this process here.
All systems go
Due to time limitations and weather restrictions, the RaptorTech team had to drop their glider from 10km instead of 30km as they’d planned. They were pleased to report the safe, successful return of their glider to about 10m from the pre-set landing point.
If you’d like to follow the adventures of RaptorTech, check out their Facebook page. You can also follow them on YouTube and on their website for more RC plane-based mayhem.
A note from Dave Akerman: “It’s worth pointing out that not only do all HAB flights need permission but that such permission would normally ONLY be for payloads being dropped by parachute. Free-flying gliders, planes, drones etc. are not allowed with specific permission. My understanding, from a HABber in the USA (where this flight was), is that the FAA will not provide such permission. In any case, before dropping anything from a HAB without a parachute, get specific permission first.”
The German charity Save Nemo works to protect coral reefs, and they are developing Nemo-Pi, an underwater “weather station” that monitors ocean conditions. Right now, you can vote for Save Nemo in the Google.org Impact Challenge.
The organisation says there are two major threats to coral reefs: divers, and climate change. To make diving saver for reefs, Save Nemo installs buoy anchor points where diving tour boats can anchor without damaging corals in the process.
In addition, they provide dos and don’ts for how to behave on a reef dive.
To monitor the effects of climate change, and to help divers decide whether conditions are right at a reef while they’re still on shore, Save Nemo is also in the process of perfecting Nemo-Pi.
This Raspberry Pi-powered device is made up of a buoy, a solar panel, a GPS device, a Pi, and an array of sensors. Nemo-Pi measures water conditions such as current, visibility, temperature, carbon dioxide and nitrogen oxide concentrations, and pH. It also uploads its readings live to a public webserver.
The Save Nemo team is currently doing long-term tests of Nemo-Pi off the coast of Thailand and Indonesia. They are also working on improving the device’s power consumption and durability, and testing prototypes with the Raspberry Pi Zero W.
The web dashboard showing live Nemo-Pi data
Save Nemo aims to install a network of Nemo-Pis at shallow reefs (up to 60 metres deep) in South East Asia. Then diving tour companies can check the live data online and decide day-to-day whether tours are feasible. This will lower the impact of humans on reefs and help the local flora and fauna survive.
A healthy coral reef
Nemo-Pi data may also be useful for groups lobbying for reef conservation, and for scientists and activists who want to shine a spotlight on the awful effects of climate change on sea life, such as coral bleaching caused by rising water temperatures.
A bleached coral reef
Vote now for Save Nemo
If you want to help Save Nemo in their mission today, vote for them to win the Google.org Impact Challenge:
Click “Abstimmen” in the footer of the page to vote
Click “JA” in the footer to confirm
Voting is open until 6 June. You can also follow Save Nemo on Facebook or Twitter. We think this organisation is doing valuable work, and that their projects could be expanded to reefs across the globe. It’s fantastic to see the Raspberry Pi being used to help protect ocean life.
Spencer Ackerman has this interesting story about a guy assigned to crack down on unauthorized White House leaks. It’s necessarily light on technical details, so I thought I’d write up some guesses, either as a guide for future reporters asking questions, or for people who want to better know the risks when leak information.
It should come as no surprise that your work email and phone are already monitored. They can get every email you’ve sent or received, even if you’ve deleted it. They can get every text message you’ve sent or received, the metadata of every phone call sent or received, and so forth.
To a lesser extent, this also applies to your well-known personal phone and email accounts. Law enforcement can get the metadata (which includes text messages) for these things without a warrant. In the above story, the person doing the investigation wasn’t law enforcement, but I’m not sure that’s a significant barrier if they can pass things onto the Secret Service or something.
The danger here isn’t that you used these things to leak, it’s that you’ve used these things to converse with the reporter before you made the decision to leak. That’s what happened in the Reality Winner case: she communicated with The Intercept before she allegedly leaked a printed document to them via postal mail. While it wasn’t conclusive enough to convict her, the innocent emails certainly put the investigators on her trail.
The path to leaking often starts this way: innocent actions before the decision to leak was made that will come back to haunt the person afterwards. That includes emails. That also includes Google searches. That includes websites you visit (like this one). I’m not sure how to solve this, except that if you’ve been in contact with The Intercept, and then you decide to leak, send it to anybody but The Intercept.
By the way, the other thing that caught Reality Winner is the records they had of her accessing files and printing them on a printer. Depending where you work, they may have a record of every file you’ve accessed, every intranet page you visited. Because of the way printers put secret dots on documents, investigators know precisely which printer and time the document leaked to The Intercept was printed.
Photographs suffer the same problem: your camera and phone tag the photographs with GPS coordinates and time the photograph was taken, as well as information about the camera. This accidentally exposed John McAfee’s hiding location when Vice took pictures of him a few years ago. Some people leak by taking pictures of the screen — use a camera without GPS for this (meaning, a really old camera you bought from a pawnshop).
These examples should impress upon you the dangers of not understanding technology. As soon as you do something to evade surveillance you know about, you may get caught by surveillance you don’t know about.
If you nonetheless want to continue forward, the next step may be to get a “burner phone”. You can get an adequate Android “prepaid” phone for cash at the local Walmart, electronics store, or phone store.
There’s some problems with such phones, though. They can often be tracked back to the store that sold them, and the store will have security cameras that record you making the purchase. License plate readers and GPS tracking on your existing phone may also place you at that Walmart.
I don’t know how to resolve these problems. Perhaps the best is grow a beard and on the last day of your vacation, color your hair, take a long bike/metro ride (without your existing phone) to a store many miles away and pick up a phone, then shave and change your color back again. I don’t know — there’s a good chance any lame attempt you or I might think of has already been experienced by law enforcement, so they are likely ahead of you. Maybe ask your local drug dealer where they get their burner phones, and if they can sell you one. Of course, that just means when they get caught for drug dealing, they can reduce their sentence by giving up the middle class person who bought a phone from them.
Lastly, they may age out old security videos, so simply waiting six months before using the phone might work. That means prepaying for an entire year.
Note that I’m not going to link to examples of cheap burner phones on this page. Web browsers will sometimes prefetch some information from links in a webpage, so simply including links in this page can condemn you as having interest in burner phones. You are already in enough trouble for having visited this web page.
Burner phones have GPS. Newer the technology, like the latest Android LTE phones, have pretty accurate GPS that the police can query (without a warrant). If you take the phone home and turn it on, they’ll then be able to trace back the phone to your home. Carrying the phone around with you has the same problem, with the phone’s location correlating with your existing phone (which presumably you also carry) or credit card receipts. Rumors are that Petraeus was partly brought down by tracking locations where he used his credit card, namely, matching the hotel he was in with Internet address information.
Older phones that support 3G or even 2G have poorer GPS capabilities. They’ll still located you to the nearest cell tower, but not as accurately to your exact location.
A better strategy than a burner phone would be a burner laptop computer used with WiFi. You can get a cheap one for $200 at Amazon.com. My favorite are the 11 inch ones with a full sized keyboard and Windows 10. Better yet, get an older laptop for cash from a pawn shop.
You can install chat apps on this like “Signal Desktop”, “Wire Desktop”, or “WhatsApp” that will allow you to securely communicate. Or use “Discord”, which isn’t really encrypted, but it’s popular among gamers so therefore less likely to stand out. You can sit in a bar with free WiFi and a USB headset and talk to reporters without having a phone. If the reporter you want to leak to doesn’t have those apps (either on their own laptop or phone) then you don’t want to talk to them.
Needless to say, don’t cross the streams. Don’t log onto your normal accounts like Facebook. If you create fake Facebook accounts, don’t follow the same things. Better yet, configure your browser to discard all information (especially “cookies”) every time you log off, so you can’t be tracked. Install ad blockers, or use the “Brave” web browser, to remove even more trackers. A common trick among hackers is to change the “theme” to a red background, as a constant subliminal reminder that you using your dangerous computer, and never to do anything that identifies the real you.
Put tape over the camera. I’m not sure it’s a really big danger, but put tape over the camera. If they infect you enough to get your picture, they’ve also infected you enough to record any audio on your computer. Remember that proper encryption is end-to-end (they can’t eavesdrop in transit), but if they hack the ends (your laptop, or the reporter’s) they can still record the audio.
Note that when your burner laptop is in “sleep” mode, it can still be talking to the local wifi. Before taking it home, make sure it’s off. Go into the settings and configure it so that when the lid is closed, the computer is turned completely off.
It goes without saying: don’t use that burner laptop from home. Luckily, free wifi is everyone, so the local cafe, bar, or library can be used.
The next step is to also use a VPN or Tor to mask your Internet address. If there’s an active investigation into the reporter, they’ll get the metadata, the Internet address of the bar/cafe you are coming from. A good VPN provider or especially Tor will stop this. Remember that these providers increase latency, making phone calls a bit harder, but they are a lot safer.
Remember that Ross Ulbricht (owner of dark website market Silk Road) was caught in a library. They’d traced back his Internet address and grabbed his laptop out of his hands. Having it turn off (off off, not sleep off) when the lid is closed is one way to reduce this risk. Configuring your web browser to flush all cookies and passwords on restart is another. If they catch you in mid conversation with your secret contact, though, they’ll at least be able to hear your side of the conversation, and know who you are talking to.
The best measure, though it takes some learning, is “Tails live”. It’s a Linux distribution preconfigured with Tor and various secure chat apps that’ll boot from the USB or SD card. When you turn off the computer, nothing will be saved, so there will be no evidence saved to the disk for investigators to retrieve later.
While we are talking about Tor, it should be noted that many news organizations (NYTimes, Washington Post, The Intercept, etc.) support “SecureDrop” accessed only through Tor for receiving anonymous tips. Burner laptops you use from bars from Tails is the likely your most secure way of doing things.
The point of this post was not to provide a howto guide, but to discuss many of the technological issues involved. In a story about White House people investigating leaks, I’d like to see something in this technological direction. I’d like to know exactly how they were investigating leaks. Certainly, they were investigating all work computers, accounts, and phones. Where they also able to get to non-work computers, accounts, phones? Did they have law enforcement powers? What could they do about burner phones and laptops?
In any case, if you do want a howto guide, the discussion above should put some fear into you how easily you can inadvertently make a mistake.
Data that describe processes in a spatial context are everywhere in our day-to-day lives and they dominate big data problems. Map data, for instance, whether describing networks of roads or remote sensing data from satellites, get us where we need to go. Atmospheric data from simulations and sensors underlie our weather forecasts and climate models. Devices and sensors with GPS can provide a spatial context to nearly all mobile data.
In this post, we introduce the WIND toolkit, a huge (500 TB), open weather model dataset that’s available to the world on Amazon’s cloud services. We walk through how to access this data and some of the open-source software developed to make it easily accessible. Our solution considers a subset of geospatial data that exist on a grid (raster) and explores ways to provide access to large-scale raster data from weather models. The solution uses foundational AWS services and the Hierarchical Data Format (HDF), a well adopted format for scientific data.
The approach developed here can be extended to any data that fit in an HDF5 file, which can describe sparse and dense vectors and matrices of arbitrary dimensions. This format is already popular within the physical sciences for both experimental and simulation data. We discuss solutions to gridded data storage for a massive dataset of public weather model outputs called the Wind Integration National Dataset (WIND) toolkit. We also highlight strategies that are general to other large geospatial data management problems.
Wind Integration National Dataset
As variable renewable power penetration levels increase in power systems worldwide, the importance of renewable integration studies to ensure continued economic and reliable operation of the power grid is also increasing. The WIND toolkit is the largest freely available grid integration dataset to date.
The WIND toolkit was developed by 3TIER by Vaisala. They were under a subcontract to the National Renewable Energy Laboratory (NREL) to support studies on integration of wind energy into the existing US grid. NREL is a part of a network of national laboratories for the US Department of Energy and has a mission to advance the science and engineering of energy efficiency, sustainable transportation, and renewable power technologies.
The toolkit has been used by consultants, research groups, and universities worldwide to support grid integration studies. Less traditional uses also include resource assessments for wind plants (such as those powering Amazon data centers), and studying the effects of weather on California condor migrations in the Baja peninsula.
The diversity of applications highlights the value of accessible, open public data. Yet, there’s a catch: the dataset is huge. The WIND toolkit provides simulated atmospheric (weather) data at a two-km spatial resolution and five-minute temporal resolution at multiple heights for seven years. The entire dataset is half a petabyte (500 TB) in size and is stored in the NREL High Performance Computing data center in Golden, Colorado. Making this dataset publicly available easily and in a cost-effective manner is a major challenge.
As other laboratories and public institutions work to release their data to the world, they may face similar challenges to those that we experienced. Some prior, well-intentioned efforts to release huge datasets as-is have resulted in data resources that are technically available but fundamentally unusable. They may be stored in an unintuitive format or indexed and organized to support only a subset of potential uses. Downloading hundreds of terabytes of data is often impractical. Most users don’t have access to a big data cluster (or super computer) to slice and dice the data as they need after it’s downloaded.
We aim to provide a large amount of data (50 terabytes) to the public in a way that is efficient, scalable, and easy to use. In many cases, researchers can access these huge cloud-located datasets using the same software and algorithms they have developed for smaller datasets stored locally. Only the pieces of data they need for their individual analysis must be downloaded. To make this work in practice, we worked with the HDF Group and have built upon their forthcoming Highly Scalable Data Service.
In the rest of this post, we discuss how the HSDS software was developed to use Amazon EC2 and Amazon S3 resources to provide convenient and scalable access to these huge geospatial datasets. We describe how the HSDS service has been put to work for the WIND Toolkit dataset and demonstrate how to access it using the h5pyd Python library and the REST API. We conclude with information about our ongoing work to release more ‘open’ datasets to the public using AWS services, and ways to improve and extend the HSDS with newer Amazon services like Amazon ECS and AWS Lambda.
Developing a scalable service for big geospatial data
The HDF5 file format and API have been used for many years and is an effective means of storing large scientific datasets. For example, NASA’s Earth Observing System (EOS) satellites collect more than 16 TBs of data per day using HDF5.
With the rise of the cloud, there are new challenges and opportunities to rethink how HDF5 can be enhanced to work effectively as a component in a cloud-native architecture. For the HDF Group, working with NREL has been a great opportunity to put ideas into practice with a production-size dataset.
An HDF5 file consists of a directed graph of group and dataset objects. Datasets can be thought of as a multidimensional array with support for user-defined metadata tags and compression. Typical operations on datasets would be reading or writing data to a regular subregion (a hyperslab) or reading and writing individual elements (a point selection). Also, group and dataset objects may each contain an arbitrary number of the user-defined metadata elements known as attributes.
Many people have used the HDF library in applications developed or ported to run on EC2 instances, but there are a number of constraints that often prove problematic:
The HDF5 library can’t read directly from HDF5 files stored as S3 objects. The entire file (often many GB in size) would need to be copied to local storage before the first byte can be read. Also, the instance must be configured with the appropriately sized EBS volume)
The HDF library only has access to the computational resources of the instance itself (as opposed to a cluster of instances), so many operations are bottlenecked by the library.
Any modifications to the HDF5 file would somehow have to be synchronized with changes that other instances have made to same file before writing back to S3.
Using a pattern common to many offerings from AWS, the solution to these constraints is to develop a service framework around the HDF data model. Using this model, the HDF Group has created the Highly Scalable Data Service (HSDS) that provides all the functionality that traditionally was provided by the HDF5 library. By using the service, you don’t need to manage your own file volumes, but can just read and write whatever data that you need.
Because the service manages the actual data persistence to a durable medium (S3, in this case), you don’t need to worry about disk management. Simply stream the data you need from the service as you need it. Secondly, putting the functionality behind a service allows some tricks to increase performance (described in more detail later). And lastly, HSDS allows any number of clients to access the data at the same time, enabling HDF5 to be used as a coordination mechanism for multiple readers and writers.
In designing the HSDS architecture, we gave much thought to how to achieve scalability of the HSDS service. For accessing HDF5 data, there are two different types of scaling to consider:
Multiple clients making many requests to the service
Single requests that require a significant amount of data processing
To deal with the first scaling challenge, as with most services, we considered how the service responds as the request rate increases. AWS provides some great tools that help in this regard:
Auto Scaling groups
Elastic Load Balancing load balancers
The ability of S3 to handle large aggregate throughput rates
By using a cluster of EC2 instances behind a load balancer, you can handle different client loads in a cost-effective manner.
The second scaling challenge concerns single requests that would take significant processing time with just one compute node. One example of this from the WIND toolkit would be extracting all the values in the seven-year time span for a given geographic point and dataset.
In HDF5, large datasets are typically stored as “chunks”; that is, a regular partition of the array. In HSDS, each chunk is stored as a binary object in S3. The sequential approach to retrieving the time series values would be for the service to read each chunk needed from S3, extract the needed elements, and go on to the next chunk. In this case, that would involve processing 2557 chunks, and would be quite slow.
Fortunately, with HSDS, you can speed this up quite a bit by exploiting the compute and I/O capabilities of the cluster. Upon receiving the request, the receiving node can use other nodes in the cluster to read different portions of the selection. With multiple nodes reading from S3 in parallel, performance improves as the cluster size increases.
The diagram below illustrates how this works in simplified case of four chunks and four nodes.
This architecture has worked in well in practice. In testing with the WIND toolkit and time series extraction, we observed a request latency of ~60 seconds using four nodes vs. ~5 seconds with 40 nodes. Performance roughly scales with the size of the cluster.
A planned enhancement to this is to use AWS Lambda for the worker processing. This enables 1000-way parallel reads at a reasonable cost, as you only pay for the milliseconds of CPU time used with AWS Lambda.
Public access to atmospheric data using HSDS and AWS
An early challenge in releasing the WIND toolkit data was in deciding how to subset the data for different use cases. In general, few researchers need access to the entire 0.5 PB of data and a great deal of efficiency and cost reduction can be gained by making directed constituent datasets.
NREL grid integration researchers initially extracted a 2-TB subset by selecting 120,000 points where the wind resource seemed appropriate for development. They also chose only those data important for wind applications (100-m wind speed, converted to power), the most interesting locations for those performing grid studies. To support the remaining users who needed more data resolution, we down-sampled the data to a 60-minute temporal resolution, keeping all the other variables and spatial resolution intact. This reduced dataset is 50 TB of data describing 30+ atmospheric variables of data for 7 years at a 60-minute temporal resolution.
Programmatic access is possible using the h5pyd Python library, a distributed analog to the widely used h5py library. Users interact with the datasets (variables) and slice the data from its (time x longitude x latitude) cube form as they see fit.
Examples and use cases are described in a set of Jupyter notebooks and available on GitHub:
Now you have a Jupyter notebook server running on your EC2 server.
From your laptop, create an SSH tunnel:
$ ssh –L 8888:localhost:8888 (IP address of the EC2 server)
Now, you can browse to localhost:8888 using the correct token, and interact with the notebooks as if they were local. Within the directory, there are examples for accessing the HSDS API and plotting wind and weather data using matplotlib.
Controlling access and defraying costs
A final concern is rate limiting and access control. Although the HSDS service is scalable and relatively robust, we had a few practical concerns:
How can we protect from malicious or accidental use that may lead to high egress fees (for example, someone who attempts to repeatedly download the entire dataset from S3)?
How can we keep track of who is using the data both to document the value of the data resource and to justify the costs?
If costs become too high, can we charge for some or all API use to help cover the costs?
To approach these problems, we investigated using Amazon API Gateway and its simplified integration with the AWS Marketplace for SaaS monetization as well as third-party API proxies.
In the end, we chose to use API Umbrella due to its close involvement with http://data.gov. While AWS Marketplace is a compelling option for future datasets, the decision was made to keep this dataset entirely open, at least for now. As community use and associated costs grow, we’ll likely revisit Marketplace. Meanwhile, API Umbrella provides controls for rate limiting and API key registration out of the box and was simple to implement as a front-end proxy to HSDS. Those applications that may want to charge for API use can accomplish a similar strategy using Amazon API Gateway and AWS Marketplace.
Ongoing work and other resources
As NREL and other government research labs, municipalities, and organizations try to share data with the public, we expect many of you will face similar challenges to those we have tried to approach with the architecture described in this post. Providing large datasets is one challenge. Doing so in a way that is affordable and convenient for users is an entirely more difficult goal. Using AWS cloud-native services and the existing foundation of the HDF file format has allowed us to tackle that challenge in a meaningful way.
Dr. Caleb Phillips is a senior scientist with the Data Analysis and Visualization Group within the Computational Sciences Center at the National Renewable Energy Laboratory. Caleb comes from a background in computer science systems, applied statistics, computational modeling, and optimization. His work at NREL spans the breadth of renewable energy technologies and focuses on applying modern data science techniques to data problems at scale.
Dr. Caroline Draxl is a senior scientist at NREL. She supports the research and modeling activities of the US Department of Energy from mesoscale to wind plant scale. Caroline uses mesoscale models to research wind resources in various countries, and participates in on- and offshore boundary layer research and in the coupling of the mesoscale flow features (kilometer scale) to the microscale (tens of meters). She holds a M.S. degree in Meteorology and Geophysics from the University of Innsbruck, Austria, and a PhD in Meteorology from the Technical University of Denmark.
John Readey has been a Senior Architect at The HDF Group since he joined in June 2014. His interests include web services related to HDF, applications that support the use of HDF and data visualization.Before joining The HDF Group, John worked at Amazon.com from 2006–2014 where he developed service-based systems for eCommerce and AWS.
Jordan Perr-Sauer is an RPP intern with the Data Analysis and Visualization Group within the Computational Sciences Center at the National Renewable Energy Laboratory. Jordan hopes to use his professional background in software engineering and his academic training in applied mathematics to solve the challenging problems facing America and the world.
This column is from The MagPi issue 59. You can download a PDF of the full issue for free, or subscribe to receive the print edition through your letterbox or the digital edition on your tablet. All proceeds from the print and digital editions help the Raspberry Pi Foundation achieve our charitable goals.
“Hey, world!” Estefannie exclaims, a wide grin across her face as the camera begins to roll for another YouTube tutorial video. With a growing number of followers and wonderful support from her fans, Estefannie is building a solid reputation as an online maker, creating unique, fun content accessible to all.
It’s as if she was born into performing and making for an audience, but this fun, enjoyable journey to social media stardom came not from a desire to be in front of the camera, but rather as a unique approach to her own learning. While studying, Estefannie decided the best way to confirm her knowledge of a subject was to create an educational video explaining it. If she could teach a topic successfully, she knew she’d retained the information. And so her YouTube channel, Estefannie Explains It All, came into being.
Her first videos featured pages of notes with voice-over explanations of data structure and algorithm analysis. Then she moved in front of the camera, and expanded her skills in the process.
But YouTube isn’t her only outlet. With nearly 50000 followers, Estefannie’s Instagram game is strong, adding to an increasing number of female coders taking to the platform. Across her Instagram grid, you’ll find insights into her daily routine, from programming on location for work to behind-the-scenes troubleshooting as she begins to create another tutorial video. It’s hard work, with content creation for both Instagram and YouTube forever on her mind as she continues to work and progress successfully as a software engineer.
As a thank you to her Instagram fans for helping her reach 10000 followers, Estefannie created a free game for Android and iOS called Gravitris — imagine Tetris with balance issues!
Estefannie was born and raised in Mexico, with ambitions to become a graphic designer and animator. However, a documentary on coding at Pixar, and the beauty of Merida’s hair in Brave, opened her mind to the opportunities of software engineering in animation. She altered her career path, moved to the United States, and switched to a Computer Science course.
With a constant desire to make and to learn, Estefannie combines her software engineering profession with her hobby to create fun, exciting content for YouTube.
While studying, Estefannie started a Computer Science Girls Club at the University of Houston, Texas, and she found herself eager to put more time and effort into the movement to increase the percentage of women in the industry. The club was a success, and still is to this day. While Estefannie has handed over the reins, she’s still very involved in the cause.
Through her YouTube videos, Estefannie continues the theme of inclusion, with every project offering a warm sense of approachability for all, regardless of age, gender, or skill. From exploring Scratch and Makey Makey with her young niece and nephew to creating her own Disney ‘Made with Magic’ backpack for a trip to Disney World, Florida, Estefannie’s videos are essentially a documentary of her own learning process, produced so viewers can learn with her — and learn from her mistakes — to create their own tech wonders.
Estefannie’s automated gingerbread house project was a labour of love, with electronics, wires, and candy strewn across both her living room and kitchen for weeks before completion. While she already was a skilled programmer, the world of physical digital making was still fairly new for Estefannie. Having ditched her hot glue gun in favour of a soldering iron in a previous video, she continued to experiment and try out new, interesting techniques that are now second nature to many members of the maker community. With the gingerbread house, Estefannie was able to research and apply techniques such as light controls, servos, and app making, although the latter was already firmly within her skill set. The result? A fun video of ups and downs that resulted in a wonderful, festive treat. She even gave her holiday home its own solar panel!
1,910 Likes, 43 Comments – Estefannie Explains It All (@estefanniegg) on Instagram: “A DAY AT RASPBERRY PI TOWERS!! LINK IN BIO @raspberrypifoundation”
And that’s just the beginning of her adventures with Pi…but we won’t spoil her future plans by telling you what’s coming next. Sorry! However, since this article was written last year, Estefannie has released a few more Pi-based project videos, plus some awesome interviews and live-streams with other members of the maker community such as Simone Giertz. She even made us an awesome video for our Raspberry Pi YouTube channel! So be sure to check out her latest releases.
2,264 Likes, 56 Comments – Estefannie Explains It All (@estefanniegg) on Instagram: “Best day yet!! I got to hangout, play Jenga with a huge arm robot, and have afternoon tea with…”
While many wonderful maker videos show off a project without much explanation, or expect a certain level of skill from viewers hoping to recreate the project, Estefannie’s videos exist almost within their own category. We can’t wait to see where Estefannie Explains It All goes next!
Съдът в Люксембург реши, че Uber е транспортна компания и предоставя таксиметрови услуги. Това е проблем не само за Uber, а за всички по-съвременни начини да предоставяш транспортна услуга, в това число децентрализирани варианти (например чрез блокчейн, въпреки целия ми скептицизъм към публичните такива).
За да бъдат допустими на пазара тези бизнес модели – дали Uber, дали Lyft, дали дори TaxiMe и TaxiStars, към които таксиметровите компании проявяват недоверие и се оптиват да ги изтикат за сметка на свои приложения, трябва законодателството да го позволява. Докато преди това решение Uber оперираше в (според тях) сива зона на нерегулиран бизнес, вече е ясно, че това не е така. И макар Uber да е най-популярният пример, те не са най-светлият такъв – компанията е на загуба и съвсем не е „цвете за мирисане“. Така че всичко недолу не би следвало да се разглежда като „как да узаконим Uber“, а как да не ограничаваме транспорта в градовете до „жълти коли с табелки, светлинки и таксиметрови апарати“.
Както бях писал преди време – регулациите могат да бъдат правени умно, така че да не ограничават технологични бизнес модели, за които регулаторите не са се сетили. За съжаление, Законът за автомобилните превози е доста остарял и със сигурност не допуска нищо различно от кола с таксиметров апарат с фискална памет, която можеш да си спреш на улицата. Освен това режимът, предвиден в закона е доста утежнен дори за съществуващите превозвачи. Първо, трябва да има регистрация на превозвач. След това всеки шофьор полага изпити и получава удостоверение за водач на таксиметров автомобил. Но това удостоверение на му е достатъчно – трябва да получи и разрешение от общината, която да разгледа удостоверението му и регистрацията на превозвача, чрез който ще осъщестява услугата. Не на последно място, законът предвижда общинските съвети да определят максимален брой таксита, както и разпределението им между регистрираните превозвачи. Това последното звучи доста непазарна мярка и със сигурност би ограничило някои по-иновативни модели.
Поради всичко това реших да напиша законопроект за изменение и допълнение на Закона за автомобилните превози. Докато пишех черновата, видях, че Естония вече е направила нещо такова, с доста сходен подход. Основните цели са:
Разграничаване на такситата, които можеш да си вземеш на улицата от тези, които можеш да вземеш единствено чрез диспечерска система (дали мобилно приложение, дали по друг начин, няма значение)
Допускане на измерване на разстоянието и съответно отчитането пред НАП със средства, различни от таксиметров апарат (например GPS + система, интегрирана с тази на НАП, както са направили в Естония преди време)
Улекотяване на регистрационния режим чрез премахване на разрешението от общината – общините, в които оперира даден автомобил се вписват от превозвача в регистъра на Изпълнителна агенция „Автомобилна администрация“, откъдето се черпи информация и за дължимия местен данък.
Запазване на данъка за таксиметрови превози към общините
Запазване на изискванията за техническа изправност, възраст на автомобилите, психическа годност и липса на присъди на водачите
Спазване на принципите на елекетронното управление – извличане на данните за автомобилите от регистъра на КАТ, позволяване на подаване на заявления по електронен път, включително автоматизирано, така че превозвачите да могат да интегрират вътрешните си системи за управление на автопарка с централния регистър. Премахване на задължението от носене на документи от страна на таксиметровите шофьори (като удостоверения) и проверката ми по електронен път
Премахване на централизираните изпити и обучения и заменянето им с обучителни материали (де факто прехъвлрне на отговорноста за обучение на шофьорите на превозвачите, които така или иначе имат интерес шофьорите им да не са неадекватни)
Премахване на възможността общината да определя размера на пазара и да разпределя участниците в него
Последните две точки са пожелателни, но според мен принципно важни. Ето и самият текст, с мотиви към всеки параграф:
Закон за изменение и допълнение на Закона за автомобилните превози
§1. В чл. 12а се правят следните изменения и допълнения: 1. В ал. 1, т.5 се изменя както следва: „Данни за моторните превозни средства, с които превозвачът извършва превозите: а) регистрационен номер б) дали автомобилът ще извършва таксиметров превоз единствено при повикване чрез диспечерска система в) общините, в които моторното превозно средство ще извършва превози 2. Ал. 2 се отменя; 3. Създава се нова ал. 6: „(6) Заявления за вписване и за промяна на обстоятелства в регистъра, могат да се подават по автоматизирано и по електронен път по реда на Закона за електронното управление“ 4. Създава се нова ал. 7: „(7) Обстоятелства за регистрираните автомобили, определени с наредбата по ал. 5, се извличат автоматично на база на регистрационния номер от националния регистър на пътните превозни средства по реда на Закона за електронното управление“ 5. Създава се нова ал. 8: „(8) Изпълнителна агенция „Автомобилна администрация“ извъшва автоматизирани проверки за платен данък за таксиметров превоз на пътници и заличава вписаните в регистъра моторни превозни средства, за които данъкът не е платен за съответната година“ 6. Създава се нова ал. 9: „(9) Изискванията към външния вид на автомобилите, които са регистрирани за извършване на таксиметров превоз единствено при повикване чрез диспечерска система могат да са различни от тези за останалите автомобили“ 7. Създава се нова ал. 10: „(10) Автомобилите, които са регистрирани за извършване на таксиметров превоз единствено при повикване чрез диспечерска система, нямат право да престояват на местата, обозначени за престояване на таксиметрови автомобили“
Мотиви: Регистърът трябва да съдъдржа актуална информация за автомобилите, с които се извършват таксиметров превоз. Тя трябва да може да бъде променяна по елекетронен път, чрез интеграция на информационната система на превозвача с регистъра. Достатъчно е вписването единствено на регистрационния номер на автомобилите – останалите данни следва да бъдат извличани (при нужда) от националния регистър на превозните средства в МВР, следвайки принципа на еднократното събиране на данни, заложен в Закона за електронното управление. Премахва се и ограничението за възраст на автомобила при първа регистрация като превозвач – важното изискване е автомобилите да не са над определена възраст (чл. 24). Премахването на това ограничение допуска динамичното променяне на „автопарка“ на превозвача. Поради отменените по-надолу разпоредби от чл. 24а, в регистъра в ИААА се предвижда водене и на общините, в които съответните автомобили на превозвача извършват дейност. Това се налага с оглед на плащането на данъка върху таксиметровия превоз на пътници. Въвежда се важно разграничение на автомобилите, извършващи таксиметров превоз – такива, които извършват услугата единствено при повикване чрез диспечерска система (което включва мобилни приложения и както централизирани, така и разпределени диспечерски системи) и други, които могат да бъдат спирани на пътното платено или вземани от предвидени за това места. Изрично се допуска възможността автомобилите, които няма да бъдат спирани на пътя, да имат неунифициран външен вид (напр. табела „Такси“, жълт цвят и др.), като обаче нямат право да престояват на т.нар. стоянки за таксита.
§2. В чл. 24 се правят следните изменения и допълнения: 1. в ал. 1 след думите „електронен таксиметров апарат с фискална памет“ се добавят думите „или по други начини, позволяващи точно измерване на разстояние и отчитане пред данъчната администрация“, а думите „след издаване на разшрение за таксиметров превоз на пътници“ се заменят с думите „след вписване в регистрите по чл. 12, ал. 2 и по ал. 3, т.5“. 2. в ал. 3, т.5 изменя така: „Вписан е в регистър на водачи, извършващи таксиметров превоз, воден от председателя на Изпълнителна агенция „Автомобилна администрация“ 3. ал. 4 се изменя така: „Ръководителят на съответното регионално звено на Изпълнителна агенция „Автомобилна администрация“ вписва лицата, отговарящи на изискванията по ал. 3, т. 1-4 и е декларирало, че се е запознало с обучителна информация, определена с наредбата по чл. 12а, ал. 5. Вписването се подновява на всеки 5 години по заявление на водача. 4. ал. 5 се отменя. 5. ал. 6 се изменя така: „Редът за вписването и подновяването на вписването в регистъра на водачите, извършващи таксиметров превоз, и за доказване на съответствието с изискванията по ал. 3, т. 1-4 се определя с наредбата по чл. 12а, ал. 5., като обстоятелствата, необходими за доказване на изискванията, се събират по служебен път“ 6. в ал. 17 думите „отнема със заповед удостоверението на водач на лек таксиметров автомобил“ се заменят с думите „заличава вписването на водач, извършващ таксиметрови превоз“
Мотиви: Носенето на удостоверение е излишно, при положение, че контролиращите органи имат електронен достъп в реално време до регистъра. Поради тази причина изискването за удостоверение се заменя с наличие на вписване в регистъра. Централизираните обучения не са добър механизъм за информираност на шофьорите (което е видно на практика), но създават административна тежест. Обученията и изпитите се заменят с деклариране (възможно по електронен път) от страна на водача, че се е запознал с обучителните материали. Тези материали могат да бъдат текстови или видео-уроци. Чрез въвеждане на електронни услуги, водачите ще могат отдалечено и лесно да заявяват вписване в регистъра. Въвежда се възможност за използване на алтернативни технологии на таксиметровия апарат с фискална памет, като например GPS устройства. С наредба ще бъдат определени условията за интегриране на отчетеното от тези устройства разстояние и съответна цена с данъчната администрация.
§3. В чл. 24а се правят следните изменения и допълнения: 1. ал. 1 се изменя така: „Водач, вписан в регистъра на водачи, извършващи таксиметров превоз, имат право да извършват такъв с всеки автомобил, вписан в регистъра по чл. 12, ал. 2 в рамките на общините, за които е валидно вписването“ 2. ал. 2-9 се отменят 3. ал. 10 се изменя така: „Административните органи нямат право да определят ограничения на броя таксиметрови автомобили, опериращи на територията на дадена община“ 4. ал. 11 се изменяе така: „Общинските съвети могат да определят минимални и максимални цени за таксиметров превоз на пътници за един километър пробег и за една минута престой по съответната тарифа, валидни за територията на съответната община“
Мотиви: допълнителните административни процедури извън регистрацията на превозвача и на водача са излишна административна тежест. Контролът на таксиметровия пазар от страна на общинския съвет, в т.ч. броя автомобили и тяхното разпределение между превозвачи е потенциален източник на корупция и пречи на конкуренцията. Чрез регистъра по чл. 12, ал. 2 се събира информация в коя община оперират таксиметровите автомобили. Допуска се един автомобил да оперира в повече от една община, което е приложимо например в курортните комплекси.
§4. В чл. 24б след думите „таксиметровите апарати“ се добавят думите „или другите допустими технологични средства“
Мотиви: с наредба се определят и условията за използване и отчитане на други технологични средства, например GPS устройства.
§5. В чл. 95, се правят следните допълнения: 1. В ал. 1 след думите „таксиметров апарат“ се добавят думите „или друго допустимо технологично средство за отчитане на разстояние“ 2. В ал. 2 се създава нова т.3: „3. извършва таксиметрови услуги в община, за която автомобилът, който управлява, не е регистриран в регистъра по чл. 12, ал 2“ §6. В чл. 96, ал. 4 след думите „таксиметров апарат“ се добавят думите „или друго допустимо технологично средство за отчитане на разстояние“
Разбира се, по-сложната част ще бъде коригирането на наредбите след това, включително намирането на начин за признаване на GPS координатите – ясно е, че както такситата имат „помпички“, така и GPS-ите на телефоните могат да бъдат „лъгани“.
Това е само предложение, на база на което да започне обсъждане. Далеч съм от мисълта, че мога да измисля решение на всички проблеми за един следобед. Нямам законодателна инициатива и не мога да го внеса, а и някои от точките може да не са приемливи за таксиметровия бранш, т.е. да трябва да се търсят компромиси. Все пак смятам, че допускането на повече технологични начини за осъществяване на таксиметрова услуга е добър за пазара и за клиентите.
Today, a guest post: Alasdair Davies, co-founder of Naturebytes, ZSL London’s Conservation Technology Specialist and Shuttleworth Foundation Fellow, shares the work of the Arribada Initiative. The project uses the Raspberry Pi Zero and camera module to follow the journey of green sea turtles. The footage captured from the backs of these magnificent creatures is just incredible – prepare to be blown away!
Footage from the new Arribada PS-C (pit-stop camera) video tag recently trialled on the island of Principe in unison with the Principe Trust. Engineered by Institute IRNAS (http://irnas.eu/) for the Arribada Initiative (http://blog.arribada.org/).
Access to affordable, open and customisable conservation technologies in the animal tracking world is often limited. I’ve been a conservation technologist for the past ten years, co-founding Naturebytes and working at ZSL London Zoo, and this was a problem that continued to frustrate me. It was inherently expensive to collect valuable data that was necessary to inform policy, to designate marine protected areas, or to identify threats to species.
In March this year, I got a supercharged opportunity to break through these barriers by becoming a Shuttleworth Foundation Fellow, meaning I had the time and resources to concentrate on cracking the problem. The Arribada Initiative was founded, and ten months later, the open source Arribada PS-C green sea turtle tag was born. The video above was captured two weeks ago in the waters of Principe Island, West Africa.
On route to Principe island with 10 second gen green sea #turtle tags for testing. This version has a video & accelerometer payload for behavioural studies, plus a nice wireless charging carry case made by @institute_irnas @ShuttleworthFdn
The tag comprises a Raspberry Pi Zero W sporting the Raspberry Pi camera module, a PiRA power management board, two lithium-ion cells, and a rather nice enclosure. It was built in unison with Institute IRNAS, and there’s a nice user-friendly wireless charging case to make it easy for the marine guards to replace the tags after their voyages at sea. When a tag is returned to one of the docking stations in the case, we use resin.io to manage it, download videos, and configure the tag remotely.
The tags can also be configured to take video clips at timed intervals, meaning we can now observe the presence of marine litter, plastic debris, before/after changes to the ocean environment due to nearby construction, pollution, and other threats.
Discarded fishing nets are lethal to sea turtles, so using this new tag at scale – now finally possible, as the Raspberry Pi Zero helps to drive down costs dramatically whilst retaining excellent video quality – offers real value to scientists in the field. Next year we will be releasing an optimised, affordable GPS version.
To make this all possible we had to devise a quicker method of attaching the tag to the sea turtles too, so we came up with the “pit-stop” technique (which is what the PS in the name “Arribada PS-C” stands for). Just as a Formula 1 car would visit the pits to get its tyres changed, we literally switch out the tags on the beach when nesting females return, replacing them with freshly charged tags by using a quick-release base plate.
About 6 days left now until the first tagged nesting green sea #turtles return using our latest “pit-stop” removeable / replaceable tag method. Counting down the days @arribada_i @institute_irnas
To implement the system we first epoxy the base plate to the turtle, which minimises any possible stress to the turtles as the method is quick. Once the epoxy has dried we attach the tag. When the turtle has completed its nesting cycle (they visit the beach to lay eggs three to four times in a single season, every 10–14 days on average), we simply remove the base plate to complete the field work.
If you’d like to watch more wonderful videos of the green sea turtles’ adventures, there’s an entire YouTube playlist available here. And to keep up to date with the initiative, be sure to follow Arribada and Alasdair on Twitter.
Allow your robots to join in the fun this Christmas with a round of Channel 4’s Countdown. https://www.rosietheredrobot.com/2017/12/tea-minus-30.html
Rosie the Red Robot
First, a little bit of backstory. Challenged by his eldest daughter to build a robot, technology-loving Alan got to work building Rosie.
I became (unusually) determined. I wanted to show her what can be done… and the how can be learnt later. After all, there is nothing more exciting and encouraging than seeing technology come alive. Move. Groove. Quite literally.
Originally, Rosie had a Raspberry Pi 3 brain controlling ultrasonic sensors and motors via Python. From there, she has evolved into something much grander, and Alan has documented her upgrades on the Rosie the Red Robot blog. Using GPS trackers and a Raspberry Pi camera module, she became Rosie Patrol, a rolling, walking, interactive bot; then, with further upgrades, the Tea Minus 30 project came to be. Which brings us back to Countdown.
T(ea) minus 30
In case it hasn’t been a big part of your life up until now, Countdown is one of the longest running televisions shows in history, and occupies a special place in British culture. Contestants take turns to fill a board with nine randomly selected vowels and consonants, before battling the Countdown clock to find the longest word they can in the space of 30 seconds.
I’ve had quite a few requests to show just the Countdown clock for use in school activities/own games etc., so here it is! Enjoy! It’s a brand new version too, using the 2010 Office package.
There’s a numbers round involving arithmetic, too – but for now, we’re going to focus on letters and words, because that’s where Rosie’s skills shine.
Using an online resource, Alan created a dataset of the ten thousand most common English words.
Many words, listed in order of common-ness. Alan wrote a Python script to order them alphabetically and by length
Next, Alan wrote a Python script to select nine letters at random, then search the word list to find all the words that could be spelled using only these letters. He used the randint function to select letters from a pre-loaded alphabet, and introduced a requirement to include at least two vowels among the nine letters.
Words that match the available letters are displayed on the screen.
Putting it all together
With the basic game-play working, it was time to bring the project to life. For this, Alan used Rosie’s camera module, along with optical character recognition (OCR) and text-to-speech capabilities.
Alan writes, “Here’s a very amateurish drawing to brainstorm our idea. Let’s call it a design as it makes it sound like we know what we’re doing.”
Alan’s script has Rosie take a photo of the TV screen during the Countdown letters round, then perform OCR using the Google Cloud Vision API to detect the nine letters contestants have to work with. Next, Rosie runs Alan’s code to check the letters against the ten-thousand-word dataset, converts text to speech with Python gTTS, and finally speaks her highest-scoring word via omxplayer.
You can follow the adventures of Rosie the Red Robot on her blog, or follow her on Twitter. And if you’d like to build your own Rosie, Alan has provided code and tutorials for his projects too. Thanks, Alan!
The trick in accurately tracking a person with this method is finding out what kind of activity they’re performing. Whether they’re walking, driving a car, or riding in a train or airplane, it’s pretty easy to figure out when you know what you’re looking for.
The sensors can determine how fast a person is traveling and what kind of movements they make. Moving at a slow pace in one direction indicates walking. Going a little bit quicker but turning at 90-degree angles means driving. Faster yet, we’re in train or airplane territory. Those are easy to figure out based on speed and air pressure.
After the app determines what you’re doing, it uses the information it collects from the sensors. The accelerometer relays your speed, the magnetometer tells your relation to true north, and the barometer offers up the air pressure around you and compares it to publicly available information. It checks in with The Weather Channel to compare air pressure data from the barometer to determine how far above sea level you are. Google Maps and data offered by the US Geological Survey Maps provide incredibly detailed elevation readings.
Once it has gathered all of this information and determined the mode of transportation you’re currently taking, it can then begin to narrow down where you are. For flights, four algorithms begin to estimate the target’s location and narrows down the possibilities until its error rate hits zero.
If you’re driving, it can be even easier. The app knows the time zone you’re in based on the information your phone has provided to it. It then accesses information from your barometer and magnetometer and compares it to information from publicly available maps and weather reports. After that, it keeps track of the turns you make. With each turn, the possible locations whittle down until it pinpoints exactly where you are.
To demonstrate how accurate it is, researchers did a test run in Philadelphia. It only took 12 turns before the app knew exactly where the car was.
This is a good example of how powerful synthesizing information from disparate data sources can be. We spend too much time worried about individual data collection systems, and not enough about analysis techniques of those systems.
Today we’re launching Amazon Time Sync Service, a time synchronization service delivered over Network Time Protocol (NTP) which uses a fleet of redundant satellite-connected and atomic clocks in each region to deliver a highly accurate reference clock. This service is provided at no additional charge and is immediately available in all public AWS regions to all instances running in a VPC.
You can access the service via the link local 169.254.169.123 IP address. This means you don’t need to configure external internet access and the service can be securely accessed from within your private subnets.
Chrony is a different implementation of NTP than what ntpd uses and it’s able to synchronize the system clock faster and with better accuracy than ntpd. I’d recommend using Chrony unless you have a legacy reason to use ntpd.
Installing and configuring chrony on Amazon Linux is as simple as:
Alternatively, just modify your existing NTP config by adding the line server 169.254.169.123 prefer iburst.
On Windows you can run the following commands in PowerShell or a command prompt:
net stop w32time
w32tm /config /syncfromflags:manual /manualpeerlist:"169.254.169.123"
w32tm /config /reliable:yes
net start w32time
Time is hard. Science, and society, measure time with respect to the International Celestial Reference Frame (ICRF), which is computed using long baseline interferometry of distant quasars, GPS satellite orbits, and laser ranging of the moon (cool!). Irregularities in Earth’s rate of rotation cause UTC to drift from time with respect to the ICRF. To address this clock drift the International Earth Rotation and Reference Systems (IERS) occasionally introduce an extra second into UTC to keep it within 0.9 seconds of real time.
Leap seconds are known to cause application errors and this can be a concern for many savvy developers and systems administrators. The 169.254.169.123 clock smooths out leap seconds some period of time (commonly called leap smearing) which makes it easy for your applications to deal with leap seconds.
This timely update should provide immediate benefits to anyone previously relying on an external time synchronization service.
When James Puderer moved to Lima, Peru, his roadside runs left a rather nasty taste in his mouth. Hit by the pollution from old diesel cars in the area, he decided to monitor the air quality in his new city using Raspberry Pis and the abundant taxies as his tech carriers.
With the onboard tech, the device collects data on longitude, latitude, humidity, temperature, pressure, and airborne particle count, feeding it back to an Android Things datalogger. This data is then pushed to Google IoT Core, where it can be remotely accessed.
Next, the data is processed by Google Dataflow and turned into a BigQuery table. Users can then visualize the collected measurements. And while James uses Google Maps to analyse his data, there are many tools online that will allow you to organise and study your figures depending on what final result you’re hoping to achieve.
James hopped in a taxi and took his monitor on the road, collecting results throughout the journey
James has provided the complete build process, including all tech ingredients and code, on his Hackster.io project page, and urges makers to create their own air quality monitor for their local area. He also plans on building upon the existing design by adding a 12V power hookup for connecting to the taxi, functioning lights within the sign, and companion apps for drivers.
Sensing the world around you
We’ve seen a wide variety of Raspberry Pi projects using sensors to track the world around us, such as Kasia Molga’s Human Sensor costume series, which reacts to air pollution by lighting up, and Clodagh O’Mahony’s Social Interaction Dress, which she created to judge how conversation and physical human interaction can be scored and studied.
Kasia Molga’s Human Sensor — a collection of hi-tech costumes that react to air pollution within the wearer’s environment.
Many people also build their own Pi-powered weather stations, or use the Raspberry Pi Oracle Weather Station, to measure and record conditions in their towns and cities from the roofs of schools, offices, and homes.
Have you incorporated sensors into your Raspberry Pi projects? Share your builds in the comments below or via social media by tagging us.
I play Pokémon Go. (There, I’ve admitted it.) One of the interesting aspects of the game I’ve been watching is how the game’s publisher, Niantec, deals with cheaters.
There are three basic types of cheating in Pokémon Go. The first is botting, where a computer plays the game instead of a person. The second is spoofing, which is faking GPS to convince the game that you’re somewhere you’re not. These two cheats are often used together — and you see the results in the many high-level accounts for sale on the Internet. The third type of cheating is the use of third-party apps like trackers to get extra information about the game.
None of this would matter if everyone played independently. The only reason any player cares about whether other players are cheating is that there is a group aspect of the game: gym battling. Everyone’s enjoyment of that part of the game is affected by cheaters who can pretend to be where they’re not, especially if they have lots of powerful Pokémon that they collected effortlessly.
Niantec has been trying to deal with this problem since the game debuted, mostly by banning accounts when it detects cheating. Its initial strategy was basic — algorithmically detecting impossibly fast travel between physical locations or super-human amounts of playing, and then banning those accounts — with limited success. The limiting factor in all of this is false positives. While Niantec wants to stop cheating, it doesn’t want to block or limit any legitimate players. This makes it a very difficult problem, and contributes to the balance in the attacker/defender arms race.
Recently, Niantic implemented twonewanti-cheating measures. The first is machine learning to detect cheaters. About this, we know little. The second is to limit the functionality of cheating accounts rather than ban them outright, making it harder for cheaters to know when they’ve been discovered.
“This is may very well be the beginning of Niantic’s machine learning approach to active bot countering,” user Dronpes writes on The Silph Road subreddit. “If the parameters for a shadowban are constantly adjusted server-side, as they can now easily be, then Niantic’s machine learning engineers can train their detection (classification) algorithms in ever-improving, ever more aggressive ways, and botters will constantly be forced to re-evaluate what factors may be triggering the detection.”
One of the expected future features in the game is trading. Creating a market for rare or powerful Pokémon would add a huge additional financial incentive to cheat. Unless Niantec can effectively prevent botting and spoofing, it’s unlikely to implement that feature.
Cheating detection in virtual reality games is going to be a constant problem as these games become more popular, especially if there are ways to monetize the results of cheating. This means that cheater detection will continue to be a critical component of these games’ success. Anything Niantec learns in Pokémon Go will be useful in whatever games come next.
Mystic, level 39 — if you must know.
And, yes, I know the game tracks works by tracking your location. I’m all right with that. As I repeatedly say, Internet privacy is all about trade-offs.
The collective thoughts of the interwebz
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.