Tag Archives: google maps

Facebook and Cambridge Analytica

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/03/facebook_and_ca.html

In the wake of the Cambridge Analytica scandal, news articles and commentators have focused on what Facebook knows about us. A lot, it turns out. It collects data from our posts, our likes, our photos, things we type and delete without posting, and things we do while not on Facebook and even when we’re offline. It buys data about us from others. And it can infer even more: our sexual orientation, political beliefs, relationship status, drug use, and other personality traits — even if we didn’t take the personality test that Cambridge Analytica developed.

But for every article about Facebook’s creepy stalker behavior, thousands of other companies are breathing a collective sigh of relief that it’s Facebook and not them in the spotlight. Because while Facebook is one of the biggest players in this space, there are thousands of other companies that spy on and manipulate us for profit.

Harvard Business School professor Shoshana Zuboff calls it “surveillance capitalism.” And as creepy as Facebook is turning out to be, the entire industry is far creepier. It has existed in secret far too long, and it’s up to lawmakers to force these companies into the public spotlight, where we can all decide if this is how we want society to operate and — if not — what to do about it.

There are 2,500 to 4,000 data brokers in the United States whose business is buying and selling our personal data. Last year, Equifax was in the news when hackers stole personal information on 150 million people, including Social Security numbers, birth dates, addresses, and driver’s license numbers.

You certainly didn’t give it permission to collect any of that information. Equifax is one of those thousands of data brokers, most of them you’ve never heard of, selling your personal information without your knowledge or consent to pretty much anyone who will pay for it.

Surveillance capitalism takes this one step further. Companies like Facebook and Google offer you free services in exchange for your data. Google’s surveillance isn’t in the news, but it’s startlingly intimate. We never lie to our search engines. Our interests and curiosities, hopes and fears, desires and sexual proclivities, are all collected and saved. Add to that the websites we visit that Google tracks through its advertising network, our Gmail accounts, our movements via Google Maps, and what it can collect from our smartphones.

That phone is probably the most intimate surveillance device ever invented. It tracks our location continuously, so it knows where we live, where we work, and where we spend our time. It’s the first and last thing we check in a day, so it knows when we wake up and when we go to sleep. We all have one, so it knows who we sleep with. Uber used just some of that information to detect one-night stands; your smartphone provider and any app you allow to collect location data knows a lot more.

Surveillance capitalism drives much of the internet. It’s behind most of the “free” services, and many of the paid ones as well. Its goal is psychological manipulation, in the form of personalized advertising to persuade you to buy something or do something, like vote for a candidate. And while the individualized profile-driven manipulation exposed by Cambridge Analytica feels abhorrent, it’s really no different from what every company wants in the end. This is why all your personal information is collected, and this is why it is so valuable. Companies that can understand it can use it against you.

None of this is new. The media has been reporting on surveillance capitalism for years. In 2015, I wrote a book about it. Back in 2010, the Wall Street Journal published an award-winning two-year series about how people are tracked both online and offline, titled “What They Know.”

Surveillance capitalism is deeply embedded in our increasingly computerized society, and if the extent of it came to light there would be broad demands for limits and regulation. But because this industry can largely operate in secret, only occasionally exposed after a data breach or investigative report, we remain mostly ignorant of its reach.

This might change soon. In 2016, the European Union passed the comprehensive General Data Protection Regulation, or GDPR. The details of the law are far too complex to explain here, but some of the things it mandates are that personal data of EU citizens can only be collected and saved for “specific, explicit, and legitimate purposes,” and only with explicit consent of the user. Consent can’t be buried in the terms and conditions, nor can it be assumed unless the user opts in. This law will take effect in May, and companies worldwide are bracing for its enforcement.

Because pretty much all surveillance capitalism companies collect data on Europeans, this will expose the industry like nothing else. Here’s just one example. In preparation for this law, PayPal quietly published a list of over 600 companies it might share your personal data with. What will it be like when every company has to publish this sort of information, and explicitly explain how it’s using your personal data? We’re about to find out.

In the wake of this scandal, even Mark Zuckerberg said that his industry probably should be regulated, although he’s certainly not wishing for the sorts of comprehensive regulation the GDPR is bringing to Europe.

He’s right. Surveillance capitalism has operated without constraints for far too long. And advances in both big data analysis and artificial intelligence will make tomorrow’s applications far creepier than today’s. Regulation is the only answer.

The first step to any regulation is transparency. Who has our data? Is it accurate? What are they doing with it? Who are they selling it to? How are they securing it? Can we delete it? I don’t see any hope of Congress passing a GDPR-like data protection law anytime soon, but it’s not too far-fetched to demand laws requiring these companies to be more transparent in what they’re doing.

One of the responses to the Cambridge Analytica scandal is that people are deleting their Facebook accounts. It’s hard to do right, and doesn’t do anything about the data that Facebook collects about people who don’t use Facebook. But it’s a start. The market can put pressure on these companies to reduce their spying on us, but it can only do that if we force the industry out of its secret shadows.

This essay previously appeared on CNN.com.

EDITED TO ADD (4/2): Slashdot thread.

Не е само Фейсбук – следят ви хиляди компании

Post Syndicated from Григор original http://www.gatchev.info/blog/?p=2127


Попаднах днес на статия в CNN от Брюс Шнайер – експерт по ИТ сигурност, който уважавам изключително много. След кратък размисъл реших, че ще я преведа и пусна тук. Да, нарушение на копирайт е – но е толкова важно и така нужно да се знае от всеки, че съм склонен да поема риска.

—-

Събудени от скандала с Cambridge Analytica, новинарски статии и коментатори се съсредоточиха върху това какво знае Facebook за нас. Оказва се, че е много. Събира данни от нашите публикации, лайковете ни, снимките ни. От нещата, които започваме да пишем, но се отказваме да ги публикуваме. Дори неща, които правим, докато не сме логнати в него – или дори когато сме офлайн. Купува информация за нас от други фирми. И може да разбере от тях за нас и много повече – сексуалната ни ориентация, политическите ни възгледи, дали сме обвързани, какви лекарства пием и много личностови характеристики. Дори ако не сме попълвали личностовия тест на Cambridge Analytica. (Чрез който тази фирма източваше данните ни и данните на приятелите ни – Григор.)

Но при всяка статия, която описва колко гадно ни следи Facebook, хиляди други компании въздъхват с облекчение. Че тези статии обсъждат Facebook, а не тях. Защото е вярно, че Facebook е един от най-големите играчи в тази игра, но и хиляди други фирми ни шпионират и манипулират за своя финансова изгода.

Професорката в Харвард Бизнес Скул Шошана Лубоф нарича това „следящ капитализъм“. Колкото и плашещ да се оказва Facebook, цялостната тази индустрия е далеч по-плашеща. Тя съществува тайно от вече прекалено дълго време, и е крайно време законодателите да изкарат тези фирми пред очите на публиката. За да можем всички да решим така ли искаме да действа нашето общество, и ако не, какво следва да се направи.

В Съединените Щати има между 2500 и 4000 брокери, които се занимават с това да купуват и продават нашите лични данни. Преди година в новините попадна Equifax, когато хакери откраднаха от нея личната информация на 150 милиона души, включително номерата на социалните им осигуровки, рождените дати, адресите и номерата на шофьорските им книжки. (Тоест, абсолютно всичко, което в САЩ идентифицира дадена личност – Григор.)

Вие с гаранция не сте давали на тази фирма разрешение да събира личната ви информация. Equifax е една от хилядите фирми – брокери на информация, за повечето от които дори не сте чували. И които продават личната ви информация без ваше знание и съгласие на всеки, който плати.

Следящият капитализъм докарва нещата още по-нататък. Фирми като Google и Facebook ви предлагат безплатни услуги в замяна на информацията ви. За следенето на Google не съобщават в новините, но то е стряскащо прилепчиво. Ние никога не лъжем търсачките, които използвате. Интересите и любопитствата ни, надеждите и страховете ни, желанията и сексуалните привличания – всичко това бива събирано и съхранявано. Добавете към това следенето кои уебсайтове посещаваме, което Google извършва чрез рекламната си мрежа, нашите акаунти в Gmail, движението ни по Google Maps и каквото може да събере от смартфоните ни.

Телефонът вероятно е най-ефективното проследяващо устройство, създадено някога. Той непрекъснато следи къде се намираме, така че знае къде живеем, работим и прекарваме времето си. Той е първото и последното нещо, което поглеждаме през деня, така че знае кога се събуждаме и кога заспиваме. Всички го имаме, така че той знае и с кого спим. Юбер използва част от тази информация, за да открива забежките за по една нощ. Мобилният ви доставчик и всяко приложение в телефона, което има достъп до услугите за локация, знаят много повече.

Следящият капитализъм поддържа немалка част от Интернет. Той стои зад повечето „безплатни“ услуги, а и зад много платени. Целта му е да ви манипулира психологически, например чрез таргетирано рекламиране, за да ви убеди да купите нещо или да направите нещо, например да гласувате за определен кандидат. Масовото, базирано на индивидуални профили манипулиране, което бяха хванати да вършат Cambridge Analytica, може да звучи отвратително – но това е, което в края на краищата се бори да постигне всяка компания. Вашата лична информация бива събирана именно затова, и то е, което я прави ценна. Компаниите, които разбират това, могат да я използват срещу вас.

Нищо от това не е новост. Медиите съобщават за следящия капитализъм от години. През 2015 г. написах книга на тази тема. Още през 2010 г. Уолстрийт Джърнъл публикува двегодишна серия статии, спечелила награди, как хората биват следени онлайн и офлайн, под заглавието „Какво знаят те“.

Следящият капитализъм е дълбоко вграден в нашето все по-компютризирано общество, и ако размерите му излязат на бял свят, ще има масов натиск за ограничения и регулации. Но тъй като тази индустрия оперира предимно тайно, и само отвреме навреме изтича информация за някоя кражба на данни или някоя разследваща статия, повечето от нас остават в неизвестност за всеобхватността ѝ.

Това може скоро да се промени. През 2016 г. Европейският съюз прие широкообхватния регламент General Data Protection Regulation (GDPR). Подробностите в този закон са прекалено сложни, за да бъдат обяснени тук. Някои от нещата, които той постановява, са че личните данни на европейски граждани могат да бъдат събирани и съхранявани само за „специфични, изрично обявени и легитимни цели“, и единствено с изричното съгласие на потребителя. Съгласието не може да бъде заровено из условията за използване (EULA) на това или онова, нито пък може да бъде смятано за дадено, ако потребителят не го изключи изрично. Законът влиза в сила през май, и компаниите по целия свят се подготвят да влязат в съгласие с него.

Тъй като на практика всеки следящ капитализъм събира данните на европейци, това ще освети тази индустрия като нищо досега. Ето ви само един пример. Подготвяйки се за този закон, PayPal тихомълком публикува списък на над 600 компании, с които вашата информация може да бъде споделяна. Какво ще стане, когато на всяка компания се наложи да публикува тази информация и да обясни изрично как използва личните ни данни? Очертава се да разберем.

Когато този скандал се надигна, дори Марк Цукърбърг каза, че вероятно неговата индустрия трябва да бъде регулирана. Надали обаче си е пожелавал сериозна и принципна регулация, каквато GDPR въвежда в сила в Европа.

Прав е. Следящият капитализъм е действал безконтролно твърде дълго време. И напредъкът както в анализа на големи количества лични данни, така и в изкуствения интелект ще направят утрешните му приложения далеч по-плашещи от днешните. Единственото спасение от това е регулацията.

Първата стъпка към всяка регулация е прозрачността. Кой има нашите данни? Точни ли са? Какво прави той с тях? Продава ли ги? Как ги опазва? Можем ли да го накараме да ги изтрие? Не виждам никаква надежда Конгресът на САЩ да прокара в обозримото бъдеще закон за защита на данните, подобен на GDPR, но не е непредставимо да се въведат закони, които задължават тези компании да бъдат по-прозрачни в действията си.

Едно от последствията на скандала с Cambridge Analytica е, че някои хора си изтриха акаунтите във Facebook. Това е трудно да се направи както трябва, и не изтрива данните, които Facebook събира за хора без акаунт в него. Но все е някакво начало. Пазарът може да окаже натиск върху тези компании да ограничат следенето ни, но само ако накараме тази индустрия да излезе от сенките на светло.

—-

И аз съм писал по въпроса, още през 2012 г. – тук и тук. Не зная дали тогава помогнах на някого да разбере нещо – май по-скоро не, ако съдя по коментарите под тях. Не зная дали ще помогна на някого и сега. Но се чувствам длъжен да опитам.

Ако съдя по личните си наблюдения, нито една от онлайн фирмите, която ви дава възможност да си изтриете информацията от нея, не я изтрива истински. Знам случаи, когато информацията от „изтрити“ акаунти бива откривана в пакети продадени впоследствие лични данни. Знам и фирми, които твърдят, че не съхраняват никаква лична информация, но всъщност съхраняват и продават на всеки платежоспособен и мълчалив всеки бит от нея, който успеят да докопат. Така че съм скептичен, че регулацията ще постигне кой знае колко.

Има обаче нещо, което можем сами да направим за себе си – и то е просто да спрем да използваме които услуги на подобни събирачи можем. Без мобилен телефон трудно се живее, но трябва ли да имате акаунти във всички социални мрежи, за които сте чували? И т.н.

Надявам се след този скандал проектите за децентрализиран и опазващ личната информация аналог на Facebook да получат нов тласък. Надявам се и поне този 1% от хората, които не са идеално кръгли идиоти във вакуум, да почнат да вземат някакви мерки да опазят себе си и своите начинания.

Tracking People Without GPS

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2017/12/tracking_people_5.html

Interesting research:

The trick in accurately tracking a person with this method is finding out what kind of activity they’re performing. Whether they’re walking, driving a car, or riding in a train or airplane, it’s pretty easy to figure out when you know what you’re looking for.

The sensors can determine how fast a person is traveling and what kind of movements they make. Moving at a slow pace in one direction indicates walking. Going a little bit quicker but turning at 90-degree angles means driving. Faster yet, we’re in train or airplane territory. Those are easy to figure out based on speed and air pressure.

After the app determines what you’re doing, it uses the information it collects from the sensors. The accelerometer relays your speed, the magnetometer tells your relation to true north, and the barometer offers up the air pressure around you and compares it to publicly available information. It checks in with The Weather Channel to compare air pressure data from the barometer to determine how far above sea level you are. Google Maps and data offered by the US Geological Survey Maps provide incredibly detailed elevation readings.

Once it has gathered all of this information and determined the mode of transportation you’re currently taking, it can then begin to narrow down where you are. For flights, four algorithms begin to estimate the target’s location and narrows down the possibilities until its error rate hits zero.

If you’re driving, it can be even easier. The app knows the time zone you’re in based on the information your phone has provided to it. It then accesses information from your barometer and magnetometer and compares it to information from publicly available maps and weather reports. After that, it keeps track of the turns you make. With each turn, the possible locations whittle down until it pinpoints exactly where you are.

To demonstrate how accurate it is, researchers did a test run in Philadelphia. It only took 12 turns before the app knew exactly where the car was.

This is a good example of how powerful synthesizing information from disparate data sources can be. We spend too much time worried about individual data collection systems, and not enough about analysis techniques of those systems.

Research paper.

GDPR – A Practical Guide For Developers

Post Syndicated from Bozho original https://techblog.bozho.net/gdpr-practical-guide-developers/

You’ve probably heard about GDPR. The new European data protection regulation that applies practically to everyone. Especially if you are working in a big company, it’s most likely that there’s already a process for gettign your systems in compliance with the regulation.

The regulation is basically a law that must be followed in all European countries (but also applies to non-EU companies that have users in the EU). In this particular case, it applies to companies that are not registered in Europe, but are having European customers. So that’s most companies. I will not go into yet another “12 facts about GDPR” or “7 myths about GDPR” posts/whitepapers, as they are often aimed at managers or legal people. Instead, I’ll focus on what GDPR means for developers.

Why am I qualified to do that? A few reasons – I was advisor to the deputy prime minister of a EU country, and because of that I’ve been both exposed and myself wrote some legislation. I’m familiar with the “legalese” and how the regulatory framework operates in general. I’m also a privacy advocate and I’ve been writing about GDPR-related stuff in the past, i.e. “before it was cool” (protecting sensitive data, the right to be forgotten). And finally, I’m currently working on a project that (among other things) aims to help with covering some GDPR aspects.

I’ll try to be a bit more comprehensive this time and cover as many aspects of the regulation that concern developers as I can. And while developers will mostly be concerned about how the systems they are working on have to change, it’s not unlikely that a less informed manager storms in in late spring, realizing GDPR is going to be in force tomorrow, asking “what should we do to get our system/website compliant”.

The rights of the user/client (referred to as “data subject” in the regulation) that I think are relevant for developers are: the right to erasure (the right to be forgotten/deleted from the system), right to restriction of processing (you still keep the data, but mark it as “restricted” and don’t touch it without further consent by the user), the right to data portability (the ability to export one’s data), the right to rectification (the ability to get personal data fixed), the right to be informed (getting human-readable information, rather than long terms and conditions), the right of access (the user should be able to see all the data you have about them), the right to data portability (the user should be able to get a machine-readable dump of their data).

Additionally, the relevant basic principles are: data minimization (one should not collect more data than necessary), integrity and confidentiality (all security measures to protect data that you can think of + measures to guarantee that the data has not been inappropriately modified).

Even further, the regulation requires certain processes to be in place within an organization (of more than 250 employees or if a significant amount of data is processed), and those include keeping a record of all types of processing activities carried out, including transfers to processors (3rd parties), which includes cloud service providers. None of the other requirements of the regulation have an exception depending on the organization size, so “I’m small, GDPR does not concern me” is a myth.

It is important to know what “personal data” is. Basically, it’s every piece of data that can be used to uniquely identify a person or data that is about an already identified person. It’s data that the user has explicitly provided, but also data that you have collected about them from either 3rd parties or based on their activities on the site (what they’ve been looking at, what they’ve purchased, etc.)

Having said that, I’ll list a number of features that will have to be implemented and some hints on how to do that, followed by some do’s and don’t’s.

  • “Forget me” – you should have a method that takes a userId and deletes all personal data about that user (in case they have been collected on the basis of consent, and not due to contract enforcement or legal obligation). It is actually useful for integration tests to have that feature (to cleanup after the test), but it may be hard to implement depending on the data model. In a regular data model, deleting a record may be easy, but some foreign keys may be violated. That means you have two options – either make sure you allow nullable foreign keys (for example an order usually has a reference to the user that made it, but when the user requests his data be deleted, you can set the userId to null), or make sure you delete all related data (e.g. via cascades). This may not be desirable, e.g. if the order is used to track available quantities or for accounting purposes. It’s a bit trickier for event-sourcing data models, or in extreme cases, ones that include some sort of blcokchain/hash chain/tamper-evident data structure. With event sourcing you should be able to remove a past event and re-generate intermediate snapshots. For blockchain-like structures – be careful what you put in there and avoid putting personal data of users. There is an option to use a chameleon hash function, but that’s suboptimal. Overall, you must constantly think of how you can delete the personal data. And “our data model doesn’t allow it” isn’t an excuse.
  • Notify 3rd parties for erasure – deleting things from your system may be one thing, but you are also obligated to inform all third parties that you have pushed that data to. So if you have sent personal data to, say, Salesforce, Hubspot, twitter, or any cloud service provider, you should call an API of theirs that allows for the deletion of personal data. If you are such a provider, obviously, your “forget me” endpoint should be exposed. Calling the 3rd party APIs to remove data is not the full story, though. You also have to make sure the information does not appear in search results. Now, that’s tricky, as Google doesn’t have an API for removal, only a manual process. Fortunately, it’s only about public profile pages that are crawlable by Google (and other search engines, okay…), but you still have to take measures. Ideally, you should make the personal data page return a 404 HTTP status, so that it can be removed.
  • Restrict processing – in your admin panel where there’s a list of users, there should be a button “restrict processing”. The user settings page should also have that button. When clicked (after reading the appropriate information), it should mark the profile as restricted. That means it should no longer be visible to the backoffice staff, or publicly. You can implement that with a simple “restricted” flag in the users table and a few if-clasues here and there.
  • Export data – there should be another button – “export data”. When clicked, the user should receive all the data that you hold about them. What exactly is that data – depends on the particular usecase. Usually it’s at least the data that you delete with the “forget me” functionality, but may include additional data (e.g. the orders the user has made may not be delete, but should be included in the dump). The structure of the dump is not strictly defined, but my recommendation would be to reuse schema.org definitions as much as possible, for either JSON or XML. If the data is simple enough, a CSV/XLS export would also be fine. Sometimes data export can take a long time, so the button can trigger a background process, which would then notify the user via email when his data is ready (twitter, for example, does that already – you can request all your tweets and you get them after a while).
  • Allow users to edit their profile – this seems an obvious rule, but it isn’t always followed. Users must be able to fix all data about them, including data that you have collected from other sources (e.g. using a “login with facebook” you may have fetched their name and address). Rule of thumb – all the fields in your “users” table should be editable via the UI. Technically, rectification can be done via a manual support process, but that’s normally more expensive for a business than just having the form to do it. There is one other scenario, however, when you’ve obtained the data from other sources (i.e. the user hasn’t provided their details to you directly). In that case there should still be a page where they can identify somehow (via email and/or sms confirmation) and get access to the data about them.
  • Consent checkboxes – this is in my opinion the biggest change that the regulation brings. “I accept the terms and conditions” would no longer be sufficient to claim that the user has given their consent for processing their data. So, for each particular processing activity there should be a separate checkbox on the registration (or user profile) screen. You should keep these consent checkboxes in separate columns in the database, and let the users withdraw their consent (by unchecking these checkboxes from their profile page – see the previous point). Ideally, these checkboxes should come directly from the register of processing activities (if you keep one). Note that the checkboxes should not be preselected, as this does not count as “consent”.
  • Re-request consent – if the consent users have given was not clear (e.g. if they simply agreed to terms & conditions), you’d have to re-obtain that consent. So prepare a functionality for mass-emailing your users to ask them to go to their profile page and check all the checkboxes for the personal data processing activities that you have.
  • “See all my data” – this is very similar to the “Export” button, except data should be displayed in the regular UI of the application rather than an XML/JSON format. For example, Google Maps shows you your location history – all the places that you’ve been to. It is a good implementation of the right to access. (Though Google is very far from perfect when privacy is concerned)
  • Age checks – you should ask for the user’s age, and if the user is a child (below 16), you should ask for parent permission. There’s no clear way how to do that, but my suggestion is to introduce a flow, where the child should specify the email of a parent, who can then confirm. Obviosuly, children will just cheat with their birthdate, or provide a fake parent email, but you will most likely have done your job according to the regulation (this is one of the “wishful thinking” aspects of the regulation).

Now some “do’s”, which are mostly about the technical measures needed to protect personal data. They may be more “ops” than “dev”, but often the application also has to be extended to support them. I’ve listed most of what I could think of in a previous post.

  • Encrypt the data in transit. That means that communication between your application layer and your database (or your message queue, or whatever component you have) should be over TLS. The certificates could be self-signed (and possibly pinned), or you could have an internal CA. Different databases have different configurations, just google “X encrypted connections. Some databases need gossiping among the nodes – that should also be configured to use encryption
  • Encrypt the data at rest – this again depends on the database (some offer table-level encryption), but can also be done on machine-level. E.g. using LUKS. The private key can be stored in your infrastructure, or in some cloud service like AWS KMS.
  • Encrypt your backups – kind of obvious
  • Implement pseudonymisation – the most obvious use-case is when you want to use production data for the test/staging servers. You should change the personal data to some “pseudonym”, so that the people cannot be identified. When you push data for machine learning purposes (to third parties or not), you can also do that. Technically, that could mean that your User object can have a “pseudonymize” method which applies hash+salt/bcrypt/PBKDF2 for some of the data that can be used to identify a person
  • Protect data integrity – this is a very broad thing, and could simply mean “have authentication mechanisms for modifying data”. But you can do something more, even as simple as a checksum, or a more complicated solution (like the one I’m working on). It depends on the stakes, on the way data is accessed, on the particular system, etc. The checksum can be in the form of a hash of all the data in a given database record, which should be updated each time the record is updated through the application. It isn’t a strong guarantee, but it is at least something.
  • Have your GDPR register of processing activities in something other than Excel – Article 30 says that you should keep a record of all the types of activities that you use personal data for. That sounds like bureaucracy, but it may be useful – you will be able to link certain aspects of your application with that register (e.g. the consent checkboxes, or your audit trail records). It wouldn’t take much time to implement a simple register, but the business requirements for that should come from whoever is responsible for the GDPR compliance. But you can advise them that having it in Excel won’t make it easy for you as a developer (imagine having to fetch the excel file internally, so that you can parse it and implement a feature). Such a register could be a microservice/small application deployed separately in your infrastructure.
  • Log access to personal data – every read operation on a personal data record should be logged, so that you know who accessed what and for what purpose
  • Register all API consumers – you shouldn’t allow anonymous API access to personal data. I’d say you should request the organization name and contact person for each API user upon registration, and add those to the data processing register. Note: some have treated article 30 as a requirement to keep an audit log. I don’t think it is saying that – instead it requires 250+ companies to keep a register of the types of processing activities (i.e. what you use the data for). There are other articles in the regulation that imply that keeping an audit log is a best practice (for protecting the integrity of the data as well as to make sure it hasn’t been processed without a valid reason)

Finally, some “don’t’s”.

  • Don’t use data for purposes that the user hasn’t agreed with – that’s supposed to be the spirit of the regulation. If you want to expose a new API to a new type of clients, or you want to use the data for some machine learning, or you decide to add ads to your site based on users’ behaviour, or sell your database to a 3rd party – think twice. I would imagine your register of processing activities could have a button to send notification emails to users to ask them for permission when a new processing activity is added (or if you use a 3rd party register, it should probably give you an API). So upon adding a new processing activity (and adding that to your register), mass email all users from whom you’d like consent.
  • Don’t log personal data – getting rid of the personal data from log files (especially if they are shipped to a 3rd party service) can be tedious or even impossible. So log just identifiers if needed. And make sure old logs files are cleaned up, just in case
  • Don’t put fields on the registration/profile form that you don’t need – it’s always tempting to just throw as many fields as the usability person/designer agrees on, but unless you absolutely need the data for delivering your service, you shouldn’t collect it. Names you should probably always collect, but unless you are delivering something, a home address or phone is unnecessary.
  • Don’t assume 3rd parties are compliant – you are responsible if there’s a data breach in one of the 3rd parties (e.g. “processors”) to which you send personal data. So before you send data via an API to another service, make sure they have at least a basic level of data protection. If they don’t, raise a flag with management.
  • Don’t assume having ISO XXX makes you compliant – information security standards and even personal data standards are a good start and they will probably 70% of what the regulation requires, but they are not sufficient – most of the things listed above are not covered in any of those standards

Overall, the purpose of the regulation is to make you take conscious decisions when processing personal data. It imposes best practices in a legal way. If you follow the above advice and design your data model, storage, data flow , API calls with data protection in mind, then you shouldn’t worry about the huge fines that the regulation prescribes – they are for extreme cases, like Equifax for example. Regulators (data protection authorities) will most likely have some checklists into which you’d have to somehow fit, but if you follow best practices, that shouldn’t be an issue.

I think all of the above features can be implemented in a few weeks by a small team. Be suspicious when a big vendor offers you a generic plug-and-play “GDPR compliance” solution. GDPR is not just about the technical aspects listed above – it does have organizational/process implications. But also be suspicious if a consultant claims GDPR is complicated. It’s not – it relies on a few basic principles that are in fact best practices anyway. Just don’t ignore them.

The post GDPR – A Practical Guide For Developers appeared first on Bozho's tech blog.

Using taxies to monitor air quality in Peru

Post Syndicated from Alex Bate original https://www.raspberrypi.org/blog/air-quality-peru/

When James Puderer moved to Lima, Peru, his roadside runs left a rather nasty taste in his mouth. Hit by the pollution from old diesel cars in the area, he decided to monitor the air quality in his new city using Raspberry Pis and the abundant taxies as his tech carriers.

Taxi Datalogger – Assembly

How to assemble the enclosure for my Taxi Datalogger project: https://www.hackster.io/james-puderer/distributed-air-quality-monitoring-using-taxis-69647e

Sensing air quality in Lima

Luckily for James, almost all taxies in Lima are equipped with the standard hollow vinyl roof sign seen in the video above, which makes them ideal for hacking.

Using a Raspberry Pi alongside various Adafuit tech including the BME280 Temperature/Humidity/Pressure Sensor and GPS Antenna, James created a battery-powered retrofit setup that fits snugly into the vinyl sign.

The schematic of the air quality monitor tech inside the taxi sign

With the onboard tech, the device collects data on longitude, latitude, humidity, temperature, pressure, and airborne particle count, feeding it back to an Android Things datalogger. This data is then pushed to Google IoT Core, where it can be remotely accessed.

Next, the data is processed by Google Dataflow and turned into a BigQuery table. Users can then visualize the collected measurements. And while James uses Google Maps to analyse his data, there are many tools online that will allow you to organise and study your figures depending on what final result you’re hoping to achieve.

A heat map of James' local area showing air quality

James hopped in a taxi and took his monitor on the road, collecting results throughout the journey

James has provided the complete build process, including all tech ingredients and code, on his Hackster.io project page, and urges makers to create their own air quality monitor for their local area. He also plans on building upon the existing design by adding a 12V power hookup for connecting to the taxi, functioning lights within the sign, and companion apps for drivers.

Sensing the world around you

We’ve seen a wide variety of Raspberry Pi projects using sensors to track the world around us, such as Kasia Molga’s Human Sensor costume series, which reacts to air pollution by lighting up, and Clodagh O’Mahony’s Social Interaction Dress, which she created to judge how conversation and physical human interaction can be scored and studied.

Human Sensor

Kasia Molga’s Human Sensor — a collection of hi-tech costumes that react to air pollution within the wearer’s environment.

Many people also build their own Pi-powered weather stations, or use the Raspberry Pi Oracle Weather Station, to measure and record conditions in their towns and cities from the roofs of schools, offices, and homes.

Have you incorporated sensors into your Raspberry Pi projects? Share your builds in the comments below or via social media by tagging us.

The post Using taxies to monitor air quality in Peru appeared first on Raspberry Pi.