Tag Archives: Opinions

Privacy expectations and the connected home

Post Syndicated from Matthew Garrett original https://mjg59.dreamwidth.org/50229.html

Traditionally, devices that were tied to logins tended to indicate that in some way – turn on someone’s xbox and it’ll show you their account name, run Netflix and it’ll ask which profile you want to use. The increasing prevalence of smart devices in the home changes that, in ways that may not be immediately obvious to the majority of people. You can configure a Philips Hue with wall-mounted dimmers, meaning that someone unfamiliar with the system may not recognise that it’s a smart lighting system at all. Without any actively malicious intent, you end up with a situation where the account holder is able to infer whether someone is home without that person necessarily having any idea that that’s possible. A visitor who uses an Amazon Echo is not necessarily going to know that it’s tied to somebody’s Amazon account, and even if they do they may not know that the log (and recorded audio!) of all interactions is available to the account holder. And someone grabbing an egg out of your fridge is almost certainly not going to think that your smart egg tray will trigger an immediate notification on the account owner’s phone that they need to buy new eggs.

Things get even more complicated when there’s multiple account support. Google Home supports multiple users on a single device, using voice recognition to determine which queries should be associated with which account. But the account that was used to initially configure the device remains as the fallback, with unrecognised voices ended up being logged to it. If a voice is misidentified, the query may end up being logged to an unexpected account.

There’s some interesting questions about consent and expectations of privacy here. If someone sets up a smart device in their home then at some point they’ll agree to the manufacturer’s privacy policy. But if someone else makes use of the system (by pressing a lightswitch, making a spoken query or, uh, picking up an egg), have they consented? Who has the social obligation to explain to them that the information they’re producing may be stored elsewhere and visible to someone else? If I use an Echo in a hotel room, who has access to the Amazon account it’s associated with? How do you explain to a teenager that there’s a chance that when they asked their Home for contact details for an abortion clinic, it ended up in their parent’s activity log? Who’s going to be the first person divorced for claiming that they were vegan but having been the only person home when an egg was taken out of the fridge?

To be clear, I’m not arguing against the design choices involved in the implementation of these devices. In many cases it’s hard to see how the desired functionality could be implemented without this sort of issue arising. But we’re gradually shifting to a place where the data we generate is not only available to corporations who probably don’t care about us as individuals, it’s also becoming available to people who own the more private spaces we inhabit. We have social norms against bugging our houseguests, but we have no social norms that require us to explain to them that there’ll be a record of every light that they turn on or off. This feels like it’s going to end badly.

(Thanks to Nikki Everett for conversations that inspired this post)

(Disclaimer: while I work for Google, I am not involved in any of the products or teams described in this post and my opinions are my own rather than those of my employer’s)

comment count unavailable comments

Fix Your Crawler

Post Syndicated from Bozho original https://techblog.bozho.net/fix-your-crawler/

Every now and then I open the admin panel of my blog hosting and ban a few IPs (after I’ve tried messaging their abuse email, if I find one). It is always IPs that are generating tons of requests (and traffic) – most likely running some home-made crawler. In some cases the IPs belong to an actual service that captures and provides content, in other cases it’s just a scraper for unknown reasons.

I don’t want to ban IPs, especially because that same IP may be reassigned to a legitimate user (or network) in the future. But they are increasing my hosting usage, which in turn leads to the hosting provider suggesting an upgrade in the plan. And this is not about me, I’m just an example – tons of requests to millions of sites are … useless.

My advice (and plea) is this – please fix your crawlers. Or scrapers. Or whatever you prefer to call that thing that programmatically goes on websites and gets their content.

How? First, reuse an existing crawler. No need to make something new (unless there’s a very specific use-case). A good intro and comparison can be seen here.

Second, make your crawler “polite” (the “politeness” property in the article above). Here’s a good overview on how to be polite, including respect for robots.txt. Existing implementations most likely have politeness options, but you may have to configure them.

Here I’d suggest another option – set a dynamic crawl rate per website that depends on how often the content is updated. My blog updates 3 times a month – no need to crawl it more than once or twice a day. TechCrunch updates many times a day; it’s probably a good idea to crawl it way more often. I don’t have a formula, but you can come up with one that ends up crawling different sites with periods between 2 minutes and 1 day.

Third, don’t “scrape” the content if a better protocol is supported. Many content websites have RSS – use that instead of the HTML of the page. If not, make use of sitemaps. If the WebSub protocol gains traction, you can avoid the crawling/scraping entirely and get notified on new content.

Finally, make sure your crawler/scraper is identifiable by the UserAgent. You can supply your service name or web address in it to make it easier for website owners to find you and complain in case you’ve misconfigured something.

I guess it makes sense to see if using a service like import.io, ScrapingHub, WrapAPI or GetData makes sense for your usecase, instead of reinventing the wheel.

No matter what your use case or approach is, please make sure you don’t put unnecessary pressure on others’ websites.

The post Fix Your Crawler appeared first on Bozho's tech blog.

Popular Danish Torrent Tracker Shuts Down After Hack

Post Syndicated from Ernesto original https://torrentfreak.com/popular-danish-torrent-tracker-shuts-down-after-hack-180102/

Torrent sites come in all shapes and sizes, but generally speaking there’s a clear divide netween private and public sites.

The latter includes the likes of The Pirate Bay and are open to anyone, while private trackers require an account to gain access.

Because many of these close communities also enforce ratio requirements and other rules, they can log quite a bit of data. This generally isn’t the type of information users would like to see out on the streets, but such leaks are no rarity.

In recent days the Danish torrent tracker Hounddawgs.org also ran into some issues. Out of the blue, the site’s 40,000 users received a message signed by ‘Anonymous’ stating that it had been hacked.

Hacked?

The hacker also noted that everyone had been promoted to “staff” but soon after the site went dark. It eventually returned with a message from the operator, accusing another private torrent site of ‘messing around.’

“We’re sorry, but due to server maintenance, we’ll be offline for a little while. Some kiddies from another Danish torrent site don’t like to share users so they found a way to mess a little with the site,” the notice read.

“No harm has been done, and we will be back up as soon as we have found the error and corrected it.”

The message seemed reassuring, but at the same time, a partially redacted file with usernames, emails, and IP-addresses started to circulate.

As a result, the rumor mill went into full swing, and people reported that other accounts where they used the same information, were being compromised. The Hounddawgs operators maintained, however, that allegations of a full database breach were false.

The site’s staff posted a new message refuting the hacking claims. At the same time, they also announced that the site would remain offline indefinitely.

Hounddawgs’ operators say they started the site as a counter-movement to the “tyranny” of other Danish trackers. However, these other trackers allegedly didn’t like the newcomer and fought back, up to a point where Hounddawgs decided to throw in the towel.

Hounddawgs’message (translated)

Private tracker feats are by no means new. They’re as old as private trackers. And while there are plenty opinions, since most of it takes place behind closed doors, the truth is often hard to find.

After the site’s operators said their goodbyes, pointing users to the new infinity-t.org tracker, the alleged hacker responded once more. This time posting over 20 gigabytes of data, said to be the full database and the site’s code.

“But how is that possible? The superheroes of the world, the people behind Hounddawgs, clearly stated on their frontpage that no database was leaked, so how could I possibly have it?” the hacker posted.

“They are lying! Like they have done for years, they don’t care one bit for their users,” the message adds, noting that the server was minimally secured.

The leaked files do indeed include site code and a database, which several people claim to be legitimate. The operators of Hounddawgs also changed their earlier tune. In a message posted on the site yesterday. They now apologize for not dealing with the security issues.

“It has NEVER been our intention to hurt any of you, and we were very happy with all the good users we had. We chose to close the site as a precaution, but unfortunately too late,” they write.

The site was running on the Gazelle script which logs quite a bit of data by default, including users’ IP-addresses. With this info out in the open, many users fear that anti-piracy groups may use the logs to identify individual pirates.

While it’s unlikely that copyright holders will pursue casual sharers based on leaked files, it’s never a pleasant thought to have one’s IP-addresses and other information leaked.

Although the local anti-piracy group, RettighedsAlliancen, might not spring into action right away, it won’t mind seeing the second largest tracker in Denmark go offline.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

GDPR – A Practical Guide For Developers

Post Syndicated from Bozho original https://techblog.bozho.net/gdpr-practical-guide-developers/

You’ve probably heard about GDPR. The new European data protection regulation that applies practically to everyone. Especially if you are working in a big company, it’s most likely that there’s already a process for gettign your systems in compliance with the regulation.

The regulation is basically a law that must be followed in all European countries (but also applies to non-EU companies that have users in the EU). In this particular case, it applies to companies that are not registered in Europe, but are having European customers. So that’s most companies. I will not go into yet another “12 facts about GDPR” or “7 myths about GDPR” posts/whitepapers, as they are often aimed at managers or legal people. Instead, I’ll focus on what GDPR means for developers.

Why am I qualified to do that? A few reasons – I was advisor to the deputy prime minister of a EU country, and because of that I’ve been both exposed and myself wrote some legislation. I’m familiar with the “legalese” and how the regulatory framework operates in general. I’m also a privacy advocate and I’ve been writing about GDPR-related stuff in the past, i.e. “before it was cool” (protecting sensitive data, the right to be forgotten). And finally, I’m currently working on a project that (among other things) aims to help with covering some GDPR aspects.

I’ll try to be a bit more comprehensive this time and cover as many aspects of the regulation that concern developers as I can. And while developers will mostly be concerned about how the systems they are working on have to change, it’s not unlikely that a less informed manager storms in in late spring, realizing GDPR is going to be in force tomorrow, asking “what should we do to get our system/website compliant”.

The rights of the user/client (referred to as “data subject” in the regulation) that I think are relevant for developers are: the right to erasure (the right to be forgotten/deleted from the system), right to restriction of processing (you still keep the data, but mark it as “restricted” and don’t touch it without further consent by the user), the right to data portability (the ability to export one’s data), the right to rectification (the ability to get personal data fixed), the right to be informed (getting human-readable information, rather than long terms and conditions), the right of access (the user should be able to see all the data you have about them), the right to data portability (the user should be able to get a machine-readable dump of their data).

Additionally, the relevant basic principles are: data minimization (one should not collect more data than necessary), integrity and confidentiality (all security measures to protect data that you can think of + measures to guarantee that the data has not been inappropriately modified).

Even further, the regulation requires certain processes to be in place within an organization (of more than 250 employees or if a significant amount of data is processed), and those include keeping a record of all types of processing activities carried out, including transfers to processors (3rd parties), which includes cloud service providers. None of the other requirements of the regulation have an exception depending on the organization size, so “I’m small, GDPR does not concern me” is a myth.

It is important to know what “personal data” is. Basically, it’s every piece of data that can be used to uniquely identify a person or data that is about an already identified person. It’s data that the user has explicitly provided, but also data that you have collected about them from either 3rd parties or based on their activities on the site (what they’ve been looking at, what they’ve purchased, etc.)

Having said that, I’ll list a number of features that will have to be implemented and some hints on how to do that, followed by some do’s and don’t’s.

  • “Forget me” – you should have a method that takes a userId and deletes all personal data about that user (in case they have been collected on the basis of consent, and not due to contract enforcement or legal obligation). It is actually useful for integration tests to have that feature (to cleanup after the test), but it may be hard to implement depending on the data model. In a regular data model, deleting a record may be easy, but some foreign keys may be violated. That means you have two options – either make sure you allow nullable foreign keys (for example an order usually has a reference to the user that made it, but when the user requests his data be deleted, you can set the userId to null), or make sure you delete all related data (e.g. via cascades). This may not be desirable, e.g. if the order is used to track available quantities or for accounting purposes. It’s a bit trickier for event-sourcing data models, or in extreme cases, ones that include some sort of blcokchain/hash chain/tamper-evident data structure. With event sourcing you should be able to remove a past event and re-generate intermediate snapshots. For blockchain-like structures – be careful what you put in there and avoid putting personal data of users. There is an option to use a chameleon hash function, but that’s suboptimal. Overall, you must constantly think of how you can delete the personal data. And “our data model doesn’t allow it” isn’t an excuse.
  • Notify 3rd parties for erasure – deleting things from your system may be one thing, but you are also obligated to inform all third parties that you have pushed that data to. So if you have sent personal data to, say, Salesforce, Hubspot, twitter, or any cloud service provider, you should call an API of theirs that allows for the deletion of personal data. If you are such a provider, obviously, your “forget me” endpoint should be exposed. Calling the 3rd party APIs to remove data is not the full story, though. You also have to make sure the information does not appear in search results. Now, that’s tricky, as Google doesn’t have an API for removal, only a manual process. Fortunately, it’s only about public profile pages that are crawlable by Google (and other search engines, okay…), but you still have to take measures. Ideally, you should make the personal data page return a 404 HTTP status, so that it can be removed.
  • Restrict processing – in your admin panel where there’s a list of users, there should be a button “restrict processing”. The user settings page should also have that button. When clicked (after reading the appropriate information), it should mark the profile as restricted. That means it should no longer be visible to the backoffice staff, or publicly. You can implement that with a simple “restricted” flag in the users table and a few if-clasues here and there.
  • Export data – there should be another button – “export data”. When clicked, the user should receive all the data that you hold about them. What exactly is that data – depends on the particular usecase. Usually it’s at least the data that you delete with the “forget me” functionality, but may include additional data (e.g. the orders the user has made may not be delete, but should be included in the dump). The structure of the dump is not strictly defined, but my recommendation would be to reuse schema.org definitions as much as possible, for either JSON or XML. If the data is simple enough, a CSV/XLS export would also be fine. Sometimes data export can take a long time, so the button can trigger a background process, which would then notify the user via email when his data is ready (twitter, for example, does that already – you can request all your tweets and you get them after a while).
  • Allow users to edit their profile – this seems an obvious rule, but it isn’t always followed. Users must be able to fix all data about them, including data that you have collected from other sources (e.g. using a “login with facebook” you may have fetched their name and address). Rule of thumb – all the fields in your “users” table should be editable via the UI. Technically, rectification can be done via a manual support process, but that’s normally more expensive for a business than just having the form to do it. There is one other scenario, however, when you’ve obtained the data from other sources (i.e. the user hasn’t provided their details to you directly). In that case there should still be a page where they can identify somehow (via email and/or sms confirmation) and get access to the data about them.
  • Consent checkboxes – this is in my opinion the biggest change that the regulation brings. “I accept the terms and conditions” would no longer be sufficient to claim that the user has given their consent for processing their data. So, for each particular processing activity there should be a separate checkbox on the registration (or user profile) screen. You should keep these consent checkboxes in separate columns in the database, and let the users withdraw their consent (by unchecking these checkboxes from their profile page – see the previous point). Ideally, these checkboxes should come directly from the register of processing activities (if you keep one). Note that the checkboxes should not be preselected, as this does not count as “consent”.
  • Re-request consent – if the consent users have given was not clear (e.g. if they simply agreed to terms & conditions), you’d have to re-obtain that consent. So prepare a functionality for mass-emailing your users to ask them to go to their profile page and check all the checkboxes for the personal data processing activities that you have.
  • “See all my data” – this is very similar to the “Export” button, except data should be displayed in the regular UI of the application rather than an XML/JSON format. For example, Google Maps shows you your location history – all the places that you’ve been to. It is a good implementation of the right to access. (Though Google is very far from perfect when privacy is concerned)
  • Age checks – you should ask for the user’s age, and if the user is a child (below 16), you should ask for parent permission. There’s no clear way how to do that, but my suggestion is to introduce a flow, where the child should specify the email of a parent, who can then confirm. Obviosuly, children will just cheat with their birthdate, or provide a fake parent email, but you will most likely have done your job according to the regulation (this is one of the “wishful thinking” aspects of the regulation).

Now some “do’s”, which are mostly about the technical measures needed to protect personal data. They may be more “ops” than “dev”, but often the application also has to be extended to support them. I’ve listed most of what I could think of in a previous post.

  • Encrypt the data in transit. That means that communication between your application layer and your database (or your message queue, or whatever component you have) should be over TLS. The certificates could be self-signed (and possibly pinned), or you could have an internal CA. Different databases have different configurations, just google “X encrypted connections. Some databases need gossiping among the nodes – that should also be configured to use encryption
  • Encrypt the data at rest – this again depends on the database (some offer table-level encryption), but can also be done on machine-level. E.g. using LUKS. The private key can be stored in your infrastructure, or in some cloud service like AWS KMS.
  • Encrypt your backups – kind of obvious
  • Implement pseudonymisation – the most obvious use-case is when you want to use production data for the test/staging servers. You should change the personal data to some “pseudonym”, so that the people cannot be identified. When you push data for machine learning purposes (to third parties or not), you can also do that. Technically, that could mean that your User object can have a “pseudonymize” method which applies hash+salt/bcrypt/PBKDF2 for some of the data that can be used to identify a person
  • Protect data integrity – this is a very broad thing, and could simply mean “have authentication mechanisms for modifying data”. But you can do something more, even as simple as a checksum, or a more complicated solution (like the one I’m working on). It depends on the stakes, on the way data is accessed, on the particular system, etc. The checksum can be in the form of a hash of all the data in a given database record, which should be updated each time the record is updated through the application. It isn’t a strong guarantee, but it is at least something.
  • Have your GDPR register of processing activities in something other than Excel – Article 30 says that you should keep a record of all the types of activities that you use personal data for. That sounds like bureaucracy, but it may be useful – you will be able to link certain aspects of your application with that register (e.g. the consent checkboxes, or your audit trail records). It wouldn’t take much time to implement a simple register, but the business requirements for that should come from whoever is responsible for the GDPR compliance. But you can advise them that having it in Excel won’t make it easy for you as a developer (imagine having to fetch the excel file internally, so that you can parse it and implement a feature). Such a register could be a microservice/small application deployed separately in your infrastructure.
  • Log access to personal data – every read operation on a personal data record should be logged, so that you know who accessed what and for what purpose
  • Register all API consumers – you shouldn’t allow anonymous API access to personal data. I’d say you should request the organization name and contact person for each API user upon registration, and add those to the data processing register. Note: some have treated article 30 as a requirement to keep an audit log. I don’t think it is saying that – instead it requires 250+ companies to keep a register of the types of processing activities (i.e. what you use the data for). There are other articles in the regulation that imply that keeping an audit log is a best practice (for protecting the integrity of the data as well as to make sure it hasn’t been processed without a valid reason)

Finally, some “don’t’s”.

  • Don’t use data for purposes that the user hasn’t agreed with – that’s supposed to be the spirit of the regulation. If you want to expose a new API to a new type of clients, or you want to use the data for some machine learning, or you decide to add ads to your site based on users’ behaviour, or sell your database to a 3rd party – think twice. I would imagine your register of processing activities could have a button to send notification emails to users to ask them for permission when a new processing activity is added (or if you use a 3rd party register, it should probably give you an API). So upon adding a new processing activity (and adding that to your register), mass email all users from whom you’d like consent.
  • Don’t log personal data – getting rid of the personal data from log files (especially if they are shipped to a 3rd party service) can be tedious or even impossible. So log just identifiers if needed. And make sure old logs files are cleaned up, just in case
  • Don’t put fields on the registration/profile form that you don’t need – it’s always tempting to just throw as many fields as the usability person/designer agrees on, but unless you absolutely need the data for delivering your service, you shouldn’t collect it. Names you should probably always collect, but unless you are delivering something, a home address or phone is unnecessary.
  • Don’t assume 3rd parties are compliant – you are responsible if there’s a data breach in one of the 3rd parties (e.g. “processors”) to which you send personal data. So before you send data via an API to another service, make sure they have at least a basic level of data protection. If they don’t, raise a flag with management.
  • Don’t assume having ISO XXX makes you compliant – information security standards and even personal data standards are a good start and they will probably 70% of what the regulation requires, but they are not sufficient – most of the things listed above are not covered in any of those standards

Overall, the purpose of the regulation is to make you take conscious decisions when processing personal data. It imposes best practices in a legal way. If you follow the above advice and design your data model, storage, data flow , API calls with data protection in mind, then you shouldn’t worry about the huge fines that the regulation prescribes – they are for extreme cases, like Equifax for example. Regulators (data protection authorities) will most likely have some checklists into which you’d have to somehow fit, but if you follow best practices, that shouldn’t be an issue.

I think all of the above features can be implemented in a few weeks by a small team. Be suspicious when a big vendor offers you a generic plug-and-play “GDPR compliance” solution. GDPR is not just about the technical aspects listed above – it does have organizational/process implications. But also be suspicious if a consultant claims GDPR is complicated. It’s not – it relies on a few basic principles that are in fact best practices anyway. Just don’t ignore them.

The post GDPR – A Practical Guide For Developers appeared first on Bozho's tech blog.

What’s the Best Solution for Managing Digital Photos and Videos?

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/discovering-best-solution-for-photo-video-backup/

Digital Asset Management (DAM)

If you have spent any time, as we have, talking to photographers and videographers about how they back up and archive their digital photos and videos, then you know that there’s no one answer or solution that users have discovered to meet their needs.

Based on what we’ve heard, visual media artists are still searching for the best combination of software, hardware, and cloud storage to preserve their media, and to be able to search, retrieve, and reuse that media as easily as possible.

Yes, there are a number of solutions out there, and some users have created combinations of hardware, software, and services to meet their needs, but we have met few who claim to be satisfied with their solution for digital asset management (DAM), or expect that they will be using the same solution in just a year or two.

We’d like to open a dialog with professionals and serious amateurs to learn more about what you’re doing, what you’d like to do, and how Backblaze might fit into that solution.

We have a bit of cred in this field, as we currently have hundreds of petabytes of digital media files in our data centers from users of Backblaze Backup and Backblaze B2 Cloud Storage. We want to make our cloud services as useful as possible for photographers and videographers.

Tell Us Both Your Current Solution and Your Dream Solution

To get started, we’d love to hear from you about how you’re managing your photos and videos. Whether you’re an amateur or a professional, your experiences are valuable and will help us understand how to provide the best cloud component of a digital asset management solution.

Here are some questions to consider:

  • Are you using direct-attached drives, NAS (Network-Attached Storage), or offline storage for your media?
  • Do you use the cloud for media you’re actively working on?
  • Do you back up or archive to the cloud?
  • Did you have a catalog or record of the media that you’ve archived that you use to search and retrieve media?
  • What’s different about how you work in the field (or traveling) versus how you work in a studio (or at home)?
  • What software and/or hardware currently works for you?
  • What’s the biggest impediment to working in the way you’d really like to?
  • How could the cloud work better for you?

Please Contribute Your Ideas

To contribute, please answer the following two questions in the comments below or send an email to [email protected]. Please comment or email your response by December 22, 2017.

  1. How are you currently backing up your digital photos, video files, and/or file libraries/catalogs? Do you have a backup system that uses attached drives, a local network, the cloud, or offline storage media? Does it work well for you?
  2. Imagine your ideal digital asset backup setup. What would it look like? Don’t be constrained by current products, technologies, brands, or solutions. Invent a technology or product if you wish. Describe an ideal system that would work the way you want it to.

We know you have opinions about managing photos and videos. Bring them on!

We’re soliciting answers far and wide from amateurs and experts, weekend video makers and well-known professional photographers. We have a few amateur and professional photographers and videographers here at Backblaze, and they are contributing their comments, as well.

Once we have gathered all the responses, we’ll write a post on what we learned about how people are currently working and what they would do if anything were possible. Look for that post after the beginning of the year.

Don’t Miss Future Posts on Media Management

We don’t want you to miss our future posts on photography, videography, and digital asset management. To receive email notices of blog updates (and no spam, we promise), enter your email address above using the Join button at the top of the page.

Come Back on Thursday for our Photography Post (and a Special Giveaway, too)

This coming Thursday we’ll have a blog post about the different ways that photographers and videographers are currently managing their digital media assets.

Plus, you’ll have the chance to win a valuable hardware/software combination for digital media management that I am sure you will appreciate. (You’ll have to wait until Thursday to find out what the prize is, but it has a total value of over $700.)

Past Posts on Photography, Videography, and Digital Asset Management

We’ve written a number of blog posts about photos, videos, and managing digital assets. We’ve posted links to some of them below.

Four Tips To Help Photographers and Videographers Get The Most From B2

Four Tips To Help Photographers and Videographers Get The Most From B2

How to Back Up Your Mac’s Photos Library

How to Back Up Your Mac’s Photos Library

How To Back Up Your Flickr Library

How To Back Up Your Flickr Library

Getting Video Archives Out of Your Closet

Getting Video Archives Out of Your Closet

B2 Cloud Storage Roundup

B2 Cloud Storage Roundup

Backing Up Photos While Traveling

Backing up photos while traveling – feedback

Should I Use an External Drive for Backup?

Should I use an external drive for backup?

How to Connect your Synology NAS to B2

How to Connect your Synology NAS to B2

The post What’s the Best Solution for Managing Digital Photos and Videos? appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

The Problem Solver

Post Syndicated from Bozho original https://techblog.bozho.net/the-problem-solver/

I’ll start this post with a quote:

Good developers are good problem solvers. They turn each task into a series of problems they have to solve. They don’t necessarily know how to solve them in advance, but they have their toolbox of approaches, shortcuts and other tricks that lead to the solution. I have outlined one such set of steps for identifying problems, but you can’t easily formalize the problem-solving approach.

But is really turning a task into a set of problems a good idea? Programming can be seen as a creative exercise, rather than a problem solving one – you think, you ponder, you deliberate, then you make something out of nothing and it’s beautiful, because it works. And sometimes programming is that, but that is almost always interrupted by a series of problems that stop you from getting the task completed. That process is best visualized with the following short video:

That’s because most things in software break. They either break because there are unknowns, or because of a lot of unsuspected edge cases, or because the abstraction that we use leaks, or because the tools that we use are poorly documented or have poor APIs/UIs, or simply because of bugs. Or in many cases – all of the above.

So inevitably, we have to learn to solve problems. And solving them quickly and properly is in fact, one might argue, the most important skill when doing software. One should learn, though, not to just patch things up with duct tape, but to come up with the best possible solution with the constraints at hand. The library that you are using is missing a feature you really need? Ideally, you should propose the feature and wait for it to be implemented. Too often that’s not an option. Quick and dirty fix – copy-paste a bunch of code. Proper, elegant solution – use design patterns to adapt the library to your needs, or come up with a generic (but not time-wasting) way of patching libraries. Or there’s a memory leak? Just launch a bigger instance? No. Spend a week live-profiling the application? To slow. Figure out how to simulate the leaking scenario in a local setup and fix it in a day? Sounds ideal, but it’s not trivial.

Sometimes there are not too many problems and development goes smoothly. Then the good problem solver identifies problems proactively – this implementation is slow, this is too memory-consuming, this is overcomplicated and should be refactored. And these can (and should) be small steps that don’t interfere with the development process, leaving you 2 days in deep refactoring for no apparent reason. The skill is to know the limit between gradual improvement and spotting problems before they occur, and wasting time in problems that don’t exist or you’ll never hit.

And finally, solving problems is not a solo exercise. In fact I think one of the most important aspects of problem solving is answering questions. If you want to be a good developer, you have to answer the questions of others. Your colleagues in most cases, but sometimes – total strangers on Stackoverflow. I myself found that answering stackoverflow questions actually turned me into a better problem solver – I could solve others problems in a limited time, with limited information. So in many case I was the go-to person on the team when a problem arises, even though I wasn’t the most senior or the most familiar with the project. But one could reasonably expect that I’ll be able to figure out a proper solution quickly. And then the loop goes on – you answer more questions and get better at problem solving, and so on, and so forth. By the way, we shouldn’t assume we are good unless we are able to solver others’ problems in addition to ours.

Problem-solving is a transferable skill. We might not be developers forever, but our approach to problems, the tenacity in fixing them, and the determination to get things done properly, is useful in many contexts. You could, in fact, view each task, not just programming ones, as a problem-solving exercise. And having the confidence that you can fix it, even though you have never encountered it before, is often priceless.

What’s my ultimate point? We should see ourselves as problem solvers and constantly improve our problem solving toolbox. Which, among other things, includes helping others. Otherwise we are tied to our knowledge of a particular technology or stack, and that’s frankly boring.

The post The Problem Solver appeared first on Bozho's tech blog.

How to read newspapers

Post Syndicated from Robert Graham original http://blog.erratasec.com/2017/11/how-to-read-newspapers.html

News articles don’t contain the information you think. Instead, they are written according to a formula, and that formula is as much about distorting/hiding information as it is about revealing it.

A good example is the following. I claimed hate-crimes aren’t increasing. The tweet below tries to disprove me, by citing a news article that claims the opposite:

But the data behind this article tells a very different story than the words.
Every November, the FBI releases its hate-crime statistics for the previous year. They’ve been doing this every year for a long time. When they do so, various news organizations grab the data and write a quick story around it.
By “story” I mean a story. Raw numbers don’t interest people, so the writer instead has to wrap it in a narrative that does interest people. That’s what the writer has done in the above story, leading with the fact that hate crimes have increased.
But is this increase meaningful? What do the numbers actually say?
To answer this, I went to the FBI’s website, the source of this data, and grabbed the numbers for the last 20 years, and graphed them in Excel, producing the following graph:
As you can see, there is no significant rise in hate-crimes. Indeed, the latest numbers are about 20% below the average for the last two decades, despite a tiny increase in the last couple years. Statistically/scientifically, there is no change, but you’ll never read that in a news article, because it’s boring and readers won’t pay attention. You’ll only get a “news story” that weaves a narrative that interests the reader.
So back to the original tweet exchange. The person used the news story to disprove my claim, but going to the underlying data, it only supports my claim that the hate-crimes are going down, not up — the small increases of the past couple years are insignificant to the larger decreases of the last two decades.
So that’s the point of this post: news stories are deceptive. You have to double-check the data they are based upon, and pay less attention to the narrative they weave, and even less attention to the title designed to grab your attention.
Anyway, as a side-note, I’d like to apologize for being human. The snark/sarcasm of the tweet above gives me extra pleasure in proving them wrong :).

I Still Prefer Eclipse Over IntelliJ IDEA

Post Syndicated from Bozho original https://techblog.bozho.net/still-prefer-eclipse-intellij-idea/

Over the years I’ve observed an inevitable shift from Eclipse to IntelliJ IDEA. Last year they were almost equal in usage, and I have the feeling things are swaying even more towards IDEA.

IDEA is like the iPhone of IDEs – its users tell you that “you will feel how much better it is once you get used to it”, “are you STILL using Eclipse??”, “IDEA is so much better, I thought everyone has switched”, etc.

I’ve been using mostly Eclipse for the past 12 years, but in some cases I did use IDEA – when I was writing Scala, when I was writing Android, and most recently – when Eclipse failed to be ready for the Java 9 release, so after half a day of trying to get it working, I just switched to IDEA until Eclipse finally gets a working Java 9 version (with Maven and the rest of the stuff).

But I will get back to Eclipse again, soon. And I still prefer it. Not just because of all the key combinations I’ve internalized (you can reuse those in IDEA), but because there are still things I find worse in IDEA. Of course, IDEA has so much more cool features like code improvement suggestions and actually working plugins for everything. But at least some of the problems I see have to do with the more basic development workflow and experience. And you can’t compensate for those with sugarcoating. So here they are:

  • Projects are not automatically built (by default), so you can end up with compilation errors that you don’t see until you open a non-compiling file or run a build. And turning the autobild on makes my machine crawl. I know I need an upgrade, but that’s not the point – not having “build on change” was a huge surprise to me the first time I tried IDEA. I recently complained about that on twitter and it turns out “it’s a feature”. The rationale seems to be that if you use refactoring, that shouldn’t happen. Well, there are dozens of cases when it does happen. Refactoring by adding a method parameter, by changing the type of a parameter, by removing a parameter (where the IDE can’t infer which parameter is removed based on the types), by changing return types. Also, a change in maven/gradle dependencies may introduces compilation issues that you don’t get to see. This is not a reasonable default at all, and I think the performance issues are the only reason it’s still the default. I think this makes the experience much worse.
  • You can have only one project per screen. Maybe there are those small companies with greenfield projects where you only need one. But I’ve never been in a situation, where you don’t at least occasionally need a separate project. Be it an “experiments” one, a “tools” one, or whatever. And no, multi-module maven projects (which IDEA handles well) are not sufficient. So each time you need to step out of your main project, you launch another screen. Apart from the bad usability, it’s double the memory, double the fun.
  • Speaking of memory, It seems to be taking more memory than Eclipse. I don’t have representative benchmarks of that, and I know that my 8 GB RAM home machine is way to small for development nowadays, but still.
  • It feels less responsive and clunky. There is some minor delay that I can’t define well, but “I feel it”. I read somewhere that they were excessively repainting the screen elements, so that might be the explanation. Eclipse feels smoother (I know that’s not a proper argument, but I can’t be more precise)
  • Due to some extra cleverness, I have “unused methods” and “never assigned fields” all around the project. It uses spring, so these methods and fields are controller methods and autowired fields. Maybe some spring plugin would take care of that, but spring is not the only framework that uses reflection. Even getters and setters on POJOs get the unused warnings. What’s the problem with those warnings? That warnings are devalued. They don’t mean anything now. There isn’t a “yellow” indicator on the class either, so you don’t actually see the amount of warnings you have. Eclipse displays warnings better, and the false positives are much less.
  • The call hierarchy is slightly worse. But since that’s the most important IDE feature for me (alongside refactoring), it matters. It doesn’t give you the call hierarchy of default constructors that are not explicitly defined. Also, from what I’ve seen IDEA users don’t often use the call hierarchy feature. “Find usage” I think predates the call hierarchy, and is also much more visible through the UI, so some of the IDEA users don’t even know what a call hierarchy is. And repeatedly do “find usage”. That’s only partly the IDE’s fault.
  • No search in the output console. Come one, why I do I have an IDE, where I have to copy the output and paste it in a text editor in order to search. Now, to clarify, the console does have search. But when I run my (spring-boot) application, it outputs stuff in a panel at the bottom that is not the console and doesn’t have search.
  • CTRL+arrows by default jumps over whole words, and not camel cased words. This is configurable, but is yet another odd default. You almost always want to be able to traverse your variables word by word (in camel case), rather than skipping over the whole variable (method/class) name.
  • A few years ago when I used it for Scala, the project never actually compiled. But I guess that’s more Scala’s fault than of the IDE

Apart from the first two, the rest are not major issues, I agree. But they add up. Ultimately, it’s a matter of personal choice whether you can turn a blind eye to these issues. But I’m getting back to Eclipse again. At some point I will propose improvements in the IntelliJ IDEA backlog and will check it again in a few years, I guess.

The post I Still Prefer Eclipse Over IntelliJ IDEA appeared first on Bozho's tech blog.

Book Author Trolled Pirates With Fake Leak to Make a Point

Post Syndicated from Ernesto original https://torrentfreak.com/book-author-trolled-pirates-with-fake-leak-to-make-a-point-171104/

When it comes to how piracy affects sales, there are thousands of different opinions. This applies to music, movies, software and many other digital products, including ebooks.

When we interviewed Paulo Coelho nearly ten years ago, he pointed out how piracy helped him to sell more books. While a lot has changed since then, he still sees the benefits of piracy today.

However, for many other authors, piracy is a menace. They cringe at the sight of their book being shared online and believe that hurts their bottom line. This includes Maggie Stiefvater, who’s known for The Raven Cycle books, among others.

This week she responded to a tweet from a self-confessed pirate, stating that piracy got the box set of the Raven Cycle canceled. As is usual on social media, it quickly turned into a mess.

Instead of debating the controversial issue indefinitely in 140 character tweets, Stiefvater did what authors do best. She put her thoughts on paper. In a Tumblr post, she countered the belief that piracy doesn’t hurt authors and that pirates wouldn’t pay for a book anyway.

The story shared by Stiefvater isn’t hypothetical, it’s real-world experience. She had noticed that the third book in the Raven Cycle wasn’t doing as well as earlier editions. While this is not uncommon for a series, the sales drop was not equal across all formats, but mostly driven by a lack of eBook sales.

While her publisher wasn’t certain that piracy was to blame, Stiefvater was convinced it played an important role. After all, the interest in her book tours was growing and there was plenty of talk about the books online as well. So when the publisher said that the print run of her new book the Raven King would be cut in half compared to a previous release, she came up with a plan.

Instead of trying to take all pirated copies down following the new release, she created her own, with help from her brother. But one with a twist.

“It was impossible to take down every illegal pdf; I’d already seen that. So we were going to do the opposite. We created a pdf of the Raven King. It was the same length as the real book, but it was just the first four chapters over and over again,” Stiefvater writes.

“I knew we wouldn’t be able to hold the fort for long — real versions would slowly get passed around by hand through forum messaging — but I told my brother: I want to hold the fort for one week. Enough to prove a point. Enough to show everyone that this is no longer 2004. This is the smart phone generation, and a pirated book sometimes is a lost sale.”

And so it happened. When the book came out April last year, customized pirated copies were planted all over the Internet by the author’s brother. People were stumbling all over them, making it near impossible to find a real pirated copy.

“He uploaded dozens and dozens and dozens of these pdfs of The Raven King. You couldn’t throw a rock without hitting one of his pdfs. We sailed those epub seas with our own flag shredding the sky.”

This paid off. Many people could only find the “troll” copies and saw no other option than to buy the real deal.

“The effects were instant. The forums and sites exploded with bewildered activity. Fans asked if anyone had managed to find a link to a legit pdf. Dozens of posts appeared saying that since they hadn’t been able to find a pdf, they’d been forced to hit up Amazon and buy the book.”

As a result, the first print of the book sold out in two days. Stiefvater was on tour and at some stores she visited, the books were no longer available. The publisher had to print more and more until… the inevitable happened.

“Then the pdfs hit the forums and e-sales sagged and it was business as usual, but it didn’t matter: I’d proven the point. Piracy has consequences,” Stiefvater writes, summarizing the morale of her story.

While this is unlikely to change the minds of undeterred pirates, it might strike a chord with some people.

Of course Stiefvater’s anecdote is no better that Coelho’s, who argued the opposite in the past. Perhaps the real takeaway is that piracy doesn’t have any fixed effects and it certainly can’t be captured in oneliners either. It’s a complex puzzle of dozens of constantly changing factors, which will likely never be solved.

Maggie Stiefvater’s full Tumblr post is a recommended read and can be found here, or below.

http://maggie-stiefvater.tumblr.com/post/166952028861/ive-decided-to-tell-you-guys-a-story-about

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Blockchain? It’s All Greek To Me…

Post Syndicated from Bozho original https://techblog.bozho.net/blockchain-its-all-greek-to-me/

The blockchain hype is huge, the ICO craze (“Coindike”) is generating millions if not billions of “funding” for businesses that claim to revolutionize basically anything.

I’ve been following all of that for a while. I got my first (and only) Bitcoin several years ago, I know how the technology works, I’ve implemented the data structure part, I’ve tried (with varying success) to install an Ethereum wallet since almost as soon as Ethereum appeared, and I’ve read and subscribed to newsletters about dozens of projects and new cryptocurrencies, including storj.io, siacoin, namecoin, etc. I would say I’m at least above average in terms of knowledge on how the cryptocurrencies, blockchain, smart contracts, EVM, proof-of-wahtever operates. And I’ve voiced my concerns about the technology in general.

Now it’s rant time.

I’ve been reading whitepapers of various projects, I’ve been to various meetups and talks, I’ve been reading the professed future applications of the blockchain, and I have to admit – it’s all Greek to me. I have no clue what these people are talking about. And why would all of that make any sense. I still think I’m not clever enough to understand the upcoming revolution, but there’s also a cynical side of me that says “this is all a scam”.

Why “X on the blockchain” somehow makes it magical and superior to a good old centralized solution? No, spare me the cliches about “immutable ledger”, “lack of central authority” and the likes. These are the phrases that a person learns after reading literally one article about blockchain. Have you actually written anything apart from a complex-sounding whitepaper or a hello-world smart contract? Do you really know how the overlay network works, how the economic incentives behind that network work, how all the cryptography works? Maybe there are many, many people that indeed know that and they know it better than me and are thus able to imagine the business case behind “X on the blockchain”.

I can’t. I can’t see why it would be useful to abandon a centralized database that you can query in dozens of ways, test easily and scale trivially in favour of a clunky write-only, low-throughput, hard-to-debug privacy nightmare that is any public blockchain. And how do you imagine to gain a substantial userbase with an ecosystem where the Windows client for the 2nd most popular blockchain (Ethereum) has been so buggy, I (a software engineer) couldn’t get it work and sync the whole chain. And why would building a website ontop of that clunky, user-unfriendly database has any benefit over a centralized competitor?

Do we all believe that somehow the huge datacenters with guarnateed power backups, regular hardware and network checks, regular backups and overall – guaranteed redundancy – will somehow be beaten by a few thousand machines hosting a software that has the sole purpose of guaranteeing integrity? Bitcoin has 10 thousand nodes. Ethereum has 22 thousand nodes. And while these nodes are probably very well GPU-equipped, they aren’t supercomputers. Amazon’s AWS has a million servers. How’s that for comparison. And why would anyone take seriously 22 thousand non-servers. Or even 220 thousand, if we believe in some inevitable growth.

Don’t get me wrong, the technology is really cool. The way tamper-evident data structures (hash chains) were combined with a consensus algorithm, an overlay network and a financial incentive is really awesome. When you add a distributed execution environment, it gets even cooler. But is it suitable for literally everything? I fail to see how.

I’m sure I’m missing something. The fact that many of those whitepapers sound increasingly like Greek to me might hint that I’m just a dumb developer and those enlightened people are really onto something huge. I guess time will tell.

But I happen to be living in a country that saw a transition to capitalism in the years of my childhood. And there were a lot of scams and ponzi schemes that people believed in. Because they didn’t know how capitalism works, how the market works. I’m seeing some similarities – we have no idea how the digital realm really works, and so a lot of scams are bound to appear, until we as a society learn the basics.

Until then – enjoy your ICO, enjoy your tokens, enjoy your big-player competitor with practically the same business model, only on a worse database.

And I hope that after the smoke of hype and fraud clears, we’ll be able to enjoy the true benefits of the blockchain innovation.

The post Blockchain? It’s All Greek To Me… appeared first on Bozho's tech blog.

Of Course Atlus Hit RPCS3’s Patreon Page Over Persona 5

Post Syndicated from Andy original https://torrentfreak.com/of-course-atlus-hit-rpcs3s-patreon-page-over-persona-5-170927/

For the uninitiated, RPCS3 is an open-source Sony PlayStation 3 emulator for PC. This growing and brilliant piece of code was publicly released in 2012 and since then has been under constant development thanks to a decent-sized team of programmers and other contributors.

While all emulation has its challenges, emulating a relatively recent piece of hardware such as Playstation 3 is a massive undertaking. As a result, RPCS3 needs funding. This it achieves through its Patreon page, which currently receives support from 675 patrons to the tune of $3,000 per month.

There’s little doubt that there are plenty of people out there who want the project to succeed. Yesterday, however, things took a turn for the worse when RPCS3 attracted the negative attention of Atlus, the developer behind the utterly beautiful RPG, Persona 5.

According to the RPCS3 team, Atlus filed a DMCA takedown notice with Patreon requesting the removal of the entire RPCS3 page after the team promoted the fact that Persona 5 would be compatible with the under-development emulator.

“The PS3 emulator itself is not infringing on our copyrights and trademarks; however, no version of the P5 game should be playable on this platform; and [the RPCS3] developers are infringing on our IP by making such games playable,” Atlus told Patreon.

Fortunately for everyone involved, Patreon did not storm in and remove the entire page, not least since the page itself didn’t infringe on Atlus’ IP rights. However, Atlus was not happy with the response and attempted to negotiate with the fund-raising platform, noting that in order for Persona 5 to work, the user would have to circumvent the game’s DRM protections.

The RPCS3 team, on the other hand, believe they’re on solid ground, noting that where their main developers live, it is legal to make personal copies of legally purchased games. They concede it may not be legal for everyone, but in any event, that would be irrelevant to the DMCA notice filed against their Patreon page. Indeed, trying to take down an entire fundraiser with a DMCA notice was a significant overreach under the circumstances

According to a statement from the team, ultimately a decision was taken to proceed with caution. In order to avoid a full takedown of their Patreon page, all mentions of Persona 5 were removed from both the fund-raiser and main RPSC3 site yesterday.

The RPSC3 team noted that they had no idea why Atlus targeted their project but an announcement from the developer later shone a little light on the issue.

“We believe that our fans best experience our titles (like Persona 5) on the actual platforms for which they are developed. We don’t want their first experiences to be framerate drops, or crashes, or other issues that can crop up in emulation that we have not personally overseen,” Atlus explained.

While some gamers expressed negative opinions over Atlus’ undoubtedly overbroad actions yesterday, it’s difficult to argue with the developer’s main point. Emulators can be beautiful things but there is no doubt that in many instances they don’t recreate the gaming experience perfectly. Indeed, in some cases when things don’t go to plan, the results can be pretty horrible.

That being said, for whatever reason Atlus has chosen not to release a PC version of this popular title so, as many hardcore emulator fans will tell you (this one included), that’s a bit of a red rag to a bull. The company suggests that it might remedy that situation in the future though, so maybe that’s some consolation.

In the meantime, there’s a significant backlash against Atlus and what it attempted to do to the RPCS3 project and its fund-raising efforts. Some people are threatening never to buy an Atlus game ever again, for example, and that’s their prerogative.

But really – is anyone truly surprised that Atlus reacted in the way it did?

While Persona 5 isn’t available on PC yet, this isn’t an out-of-print game from 1982 that’s about to disappear into the black hole of time because there’s no hardware to play it on. This is a game created for relatively current hardware (bang up to date if you include the PS4 version) that was released April 2017 in the United States, just a handful of months ago.

As such, none of the usual ‘moral’ motivations for emulating games on other platforms exist for Persona 5 and for that reason alone, the decision to heavily mention it in RPCS3 fund-raising efforts was bound to backfire. It doesn’t matter whether emulation or dumping of ROMs is legal in some regions, any company can be expected to wade in when someone threatens their business model.

The stark reality is that when they do, entire projects can be put at risk. In this case, Patreon stepped in to save the day but it could’ve been a lot worse. Martyring the whole project for one game would’ve been a disaster for the team and the public. All that being said, Atlus is unlikely to come out of this on top.

“Whatever people may wish, there’s no way to stop any playable game from being executed on the emulator,” the RPCS3 team note.

“Blacklisting the game? RPCS3 is open-source, any attempt would easily be reversed. Attempting to take down the project? At the time of this post, this and many other games were already playable to their full extent, and again, RPCS3 is and will always be an open-source project.”

The bottom line here is that Atlus’ actions may have left a bit of a bad taste in the mouths of some gamers, but even the most hardcore emulator fan shouldn’t be surprised the company went for the throat on a game so fresh. That being said, there are lessons to be learned.

Atlus could’ve spoken quietly to RPCS3 first, but chose not to. RPCS3, on the other hand, will probably be a little bit more strategic with future game compatibility announcements, given what’s just happened. In the long term, that will help them, since it will ensure longevity for the project.

RPCS3 is needed, there’s no doubt about that, but its true value will only be felt when the PS3 has been consigned to history. At that point people will understand why it was worth all the effort – and the occasional hiccup.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Julia Reda MEP Likened to Nazi in Sweeping Anti-Pirate Rant

Post Syndicated from Andy original https://torrentfreak.com/julia-reda-mep-likened-to-nazi-in-sweeping-anti-pirate-rant-170926/

The debate over copyright and enforcement thereof is often polarized, with staunch supporters on one side, objectors firmly on the other, and never the twain shall meet.

As a result, there have been some heated battles over the years, with pro-copyright bodies accusing pirates of theft and pirates accusing pro-copyright bodies of monopolistic tendencies. While neither claim is particularly pleasant, they have become staples of this prolonged war of words and as such, many have become desensitized to their original impact.

This morning, however, musician and staunch pro-copyright activist David Lowery published an article which pours huge amounts of gas on the fire. The headline goes straight for the jugular, asking: Why is it Every Time We Turn Over a Pirate Rock White Nationalists, Nazi’s and Bigots Scurry Out?

Lowery’s opening gambit in his piece on The Trichordist is that one only has to scratch below the surface of the torrent and piracy world in order to find people aligned with the above-mentioned groups.

“Why is it every time we dig a little deeper into the pro-piracy and torrenting movement we find key figures associated with ‘white nationalists,’ Nazi memorabilia collectors, actual Nazis or other similar bigots? And why on earth do politicians, journalists and academics sing the praises of these people?” Lowery asks.

To prove his point, the Camper Van Beethoven musician digs up the fact that former Pirate Bay financier Carl Lündstrom had some fairly unsavory neo-fascist views. While this is not in doubt, Lowery is about 10 tens years too late if he wants to tar The Pirate Bay with the extremist brush.

“It’s called guilt by association,” Pirate Bay co-founder Peter Sunde explained in 2007.

“One of our previous ISPs [owned by Lündstrom] (with clients like The Red Cross, Save the Children foundation etc) gave us cheap bandwidth since one of the guys in TPB worked there; and one of the owners [has a reputation] for his political opinions. That does NOT make us in any way associated to what political views anyone else might or might not have.”

After dealing with TPB but failing to include the above explanation, Lowery moves on to a more recent target, Megaupload founder Kim Dotcom. Dotcom owns an extremely rare signed copy of Hitler’s autobiographical manifesto, Mein Kampf (My Struggle) and once wore a German World War II helmet. It’s a mistake Prince Harry made in 2005 too.

“I’ve bought memorabilia from Churchill, from Stalin, from Hitler,” Dotcom said in response to the historical allegations. “Let me make absolutely clear, OK. I’m not buying into the Nazi ideology. I’m totally against what the Nazis did.”

With Dotcom dealt with, Lowery then turns his attention to the German Pirate Party’s Julia Reda. As a Member of the European Parliament, Reda has made it her mission to deal with overreaching copyright law, which has made her a bit of a target. That being said, would anyone really try to shoehorn her into the “White Nationalists, Nazi’s and Bigots” bracket?

They would.

In his piece, Lowery highlights comments made by Reda last year, when she complained about the copyright situation developing around the diary written by Anne Frank, which detailed the horrors of living in occupied countries during World War II.

Anne Frank died in 1945 which means that the book was elevated into the public domain in the Netherlands on January 1, 2016, 70 years after her death. A copy was made available at Wikisource, a digital library of free texts maintained by the Wikimedia Foundation, which also operates Wikipedia.

However, in early February that same year, Anne Frank’s diary became unavailable, since U.S. copyright law dictates that works are protected for 95 years from date of publication.

“Today, in an unfortunate example of the overreach of the United States’ current copyright law, the Wikimedia Foundation removed the Dutch-language text of The Diary of a Young Girl,” said Jacob Rogers, Legal Counsel for the Wikimedia Foundation

“We took this action to comply with the United States’ Digital Millennium Copyright Act (DMCA), as we believe the diary is still under US copyright protection under the law as it is currently written,” he added.

Lowery ignores this background in its entirety. He actually ignores all of it in an effort to paint a picture of Reda engaging in some far-right agenda. Lowery even places emphasis on Reda’s nationality to force his point home.

“I don’t really know what to make of her except to say that this German politician really should find something other than the Anne Frank Diary and the Anne Frank Foundation to use as an example of a work that should be freely available in the public domain,” he writes.

“Think of all the copyrighted works out there for which she might reasonably argue a claim of public domain. She decided to pick the Anne Frank diary. Hmm.”

Lowery then accuses Reda of urging people on Twitter to pirate the book, in order to hurt the fight against anti-Semitism and somehow deprive Jewish people of an income.

“After all sales of the book are used by the Anne Frank Foundation to fight anti-semitism. It’s really quite a bad look for any MP, German or not. (Even if it is just the make-believe LARPing RPG EU Parliament),” Lowery writes.

“Or maybe that is the point? Defund the Anne Frank Foundation. Cause you know I read in the twittersphere that copyright producing media conglomerates are controlled by you-know-who.”

At this point, Lowery moves on to Fight For the Future, stating that their lack of racial diversity caused them to stumble into a racially charged copyright dispute involving the famous Martin Luther King speech.

The whole article can be read here but hopefully, most readers will recognize that America needs less division right now, not more hatred.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Are Cryptocurrency Miners The Future for Pirate Sites?

Post Syndicated from Ernesto original https://torrentfreak.com/are-cryptocurrency-miners-the-future-for-pirate-sites-170921/

Last weekend The Pirate Bay surprised friend and foe by adding a Javascript-based cryptocurrency miner to its website.

The miner utilizes CPU power from visitors to generate Monero coins for the site, providing an extra revenue source.

Initially, this caused the CPUs of visitors to max out due to a configuration error, but it was later adjusted to be less demanding. Still, there was plenty of discussion on the move, with greatly varying opinions.

Some criticized the site for “hijacking” their computer resources for personal profit, without prior warning. However, there are also people who are happy to give something back to TPB, especially if it can help the site to remain online.

Aside from the configuration error, there was another major mistake everyone agreed on. The Pirate Bay team should have alerted its visitors to this change beforehand, and not after the fact, as they did last weekend.

Despite the sensitivities, The Pirate Bay’s move has inspired others to follow suit. Pirate linking site Alluc.ee is one of the first. While they use the same mining service, their implementation is more elegant.

Alluc shows how many hashes are mined and the site allows users to increase or decrease the CPU load, or turn the miner off completely.

Alluc.ee miner

Putting all the controversy aside for a minute, the idea to let visitors mine coins is a pretty ingenious idea. The Pirate Bay said it was testing the feature to see if it’s possible as a replacement for ads, which might be much needed in the future.

In recent years many pirate sites have struggled to make a decent income. Not only are more people using ad-blockers now, the ad-quality is also dropping as copyright holders actively go after this revenue source, trying to dry up the funds of pirate sites. And with Chrome planning to add a default ad-blocker to its browser, the outlook is grim.

A cryptocurrency miner might alleviate this problem. That is, as long as ad-blockers don’t start to interfere with this revenue source as well.

Interestingly, this would also counter one of the main anti-piracy talking points. Increasingly, industry groups are using the “public safety” argument as a reason to go after pirate sites. They point to malicious advertisements as a great danger, hoping that this will further their calls for tougher legislation and enforcement.

If The Pirate Bay and other pirate sites can ditch the ads, they would be less susceptible to these and other anti-piracy pushes. Of course, copyright holders could still go after the miner revenues, but this might not be easy.

TorrentFreak spoke to Coinhive, the company that provides the mining service to The Pirate Bay, and they don’t seem eager to take action without a court order.

“We don’t track where users come from. We are just providing servers and a script to submit hashes for the Monero blockchain. We don’t see it as our responsibility to determine if a website is ‘valid’ and we don’t have the technical capabilities to do so,” a Coinhive representative says.

We also contacted several site owners and thus far the response has been mixed. Some like the idea and would consider adding a miner, if it doesn’t affect visitors too much. Others are more skeptical and don’t believe that the extra revenue is worth the trouble.

The Pirate Bay itself, meanwhile, has completed its test run and has removed the miner from the site. They will now analyze the results before deciding whether or not it’s “the future” for them.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Kodi ‘Trademark Troll’ Has Interesting Views on Co-Opting Other People’s Work

Post Syndicated from Andy original https://torrentfreak.com/kodi-trademark-troll-has-interesting-views-on-co-opting-other-peoples-work-170917/

The Kodi team, operating under the XBMC Foundation, announced last week that a third-party had registered the Kodi trademark in Canada and was using it for their own purposes.

That person was Geoff Gavora, who had previously been in communication with the Kodi team, expressing how important the software was to his sales.

“We had hoped, given the positive nature of his past emails, that perhaps he was doing this for the benefit of the Foundation. We learned, unfortunately, that this was not the case,” XBMC Foundation President Nathan Betzen said.

According to the Kodi team, Gavora began delisting Amazon ads placed by companies selling Kodi-enabled products, based on infringement of Gavora’s trademark rights.

“[O]nly Gavora’s hardware can be sold, unless those companies pay him a fee to stay on the store,” Betzen explained.

Predictably, Gavora’s move is being viewed as highly controversial, not least since he’s effectively claiming licensing rights in Canada over what should be a free and open source piece of software. TF obtained one of the notices Amazon sent to a seller of a Kodi-enabled device in Canada, following a complaint from Gavora.

Take down Kodi from Amazon, or pay Gavora

So who is Geoff Gavora and what makes him tick? Thanks to a 2016 interview with Ali Salman of the Rapid Growth Podcast, we have a lot of information from the horse’s mouth.

It all began in 2011, when Gavora began jailbreaking Apple TVs, loading them with XBMC, and selling them to friends.

“I did it as a joke, for beer money from my friends,” Gavora told Salman.

“I’d do it for $25 to $50 and word of mouth spread that I was doing this so we could load on this media center to watch content and online streams from it.”

Intro to the interview with Ali Salman

Soon, however, word of mouth caused the business to grow wings, Gavora claims.

“So they started telling people and I start telling people it’s $50, and then I got so busy so I start telling people it’s $75. I’m getting too busy with my work and with this. And it got to the point where I was making more jailbreaking these Apple TVs than I was at my career, and I wasn’t very happy at my career at that time.”

Jailbreaking was supposed to be a side thing to tide Gavora over until another job came along, but he had a problem – he didn’t come from a technical background. Nevertheless, what Gavora did have was a background in marketing and with a decent knowledge of how to succeed in customer service, he majored on that front.

Gavora had come to learn that while people wanted his devices, they weren’t very good at operating XBMC (Kodi’s former name) which he’d loaded onto them. With this in mind, he began offering web support and phone support via a toll-free line.

“I started receiving calls from New York, Dallas, and then Australia, Hong Kong. Everyone around the world was calling me and saying ‘we hear there’s some kid in Calgary, some young child, who’s offering tech support for the Apple TV’,” Gavora said.

But with things apparently going well, a wrench was soon thrown into the works when Apple released the third variant of its Apple TV and Gavorra was unable to jailbreak it. This prompted him to market his own Linux-based set-top device and his business, Raw-Media, grew from there.

While it seems likely that so-called ‘Raw Boxes’ were doing reasonably well with consumers, what was the secret of their success? Podcast host Salman asked Gavora for his ‘networking party 10-second pitch’, and the Canadian was happy to oblige.

“I get this all the time actually. I basically tell people that I sell a box that gives them free TV and movies,” he said.

This was met with laughter from the host, to which Gavora added, “That’s sort of the three-second pitch and everyone’s like ‘Oh, tell me more’.”

“Who doesn’t like free TV, come on?” Salman responded. “Yeah exactly,” Gavora said.

The image below, taken from a January 2016 YouTube unboxing video, shows one of the products sold by Gavora’s company.

Raw-Media Kodi Box packaging (note Kodi logo)

Bearing in mind the offer of free movies and TV, the tagline on the box, “Stop paying for things you don’t want to watch, watch more free tv!” initially looks quite provocative. That being said, both the device and Kodi are perfectly capable of playing plenty of legal content from free sources, so there’s no problem there.

What is surprising, however, is that the unboxing video shows the device being booted up, apparently already loaded with infamous third-party Kodi addons including PrimeWire, Genesis, Icefilms, and Navi-X.

The unboxing video showing the Kodi setup

Given that Gavora has registered the Kodi trademark in Canada and prints the official logo on his packaging, this runs counter to the official Kodi team’s aggressive stance towards boxes ready-configured with what they categorize as banned addons. Matters are compounded when one visits the product support site.

As seen in the image below, Raw-Media devices are delivered with a printed card in the packaging informing people where to get the after-sales services Gavora says he built his business upon. The cards advise people to visit No-Issue.ca, a site setup to offer text and video-based support to set-top box buyers.

No-Issue.ca (which is hosted on the same server as raw-media.ca and claimed officially as a sister site here) now redirects to No-Issue.is, as per a 2016 announcement. It has a fairly bland forum but the connected tutorial videos, found on No Issue’s YouTube channel, offer a lot more spice.

Registered under Gavora’s online nickname Gombeek (which is also used on the official Kodi forums), the channel is full of videos detailing how to install and use a wide range of addons.

The No-issue YouTube Channel tutorials

But while supplying tutorial videos is one thing, providing the actual software addons is another. Surprisingly, No-Issue does that too. Filed away under the URL http://solved.no-issue.is/ is a Kodi repository which distributes a wide range of addons, including many that specialize in infringing content, according to the Kodi team.

The No-Issue repository

A source familiar with Raw-Media’s devices informs TF that they’re no longer delivered with addons installed. However, tools hosted on No-Issue.is automate the installation process for the customer, with unlisted YouTube Videos (1,2) providing the instructions.

XBMC Foundation President Nathan Betzen says that situation isn’t ideal.

“If that really is his repo it is disappointing to see that Gavora is charging a fee or outright preventing the sale of boxes with Kodi installed that do not include infringing add-ons, while at the same time he is distributing boxes himself that do include the infringing add-ons like this,” Betzen told TF.

While the legality of this type of service is yet to be properly tested in Canada and may yet emerge as entirely permissible under local law, Gavora himself previously described his business as operating in a gray area.

“If I could go back in time four years, I would’ve been more aggressive in the beginning because there was a lot of uncertainty being in a gray market business about how far I could push it,” he said.

“I really shouldn’t say it’s a gray market because everything I do is completely above board, I just felt it was more gray market so I was a bit scared,” he added.

But, legality aside (which will be determined in due course through various cases 1,2), the situation is still problematic when it comes to the Kodi trademark.

The official Kodi team indicate they don’t want to be associated with any kind of questionable addon or even tutorials for the same. Nevertheless, several of the addons installed by No-Issue (including PrimeWire, cCloud TV, Genesis, Icefilms, MoviesHD, MuchMovies and Navi-X, to name a few), are present on the Kodi team’s official ban list.

The fact remains, however, that Gavora successfully registered the trademark in Canada (one month later it was transferred to a brand new company at the same address), and Kodi now have no control over the situation in the country, short of a settlement or some kind of legal action.

Kodi matters aside, though, we get more insight into Gavora’s attitudes towards intellectual property after learning that he studied gemology and jewelry at school. He’s a long-standing member of jewelry discussion forum Ganoskin.com (his profile links to Gavora.com, a domain Gavora owns, as per information supplied by Amazon).

Things get particularly topical in a 2006 thread titled “When your work gets ripped“. The original poster asked how people feel when their jewelry work gets copied and Gavora made his opinions known.

“I think that what most people forget to remember is that when a piece from Tiffany’s or Cartier is ripped off or copied they don’t usually just copy the work, they will stamp it with their name as well,” Gavora said.

“This is, in fact, fraud and they are deceiving clients into believing they are purchasing genuine Tiffany’s or Cartier pieces. The client is in fact more interested in purchasing from an artist than they are the piece. Laying claim to designs (unless a symbol or name is involved) is outrageous.”

Unless that ‘design’ is called Kodi, of course, then it’s possible to claim it as your own through an administrative process and begin demanding licensing fees from the public. That being said, Gavora does seem to flip back and forth a little, later suggesting that being copied is sometimes ok.

“If someone copies your design and produces it under their own name, I think one should be honored and revel in the fact that your design is successful and has caused others to imitate it and grow from it,” he wrote.

“I look forward to the day I see one of my original designs copied, that is the day I will know my design is a success.”

From their public statements, this opinion isn’t shared by the Kodi team in respect of their product. Despite the Kodi name, software and logo being all their own work, they now find themselves having to claw back rights in Canada, in order to keep the product free in the region. For now, however, that seems like a difficult task.

TorrentFreak wrote to Gavora and asked him why he felt the need to register the Kodi trademark, but we received no response. That means we didn’t get the chance to ask him why he’s taking down Amazon listings for other people’s devices, or about something else that came up in the podcast.

“My biggest weakness, I guess, is that I’m too ethical about how I do my business,” he said, referring to how he deals with customers.

Only time will tell how that philosophy will affect Gavora’s attitudes to trademarks and people’s desire not to be charged for using free, open source software.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Self-Driving Cars Should Be Open Source

Post Syndicated from Bozho original https://techblog.bozho.net/self-driving-cars-open-source/

Self-driving cars are (will be) the pinnacle of consumer products automation – robot vacuum cleaners, smart fridges and TVs are just toys compared to self-driving cars. Both in terms of technology and in terms of impact. We aren’t yet on level 5 self driving cars , but they are behind the corner.

But as software engineers we know how fragile software is. And self-driving cars are basically software, so we can see all the risks involved with putting our lives in the hands anonymous (from our point of view) developers and unknown (to us) processes and quality standards. One may argue that this has been the case for every consumer product ever, but with software is different – software is way more complex than anything else.

So I have an outrageous proposal – self-driving cars should be open source. We have to be able to verify and trust the code that’s navigating our helpless bodies around the highways. Not only that, but we have to be able to verify if it is indeed that code that is currently running in our car, and not something else.

In fact, let me extend that – all cars should be open source. Before you say “but that will ruin the competitive advantage of manufacturers and will be deadly for business”, I don’t actually care how they trained their neural networks, or what their datasets are. That’s actually the secret sauce of the self-driving car and in my view it can remain proprietary and closed. What I’d like to see open-sourced is everything else. (Under what license – I’d be fine to even have it copyrighted and so not “real” open source, but that’s a separate discussion).

Why? This story about remote carjacking using the entertainment system of a Jeep is a scary example. Attackers that reverse engineer the car software can remotely control everything in the car. Why did that happen? Well, I guess it’s complicated and we have to watch the DEFCON talk.

And also read the paper, but a paragraph in wikipedia about the CAN bus used in most cars gives us a hint:

CAN is a low-level protocol and does not support any security features intrinsically. There is also no encryption in standard CAN implementations, which leaves these networks open to man-in-the-middle packet interception. In most implementations, applications are expected to deploy their own security mechanisms; e.g., to authenticate incoming commands or the presence of certain devices on the network. Failure to implement adequate security measures may result in various sorts of attacks if the opponent manages to insert messages on the bus. While passwords exist for some safety-critical functions, such as modifying firmware, programming keys, or controlling antilock brake actuators, these systems are not implemented universally and have a limited number of seed/key pair

I don’t know in what world it makes sense to even have a link between the entertainment system and the low-level network that operates the physical controls. As apparent from the talk, the two systems are supposed to be air-gapped, but in reality they aren’t.

Rookie mistakes were abound – unauthenticated “execute” method, running as root, firmware is not signed, hard-coded passwords, etc. How do we know that there aren’t tons of those in all cars out there right now, and in the self-driving cars of the future (which will likely use the same legacy technologies of the current cars)? Recently I heard a negative comment about the source code of one of the self-driving cars “players”, and I’m pretty sure there are many of those rookie mistakes.

Why this is this even more risky for self-driving cars? I’m not an expert in car programming, but it seems like the attack surface is bigger. I might be completely off target here, but on a typical car you’d have to “just” properly isolate the CAN bus. With self-driving cars the autonomous system that watches the surrounding and makes decisions on what to do next has to be connected to the CAN bus. With Tesla being able to send updates over the wire, the attack surface is even bigger (although that’s actually a good feature – to be able to patch all cars immediately once a vulnerability is discovered).

Of course, one approach would be to introduce legislation that regulates car software. It might work, but it would rely on governments to to proper testing, which won’t always be the case.

The alternative is to open-source it and let all the white-hats find your issues, so that you can close them before the car hits the road. Not only that, but consumers like me will feel safer, and geeks would be able to verify whether the car is really running the software it claims to run by verifying the fingerprints.

Richard Stallman might be seen as a fanatic when he advocates against closed source software, but in cases like … cars, his concerns seem less extreme.

“But the Jeep vulnerability was fixed”, you may say. And that might be seen as being the way things are – vulnerabilities appear, they get fixed, life goes on. No person was injured because of the bug, right? Well, not yet. And “gaining control” is the extreme scenario – there are still pretty bad scenarios, like being able to track a car through its GPS, or cause panic by controlling the entertainment system. It might be over wifi, or over GPRS, or even by physically messing with the car by inserting a flash drive. Is open source immune to those issues? No, but it has proven to be more resilient.

One industry where the problem of proprietary software on a product that the customer bought is … tractors. It turns out farmers are hacking their tractors, because of multiple issues and the inability of the vendor to resolve them in a timely manner. This is likely to happen to cars soon, when only authorized repair shops are allowed to touch anything on the car. And with unauthorized repair shops the attack surface becomes even bigger.

In fact, I’d prefer open source not just for cars, but for all consumer products. The source code of a smart fridge or a security camera is trivial, it would rarely mean sacrificing competitive advantage. But refrigerators get hacked, security cameras are active part of botnets, the “internet of shit” is getting ubiquitous. A huge amount of these issues are dumb, beginner mistakes. We have the right to know what shit we are running – in our frdges, DVRs and ultimatey – cars.

Your fridge may soon by spying on you, your vacuum cleaner may threaten your pet in demand of “ransom”. The terrorists of the future may crash planes without being armed, can crash vans into crowds without being in the van, and can “explode” home equipment without being in the particular home. And that’s not just a hypothetical.

Will open source magically solve the issue? No. But it will definitely make things better and safer, as it has done with operating systems and web servers.

The post Self-Driving Cars Should Be Open Source appeared first on Bozho's tech blog.

Five Must-Watch Software Engineering Talks

Post Syndicated from Bozho original https://techblog.bozho.net/five-must-watch-software-engineering-talks/

We’ve all watched dozens of talks online. And we probably don’t remember many of them. But some do stick in our heads and we eventually watch them again (and again) because we know they are good and we want to remember the things that were said there. So I decided to compile a small list of talks that I find very insightful, useful and that have, in a way, shaped my software engineering practice or expanded my understanding of the software world.

1. How To Design A Good API and Why it Matters by Joshua Bloch – this is a must-watch (well, obviously all are). And don’t skip it because “you are not writing APIs” – everyone is writing APIs. Maybe not used by hundreds of other developers, but used by at least several, and that’s a good enough reason. Having watched this talk I ended up buying and reading one of the few software books that I have actually read end-to-end – “Effective Java” (the talk uses Java as an example, but the principles aren’t limited to Java)

2. How to write clean, testable code by Miško Hevery. Maybe there are tons of talks about testing code, maybe Uncle Bob has a more popular one, but I found this one particularly practical and the the point – that writing testable code is a skill, and that testable code is good code. (By the way, the speaker then wrote AngularJS)

3. Back to basics: the mess we’ve made of our fundamental data types by Jon Skeet. The title says it all, and it’s nice to be reminded of how fragile even the basics of programming languages are.

4. The Danger of Software Patents by Richard Stallman. That goes a little bit away from writing software, but puts software in legal context – how do legislation loopholes affect code reuse and business practices related it. It’s a bit long, but I think worth it.

5. Does my ESB look big in this? by Martin Fowler and Jim Webber. It’s about bloated enterprise architecture and how to actually do enterprise architecture without complex and expensive middleware. (Unfortunately it’s not on YouTube, so no embedding).

Although this is not a “ranking”, I’d like to add a few honourable mentions: The famous “WAT” lightning talk, showing some quirks of ruby and javascript, “The future of programming” by Bret Victor, “You suck at Excel” by Joel Spolsky, which isn’t really about creating software, but it’s cool. And a tiny shameless plug with my “Common sense driven development talk”

I hope the compilation is useful and enlightening. Enjoy.

The post Five Must-Watch Software Engineering Talks appeared first on Bozho's tech blog.

We Are Not Having a Productive Debate About Women in Tech

Post Syndicated from Bozho original https://techblog.bozho.net/not-productive-debate-women-tech/

Yes, it’s about the “anti-diversity memo”. But I won’t go into particular details of the memo, the firing, who’s right and wrong, who’s liberal and who’s conservative. Actually, I don’t need to repeat this post, which states almost exactly what I think about the particular issue. Just in case, and before someone decided to label me as “sexist white male” that knows nothing, I guess should clearly state that I acknowledge that biases against women are real and that I strongly support equal opportunity, and I think there must be more women in technology. I also have to state that I think the author of “the memo” was well-meaning, had some well argued, research-backed points and should not be ostracized.

But I want to “rant” about the quality of the debate. On one side we have conservatives who are throwing themselves in defense of the fired googler, insisting that liberals are banning conservative points of view, that it is normal to have so few woman in tech and that everything is actually okay, or even that women are inferior. On the other side we have triggered liberals that are ready to shout “discrimination” and “harassment” at anything that resembles an attempt to claim anything different than total and absolute equality, in many cases using a classical “strawman” argument (e.g. “he’s saying women should not work in tech, he’s obviously wrong”).

Everyone seems to be too eager to take side and issue a verdict on who’s right and who’s wrong, to blame the other side for all related and unrelated woes and while doing that, exhibit a huge amount of biases. If the debate is about that, we’d better shut it down as soon as possible, as it’s not going to lead anywhere. No matter how much conservatives want “a debate”, and no matter how much liberals want to advance equality. Oh, and by the way – this “conservatives” vs “liberals” is a false dichotomy. Most people hold a somewhat sensible stance in between. But let’s get to the actual issue:

Women are underrepresented in STEM (Science, technology, engineering, mathematics). That is a fact everyone agrees on and is blatantly obvious when you walk in any software company office.

Why is that the case? The whole debate revolved around biological and social differences, some of which are probably even true – that women value job flexibility more than being promoted or getting higher salary, that they are more neurotic (on average), that they are less confident, that they are more empathic and so on. These difference have been studied and documented, and as much as I have my reservations about psychology studies (so much so, that even meta-analysis are shown by meta-meta-analysis to be flawed) and social science in general, there seems to be a consensus there (by the way, it’s a shame that Gizmodo removed all the scientific references when they first published “the memo”). But that is not the issue. As it has been pointed out, there’s equal applicability of male and female “inherent” traits when working with technology.

Why are we talking about “techonology”, and why not “mining and construction”, as many will point out. Let’s cut that argument once and for all – mining and construction are blue collar jobs that have a high chance of being automated in the near future and are in decline. The problem that we’re trying to solve is – how to make the dominant profession of the future – information technology – one of equal opportunity. Yes, it’s a a bold claim, but software is going to be everywhere and the industry will grow. This is why it’s so important to discuss it, not because we are developers and we are somewhat affected by that.

So, there has been extended research on the matter, and the reasons are – surprise – complex and intertwined and there is no simple issue that, once resolved, will unlock the path of women to tech jobs.

What would diversity give us and why should we care? Let’s assume for a moment we don’t care about equal opportunity and we are right-leaning, conservative people. Well, imagine you have a growing business and you need to hire developers. What would you prefer – having fewer or more people of whom to choose from? Having fewer or more diverse skills (technical and social) on the job market? The answer is obvious. The more people, regardless of their gender, race, whatever, are on the job market, the better for businesses.

So I guess we’ve agreed on the two points so far – that women are underrepresented, and that it’s better for everyone if there are more people with technical skills on the job market, which includes more women.

The “final” questions is – how?

And this questions seems to not be anywhere in the discussion. Instead, we are going in circles with irrelevant arguments trying to either show that we’ve read more scientific papers than others, that we are more liberal than others or that we are more pro free speech.

Back to “how” – in Bulgaria we have a social meme: “I don’t know what is the right way, but the way you are doing it is NOT the right way”. And much of the underlying sentiment of “the memo” is similar – that google should stop doing some of the stuff it is doing about diversity, or do them differently (but doesn’t tell us how exactly). Hiring biases, internal programs, whatever, seem to bother him. But this is just talking about the surface of the problem. These programs are correcting something that remains hidden in “the memo”.

Google, on their diversity page, say that 20% of their tech employees are women. At the same time, in another diversity section, they claim “18% of CS graduates are women”. So, I guess, job done – they’ve reached the maximum possible diversity. They’ve hired as many women in tech as CS graduates there are. Anything more than that, even if it doesn’t mean they’ll hire worse developers, will leave the rest of the industry with less women. So, sure, 50/50 in Google would sound cool, but the industry average will still be bad.

And that’s the actual, underlying reason that we should have already arrived at, and we should’ve started discussing the “how”. Girls do not see STEM as a thing for them. Our biases are projected on younger girls which culminate at a “this is not for girls” mantra. No matter how diverse hiring policies we have, if we don’t address the issue at a way earlier stage, we aren’t getting anywhere.

In schools and even kindergartens we need to have an inclusive environment where “this is not for girls” is frowned upon. We should not discourage girls from liking math, or making math sound uncool and “hard for girls” (in my biased world I actually know more women mathematicians than men). This comic seems like on a different topic (gender-specific toys), but it’s actually not about toys – it’s about what is considered (stereo)typical of a girl to do. And most of these biases are unconscious, and come from all around us (school, TV, outdoor ads, people on the street, relatives, etc.), and it takes effort to confront them.

To do that, we need policy decisions. We need lobbying education departments / ministries to encourage girls more in the STEM direction (and don’t worry, they’ll be good at it). By the way, guess what – Google’s diversity program is not just about hiring more women, it actually includes education policies with stuff like “influencing perception about computer science”, “getting more girls to code” and scholarships.

Let’s discuss the education policies, the path to getting 40-50% of CS graduates to be female, and before that – more girls in schools with technical focus, and ultimately – how to get society to not perceive technology and science as “not for girls”. Let each girl decide on her own. All the other debates are short-sighted and not to the point at all. Will biological differences matter then? They probably will – but not significantly to justify a high gender imbalance.

I am no expert in education policies and I don’t know what will work and what won’t. There is research on the matter that we should look at, and maybe argue about it. Everything else is wasted keystrokes.

The post We Are Not Having a Productive Debate About Women in Tech appeared first on Bozho's tech blog.

Concerns About The Blockchain Technology

Post Syndicated from Bozho original https://techblog.bozho.net/concerns-blockchain-technology/

The so-called (and marketing-branded) “blockchain technology” is promised to revolutionize every industry. Anything, they say, will become decentralized, free from middle men or government control. Services will thrive on various installments of the blockchain, and smart contracts will automatically enforce any logic that is related to the particular domain.

I don’t mind having another technological leap (after the internet), and given that I’m technically familiar with the blockchain, I may even be part of it. But I’m not convinced it will happen, and I’m not convinced it’s going to be the next internet.

If we strip the hype, the technology behind Bitcoin is indeed a technical masterpiece. It combines existing techniques (likes hash chains and merkle trees) with a very good proof-of-work based consensus algorithm. And it creates a digital currency, which ontop of being worth billions now, is simply cool.

But will this technology be mass-adopted, and will mass adoption allow it to retain the technological benefits it has?

First, I’d like to nitpick a little bit – if anyone is speaking about “decentralized software” when referring to “the blockchain”, be suspicious. Bitcon and other peer-to-peer overlay networks are in fact “distributed” (see the pictures here). “Decentralized” means having multiple providers, but doesn’t mean each user will be full-featured nodes on the network. This nitpicking is actually part of another argument, but we’ll get to that.

If blockchain-based applications want to reach mass adoption, they have to be user-friendly. I know I’m being captain obvious here (and fortunately some of the people in the area have realized that), but with the current state of the technology, it’s impossible for end users to even get it, let alone use it.

My first serious concern is usability. To begin with, you need to download the whole blockchain on your machine. When I got my first bitcoin several years ago (when it was still 10 euro), the blockchain was kind of small and I didn’t notice that problem. Nowadays both the Bitcoin and Ethereum blockchains take ages to download. I still haven’t managed to download the ethereum one – after several bugs and reinstalls of the client, I’m still at 15%. And we are just at the beginning. A user just will not wait for days to download something in order to be able to start using a piece of technology.

I recently proposed downloading snapshots of the blockchain via bittorrent to be included in the Ethereum protocol itself. I know that snapshots of the Bitcoin blockchain have been distributed that way, but it has been a manual process. If a client can quickly download the huge file up to a recent point, and then only donwload the latest ones in the the traditional way, starting up may be easier. Of course, the whole chain would have to be verified, but maybe that can be a background process that doesn’t stop you from using whatever is built ontop of the particular blockchain. (I’m not sure if that will be secure enough, and that, say potential Sybil attacks on the bittorrent part won’t make it undesirable, it’s just an idea).

But even if such an approach works and is adopted, that would still mean that for every service you’d have to download a separate blockchain. Of course, projects like Ethereum may seem like the “one stop shop” for cool blockchain-based applications, but fragmentation is already happening – there are alt-coins bundled with various services like file storage, DNS, etc. That will not be workable for end-users. And it’s certainly not an option for mobile, which is the dominant client now. If instead of downloading the entire chain, something like consistent hashing is used to distribute the content in small portions among clients, it might be workable. But how will trust work in that case, I don’t know. Maybe it’s possible, maybe not.

And yes, I know that you don’t necessarily have to install a wallet/client in order to make use of a given blockchain – you can just have a cloud-based wallet. Which is fairly convenient, but that gets me to my nitpicking from a few paragraphs above and to may second concern – this effectively turns a distributed system into a decentralized one – a limited number of cloud providers hold most of the data (just as a limited number of miners hold most of the processing power). And then, even though the underlying technology allows for a distributed deployment, we’ll end-up again with simply decentralized or even de-facto cenetralized, if mergers and acquisitions lead us there (and they probably will). And in order to be able to access our wallets/accounts from multiple devices, we’d use a convenient cloud service where we’d login with our username and password (because the private key is just too technical and hard for regular users). And that seems to defeat the whole idea.

Not only that, but there is an inevitable centralization of decisions (who decides on the size of the block, who has commit rights to the client repository) as well as a hidden centralization of power – how much GPU power does the Chinese mining “farms” control and can they influence the network significantly? And will the average user ever know that or care (as they don’t care that Google is centralized). I think that overall, distributed technologies will follow the power law, and the majority of data/processing power/decision power will be controller by a minority of actors. And so our distributed utopia will not happen in its purest form we dream of.

My third concern is incentive. Distributed technologies that have been successful so far have a pretty narrow set of incentives. The internet was promoted by large public institutions, including government agencies and big universitives. Bittorrent was successful mainly because it allowed free movies and songs with 2 clicks of the mouse. And Bitcoin was successful because it offered financial benefits. I’m oversimplifying of course, but “government effort”, “free & easy” and “source of more money” seem to have been the successful incentives. On the other side of the fence there are dozens of failed distributed technologies. I’ve tried many of them – alternative search engines, alternative file storage, alternative ride-sharings, alternative social networks, alternative “internets” even. None have gained traction. Because they are not easier to use than their free competitors and you can’t make money out of them (and no government bothers promoting them).

Will blockchain-based services have sufficient incentives to drive customers? Will centralized competitors just easily crush the distributed alternatives by being cheaper, more-user friendly, having sales departments that can target more than hardcore geeks who have no problem syncing their blockchain via the command line? The utopian slogans seem very cool to idealists and futurists, but don’t sell. “Free from centralized control, full control over your data” – we’d have to go through a long process of cultural change before these things make sense to more than a handful of people.

Speaking of services, often examples include “the sharing economy”, where one stranger offers a service to another stranger. Blockchain technology seems like a good fit here indeed – the services are by nature distributed, why should the technology be centralized? Here comes my fourth concern – identity. While for the cryptocurrencies it’s actually beneficial to be anonymous, for most of the real-world services (i.e. the industries that ought to be revolutionized) this is not an option. You can’t just go in the car of publicKey=5389BC989A342…. “But there are already distributed reputation systems”, you may say. Yes, and they are based on technical, not real-world identities. That doesn’t build trust. I don’t trust that publicKey=5389BC989A342… is the same person that got the high reputation. There may be five people behind that private key. The private key may have been stolen (e.g. in a cloud-provider breach).

The values of companies like Uber and AirBNB is that they serve as trust brokers. They verify and vouch for their drivers and hosts (and passengers and guests). They verify their identity through government-issued documents, skype calls, selfies, compare pictures to documents, get access to government databases, credit records, etc. Can a fully distributed service do that? No. You’d need a centralized provider to do it. And how would the blockchain make any difference then? Well, I may not be entirely correct here. I’ve actually been thinking quite a lot about decentralized identity. E.g. a way to predictably generate a private key based on, say biometrics+password+government-issued-documents, and use the corresponding public key as your identifier, which is then fed into reputation schemes and ultimately – real-world services. But we’re not there yet.

And that is part of my fifth concern – the technology itself. We are not there yet. There are bugs, there are thefts and leaks. There are hard-forks. There isn’t sufficient understanding of the technology (I confess I don’t fully grasp all the implementation details, and they are always the key). Often the technology is advertised as “just working”, but it isn’t. The other day I read an article (lost the link) that clarifies a common misconception about smart contracts – they cannot interact with the outside world – they can’t call APIs (e.g. stock market prices, bank APIs), they can’t push or fetch data from anywhere but the blockchain. That mandates the need, again, for a centralized service that pushes the relevant information before smart contracts can pick it up. I’m pretty sure that all cool-sounding applications are not possible without extensive research. And even if/when they are, writing distributed code is hard. Debugging a smart contract is hard. Yes, hard is cool, but that doesn’t drive economic value.

I have mostly been referring to public blockchains so far. Private blockchains may have their practical application, but there’s one catch – they are not exactly the cool distributed technology that the Bitcoin uses. They may be called “blockchains” because they…chain blocks, but they usually centralize trust. For example the Hyperledger project uses PKI, with all its benefits and risks. In these cases, a centralized authority issues the identity “tokens”, and then nodes communicate and form a shared ledger. That’s a bit easier problem to solve, and the nodes would usually be on actual servers in real datacenters, and not on your uncle’s Windows XP.

That said, hash chaining has been around for quite a long time. I did research on the matter because of a side-project of mine and it seems providing a tamper-proof/tamper-evident log/database on semi-trusted machines has been discussed in many computer science papers since the 90s. That alone is not “the magic blockchain” that will solve all of our problems, no matter what gossip protocols you sprinkle ontop. I’m not saying that’s bad, on the contrary – any variation and combinations of the building blocks of the blockchain (the hash chain, the consensus algorithm, the proof-of-work (or stake), possibly smart contracts), has potential for making useful products.

I know I sound like the a naysayer here, but I hope I’ve pointed out particular issues, rather than aimlessly ranting at the hype (though that’s tempting as well). I’m confident that blockchain-like technologies will have their practical applications, and we will see some successful, widely-adopted services and solutions based on that, just as pointed out in this detailed report. But I’m not convinced it will be revolutionizing.

I hope I’m proven wrong, though, because watching a revolutionizing technology closely and even being part of it would be quite cool.

The post Concerns About The Blockchain Technology appeared first on Bozho's tech blog.

Developers and Ethics

Post Syndicated from Bozho original https://techblog.bozho.net/developers-and-ethics/

“What are some areas you are particularly interested in” – recruiters (head-hunters) tend to ask that question a lot. I don’t have a good answer for that – I’ll know it when I see it. But I have a list of areas that I wouldn’t like to work in. And one of them is gambling.

Several years ago I got a very lucrative offer for a gambling company, both well paid and technically challenging. But I rejected it. Because I didn’t want to contribute to abusing peoples’ weaknesses for the sake of getting their money. And no, I’m not a raging Marxist, but gambling is bad. You may argue that it’s a necessary vice and people need it to suppress other internal struggles, but I’m not buying that as a motivator.

I felt it’s unethical to write code that does that. Like I feel it’s unethical to profile users’ behaviours and “read” their emails in order to target ads, or to write bots to disseminate fake news.

A few months ago I was part of the campaign HQ for a party in a parliamentary election. Cambridge Analytica had already become popular after “delivering Brexit and Trump’s victory”, that using voters’ data in order to target messages at them sounded like the new cool thing. As head of IT & data, I rejected this approach. Because it would be unethical to bait unsuspecting users to take dumb tests in order to provide us with facebook tokens. Yes, we didn’t have any money to hire Cambridge Analytica-like companies, but even if we had, is “outsourcing” the dubious practice changing anything? If you pay someone to trick users into unknowingly giving their personal data, it’s as if you did it yourself.

This can be a very long post about technology and ethics. But it won’t, as this is a technical blog, not a philosophical one. It won’t be about philosophy – for interesting takes on the matter you can listen to Damon Horowitz’s TED talk or even go through all of Michael Sandel’s Justice lectures at Harvard. It won’t be about how companies should be ethical (e.g. following the ethical design manifesto)

Instead, it will be a short post focusing on developers and their ethical choices.

I think we have the freedom to be ethical – there’s so much demand on the job market that rejecting an offer, refusing to do something, or leaving a company for ethical reasons is something we have the luxury to do without compromising our well-being. When asked to do something unethical, we can refuse (several years ago I was asked to take part in some shady interactions related to a potential future government contract, which I refused to do). When offered jobs that are slightly better paid but would have us build abusive technology, we can turn the offer down. When a new feature requires us to breach people’s privacy, we can argue it, and ultimately not do it.

But in order to start making these ethical choices, we have to start thinking about ethics. To put ourselves in context. We, developers, are building the world of tomorrow (it sounds grandiose, but we know it’s way more mundane than that). We are the “tools” with which future products will be shaped. And yes, that’s true even for the average back-office system of an insurance company (which allows for raising the insurance for pre-existing conditions), and true for boring banking software (which allows mortgages way beyond the actual coverage the bank has), and so on.

Are these decisions ours to make? Isn’t it legislators that should define what’s allowed and what isn’t? We are just building whatever they tell us to build. Forgive me the far-fetched analogy, but Nazi Germany was an anti-humanity machine based on people who “just followed orders”. Yes, we’ll refuse, someone else will come and do it, but collective ethics gets built over time.

As Hannah Arendt had put it – “The sad truth is that most evil is done by people who never make up their minds to be good or evil.”. We may think that as developers we don’t have a say. But without us, no software can be built. So with our individual ethical stance, a certain unethical software may not be built or be successful, and that’s a stance worth considering, especially when it costs us next to nothing.

The post Developers and Ethics appeared first on Bozho's tech blog.