Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2019/06/lessons_learned_1.html
Really interesting first-hand experience from Maciej Cegłowski.
Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2019/06/lessons_learned_1.html
Really interesting first-hand experience from Maciej Cegłowski.
Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/08/good_primer_on_.html
Stuart Schechter published a good primer on the security issues surrounding two-factor authentication.
While it’s often an important security measure, it’s not a panacea. Stuart discusses the usability and security issues that you have to think about before deploying the system.
Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/06/perverse_vulner.html
Apple is rolling out an iOS security usability feature called Security code AutoFill. The basic idea is that the OS scans incoming SMS messages for security codes and suggests them in AutoFill, so that people can use them without having to memorize or type them.
Sounds like a really good idea, but Andreas Gutmann points out an application where this could become a vulnerability: when authenticating transactions:
Transaction authentication, as opposed to user authentication, is used to attest the correctness of the intention of an action rather than just the identity of a user. It is most widely known from online banking, where it is an essential tool to defend against sophisticated attacks. For example, an adversary can try to trick a victim into transferring money to a different account than the one intended. To achieve this the adversary might use social engineering techniques such as phishing and vishing and/or tools such as Man-in-the-Browser malware.
Transaction authentication is used to defend against these adversaries. Different methods exist but in the one of relevance here — which is among the most common methods currently used — the bank will summarise the salient information of any transaction request, augment this summary with a TAN tailored to that information, and send this data to the registered phone number via SMS. The user, or bank customer in this case, should verify the summary and, if this summary matches with his or her intentions, copy the TAN from the SMS message into the webpage.
This new iOS feature creates problems for the use of SMS in transaction authentication. Applied to 2FA, the user would no longer need to open and read the SMS from which the code has already been conveniently extracted and presented. Unless this feature can reliably distinguish between OTPs in 2FA and TANs in transaction authentication, we can expect that users will also have their TANs extracted and presented without context of the salient information, e.g. amount and destination of the transaction. Yet, precisely the verification of this salient information is essential for security. Examples of where this scenario could apply include a Man-in-the-Middle attack on the user accessing online banking from their mobile browser, or where a malicious website or app on the user’s phone accesses the bank’s legitimate online banking service.
This is an interesting interaction between two security systems. Security code AutoFill eliminates the need for the user to view the SMS or memorize the one-time code. Transaction authentication assumes the user read and approved the additional information in the SMS message before using the one-time code.
StaCoAn is a cross-platform tool which aids developers, bug bounty hunters and ethical hackers performing mobile app static analysis on the code of the application for both native Android and iOS applications.
This tool will look for interesting lines in the code which can contain:
This tool was created with a big focus on usability and graphical guidance in the user interface.
Post Syndicated from Bozho original https://techblog.bozho.net/user-authentication-best-practices-checklist/
User authentication is the functionality that every web application shared. We should have perfected that a long time ago, having implemented it so many times. And yet there are so many mistakes made all the time.
Part of the reason for that is that the list of things that can go wrong is long. You can store passwords incorrectly, you can have a vulnerably password reset functionality, you can expose your session to a CSRF attack, your session can be hijacked, etc. So I’ll try to compile a list of best practices regarding user authentication. OWASP top 10 is always something you should read, every year. But that might not be enough.
So, let’s start. I’ll try to be concise, but I’ll include as much of the related pitfalls as I can cover – e.g. what could go wrong with the user session after they login:
secure. Makes cookie theft harder.
userId:expiresTimestamp:hmac(userId+expiresTimestamp). That way you have expiring links (rather than one-time links). The HMAC relies on a secret key, so the links can’t be spoofed. It seems there’s no consensus, as the OWASP guide has a bit different approach
I’m sure I’m missing something. And you see it’s complicated. Sadly we’re still at the point where the most common functionality – authenticating users – is so tricky and cumbersome, that you almost always get at least some of it wrong.
GDPR is the new data protection regulation, as you probably already know. I’ve given a detailed practical advice for what it means for developers (and product owners). However, there’s one thing missing there – cookies. The elephant in the room.
Previously I’ve stated that cookies are subject to another piece of legislation – the ePrivacy directive, which is getting updated and its new version will be in force a few years from now. And while that’s technically correct, cookies seem to be affected by GDPR as well. In a way I’ve underestimated that effect.
When you do a Google search on “GDPR cookies”, you’ll pretty quickly realize that a) there’s not too much information and b) there’s not much technical understanding of the issue.
What appears to be the consensus is that GDPR does change the way cookies are handled. More specifically – tracking cookies. Here’s recital 30:
(30) Natural persons may be associated with online identifiers provided by their devices, applications, tools and protocols, such as internet protocol addresses, cookie identifiers or other identifiers such as radio frequency identification tags. This may leave traces which, in particular when combined with unique identifiers and other information received by the servers, may be used to create profiles of the natural persons and identify them.
How tracking cookies work – a 3rd party (usually an ad network) gives you a code snippet that you place on your website, for example to display ads. That code snippet, however, calls “home” (makes a request to the 3rd party domain). If the 3rd party has previously been used on your computer, it has created a cookie. In the example of Facebook, they have the cookie with your Facebook identifier because you’ve logged in to Facebook. So this cookie (with your identifier) is sent with the request. The request also contains all the details from the page. In effect, you are uniquely identified by an identifier (in the case of Facebook and Google – fully identified, rather than some random anonymous identifier as with other ad networks).
Your behaviour on the website is personal data. It gets associated with your identifier, which in turn is associated with your profile. And all of that is personal data. Who is responsible for collecting the website behaviour data, i.e. who is the “controller”? Is it Facebook (or any other 3rd party) that technically does the collection? No, it’s the website owner, as the behaviour data is obtained on their website, and they have put the tracking piece of code there. So they bear responsibility.
For the data collected by tracking cookies you have two options – “consent” and “legitimate interest”. Legitimate interest will be hard to prove – it is not something that a user reasonably expects, it is not necessary for you to provide the service. If your lawyers can get that option to fly, good for them, but I’m not convinced regulators will be happy with that.
The other option is “consent”. You have to ask your users explicitly – that means “with a checkbox” – to let you use tracking cookies. That has two serious implications – from technical and usability point of view.
These aspects pose a significant questions: is it worth it to have tracking cookies? Is developing new functionality worth it, is interrupting the user worth it, and is implementing new functionality just so that users never clicks a hidden checkbox worth it? Especially given that Firefox now blocks all tracking cookies and possibly other browsers will follow?
That by itself is an interesting topic – Firefox has basically implemented the most strict form of requirements of the upcoming ePrivacy directive update (that would turn it into an ePrivacy regulation). Other browsers will have to follow, even though Google may not be happy to block their own tracking cookies. I hope other browsers follow Firefox in tracking protection and the issue will be gone automatically.
To me it seems that it will be increasingly not worthy to have tracking cookies on your website. They add regulatory obligations for you and give you very little benefit (yes, you could track engagement from ads, but you can do that in other ways, arguably by less additional code than supporting the cookie consents). And yes, the cookie consent will be “outsourced” to browsers after the ePrivacy regulation is passed, but we can’t be sure at the moment whether there won’t be technical whack-a-mole between browsers and advertisers and whether you wouldn’t still need additional effort to have dynamic consent for tracking cookies. (For example there are reported issues that Firefox used to make Facebook login fail if tracking protection is enabled. Which could be a simple bug, or could become a strategy by big vendors in the future to force browsers into a less strict tracking protection).
Okay, we’ve decided it’s not worth it managing tracking cookies. But do you have a choice as a website owner? Can you stop your ad network from using them? (Remember – you are liable if users’ data is collected by visiting your website). And currently the answer is no – you can’t disable that. You can’t have “just the ads”. This is part of the “deal” – you get money for the ads you place, but you participate in a big “surveillance” network. Users have a way to opt out (e.g. Google AdWords gives them that option). You, as a website owner, don’t.
And sometimes you don’t want to serve ads, just track user behaviour and measure conversion. But even if you ask for consent for that and conditionally insert the plugin/snippet, do you actually know what data it sends? And what it’s used for? Because you have to know in order to inform your users. “Do you agree to use tracking cookies that Facebook has inserted in order to collect data about your behaviour on our website” doesn’t sound compelling.
So, what to do? The easiest thing is just not to use any 3rd party ad-related plugins. But that’s obviously not an option, as ad revenue is important, especially in the publishing industry. I don’t have a good answer, apart from “Regulators should pressure ad networks to provide opt-outs and clearly document their data usage”. They have to do that under GDPR, and while website owners are responsible for their users’ data, the ad networks that are in the role of processors in this case (as you delegate the data collection for your visitors to them) also have obligation to assist you in fulfilling your obligations. So ask Facebook – what should I do with your tracking cookies? And when the regulator comes after a privacy-aware customer files a complaint, you could prove that you’ve tried.
The ethical debate whether it’s wrong to collect data about peoples’ behaviour without their informed consent is an easy one. And that’s why I don’t put blame on the regulators – they are putting the ethical consensus in law. It gets more complicated if not allowing tracking means some internet services are no longer profitable and therefore can’t exist. Can we have the cake and eat it too?
Over on the Red Hat Developer Program blog, David Malcolm describes a number of usability improvements that he has made for the upcoming GCC 8 release. Malcolm has made a number of the C/C++ compiler error messages much more helpful, including adding hints for integrated development environments (IDEs) and other tools to suggest fixes for syntax and other kinds of errors. “[…] the code is fine, but, as is common with fragments of code seen on random websites, it’s missing #include directives. If you simply copy this into a new file and try to compile it as-is, it fails.
This can be frustrating when copying and pasting examples – off the top of your head, which header files are needed by the above? – so for gcc 8 I’ve added hints telling you which header files are missing (for the most common cases).” He has various examples showing what the new error messages and hints look like in the blog post.
Sodium is a new, easy-to-use software library for encryption, decryption, signatures, password hashing and more. It is a portable, cross-compilable, installable, packageable fork of NaCl, with a compatible API, and an extended API to improve usability even further.
Whether you want to review a Security, Compliance, & Identity track session you attended at AWS re:Invent 2017, or you want to experience a session for the first time, videos and slide decks from the Security, Compliance, & Identity track are now available.
Today, the following Security, Compliance, & Identity sessions, workshops, and chalk talks will be presented at AWS re:Invent 2017 in Las Vegas. All sessions are in the MGM Grand and all times are local. See the re:Invent Session Catalog for complete information about every session. You can also download the AWS re:Invent 2017 Mobile App for the latest updates and information.
If you are not attending re:Invent 2017, keep in mind that all videos of and slide decks from these sessions will be made available next week. We will publish a post on the Security Blog next week that links to all available videos and slide decks.
Tonight, 4:30–7:30 P.M.
Special event sponsored by AWS Chief Information Security Office Steve Schmidt: We Are Giants: Diversity in Tech.
Post Syndicated from Bozho original https://techblog.bozho.net/gdpr-practical-guide-developers/
You’ve probably heard about GDPR. The new European data protection regulation that applies practically to everyone. Especially if you are working in a big company, it’s most likely that there’s already a process for gettign your systems in compliance with the regulation.
The regulation is basically a law that must be followed in all European countries (but also applies to non-EU companies that have users in the EU). In this particular case, it applies to companies that are not registered in Europe, but are having European customers. So that’s most companies. I will not go into yet another “12 facts about GDPR” or “7 myths about GDPR” posts/whitepapers, as they are often aimed at managers or legal people. Instead, I’ll focus on what GDPR means for developers.
Why am I qualified to do that? A few reasons – I was advisor to the deputy prime minister of a EU country, and because of that I’ve been both exposed and myself wrote some legislation. I’m familiar with the “legalese” and how the regulatory framework operates in general. I’m also a privacy advocate and I’ve been writing about GDPR-related stuff in the past, i.e. “before it was cool” (protecting sensitive data, the right to be forgotten). And finally, I’m currently working on a project that (among other things) aims to help with covering some GDPR aspects.
I’ll try to be a bit more comprehensive this time and cover as many aspects of the regulation that concern developers as I can. And while developers will mostly be concerned about how the systems they are working on have to change, it’s not unlikely that a less informed manager storms in in late spring, realizing GDPR is going to be in force tomorrow, asking “what should we do to get our system/website compliant”.
The rights of the user/client (referred to as “data subject” in the regulation) that I think are relevant for developers are: the right to erasure (the right to be forgotten/deleted from the system), right to restriction of processing (you still keep the data, but mark it as “restricted” and don’t touch it without further consent by the user), the right to data portability (the ability to export one’s data), the right to rectification (the ability to get personal data fixed), the right to be informed (getting human-readable information, rather than long terms and conditions), the right of access (the user should be able to see all the data you have about them), the right to data portability (the user should be able to get a machine-readable dump of their data).
Additionally, the relevant basic principles are: data minimization (one should not collect more data than necessary), integrity and confidentiality (all security measures to protect data that you can think of + measures to guarantee that the data has not been inappropriately modified).
Even further, the regulation requires certain processes to be in place within an organization (of more than 250 employees or if a significant amount of data is processed), and those include keeping a record of all types of processing activities carried out, including transfers to processors (3rd parties), which includes cloud service providers. None of the other requirements of the regulation have an exception depending on the organization size, so “I’m small, GDPR does not concern me” is a myth.
It is important to know what “personal data” is. Basically, it’s every piece of data that can be used to uniquely identify a person or data that is about an already identified person. It’s data that the user has explicitly provided, but also data that you have collected about them from either 3rd parties or based on their activities on the site (what they’ve been looking at, what they’ve purchased, etc.)
Having said that, I’ll list a number of features that will have to be implemented and some hints on how to do that, followed by some do’s and don’t’s.
Now some “do’s”, which are mostly about the technical measures needed to protect personal data. They may be more “ops” than “dev”, but often the application also has to be extended to support them. I’ve listed most of what I could think of in a previous post.
Finally, some “don’t’s”.
Overall, the purpose of the regulation is to make you take conscious decisions when processing personal data. It imposes best practices in a legal way. If you follow the above advice and design your data model, storage, data flow , API calls with data protection in mind, then you shouldn’t worry about the huge fines that the regulation prescribes – they are for extreme cases, like Equifax for example. Regulators (data protection authorities) will most likely have some checklists into which you’d have to somehow fit, but if you follow best practices, that shouldn’t be an issue.
I think all of the above features can be implemented in a few weeks by a small team. Be suspicious when a big vendor offers you a generic plug-and-play “GDPR compliance” solution. GDPR is not just about the technical aspects listed above – it does have organizational/process implications. But also be suspicious if a consultant claims GDPR is complicated. It’s not – it relies on a few basic principles that are in fact best practices anyway. Just don’t ignore them.
Post Syndicated from Robert Graham original http://blog.erratasec.com/2017/11/your-holiday-cybersecurity-guide.html
Many of us are visiting parents/relatives this Thanksgiving/Christmas, and will have an opportunity to help our them with cybersecurity issues. I thought I’d write up a quick guide of the most important things.
They don’t need a separate password for every site. You don’t care about the majority of website whether you get hacked. Use a common password for all the meaningless sites. You only need unique passwords for important accounts, like email, Facebook, and Twitter.
Write them down, with pen and paper. Don’t put them in a MyPasswords.doc, because when a hacker breaks in, they’ll easily find that document and easily hack your accounts.
You might help them out with getting a password manager, or two-factor authentication (2FA). Good 2FA like YubiKey will stop a lot of phishing threats. But this is difficult technology to learn, and of course, you’ll be on the hook for support issues, such as when they lose the device. Thus, while 2FA is best, I’m only recommending pen-and-paper to store passwords. (AccessNow has a guide, though I think YubiKey/U2F keys for Facebook and GMail are the best).
Apple has made this especially easy with fingerprints (and now faceprints), so there’s little excuse not to lock the phone.
Note that Apple iPhones are the most secure. I give my mother my old iPhones so that they will have something secure.
My mom demonstrates a problem you’ll have with the older generation: she doesn’t reliably have her phone with her, and charged. She’s the opposite of my dad who religiously slaved to his phone. Even a small change to make her lock her phone means it’ll be even more likely she won’t have it with her when you need to call her.
The password should be written down on the same piece of paper as all the other passwords. This is importance. My parents just moved, Comcast installed a WiFi access point for them, and they promptly lost the piece of paper. When I wanted to debug some thing on their network today, they didn’t know the password, and couldn’t find the paper. Get that password written down in a place it won’t get lost!
If they have a really old home router, you should probably replace it, or at least update the firmware. A lot of old routers have hacks that allow hackers (like me masscaning the Internet) to easily break in.
Most of the online tricks that will confuse your older parents will come via advertising, such as popups claiming “You are infected with a virus, click here to clean it”. Installing an ad blocker in the browser, such as uBlock Origin, stops most all this nonsense.
For example, here’s a screenshot of going to the “Speedtest” website to test the speed of my connection (I took this on the plane on the way home for Thanksgiving). Ignore the error (plane’s firewall Speedtest) — but instead look at the advertising banner across the top of the page insisting you need to download a browser extension. This is tricking you into installing malware — the ad appears as if it’s a message from Speedtest, it’s not. Speedtest is just selling advertising and has no clue what the banner says. This sort of thing needs to be blocked — it fools even the technologically competent.
First, you really need to separate your work account from personal. The IT department is already getting misdirected emails with your spouse/lover that they don’t want to see. Any conflict with your work, such as getting fired, gives your private correspondence to their lawyers.
Second, you need a wholly separate account for financial stuff, like Amazon.com, your bank, PayPal, and so on. That prevents confusion with phishing attacks.
Consider this warning today:
Phishing warning! Fake emails are being sent out pretending to be from the US Postal Service, claiming that you requested your mail be held this week. Don’t click on the attachment OR the links.
— Wendy Nather (@wendynather) November 21, 2017
If you had split accounts, you could safely ignore this. The USPS would only know your financial email account, which gets no phishing attacks, because it’s not widely known. When your receive the phishing attack on your personal email, you ignore it, because you know the USPS doesn’t know your personal email account.
You should install the latest OS (Windows 10, macOS High Sierra), and also turn on automatic patching.
But remember it may not be worth the huge effort involved. I want my parents to be secure — but no so secure I have to deal with issues.
Contributed by Otavio Ferreira, Manager, Software Development, AWS Messaging
Like other developers around the world, you may be tackling increasingly complex business problems. A key success factor, in that case, is the ability to break down a large project scope into smaller, more manageable components. A service-oriented architecture guides you toward designing systems as a collection of loosely coupled, independently scaled, and highly reusable services. Microservices take this even further. To improve performance and scalability, they promote fine-grained interfaces and lightweight protocols.
However, the communication among isolated microservices can be challenging. Services are often deployed onto independent servers and don’t share any compute or storage resources. Also, you should avoid hard dependencies among microservices, to preserve maintainability and reusability.
If you apply the pub/sub design pattern, you can effortlessly decouple and independently scale out your microservices and serverless architectures. A pub/sub messaging service, such as Amazon SNS, promotes event-driven computing that statically decouples event publishers from subscribers, while dynamically allowing for the exchange of messages between them. An event-driven architecture also introduces the responsiveness needed to deal with complex problems, which are often unpredictable and asynchronous.
Given the context of microservices, event-driven computing is a model in which subscriber services automatically perform work in response to events triggered by publisher services. This paradigm can be applied to automate workflows while decoupling the services that collectively and independently work to fulfil these workflows. Amazon SNS is an event-driven computing hub, in the AWS Cloud, that has native integration with several AWS publisher and subscriber services.
Several AWS services have been integrated as SNS publishers and, therefore, can natively trigger event-driven computing for a variety of use cases. In this post, I specifically cover AWS compute, storage, database, and networking services, as depicted below.
In addition to SNS, event-driven computing is also addressed by Amazon CloudWatch Events, which delivers a near real-time stream of system events that describe changes in AWS resources. With CloudWatch Events, you can route each event type to one or more targets, including:
Many AWS services publish events to CloudWatch. As an example, you can get CloudWatch Events to capture events on your ETL (Extract, Transform, Load) jobs running on AWS Glue and push failed ones to an SQS queue, so that you can retry them later.
Amazon SNS is a pub/sub messaging service that can be used as an event-driven computing hub to AWS customers worldwide. By capturing events natively triggered by AWS services, such as EC2, S3 and RDS, you can automate and optimize all kinds of workflows, namely scaling, testing, encoding, profiling, broadcasting, discovery, failover, and much more. Business use cases presented in this post ranged from recruiting websites, to scientific research, geographic systems, social networks, retail websites, and news portals.
Start now by visiting Amazon SNS in the AWS Management Console, or by trying the AWS 10-Minute Tutorial, Send Fan-out Event Notifications with Amazon SNS and Amazon SQS.
Post Syndicated from Bozho original https://techblog.bozho.net/still-prefer-eclipse-intellij-idea/
Over the years I’ve observed an inevitable shift from Eclipse to IntelliJ IDEA. Last year they were almost equal in usage, and I have the feeling things are swaying even more towards IDEA.
IDEA is like the iPhone of IDEs – its users tell you that “you will feel how much better it is once you get used to it”, “are you STILL using Eclipse??”, “IDEA is so much better, I thought everyone has switched”, etc.
I’ve been using mostly Eclipse for the past 12 years, but in some cases I did use IDEA – when I was writing Scala, when I was writing Android, and most recently – when Eclipse failed to be ready for the Java 9 release, so after half a day of trying to get it working, I just switched to IDEA until Eclipse finally gets a working Java 9 version (with Maven and the rest of the stuff).
But I will get back to Eclipse again, soon. And I still prefer it. Not just because of all the key combinations I’ve internalized (you can reuse those in IDEA), but because there are still things I find worse in IDEA. Of course, IDEA has so much more cool features like code improvement suggestions and actually working plugins for everything. But at least some of the problems I see have to do with the more basic development workflow and experience. And you can’t compensate for those with sugarcoating. So here they are:
Apart from the first two, the rest are not major issues, I agree. But they add up. Ultimately, it’s a matter of personal choice whether you can turn a blind eye to these issues. But I’m getting back to Eclipse again. At some point I will propose improvements in the IntelliJ IDEA backlog and will check it again in a few years, I guess.
The recent addition of Xilinx FPGAs to AWS Cloud compute offerings is one way that AWS is enabling global growth in the areas of advanced analytics, deep learning and AI. The customized F1 servers use pooled accelerators, enabling interconnectivity of up to 8 FPGAs, each one including 64 GiB DDR4 ECC protected memory, with a dedicated PCIe x16 connection. That makes this a powerful engine with the capacity to process advanced analytical applications at scale, at a significantly faster rate. For example, AWS commercial partner Edico Genome is able to achieve an approximately 30X speedup in analyzing whole genome sequencing datasets using their DRAGEN platform powered with F1 instances.
While the availability of FPGA F1 compute on-demand provides clear accessibility and cost advantages, many mainstream users are still finding that the “threshold to entry” in developing or running FPGA-accelerated simulations is too high. Researchers at the UC Berkeley RISE Lab have developed “FireSim”, powered by Amazon FPGA F1 instances as an open-source resource, FireSim lowers that entry bar and makes it easier for everyone to leverage the power of an FPGA-accelerated compute environment. Whether you are part of a small start-up development team or working at a large datacenter scale, hardware-software co-design enables faster time-to-deployment, lower costs, and more predictable performance. We are excited to feature FireSim in this post from Sagar Karandikar and his colleagues at UC-Berkeley.
―Mia Champion, Sr. Data Scientist, AWS
As traditional hardware scaling nears its end, the data centers of tomorrow are trending towards heterogeneity, employing custom hardware accelerators and increasingly high-performance interconnects. Prototyping new hardware at scale has traditionally been either extremely expensive, or very slow. In this post, I introduce FireSim, a new hardware simulation platform under development in the computer architecture research group at UC Berkeley that enables fast, scalable hardware simulation using Amazon EC2 F1 instances.
FireSim benefits both hardware and software developers working on new rack-scale systems: software developers can use the simulated nodes with new hardware features as they would use a real machine, while hardware developers have full control over the hardware being simulated and can run real software stacks while hardware is still under development. In conjunction with this post, we’re releasing the first public demo of FireSim, which lets you deploy your own 8-node simulated cluster on an F1 Instance and run benchmarks against it. This demo simulates a pre-built “vanilla” cluster, but demonstrates FireSim’s high performance and usability.
FPGA-accelerated hardware simulation is by no means a new concept. However, previous attempts to use FPGAs for simulation have been fraught with usability, scalability, and cost issues. FireSim takes advantage of EC2 F1 and open-source hardware to address the traditional problems with FPGA-accelerated simulation:
Problem #1: FPGA-based simulations have traditionally been expensive, difficult to deploy, and difficult to reproduce.
FireSim uses public-cloud infrastructure like F1, which means no upfront cost to purchase and deploy FPGAs. Developers and researchers can distribute pre-built AMIs and AFIs, as in this public demo (more details later in this post), to make experiments easy to reproduce. FireSim also automates most of the work involved in deploying an FPGA simulation, essentially enabling one-click conversion from new RTL to deploying on an FPGA cluster.
Problem #2: FPGA-based simulations have traditionally been difficult (and expensive) to scale.
Because FireSim uses F1, users can scale out experiments by spinning up additional EC2 instances, rather than spending hundreds of thousands of dollars on large FPGA clusters.
Problem #3: Finding open hardware to simulate has traditionally been difficult. Finding open hardware that can run real software stacks is even harder.
FireSim simulates RocketChip, an open, silicon-proven, RISC-V-based processor platform, and adds peripherals like a NIC and disk device to build up a realistic system. Processors that implement RISC-V automatically support real operating systems (such as Linux) and even support applications like Apache and Memcached. We provide a custom Buildroot-based FireSim Linux distribution that runs on our simulated nodes and includes many popular developer tools.
Problem #4: Writing hardware in traditional HDLs is time-consuming.
Both FireSim and RocketChip use the Chisel HDL, which brings modern programming paradigms to hardware description languages. Chisel greatly simplifies the process of building large, highly parameterized hardware components.
FireSim drastically improves the process of co-designing hardware and software by acting as a push-button interface for collaboration between hardware developers and systems software developers. The following diagram describes the workflows that hardware and software developers use when working with FireSim.
The hardware developer’s view:
The software developer’s view:
FireSim demo v1.0
This first public demo of FireSim focuses on the aforementioned “software-developer’s view” of the custom hardware development cycle. The demo simulates a cluster of 1 to 8 RocketChip-based nodes, interconnected by a functional network simulation. The simulated nodes work just like “real” machines: they boot Linux, you can connect to them using SSH, and you can run real applications on top. The nodes can see each other (and the EC2 F1 instance on which they’re deployed) on the network and communicate with one another. While the demo currently simulates a pre-built “vanilla” cluster, the entire hardware configuration of these simulated nodes can be modified after FireSim is open-sourced.
In this post, I walk through bringing up a single-node FireSim simulation for experienced EC2 F1 users. For more detailed instructions for new users and instructions for running a larger 8-node simulation, see FireSim Demo v1.0 on Amazon EC2 F1. Both demos walk you through setting up an instance from a demo AMI/AFI and booting Linux on the simulated nodes. The full demo instructions also walk you through an example workload, running Memcached on the simulated nodes, with YCSB as a load generator to demonstrate network functionality.
In this release, we provide pre-built binaries for driving simulation from the host and a pre-built AFI that contains the FPGA infrastructure necessary to simulate a RocketChip-based node.
First, launch an instance using the free FireSim Demo v1.0 product available on the AWS Marketplace on an f1.2xlarge instance. After your instance has booted, log in using the user name centos. On the first login, you should see the message “FireSim network config completed.” This sets up the necessary tap interfaces and bridge on the EC2 instance to enable communicating with the simulated nodes.
The AMI contains a variety of tools to help you run simulations and build software for RISC-V systems, including the riscv64 toolchain, a Buildroot-based Linux distribution that runs on the simulated nodes, and the simulation driver program. For more details, see the AMI Contents section on the FireSim website.
First, you need to flash the FPGA with the FireSim AFI. To do so, run:
[[email protected]_ADDR ~]$ sudo fpga-load-local-image -S 0 -I agfi-00a74c2d615134b21
To start a simulation, run the following at the command line:
[[email protected]_ADDR ~]$ boot-firesim-singlenode
This automatically calls the simulation driver, telling it to load the Linux kernel image and root filesystem for the Linux distro. This produces output similar to the following:
Simulations Started. You can use the UART console of each simulated node by attaching to the following screens:
There is a screen on:
1 Socket in /var/run/screen/S-centos.
You could connect to the simulated UART console by connecting to this screen, but instead opt to use SSH to access the node instead.
First, ping the node to make sure it has come online. This is currently required because nodes may get stuck at Linux boot if the NIC does not receive any network traffic. For more information, see Troubleshooting/Errata. The node is always assigned the IP address 192.168.1.10:
[[email protected]_ADDR ~]$ ping 192.168.1.10
This should eventually produce the following output:
PING 192.168.1.10 (192.168.1.10) 56(84) bytes of data.
From 192.168.1.1 icmp_seq=1 Destination Host Unreachable
64 bytes from 192.168.1.10: icmp_seq=1 ttl=64 time=2017 ms
64 bytes from 192.168.1.10: icmp_seq=2 ttl=64 time=1018 ms
64 bytes from 192.168.1.10: icmp_seq=3 ttl=64 time=19.0 ms
At this point, you know that the simulated node is online. You can connect to it using SSH with the user name root and password firesim. It is also convenient to make sure that your TERM variable is set correctly. In this case, the simulation expects TERM=linux, so provide that:
The authenticity of host ‘192.168.1.10 (192.168.1.10)’ can’t be established.
ECDSA key fingerprint is 63:e9:66:d0:5c:06:2c:1d:5c:95:33:c8:36:92:30:49.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added ‘192.168.1.10’ (ECDSA) to the list of known hosts.
[email protected]’s password:
At this point, you’re connected to the simulated node. Run uname -a as an example. You should see the following output, indicating that you’re connected to a RISC-V system:
# uname -a
Linux buildroot 4.12.0-rc2 #1 Fri Aug 4 03:44:55 UTC 2017 riscv64 GNU/Linux
Now you can run programs on the simulated node, as you would with a real machine. For an example workload (running YCSB against Memcached on the simulated node) or to run a larger 8-node simulation, see the full FireSim Demo v1.0 on Amazon EC2 F1 demo instructions.
Finally, when you are finished, you can shut down the simulated node by running the following command from within the simulated node:
You can confirm that the simulation has ended by running screen -ls, which should now report that there are no detached screens.
At Berkeley, we’re planning to keep improving the FireSim platform to enable our own research in future data center architectures, like FireBox. The FireSim platform will eventually support more sophisticated processors, custom accelerators (such as Hwacha), network models, and peripherals, in addition to scaling to larger numbers of FPGAs. In the future, we’ll open source the entire platform, including Midas, the tool used to transform RTL into FPGA simulators, allowing users to modify any part of the hardware/software stack. Follow @firesimproject on Twitter to stay tuned to future FireSim updates.
FireSim is the joint work of many students and faculty at Berkeley: Sagar Karandikar, Donggyu Kim, Howard Mao, David Biancolin, Jack Koenig, Jonathan Bachrach, and Krste Asanović. This work is partially funded by AWS through the RISE Lab, by the Intel Science and Technology Center for Agile HW Design, and by ASPIRE Lab sponsors and affiliates Intel, Google, HPE, Huawei, NVIDIA, and SK hynix.
Post Syndicated from Craig Liebendorfer original https://aws.amazon.com/blogs/security/a-full-list-of-the-security-compliance-and-identity-sessions-workshops-and-chalk-talks-available-at-aws-reinvent-2017/
Now that you can reserve seating in AWS re:Invent 2017 breakout sessions, workshops, chalk talks, and other events, the time is right to review the list of introductory, advanced, and expert content being offered this year. To learn more about breakout content types and levels, see Breakout Content.
For those who aren’t cryptocurrency-savvy: Ethereum is a cryptocurrency project, based around the coin Ether. It has the support of many big banks, big hedge funds and some states (Russia, China etc). Among the cryptocurrencies, it is second only to Bitcoin – and might even overtake it with the time. (Especially if Bitcoin doesn’t finally move and fix some of its problems.)
Ethereum offers some abilities that few other cryptocurrencies do. The most important one is the support for “smart projects” – kind of electronic contracts that can easily be executed and enforced with little to no human participation. This post however is dedicated to another of its traits – the Proof of Stake.
To work and exist, every cryptocurrency depends on some proof. Most of them use Proof-of-Work scheme. In it, one has to put some work – eg. calculating checksums – behind its participation in the network and its decision, and receive newly generated coins for it. This however results in huge amount of work done only to prove that, well, you can do it and deserve to be in and receive some of the newly squeezed juice.
As of August 2017, Ethereum uses this scheme too. However, they plan to switch to a Proof-of-Stake algorithm named Casper. In it, you prove yourself not by doing work, but by proving to own Ether. As this requires practically no work, it is much more technically effective than the Proof-of-Work schemes.
Technically, Caspar is an amazing design. I congratulate the Ethereum team for it. However, economically its usage appears to have an important weakness. It is described below.
A polarized system
With Casper, the Ether generated by the Ethereum network and the decision power in it are distributed to these who already own Ether. As a consequence, most of both go to those who own most Ether. (There might be attempts to limit that, but these are easily defeatable. For example, limiting the amount distributed to an address can be circumvented by a Sybil attack.)
Such a distribution will create with the time a financial ecosystem where most money and vote are held by a small minority of the participants. The big majority will have little to no of both – it will summarily hold less money and vote than the minority of “haves”. Giving the speed with which the cryptocurrency systems evolve, it is realistic to expect this development in ten, maybe even in five or less years after introducing Casper.
The “middle class”
Economists love to repeat how important is to have a strong middle class. Why, and how that translates to the situation in a cryptocurrency-based financial system?
In systemic terms, “middle class” denotes in a financial system the set of entities that control each a noticeable but not very big amount of resources.
Game theory shows that in a financial system, entities with different clout usually have different interests. These interests usually reflect the amount of resources they control. Entities with little to no resources tend to have interests opposing to these with biggest resources – especially in systems where the total amount of resources changes slowly and the economics is close to a zero-sum game. (For example, in most cryptocurrency systems.) The “middle class” entities interests are in most aspects in the middle.
For an economics to work, there must be a balance of interests that creates incentive for all of its members to participate. In financial systems, where “haves” interests are mostly opposing to “have-nots” interests, creating such a balance depends on the presence and influence of a “middle class”. Its interests are usually the closest to a compromise that satisfies all, and its influence is the key to achieving that compromise within the system.
If the system state is not acceptable for all entities, these who do not accept it eventually leave. (Usually their participation is required for the system survival, so this brings the system down.) If these entities cannot leave the system, they ultimately reject its rules and try to change it by force. If that is impossible too, they usually resort to denying the system what makes them useful for it, thus decreasing its competitiveness to other systems.
The most reliable way to have acceptable compromise enforced in a system is to have in it a “middle class” that summarily controls more resources than any other segment of entities, preferably at least 51% of the system resources. (This assumes that the “middle class” is able and willing to protect their interests. If some of these entities are controlled into defending someone else’s interests – eg. botnets in computer networks, manipulated voters during elections, etc – these numbers apply to the non-controlled among them.)
A system that doesn’t have a non-controlled “middle class” that controls a decisive amount of resources, usually does not have an influential set of interests that are an acceptable compromise between the interests poles. For this reason, it can be called a polarized system.
The limitation on development
In a polarized system, the incentive for development is minimized. (Development is potentially disruptive, and the majority of the financial abilities and the decision power there has only to lose from a disruption. When factoring in the expected profits from development, the situation always becomes a zero-sum game.) The system becomes static (thus cementing the zero-sum game situation in it) and is under threat of being overtaken by a competing financial system. When that happens, it is usually destroyed together with all stakes in it.
Also, almost any initiative in such a financial system is bound to turn into a cartel, oligopoly or monopoly, due to the small number of participants with resources to start and support an initiative. That effectively destroys its markets, contributing to the weakness of the system and limiting further its ability to develop.
Another problem that stems from this is that the incentive during an interaction to violate the rules and to push the contragent into a loss is greater than the incentive to compete by giving a better offer. This in turn removes the incentive to increase productivity, which is a key incentive for development.)
Yet another problem of the concentration of most resources into few entities is the increased gain from attacking one of them and appropriating their resources, and thus the incentive to do it. Since good defensive capabilities are usually an excellent offense base, this pulls the “haves” into an “arms race”, redirecting more and more of their resources into defense. This also leaves the development outside the arms race increasingly resource-strapped. (The “arms race” itself generates development, but the race situation prevents that into trickling into “non-military” applications.)
These are only a part of the constraints on development in a polarized system. Listing all of them will make a long read.
Trickle-up and trickle-down
In theory, every economical system involves two processes: trickle-down and trickle-up. So, any concentration of resources on the top should be decreased by an automatically increased trickle-down. However, a better understanding how these processes work shows that this logic is faulty.
Any financial exchange in a system consists of two parts. One of them covers the actual production cost of whatever resource is being exchanged against the finances. The other part is the profit of the entity that obtains the finances. From the viewpoint of that entity, the first part vs. the resource given is zero-sum – its incentive to participate in this exchange is the second part, the profit. That second part is effectively the trickle in the system, as it is the only resource really gained.
The direction and the size of the trickle ultimately depends on the balance of many factors, some of them random, others constant. On the long run, it is the constant factors that determine the size and the direction of the trickle sum.
The most important constant factor is the benefit of scale (BOS). It dictates that the bigger entities are able to pull the balance to their side more strongly than the smaller ones. Some miss that chance, but others use it. It makes the trickle-up stronger than the trickle-down. In a system where the transaction outcome is close to a zero-sum game, this concentrates all resources at the top with a speed depending on the financial interactions volume per an unit of time.
(Actually the formula is a bit more complex. All dynamic entities – eg. living organisms, active companies etc – have an “existence maintenance” expense, which they cannot avoid. However, the amount of resources in a system above the summary existence maintenance follows the simple rule above. And these are the only resources that are available for investing in anything, eg. development.)
In the real-life systems the BOS power is limited. There are many different random factors that compete with and influence one another, some of them outweighing BOS. Also, in every moment some factors lose importance and / or cease to exist, while others appear and / or gain importance. The complexity of this system makes any attempt by an entity or entities pool to take control over it hard and slow. This gives the other entities time and ways to react and try to block the takeover attempt. Also, the real-life systems have many built-in constraints against scale-based takeovers – anti-trust laws, separation of the government powers, enforced financial trickle-down through taxes on the rich and benefits for the poor, etc. All these together manage to prevent most takeover attempts, or to limit them into only a segment of the system.
How a Proof-of-Stake based cryptocurrency fares at these?
A POS-based cryptocurrency financial system has no constraints against scale-based takeovers. It has only one kind of clout – the amount of resources controlled by an entity. This kind of clout is built in it, has all the importance in it and cannot lose that or disappear. It has no other types of resources, and has no slowing due to complexity. It is not segmented – who has these resources has it all. There are no built-in constraints against scale-based takeovers, or mechanisms to strengthen resource trickle-down. In short, it is the ideal ground for creating a polarized financial system.
So, it would be only logical to expect that a Proof-of-Stake based Ether financial system will suffer by the problems a polarized system presents. Despite all of its technical ingenuity, its longer-term financial usability is limited, and the participation in it may be dangerous to any entity smaller than eg. a big bank, a big hedge fund or a big authoritarian state.
All fixes for this problem I could think of by now would be easily beaten by simple attacks. I am not sure if it is possible to have a reliable solution to it at all.
Do smart contracts and secondary tokens change this?
Unhappily, no. Smart contracts are based on having Ether, and need Ether to exist and act. Thus, they are bound to the financial situation of the Ether financial system, and are influenced by it. The bigger is the scope of the smart contract, the bigger is its dependence on the Ether situation.
Due to this, smart contracts of meaningful size will find themselves hampered and maybe even endangered by a polarization in the financial system powered by POS-based Ethereum. It is technically possible to migrate these contracts to a competing underlying system, but it won’t be easy – probably even when the competing system is technically a clone of Ethereum, like Ethereum Classic. The migration cost might exceed the migration benefits at any given stage of the contract project development, even if the total migration benefits are far larger than this cost.
Eventually this problem might become public knowledge and most projects in need of a smart contract might start avoiding Ethereum. This will lead to decreased interest in participation in the Ethereum ecosystem, to a loss of market cap, and eventually maybe even to the demise of this technically great project.
There is a danger that the “haves” minority in a polarized system might start actively investing resources in creating other systems that suffer from the same problem (as they benefit from it), or in modifying existing systems in this direction. This might decrease the potential for development globally. As some of the backers of Ethereum are entities with enormous clout worldwide, that negative influence on the global system might be significant.
Post Syndicated from John Ohle original https://aws.amazon.com/blogs/big-data/run-common-data-science-packages-on-anaconda-and-oozie-with-amazon-emr/
In the world of data science, users must often sacrifice cluster set-up time to allow for complex usability scenarios. Amazon EMR allows data scientists to spin up complex cluster configurations easily, and to be up and running with complex queries in a matter of minutes.
Data scientists often use scheduling applications such as Oozie to run jobs overnight. However, Oozie can be difficult to configure when you are trying to use popular Python packages (such as “pandas,” “numpy,” and “statsmodels”), which are not included by default.
One such popular platform that contains these types of packages (and more) is Anaconda. This post focuses on setting up an Anaconda platform on EMR, with an intent to use its packages with Oozie. I describe how to run jobs using a popular open source scheduler like Oozie.
For this post, you walk through the following tasks:
Spin up an Amazon EMR cluster using the console or the AWS CLI. Use the latest release, and include Apache Hadoop, Apache Spark, Apache Hive, and Oozie.
To create a three-node cluster in the us-east-1 region, issue an AWS CLI command such as the following. This command must be typed as one line, as shown below. It is shown here separated for readability purposes only.
One-line version for reference:
SSH into your EMR master node instance and download the official Anaconda installer:
At the time of publication, Anaconda 4.4 is the most current version available. For the download link location for the latest Python 2.7 version (Python 3.6 may encounter issues), see https://www.continuum.io/downloads. Open the context (right-click) menu for the Python 2.7 download link, choose Copy Link Location, and use this value in the previous wget command.
This post used the Anaconda 4.4 installation. If you have a later version, it is shown in the downloaded filename: “anaconda2-<version number>-Linux-x86_64.sh”.
Run this downloaded script and follow the on-screen installer prompts.
For an installation directory, select somewhere with enough space on your cluster, such as “/mnt/anaconda/”.
The process should take approximately 1–2 minutes to install. When prompted if you “wish the installer to prepend the Anaconda2 install location”, select the default option of [no].
After you are done, export the PATH to include this new Anaconda installation:
Zip up the Anaconda installation:
The zip process may take 4–5 minutes to complete.
(Optional) Upload this anaconda.zip file to your S3 bucket for easier inclusion into future EMR clusters. This removes the need to repeat the previous steps for future EMR clusters.
Next, you configure Oozie to use Pyspark and the Anaconda platform.
Get the location of your Oozie sharelibupdate folder. Issue the following command and take note of the “sharelibDirNew” value:
For this post, this value is “hdfs://ip-192-168-4-200.us-east-1.compute.internal:8020/user/oozie/share/lib/lib_20170616133136”.
Pass in the required Pyspark files into Oozies sharelibupdate location. The following files are required for Oozie to be able to run Pyspark commands:
These are located on the EMR master instance in the location “/usr/lib/spark/python/lib/”, and must be put into the Oozie sharelib spark directory. This location is the value of the sharelibDirNew parameter value (shown above) with “/spark/” appended, that is, “hdfs://ip-192-168-4-200.us-east-1.compute.internal:8020/user/oozie/share/lib/lib_20170616133136/spark/”.
To do this, issue the following commands:
After you’re done, Oozie can use Pyspark in its processes.
Pass the anaconda.zip file into HDFS as follows:
(Optional) Verify that it was transferred successfully with the following command:
On your master node, execute the following command:
Set the PYSPARK_PYTHON environment variable on the executor nodes. Put the following configurations in your “spark-opts” values in your Oozie workflow.xml file:
This is referenced from the Oozie job in the following line in your workflow.xml file, also included as part of your “spark-opts”:
Your Oozie workflow.xml file should now look something like the following:
To test this out, you can use the following job.properties and myPysparkProgram.py file, along with the following steps:
Note: You can get your master node IP address (denoted as “ip-xxx-xxx-xxx-xxx” here) from the value for the sharelibDirNew parameter noted earlier.
Put the “myPysparkProgram.py” into the location mentioned between the “<jar>xxxxx</jar>” tags in your workflow.xml. In this example, the location is “hdfs:///user/oozie/apps/”. Use the following command to move the “myPysparkProgram.py” file to the correct location:
Put the above workflow.xml file into the “/user/oozie/apps/” location in hdfs:
Note: The job.properties file is run locally from the EMR master node.
Create a sample input.txt file with some data in it. For example:
Put this file into hdfs:
Execute the job in Oozie with the following command. This creates an Oozie job ID.
You can check the Oozie job state with the command:
The output will be:
The myPysparkProgram.py has successfully imported the numpy library from the Anaconda platform and has produced some output with it. If you tried to run this using standard Python, you’d encounter the following error:
Now when your Python job runs in Oozie, any imported packages that are implicitly imported by your Pyspark script are imported into your job within Oozie directly from the Anaconda platform. Simple!
If you have questions or suggestions, please leave a comment below.
John Ohle is an AWS BigData Cloud Support Engineer II for the BigData team in Dublin. He works to provide advice and solutions to our customers on their Big Data projects and workflows on AWS. In his spare time, he likes to play music, learn, develop tools and write documentation to further help others – both colleagues and customers alike.
Post Syndicated from Bozho original https://techblog.bozho.net/concerns-blockchain-technology/
The so-called (and marketing-branded) “blockchain technology” is promised to revolutionize every industry. Anything, they say, will become decentralized, free from middle men or government control. Services will thrive on various installments of the blockchain, and smart contracts will automatically enforce any logic that is related to the particular domain.
I don’t mind having another technological leap (after the internet), and given that I’m technically familiar with the blockchain, I may even be part of it. But I’m not convinced it will happen, and I’m not convinced it’s going to be the next internet.
If we strip the hype, the technology behind Bitcoin is indeed a technical masterpiece. It combines existing techniques (likes hash chains and merkle trees) with a very good proof-of-work based consensus algorithm. And it creates a digital currency, which ontop of being worth billions now, is simply cool.
But will this technology be mass-adopted, and will mass adoption allow it to retain the technological benefits it has?
First, I’d like to nitpick a little bit – if anyone is speaking about “decentralized software” when referring to “the blockchain”, be suspicious. Bitcon and other peer-to-peer overlay networks are in fact “distributed” (see the pictures here). “Decentralized” means having multiple providers, but doesn’t mean each user will be full-featured nodes on the network. This nitpicking is actually part of another argument, but we’ll get to that.
If blockchain-based applications want to reach mass adoption, they have to be user-friendly. I know I’m being captain obvious here (and fortunately some of the people in the area have realized that), but with the current state of the technology, it’s impossible for end users to even get it, let alone use it.
My first serious concern is usability. To begin with, you need to download the whole blockchain on your machine. When I got my first bitcoin several years ago (when it was still 10 euro), the blockchain was kind of small and I didn’t notice that problem. Nowadays both the Bitcoin and Ethereum blockchains take ages to download. I still haven’t managed to download the ethereum one – after several bugs and reinstalls of the client, I’m still at 15%. And we are just at the beginning. A user just will not wait for days to download something in order to be able to start using a piece of technology.
I recently proposed downloading snapshots of the blockchain via bittorrent to be included in the Ethereum protocol itself. I know that snapshots of the Bitcoin blockchain have been distributed that way, but it has been a manual process. If a client can quickly download the huge file up to a recent point, and then only donwload the latest ones in the the traditional way, starting up may be easier. Of course, the whole chain would have to be verified, but maybe that can be a background process that doesn’t stop you from using whatever is built ontop of the particular blockchain. (I’m not sure if that will be secure enough, and that, say potential Sybil attacks on the bittorrent part won’t make it undesirable, it’s just an idea).
But even if such an approach works and is adopted, that would still mean that for every service you’d have to download a separate blockchain. Of course, projects like Ethereum may seem like the “one stop shop” for cool blockchain-based applications, but fragmentation is already happening – there are alt-coins bundled with various services like file storage, DNS, etc. That will not be workable for end-users. And it’s certainly not an option for mobile, which is the dominant client now. If instead of downloading the entire chain, something like consistent hashing is used to distribute the content in small portions among clients, it might be workable. But how will trust work in that case, I don’t know. Maybe it’s possible, maybe not.
And yes, I know that you don’t necessarily have to install a wallet/client in order to make use of a given blockchain – you can just have a cloud-based wallet. Which is fairly convenient, but that gets me to my nitpicking from a few paragraphs above and to may second concern – this effectively turns a distributed system into a decentralized one – a limited number of cloud providers hold most of the data (just as a limited number of miners hold most of the processing power). And then, even though the underlying technology allows for a distributed deployment, we’ll end-up again with simply decentralized or even de-facto cenetralized, if mergers and acquisitions lead us there (and they probably will). And in order to be able to access our wallets/accounts from multiple devices, we’d use a convenient cloud service where we’d login with our username and password (because the private key is just too technical and hard for regular users). And that seems to defeat the whole idea.
Not only that, but there is an inevitable centralization of decisions (who decides on the size of the block, who has commit rights to the client repository) as well as a hidden centralization of power – how much GPU power does the Chinese mining “farms” control and can they influence the network significantly? And will the average user ever know that or care (as they don’t care that Google is centralized). I think that overall, distributed technologies will follow the power law, and the majority of data/processing power/decision power will be controller by a minority of actors. And so our distributed utopia will not happen in its purest form we dream of.
My third concern is incentive. Distributed technologies that have been successful so far have a pretty narrow set of incentives. The internet was promoted by large public institutions, including government agencies and big universitives. Bittorrent was successful mainly because it allowed free movies and songs with 2 clicks of the mouse. And Bitcoin was successful because it offered financial benefits. I’m oversimplifying of course, but “government effort”, “free & easy” and “source of more money” seem to have been the successful incentives. On the other side of the fence there are dozens of failed distributed technologies. I’ve tried many of them – alternative search engines, alternative file storage, alternative ride-sharings, alternative social networks, alternative “internets” even. None have gained traction. Because they are not easier to use than their free competitors and you can’t make money out of them (and no government bothers promoting them).
Will blockchain-based services have sufficient incentives to drive customers? Will centralized competitors just easily crush the distributed alternatives by being cheaper, more-user friendly, having sales departments that can target more than hardcore geeks who have no problem syncing their blockchain via the command line? The utopian slogans seem very cool to idealists and futurists, but don’t sell. “Free from centralized control, full control over your data” – we’d have to go through a long process of cultural change before these things make sense to more than a handful of people.
Speaking of services, often examples include “the sharing economy”, where one stranger offers a service to another stranger. Blockchain technology seems like a good fit here indeed – the services are by nature distributed, why should the technology be centralized? Here comes my fourth concern – identity. While for the cryptocurrencies it’s actually beneficial to be anonymous, for most of the real-world services (i.e. the industries that ought to be revolutionized) this is not an option. You can’t just go in the car of publicKey=5389BC989A342…. “But there are already distributed reputation systems”, you may say. Yes, and they are based on technical, not real-world identities. That doesn’t build trust. I don’t trust that publicKey=5389BC989A342… is the same person that got the high reputation. There may be five people behind that private key. The private key may have been stolen (e.g. in a cloud-provider breach).
The values of companies like Uber and AirBNB is that they serve as trust brokers. They verify and vouch for their drivers and hosts (and passengers and guests). They verify their identity through government-issued documents, skype calls, selfies, compare pictures to documents, get access to government databases, credit records, etc. Can a fully distributed service do that? No. You’d need a centralized provider to do it. And how would the blockchain make any difference then? Well, I may not be entirely correct here. I’ve actually been thinking quite a lot about decentralized identity. E.g. a way to predictably generate a private key based on, say biometrics+password+government-issued-documents, and use the corresponding public key as your identifier, which is then fed into reputation schemes and ultimately – real-world services. But we’re not there yet.
And that is part of my fifth concern – the technology itself. We are not there yet. There are bugs, there are thefts and leaks. There are hard-forks. There isn’t sufficient understanding of the technology (I confess I don’t fully grasp all the implementation details, and they are always the key). Often the technology is advertised as “just working”, but it isn’t. The other day I read an article (lost the link) that clarifies a common misconception about smart contracts – they cannot interact with the outside world – they can’t call APIs (e.g. stock market prices, bank APIs), they can’t push or fetch data from anywhere but the blockchain. That mandates the need, again, for a centralized service that pushes the relevant information before smart contracts can pick it up. I’m pretty sure that all cool-sounding applications are not possible without extensive research. And even if/when they are, writing distributed code is hard. Debugging a smart contract is hard. Yes, hard is cool, but that doesn’t drive economic value.
I have mostly been referring to public blockchains so far. Private blockchains may have their practical application, but there’s one catch – they are not exactly the cool distributed technology that the Bitcoin uses. They may be called “blockchains” because they…chain blocks, but they usually centralize trust. For example the Hyperledger project uses PKI, with all its benefits and risks. In these cases, a centralized authority issues the identity “tokens”, and then nodes communicate and form a shared ledger. That’s a bit easier problem to solve, and the nodes would usually be on actual servers in real datacenters, and not on your uncle’s Windows XP.
That said, hash chaining has been around for quite a long time. I did research on the matter because of a side-project of mine and it seems providing a tamper-proof/tamper-evident log/database on semi-trusted machines has been discussed in many computer science papers since the 90s. That alone is not “the magic blockchain” that will solve all of our problems, no matter what gossip protocols you sprinkle ontop. I’m not saying that’s bad, on the contrary – any variation and combinations of the building blocks of the blockchain (the hash chain, the consensus algorithm, the proof-of-work (or stake), possibly smart contracts), has potential for making useful products.
I know I sound like the a naysayer here, but I hope I’ve pointed out particular issues, rather than aimlessly ranting at the hype (though that’s tempting as well). I’m confident that blockchain-like technologies will have their practical applications, and we will see some successful, widely-adopted services and solutions based on that, just as pointed out in this detailed report. But I’m not convinced it will be revolutionizing.
I hope I’m proven wrong, though, because watching a revolutionizing technology closely and even being part of it would be quite cool.
Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2017/07/password_maskin.html
Slashdot asks if password masking — replacing password characters with asterisks as you type them — is on the way out. I don’t know if that’s true, but I would be happy to see it go. Shoulder surfing, the threat is defends against, is largely nonexistent. And it is becoming harder to type in passwords on small screens and annoying interfaces. The IoT will only exacerbate this problem, and when passwords are harder to type in, users choose weaker ones.
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.