Cloud desktops aren’t as good as you’d think

Post Syndicated from original https://mjg59.dreamwidth.org/61535.html

Fast laptops are expensive, cheap laptops are slow. But even a fast laptop is slower than a decent workstation, and if your developers want a local build environment they’re probably going to want a decent workstation. They’ll want a fast (and expensive) laptop as well, though, because they’re not going to carry their workstation home with them and obviously you expect them to be able to work from home. And in two or three years they’ll probably want a new laptop and a new workstation, and that’s even more money. Not to mention the risks associated with them doing development work on their laptop and then drunkenly leaving it in a bar or having it stolen or the contents being copied off it while they’re passing through immigration at an airport. Surely there’s a better way?

This is the thinking that leads to “Let’s give developers a Chromebook and a VM running in the cloud”. And it’s an appealing option! You spend far less on the laptop, and the VM is probably cheaper than the workstation – you can shut it down when it’s idle, you can upgrade it to have more CPUs and RAM as necessary, and you get to impose all sorts of additional neat security policies because you have full control over the network. You can run a full desktop environment on the VM, stream it to a cheap laptop, and get the fast workstation experience on something that weighs about a kilogram. Your developers get the benefit of a fast machine wherever they are, and everyone’s happy.

But having worked at more than one company that’s tried this approach, my experience is that very few people end up happy. I’m going to give a few reasons here, but I can’t guarantee that they cover everything – and, to be clear, many (possibly most) of the reasons I’m going to describe aren’t impossible to fix, they’re simply not priorities. I’m also going to restrict this discussion to the case of “We run a full graphical environment on the VM, and stream that to the laptop” – an approach that only offers SSH access is much more manageable, but also significantly more restricted in certain ways. With those details mentioned, let’s begin.

The first thing to note is that the overall experience is heavily tied to the protocol you use for the remote display. Chrome Remote Desktop is extremely appealing from a simplicity perspective, but is also lacking some extremely key features (eg, letting you use multiple displays on the local system), so from a developer perspective it’s suboptimal. If you read the rest of this post and want to try this anyway, spend some time working with your users to find out what their requirements are and figure out which technology best suits them.

Second, let’s talk about GPUs. Trying to run a modern desktop environment without any GPU acceleration is going to be a miserable experience. Sure, throwing enough CPU at the problem will get you past the worst of this, but you’re still going to end up with users who need to do 3D visualisation, or who are doing VR development, or who expect WebGL to work without burning up every single one of the CPU cores you so graciously allocated to their VM. Cloud providers will happily give you GPU instances, but that’s going to cost more and you’re going to need to re-run your numbers to verify that this is still a financial win. “But most of my users don’t need that!” you might say, and we’ll get to that later on.

Next! Video quality! This seems like a trivial point, but if you’re giving your users a VM as their primary interface, then they’re going to do things like try to use Youtube inside it because there’s a conference presentation that’s relevant to their interests. The obvious solution here is “Do your video streaming in a browser on the local system, not on the VM” but from personal experience that’s a super awkward pain point! If I click on a link inside the VM it’s going to open a browser there, and now I have a browser in the VM and a local browser and which of them contains the tab I’m looking for WHO CAN SAY. So your users are going to watch stuff inside their VM, and re-compressing decompressed video is going to look like shit unless you’re throwing a huge amount of bandwidth at the problem. And this is ignoring the additional irritation of your browser being unreadable while you’re rapidly scrolling through pages, or terminal output from build processes being a muddy blur of artifacts, or the corner case of “I work for Youtube and I need to be able to examine 4K streams to determine whether changes have resulted in a degraded experience” which is a very real job and one that becomes impossible when you pass their lovingly crafted optimisations through whatever codec your remote desktop protocol has decided to pick based on some random guesses about the local network, and look everyone is going to have a bad time.

The browser experience. As mentioned before, you’ll have local browsers and remote browsers. Do they have the same security policy? Who knows! Are all the third party services you depend on going to be ok with the same user being logged in from two different IPs simultaneously because they lost track of which browser they had an open session in? Who knows! Are your users going to become frustrated? Who knows oh wait no I know the answer to this one, it’s “yes”.

Accessibility! More of your users than you expect rely on various accessibility interfaces, be those mechanisms for increasing contrast, screen magnifiers, text-to-speech, speech-to-text, alternative input mechanisms and so on. And you probably don’t know this, but most of these mechanisms involve having accessibility software be able to introspect the UI of applications in order to provide appropriate input or expose available options and the like. So, I’m running a local text-to-speech agent. How does it know what’s happening in the remote VM? It doesn’t because it’s just getting an a/v stream, so you need to run another accessibility stack inside the remote VM and the two of them are unaware of each others existence and this works just as badly as you’d think. Alternative input mechanism? Good fucking luck with that, you’re at best going to fall back to “Send synthesized keyboard inputs” and that is nowhere near as good as “Set the contents of this text box to this unicode string” and yeah I used to work on accessibility software maybe you can tell. And how is the VM going to send data to a braille output device? Anyway, good luck with the lawsuits over arbitrarily making life harder for a bunch of members of a protected class.

One of the benefits here is supposed to be a security improvement, so let’s talk about WebAuthn. I’m a big fan of WebAuthn, given that it’s a multi-factor authentication mechanism that actually does a good job of protecting against phishing, but if my users are running stuff inside a VM, how do I use it? If you work at Google there’s a solution, but that does mean limiting yourself to Chrome Remote Desktop (there are extremely good reasons why this isn’t generally available). Microsoft have apparently just specced a mechanism for doing this over RDP, but otherwise you’re left doing stuff like forwarding USB over IP, and that means that your USB WebAuthn no longer works locally. It also doesn’t work for any other type of WebAuthn token, such as a bluetooth device, or an Apple TouchID sensor, or any of the Windows Hellow support. If you’re planning on moving to WebAuthn and also planning on moving to remote VM desktops, you’re going to have a bad time.

That’s the stuff that comes to mind immediately. And sure, maybe each of these issues is irrelevant to most of your users. But the actual question you need to ask is what percentage of your users will hit one or more of these, because if that’s more than an insignificant percentage you’ll still be staffing all the teams that dealt with hardware, handling local OS installs, worrying about lost or stolen devices, and the glorious future of just being able to stop worrying about this is going to be gone and the financial benefits you promised would appear are probably not going to work out in the same way.

A lot of this falls back to the usual story of corporate IT – understand the needs of your users and whether what you’re proposing actually meets them. Almost everything I’ve described here is a corner case, but if your company is larger than about 20 people there’s a high probability that at least one person is going to fall into at least one of these corner cases. You’re going to need to spend a lot of time understanding your user population to have a real understanding of what the actual costs are here, and I haven’t seen anyone do that work before trying to launch this and (inevitably) going back to just giving people actual computers.

There are alternatives! Modern IDEs tend to support SSHing out to remote hosts to perform builds there, so as long as you’re ok with source code being visible on laptops you can at least shift the “I need a workstation with a bunch of CPU” problem out to the cloud. The laptops are going to need to be more expensive because they’re also going to need to run more software locally, but it wouldn’t surprise me if this ends up being cheaper than the full-on cloud desktop experience in most cases.

Overall, the most important thing to take into account here is that your users almost certainly have more use cases than you expect, and this sort of change is going to have direct impact on the workflow of every single one of your users. Make sure you know how much that’s going to be, and take that into consideration when suggesting it’ll save you money.

comment count unavailable comments