Designing tasteful CLIs: a case study

Post Syndicated from esr original

Yesterday evening my apprentice, Ian Bruene, tossed a design question at me.

Ian is working on a utility he calls “igor” intended to script interactions with GitLab, a major public forge site. Like many such sites, it has a sort of remote-procedure-call interface that allows you, as an alternative to clicky-dancing on the visible Web interface, to pass it JSON datagrams and get back responses that do useful things like – for example – publishing a release tarball of a project where GitLab users can easily find it.

Igor is going to have (actually, already has) one mode that looks like a command interpreter for a little minilanguage, with each command being an action verb like “upload” or “release”. The idea is not so much for users to drive this manually as for them to be able to write scripts in the minilanguage which become part of a project’s canned release procedure. (This is why GUIs are irrelevant to this whole discussion; you can’t script a GUI.)

Ian, quite reasonably, also wants users to be able to run simple igor commands in a fire-and-forget mode by typing “igor” followed by command-line arguments. Now, classically, under Unix, you would expect a single-line “release” command to be designed to look something like this:

$ igor -r -n fooproject -t 1.2.3 foo-1.2.3.tgz

(To be clear, the dollar sign on the left is a shell prompt, put in to emphasize that this is something you type direct to a shell.)

In this invocation, the “-r” option says “I want to do a release”, the -n option says “This is the GitLab name of the project I’m shipping a release of”, the -t option specifies a release tag, and the following filename argument is the name of the tarball you want to publish.

It might not look exactly like this. Maybe there’d be yet another switch that lets you attach a release notes file. Maybe you’d have the utility deduce the project name from the directory it’s running in. But the basic style of this CLI (= Command Line Interface), with option flags like -r that act as command verbs and other flags that exist to attach their arguments to the request, is very familiar to any Unix user. This what most Unix system commands look like.

One of the design rules of the old-school style is that the first token on the line that is not a switch argument terminates recognition of switches. It, and all tokens after it, are treated as arguments to be passed to the program and are normally expected to be filenames (or, in the 21st century, filename-like things like URLs).

Another characteristic of this style is that the order of the switch clauses is not fixed. You could write

$ igor -t 1.2.3 -n fooproject -r foo-1.2.3.tgz

and it would mean the same thing. (Order of the following arguments, on the other hand, will usually be significant if there is more than one.)

For purposes of this post I’m going to call this style old-school UNIX CLI, because Ian’s puzzlement comes from a collision he’s having with a newer style of doing things. And, actually, with a third interface style, also ancient but still vigorous.

When those of us in Unix-land only had the old-school CLI style as a model it was difficult to realize that all of those switches, though compact and easy to type, imposed a relatively high cognitive load. They were, and still are, difficult to remember. But we couldn’t really notice this until we had something to contrast it with that met similar functional requirements with lower cognitive effort.

Though there may have been earlier precedents, the first well-known program to use something recognizably like what I will call new-school CLI was the CVS version control system. The distinguishing trope was this: Each CVS command begins with a subcommand verb, like “cvs update” or “cvs checkout”. If there are switches, they normally follow the subcommand rather than preceding it. And there are fewer switches.

Later version-control systems like Subversion and Mercurial picked up on the subcommand idea and used it to further reduce the number of arbitrary-looking switches users had to remember. In Subversion, especially, your normal workflow could consist of a sequence of svn add, svn update, svn status, and svn commit commands during which you’d never type anything that looked like an old-school Unixy switch at all. This was easy to remember, easy to document, and users liked it.

Users liked it because humans are used to remembering associations between actions and natural-language verbs; “release” is less of a memory load than “-r” even if it takes longer to type. Which illuminates one of the drivers of the old-school style; it was shaped back in the 1970s by 110-baud Teletypes on which terseness and only having to type few characters was a powerful virtue.

After Subversion and Mercurial Git came along, with its CLI written in a style that, though it uses leading subcommand verbs, is rather more switch-heavy. From the point of view of being comfortable for users (especially new users), this was a pretty serious regression from Subversion. But then the CLI of git wasn’t really a design at all, it was an accretion of features that there was little attempt to simplify or systematize. It’s fair to say that git has succeeded despite its rather spiky UI rather than because of it.

Git is, however a digression here; I’ve mainly described it to make clear that you can lose the comfort benefits of the new-school CLI if a lot of old-school-style switches crowd in around the action verbs.

Next we need to look at a third UI style, which I’m going to call “GDB style” because the best-known program that uses it today is the GNU symbolic debugger. It’s almost as ancient as old-school CLIs, going back to the early 1980s at least.

A program like GDB is almost never invoked as a one-liner at all; a command is something you type to its internal command prompt, not the shell. As with new-school CLIs like Subversuon’s, all commands begin with an action verb, but there are no switches. Each space-separated token after the verb on the command line is passed to the command handler as a positional argument.

Part of Igor’s interface is intended to be a GDB-style interpreter. In that, the release command should logically look something like this, with igor’s command prompt at the left margin.

igor> release fooproject 1.2.3 foo-1.2.3.tgz

Note that this is the same arguments in the same order as our old-school “igor -r” command, but now -r has been replaced by a command verb and the order of what follows it is fixed. If we were designing Igor to be Subversion-like, with a fire-and-forget interface and no internal command interpreter at all, it would correspond to a shell command line like this:

$ igor release fooproject 1.2.3 foo-1.2.3.tgz

This is where we get to the collision of design styles I referred to earlier. What was really confusing Ian, I think, is that part of his experience was pulling for old-school fire-and-forget with switches, part of his experience was pulling for new-school as filtered through git’s rather botched version of it, and then there is this internal GDB-like interpreter to reconcile with how the command line works.

My apprentice’s confusion was completely reasonable. There’s a real question here which the tradition he’s immersed in has no canned, best-practices answer for. Git and GDB evade it in equal and opposite ways – Git by not having any internal interpreter like GDB, GDB by not being designed to do anything in a fire-and-forget mode without going through its internal interpreter.

The question is: how do you design a tool that (a) has a GDB like internal interpreter for a command minilanguage, (b) also allows you to write useful fire-and-forget one-liners in the shell without diving into that interpreter, (c) has syntax for those one liners that looks like an old-school CLI, and (d) has only one syntax for each command?

And the answer is: you can’t actually satisfy all four of those constraints at once. One of them has to give. It’s trivially true that if you abandon (a) or (b) you evade the problem, the way Git and GDB do. The real problem is that an old-school CLI wants to have terse switch clauses with flexible order, a GDB-style minilanguage wants to have more verbose commands with positional arguments, and never these twain shall meet.

The only one-syntax-for-each-command choice you can make is to have the same command interpreter parse your command line and what the user types to the internal prompt.

I bit this bullet when I designed reposurgeon, which is why a fire-and-forget command to read a stream dump of a Subversion repository and build a live repository from it looks like this:

$ reposurgeon "read <project .svn" "prefer git" "rebuild ../overthere"

Each of those string arguments is just fed to reposurgeon’s internal interpreter; any attempt to look like an old-school CLI has been abandoned. This way, I can fire and forget multiple reposurgeon commands; for Igor, it might be more appropriate to pass all the tokens on the command line as a single command.

The other possible way Igor could go is to have a command language for the internal interpreter in which each line looks like a new-school shell command with a command verb followed by switch clusters:

$ release -t 1.2.3 -n fooproject foo-1.2.3.tgz

Which is fine except that now we’ve violated some of the implicit rules of the GDB style. Those aren’t simple positional arguments, and we’re back to the higher cognitive load of having to remember cryptic switches.

But maybe that’s what your prospective users would be comfortable with, because it fits their established habits! This seems to me unlikely but possible.

Design questions like these generally come down to having some sense of who your audience is. Who are they? What do they want? What will surprise them the least? What will fit into their existing workflows and tools best?

I could launch into a panegyric on the agile-programming practice of design-by-user-story at this point; I think this is one of the things agile most clearly gets right. Instead, I’ll leave the reader with a recommendation to read up on that idea and learn how to do it right. Your users will be grateful.