Questioning The Original Analysis On The Bionic Debate

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2011/03/18/bionic-debate.html

I was hoping to avoid having to comment further on this problematic
story. I figured a comment
as a brief identi.ca statement was enough when it
was just
a story on the Register. But, it’s now
hit a
major tech news outlet, and I feel that, given that I’m typically
the first person everyone in the Free Software world comes to ask if
something is a GPL violation, I’m going to get asked about this soon, so
I might as well preempt the questions with a blog post, so I can answer
any questions about it with this URL.

In short, the question is: Does Bionic (the Android/Linux default C
library developed by Google) violate the GPL by importing
“scrubbed” headers from Linux? For those of you seeking
TL;DR version: You can
stop now if you expect me to answer this question; I’m not going to. I’m
just going to show that the apparent original analysis material that started
this brouhaha is a speculative hypothesis which would require much more
research to amount to anything of note.

Indeed, the kind of work needed to answer these questions typically
requires the painstaking work of a talented developer working very
closely with legal counsel. I’ve done analysis like this before for
other projects. The only one I can easily talk about publicly is the
ath5k situation. (If you want to hear more on that, you can listen to
an old
oggcast where I discussed this with Karen Sandler
or read
papers
that were written on the subject back where I used to work.)

Anyway, most of what’s been written about this subject of the Linux
headers in Bionic has been poorly drafted speculation. I
suppose some will say this blog post is no better, since I am not
answering any questions, but my primary goal here is to draw attention
that absolutely no one, as near as I can tell, has done the incredibly
time consuming work to figure out anything approaching a definitive
answer! Furthermore, the original article that launched this debate
(Naughton’s
paper, The Bionic Library: Did Google Work Around the
GPL?) is merely a position paper for a research project yet
to be done.

Naughton’s full paper gives some examples that would make a good
starting point for a complete analysis. It’s disturbing, however, that
his paper is presented as if it’s a complete analysis. At best, his
paper is a position statement of a hypothesis that then needs the actual
experiment to figure things out. That rigorous research (as I keep
reiterating) is still undone.

To his credit, Naughton does admit that only the kind of analysis I’m
talking about would yield a definitive answer. You have to get almost
all the way through his paper to get to:

Determining copyrightability is thus a fact-specific, case-by-case
exercise. … Certainly, sorting out what is and isn’t subject to
GPLv2 in Bionic would require at least a file-by-file, and most likely
line-by-line, analysis of Bionic — a daunting task[.]

Of course, in that statement, Naughton makes the mistake of subtly
including an assumption in the hypothesis: he fails to acknowledge clearly
that it’s entirely possible the set of GPLv2-covered work found in Bionic
could be the empty set; he hasn’t shown it’s not the empty set (even
notwithstanding his very cursory analysis of a few files).

Yet, even though Naughton admits full analysis (that he hasn’t done) is
necessary, he nevertheless later makes sweeping conclusions:

The 750 Linux kernel header files … define a complex overarching
structure, an application programming interface, that is thoughtfully and
cleverly designed, and almost assuredly protected by copyright.

Again, this is a hypothesis, that would have be tested and proved with
evidence generated by the careful line-by-line analysis Naughton himself
admits is necessary. Yet, he doesn’t acknowledge that fact in his
conclusions, leaving his readers (and IMO he’s expecting to dupe lots of
readers unsophisticated on these issues) with the impression he’s shown
something he hasn’t. For example, one of my first questions would be
whether or not Bionic uses only parts of Linux headers that are required
by specification to write POSIX programs, a question that Naughton doesn’t
even consider.

Finally, Naughton moves from the merely shoddy analysis to completely
alarmist speculation with:

But if Google is right, if it has succeeded in removing all copyrightable
material from the Linux kernel headers, then it has unlocked the Linux
kernel from the restrictions of GPLv2. Google can now use the
“clean” Bionic headers to create a non-GPL’d fork of the Linux
kernel, one that can be extended under proprietary license terms. Even if
Google does not do this itself, it has enabled others to do so. It also
has provided a useful roadmap for those who might want to do the same
thing with other GPLv2-licensed [sic] programs, such as databases.

If it turns out that Google has succeeded in making sure that the GPLv2
does not apply to Bionic, then Google’s success is substantially more
narrow. The success would be merely the extraction of the
non-copyrightable facts that any C library needs to know about Linux to
make a binary run when Linux happens to be the kernel underneath. Now, it
should be duly noted that there already exist two libraries under the LGPL
that have already implemented that (namely, glibc, and uClibc — the
latter of which Naughton’s cursory research apparently didn’t even turn
up). As it stands, anyone who wants to write user-space applications on a
Linux-based system already can; there are multiple C library choices
available under the weak copyleft license, LGPL. Google, for its
part, believes they’ve succeed at is to make a permissively licensed third
alternative, which is an outcome that would be no surprise to us who have
seen something like it done twice before.

In short, everyone opining here seems to be conflating a lot of issues.
There are many ways to interface with Linux. Many people, including me,
believe quite strongly that there is no way to make a subprogram in
kernel space (such as a device driver) without the terms of the GPLv2
applying to it. But writing a device driver is a specialized task
that’s very different from what most Linux users do. Most developers
who “use Linux” — by which they typically mean write a
user space program that runs on a GNU/Linux operating system — have
(at most) weak copyleft (LGPL) terms to follow due to glibc or uClibc.
I admit that I sometimes feel chagrin that proprietary applications can
be written for GNU/Linux (and other Linux-based) systems, but that was a
strategic decision that RMS made (correctly) at the start of the GNU
project one that the Linux project, for its part, has also always
sought.

I’m quite sure no one — including hard-core copyleft advocates
like me — expects nor seeks the GPLv2 terms to apply to programs
that interface with Linux solely as user-space programs that
runs on an operating system that uses Linux as its kernel. Thus, I’d
guess that even if it turned out that Google made some mistakes
in this regard for Bionic, we’d all work together to rectify those
mistakes so that the outcome everyone intended could occur.

Moreover, to compare the specifics of this situation to other types of
so-called “copyleft circumvention techniques” is just
link-baiting that borders on trolling. Google wasn’t seeking to
circumvent the GPL at all; they were seeking to write and/or adapt a
permissively licensed library that replaced an LGPL’d one. I’m of
course against that task on principle (I think Google should have just
used glibc and/or uClibc and required LGPL-compliance by applications).
But, to deny that it’s possible to rewrite a C library for Linux under a
license that isn’t GPLv2 would also imply immediately the (incorrect)
conclusion that uClibc and glibc are covered by the GPLv2, and we are
all quite sure they aren’t; even Naughton himself admits that (regarding
glibc).

Google may have erred; no one actually knows for sure at this time.
But the task they sought to do has been done before and everyone
intended it to be permitted. The worst mistake of which we might ultimately accuse
Google is inadvertently taking a copyright-infringing short-cut. If
someone actually does all the research to prove that Google did so, I’d
easily offer a 1,000-to-1 bet to anyone that such a copyright
infringement could be cleared up easily, that Bionic would still work as
a permissively licensed C library for Linux, and the implications of the
whole thing wouldn’t go beyond: “It’s possible to write your own C
library for Linux that isn’t covered by the GPLv2” — a fact
which we’ve all known for a decade and a half anyway.

Update (2011-03-20):
Many people,
including slashdot,
have been linking to
this comment
by RMS on LKML about .h files. It’s important to look carefully at
what RMS is saying. Specifically, RMS says that sometimes #include’ing a
.h file creates a copyright derivative work, and sometimes it doesn’t; it
depends on the details. Then, RMS goes to talk on some rules of thumb
that can help determine the outcome of the question. The details are what
matters; and those are, as I explain in the main post above, what requires
careful analysis done jointly and in close collaboration between a
developer and a lawyer. There is no general rule of thumb that always
immediately leads one to the right answer on this question.

Noise

Questioning The Original Analysis On The Bionic Debate

The collective thoughts of the interwebz