All posts by Bradley M. Kuhn

When your apt-mirror is always downloading

2008-01-24 Bradley M. Kuhn

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2008/01/24/apt-mirror-2.html

When I started building our apt-mirror, I ran into a problem: the
machine was throttled against ubuntu.com’s servers, but I had completed
much of the download (which took weeks to get multiple distributions).
I really wanted to roll out the solution quickly, particularly because
the service from the remote servers was worse than ever due to the
throttling that the mirroring created. But, with the mirror incomplete,
I couldn’t so easily make available incomplete repositories.

The solution was to simply let apache redirect users on to the real
servers if the mirror doesn’t have the file. The first order of
business for that is to rewrite and redirect URLs when files aren’t
found. This is a straightforward Apache configuration:

           RewriteEngine on
           RewriteLogLevel 0
           RewriteCond %{REQUEST_FILENAME} !^/cgi/
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-F
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-d
           RewriteCond %{REQUEST_URI} !(Packages|Sources)\.bz2$
           RewriteCond %{REQUEST_URI} !/index\.[^/]*$ [NC]
           RewriteRule ^(http://%{HTTP_HOST})?/(.*) http://91.189.88.45/$2 [P]

Note a few things there:

I have to hard-code an IP number, because as I mentioned in
the last
post on this subject, I’ve faked out DNS
for archive.ubuntu.com and other sites I’m mirroring. (Note:
this has the unfortunate side-effect that I can’t easily take advantage
of round-robin DNS on the other side.)
I avoid taking Packages.bz2 from the other site, because
apt-mirror actually doesn’t mirror the bz2 files (although I’ve
submitted a patch to it so it will eventually).
I make sure that index files get built by my Apache and not
redirected.
I am using Apache proxying, which gives me Yet Another type of
cache temporarily while I’m still downloading the other packages. (I
should actually work out a way to have these caches used by apt-mirror
itself in case a user has already requested a new package while waiting
for apt-mirror to get it.)

Once I do a rewrite like this for each of the hosts I’m replacing with
a mirror, I’m almost done. The problem is that if for any reason my
site needs to give a 403 to the clients, I would actually like to
double-check to be sure that the URL doesn’t happen to work at the place
I’m mirroring from.

My hope was that I could write a RewriteRule based on what the
HTTP return code would be when the request completed. This was
really hard to do, it seemed, and perhaps undoable. The quickest
solution I found was to write a CGI script to do the redirect. So, in
the Apache config I have:

        ErrorDocument 403 /cgi/redirect-forbidden.cgi

And, the CGI script looks like this:

        #!/usr/bin/perl
        
        use strict;
        use CGI qw(:standard);
        
        my $val = $ENV{REDIRECT_SCRIPT_URI};
        
        $val =~ s%^http://(\S+).sflc.info(/.*)$%$2%;
        if ($1 eq "ubuntu-security") {
           $val = "http://91.189.88.37$val";
        } else {
           $val = "http://91.189.88.45$val";
        }
        
        print redirect($val);

With these changes, the user will be redirected to the original when
the files aren’t available on the mirror, and as the mirror gets more
accurate, they’ll get more files from the mirror.

I still have problems if for any reason the user gets a Packages or
Sources file from the original site before the mirror is synchronized,
but this rarely happens since apt-mirror is pretty careful. The only
time it might happen is if the user did an apt-get update when
not connected to our VPN and only a short time later did one while
connected.

apt-mirror and Other Caching for Debian/Ubuntu Repositories

2008-01-16 Bradley M. Kuhn

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2008/01/16/apt-mirror-1.html

Working for a small non-profit, everyone has to wear lots of hats, and
one that I have to wear from time to time (since no one else here can)
is “sysadmin”. One of the perennial rules of system
administration is: you can never give users enough bandwidth. The
problem is, they eventually learn how fast your connection to the
outside is, and then complain any time a download doesn’t run at that
speed. Of course, if you have a T1 or better, it’s usually the other
side that’s the problem. So, I look to use our extra bandwidth during
off hours to cache large pools of data that are often downloaded. With
a organization full of Ubuntu machines, the Ubuntu repositories are an
important target for caching.

apt-mirror is a
program that mirrors large Debian-based repositories, including the
Ubuntu ones. There
are already
tutorials
available on how to set it up. What I’m writing about here is a way to
“force” users to use that repository.

The obvious way, of course, is to make
everyone’s /etc/apt/sources.list point at the mirrored
repository. This often isn’t a good option. Save the servers, the user
base here is all laptops, which means that they will often be on
networks that may actually be closer to another package repository and
perhaps I want to avoid interfering with that. (Although given that I
can usually give almost any IP number in the world better than the
30kbs/sec that ubuntu.com’s servers seem to quickly throttle to, that
probably doesn’t matter so much).

The bigger problem is that I don’t want to be married to the idea that
the apt-mirror is part of our essential 24/7 infrastructure. I don’t
want an angry late-night call from a user because they can’t install a
package, and I want the complete freedom to discontinue the server at
any time, if I find it to be unreliable. I can’t do this easily if
sources.list files on traveling machines are hard-coded with
the apt-mirror server’s name or address, especially when I don’t know
when exactly they’ll connect back to our VPN.

The easier solution is to fake out the DNS lookups via the DNS server
used by the VPN and the internal network. This way, user only get the
mirror when they are connected to the VPN or in the office; otherwise,
the get the normal Ubuntu servers. I had actually forgotten you could
fake out DNS on a per host basis, but asking my friend Paul reminded me
quickly. In /etc/bin/named.conf.local (on Debian/Ubuntu), I
just add:

        zone "archive.ubuntu.com"      {
                type master;
                file "/etc/bind/db.archive.ubuntu-fake";
        };

And in /etc/bind/db.archive.ubuntu-fake:

        $TTL    604800
        @ IN SOA archive.ubuntu.com.  root.vpn. (
               2008011001  ; serial number                                              
               10800 3600 604800 3600)
             IN NS my-dns-server.vpn.
        
        ;                                                                               
        ;  Begin name records                                                           
        ;                                                                               
        archive.ubuntu.com.  IN A            MY.EXTERNAL.FACING.IP

And there I have it; I just do one of those for each address I want to
replace (e.g., security.ubuntu.com). Now, when client machines
lookup archive.ubuntu.com (et al), they’ll
get MY.EXTERNAL.FACING.IP, but only
when my-dns-server.vpn is first in their resolv.conf.

Next time, I’ll talk about some other ideas on how I make the
apt-mirror even better.

Postfix Trick to Force Secondary MX to Deliver Locally

2008-01-09 Bradley M. Kuhn

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2008/01/09/postfix-secondary-mx-local-deliver.html

Suppose you have a domain name, example.org, that has a primary MX host
(mail.example.org) that does most of the delivery. However,
one of the users, who works at example.com, actually gets delivery of
<[email protected]> at work (from the primary MX for example.com,
mail.example.com). Of course, a simple .forward
or /etc/aliases entry would work, but this would pointlessly
push email back and forth between the two mail servers — in some
cases, up to three pointless passes before the final destination!
That’s particularly an issue in today’s SPAM-laden world. Here’s how to
solve this waste of bandwidth using Postfix.

This tutorial here assumes you have a some reasonable background
knowledge of Postfix MTA administration. If you don’t, this might go a
bit fast for you.

To begin, first note that this setup assumes that you have something
like this with regard to your MX setup:

        $ host -t mx example.org
        example.org mail is handled by 10 mail.example.org.
        example.org mail is handled by 20 mail.example.com.
        $ host -t mx example.com
        example.com mail is handled by 10 mail.example.com.

Our first task is to avoid
example.org SPAM
backscatter on mail.example.com. To do that, we make a file with
all the valid accounts for example.org and put it
in mail.example.com:/etc/postfix/relay_recipients. (For more
information, read
the Postfix
docs
or various
tutorials
about this.) After that, we
have something like this in mail.example.com:/etc/postfix/main.cf:

        relay_domains = example.org
        relay_recipient_maps = hash:/etc/postfix/relay_recipients

And this in /etc/postfix/transport:

        example.org     smtp:[mail.example.org]

This will give proper delivery for our friend <[email protected]>
(assuming mail.example.org is forwarding that address properly to
<[email protected]>), but mail will push mail back and forth
unnecessarily when mail.example.com gets a message for
<[email protected]>. What we actually want is to wise up
mail.example.com so it “knows” that mail for
<[email protected]> is ultimately going to be delivered locally on
that server.

To do this, we add <[email protected]> to
the virtual_alias_maps, with an entry like:

        [email protected]      user

so that the key [email protected] resolves to the local
username user. Fortunately, Postfix is smart enough to look at
the virtual table first before performing a relay.

Now, what about aliases like <[email protected]>, that
actually forwards to <[email protected]>? That will have the same
pointless forwarding from server-to-server unless we address it
specifically. To do so, we use the transport file. of course, we
should already have that catch-all entry there to do the relaying:

        example.org     smtp:[mail.example.org]

But, we can also add email address specific entries for certain
addresses in the example.org domain. Fortunately, email address matches
in the transport table take precedence over whole domain match entries
(see the transport man
page for details.). Therefore, we simply add entries to
that transport file like this for each of user‘s
aliases:

        [email protected]    local:user

(Note: that assumes you have a delivery method in master.cf
called local. Use whatever transport you typically use to
force local delivery.)

And there you have it! If you have (those albeit rare) friendly and
appreciative users, user will thank you for the slightly
quicker mail delivery, and you’ll be glad that you aren’t pointlessly
shipping SPAM back and forth between MX’s unnecessarily.

Apache 2.0 -> 2.2 LDAP Changes on Ubuntu

2008-01-01 Bradley M. Kuhn

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2008/01/01/apache-2-2-ldap.html

I thought the following might be of use to those of you who are still
using Apache 2.0 with LDAP and wish to upgrade to 2.2. I found this
basic information around online, but I had to search pretty hard for it.
Perhaps presenting this in a more straightforward way might help the
next searcher to find an answer more quickly. It’s probably only of
interest if you are using LDAP as your authentication system with an
older Apache (e.g., 2.0) and have upgraded to 2.2 on an Ubuntu or Debian
system (such as upgrading from dapper to gutsy.)

When running dapper on my intranet web server with Apache
2.0.55-4ubuntu2.2, I had something like this:

             <Directory /var/www/intranet>
                   Order allow,deny
                   Allow from 192.168.1.0/24 
        
                   Satisfy All
                   AuthLDAPEnabled on
                   AuthType Basic
                   AuthName "Example.Org Intranet"
                   AuthLDAPAuthoritative on
                   AuthLDAPBindDN uid=apache,ou=roles,dc=example,dc=org
                   AuthLDAPBindPassword APACHE_BIND_ACCT_PW
                   AuthLDAPURL ldap://127.0.0.1/ou=staff,ou=people,dc=example,dc=org?cn
                   AuthLDAPGroupAttributeIsDN off
                   AuthLDAPGroupAttribute memberUid
        
                   require valid-user
            </Directory>

I upgraded that server to gutsy (via dapper → edgy → feisty
→ gutsy in succession, just because it’s safer), and it now has
Apache 2.2.4-3build1. The methods to do LDAP authentication is a bit
more straightforward now, but it does require this change:

            <Directory /var/www/intranet>
                Order allow,deny
                Allow from 192.168.1.0/24 
        
                AuthType Basic
                AuthName "Example.Org Intranet"
                AuthBasicProvider ldap
                AuthzLDAPAuthoritative on
                AuthLDAPBindDN uid=apache,ou=roles,dc=example,dc=org
                AuthLDAPBindPassword APACHE_BIND_ACCT_PW
                AuthLDAPURL ldap://127.0.0.1/ou=staff,ou=people,dc=example,dc=org
        
                require valid-user
                Satisfy all
            </Directory>

However, this wasn’t enough. When I set this up, I got rather strange
error messages such as:

        [error] [client MYIP] GROUP: USERNAME not in required group(s).

I found somewhere online (I’ve now lost the link!) that you couldn’t
have standard pam auth competing with the LDAP authentication. This
seemed strange to me, since I’ve told it I want the authentication
provided by LDAP, but anyway, doing the following on the system:

        a2dismod auth_pam
        a2dismod auth_sys_group

solved the problem. I decided to move on rather than dig deeper into the
true reasons. Sometimes, administration life is actually better with a
mystery about.

stet and AGPLv3

2007-11-22 Bradley M. Kuhn

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2007/11/21/stet-and-agplv3.html

Many people don’t realize that the GPLv3 process actually began long
before the November 2005 announcement. For me and a few others, the GPLv3
process started much earlier. Also, in my view, it didn’t actually end
until this week, the FSF released the AGPLv3. Today, I’m particularly
proud that stet was the first software released covered by the terms of
that license.

The GPLv3 process focused on the idea of community, and a community is
built from bringing together many individual experiences. I am grateful
for all my personal experiences throughout this process. Indeed, I
would guess that other GPL fans like myself remember, as I do, the first
time the heard the phrase “GPLv3”. For me, it was a bit
early — on Tuesday 8 January 2002 in a conference room at MIT. On
that day, Richard Stallman, Eben Moglen and I sat down to have an
all-day meeting that included discussions regarding updating GPL. A key
issue that we sought to address was (in those days) called the
“Application Service Provider (ASP) problem” — now
called “Software as a Service (SaaS)”.

A few days later, on the telephone with Moglen² one morning, as I stood in my
kitchen making oatmeal, we discussed this problem. I pointed out the
oft-forgotten section 2(c) of the GPL [version 2]. I argued that contrary
to popular belief, it does have restrictions on some minor
modifications. Namely, you have to maintain those print statements for
copyright and warranty disclaimer information. It’s reasonable, in other
words, to restrict some minor modifications to defend freedom.

We also talked about that old Computer Science problem of having a
program print its own source code. I proposed that maybe we needed a
section 2(d) that required that if a program prints its own source to
the user, that you can’t remove that feature, and that the feature must
always print the complete and corresponding source.

Within two months, Affero
GPLv1 was published — an authorized fork of the GPL to test
the idea. From then until AGPLv3, that “Affero clause”
has had many changes, iterations and improvements, and I’m grateful
for all the excellent feedback, input and improvements that have gone
into it. The
result, the
Affero GPLv3 (AGPLv3) released on Monday, is an excellent step
forward for software freedom licensing. While the community process
indicated that the preference was for the Affero clause to be part of
a separate license, I’m nevertheless elated that the clause continues
to live on and be part of the licensing infrastructure defending
software freedom.

Other than coining the Affero clause, my other notable personal
contribution to the GPLv3 was management of a software development
project to create the online public commenting system. To do the
programming, we contracted with Orion Montoya, who has extensive
experience doing semantic markup of source texts from an academic
perspective. Orion gave me my first introduction to the whole
“Web 2.0” thing, and I was amazed how useful the result was;
it helped the leaders of the process easily grok the public response.
For example, the intensity highlighting — which shows the hot
spots in the text that received the most comments — gives a very
quick picture of sections that are really of concern to the public. In
reviewing the drafts today, I was reminded that the big red area in
section 1 about “encryption and authorization codes” is
substantially
changed and less intensely highlighted by draft 4. That quick-look
gives a clear picture of how the community process operated to get a
better license for everyone.

Orion, a Classics scholar as an undergrad, named the
software stet for its original Latin definition: “let it
stand as it is”. It was his hope that stet (the software) would
help along the GPLv3 process so that our whole community, after filing
comments on each successive draft, could look at the final draft and
simply say: Stet!

Stet has a special place in software history, I believe, even if it’s
just a purely geeky one. It is the first software system in history to
be meta-licensed. Namely, it was software whose output was its own
license. It’s with that exciting hacker concept that I put up today
a Trac instance
for stet, licensed under the terms of the AGPLv3 [ which is now on
Gitorious ] ¹.

Stet is by no means ready for drop-in production. Like most software
projects, we didn’t estimate perfectly how much work would be needed.
We got lazy about organization early on, which means it still requires a
by-hand install, and new texts must be carefully marked up by hand.
We’ve moved on to other projects, but hopefully SFLC will host the Trac
instance indefinitely so that other developers can make it better.
That’s what copylefted FOSS is all about — even when it’s
SaaS.

¹Actually, it’s
under AGPLv3 plus an exception to allow for combining with the
GPLv2-only Request Tracker, with which parts of stet combine.

²Update
2016-01-06:After writing this blog post, I found
evidence in my email archives from early 2002, wherein Henry Poole (who
originally suggested the need for Affero GPL to FSF), began cc’ing me anew
on an existing thread. In that thread, Poole quoted text from Moglen
proposing the original AGPLv1 idea to Poole. Moglen’s quoted text in
Poole’s email proposed the idea as if it were solely Moglen’s own. Based
on the timeline of the emails I have, Moglen seems to have written to Poole
within 36-48 hours of my original formulation of the idea.

While I do not accuse Moglen of plagiarism, I believe he does at least
misremember my idea as his own, which is particularly surprising, as Moglen
(at that time, in 2002) seemed unfamiliar with the Computer Science concept
of a quine; I had to explain that concept as part of my presentation of my
idea. Furthermore, Moglen and I discussed this matter in a personal
conversation in 2007 (around the time I made this blog post originally) and
Moglen said to me: “you certainly should take credit for the Affero
GPL”. Thus, I thought the matter was thus fully settled back in
2007, and thus Moglen’s post-2007 claims of credit that write me out of
Affero GPL’s history are simply baffling. To clear up the confusion his
ongoing claims create, I added this footnote to communicate unequivocally
that my memory of that phone call is solid, because it was the first time I
ever came up with a particularly interesting licensing idea, so the memory
became extremely precious to me immediately. I am therefore completely
sure I was the first to propose the original idea of mandating preservation
of a quine-like feature in AGPLv1§2(d) (as a fork/expansion of
GPLv2§2(c)) on the telephone to Moglen, as described above. Moglen
has never produced evidence to dispute my recollection, and even agreed
with the events as I told them back in 2007.

Nevertheless, unlike Moglen, I do admit that creation of the final text of
AGPLv1 was a collaborative process, which included contributions from
Moglen, Poole, RMS, and a lawyer (whose name I don’t recall) whom Poole
hired. AGPLv3§13’s drafting was similarly collaborative, and included
input from Richard Fontana, David Turner, and Brett Smith, too.

Finally, I note my surprise at this outcome. In my primary community
— the Free Software community — people are generally extremely
good at giving proper credit. Unlike the Free Software community, legal
communities apparently are cutthroat on the credit issue, so I’ve
learned.

More Xen Tricks

2007-08-24 Bradley M. Kuhn

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2007/08/24/more-xen.html

In
my previous
post about Xen, I talked about how easy Xen is to configure and
set up, particularly on Ubuntu and Debian. I’m still grateful that
Xen remains easy; however, I’ve lately had a few Xen-related
challenges that needed attention. In particular, I’ve needed to
create some surprisingly messy solutions when using vif-route to
route multiple IP numbers on the same network through the dom0 to a
domU.

I tend to use vif-route rather than vif-bridge, as I like the control
it gives me in the dom0. The dom0 becomes a very traditional
packet-forwarding firewall that can decide whether or not to forward
packets to each domU host. However, I recently found some deep
weirdness in IP routing when I use this approach while needing
multiple Ethernet interfaces on the domU. Here’s an example:

Multiple IP numbers for Apache

Suppose the domU host, called webserv, hosts a number of
websites, each with a different IP number, so that I have Apache
doing something like¹:

        Listen 192.168.0.200:80
        Listen 192.168.0.201:80
        Listen 192.168.0.202:80
        ...
        NameVirtualHost 192.168.0.200:80
        <VirtualHost 192.168.0.200:80>
        ...
        NameVirtualHost 192.168.0.201:80
        <VirtualHost 192.168.0.201:80>
        ...
        NameVirtualHost 192.168.0.202:80
        <VirtualHost 192.168.0.202:80>
        ...

The Xen Configuration for the Interfaces

Since I’m serving all three of those sites from webserv, I
need all those IP numbers to be real, live IP numbers on the local
machine as far as the webserv is concerned. So, in
dom0:/etc/xen/webserv.cfg I list something like:

        vif  = [ 'mac=de:ad:be:ef:00:00, ip=192.168.0.200',
                 'mac=de:ad:be:ef:00:01, ip=192.168.0.201',
                 'mac=de:ad:be:ef:00:02, ip=192.168.0.202' ]

… And then make webserv:/etc/iftab look like:

        eth0 mac de:ad:be:ef:00:00 arp 1
        eth1 mac de:ad:be:ef:00:01 arp 1
        eth2 mac de:ad:be:ef:00:02 arp 1

… And make webserv:/etc/network/interfaces (this is
probably Ubuntu/Debian-specific, BTW) look like:

        auto lo
        iface lo inet loopback
        auto eth0
        iface eth0 inet static
         address 192.168.0.200
         netmask 255.255.255.0
        auto eth1
        iface eth1 inet static
         address 192.168.0.201
         netmask 255.255.255.0
        auto eth2
        iface eth2 inet static
         address 192.168.0.202
         netmask 255.255.255.0

Packet Forwarding from the Dom0

But, this doesn’t get me the whole way there. My next step is to make
sure that the dom0 is routing the packets properly to
webserv. Since my dom0 is heavily locked down, all
packets are dropped by default, so I have to let through explicitly
anything I’d like webserv to be able to process. So, I
add some code to my firewall script on the dom0 that looks like:²

        webIpAddresses="192.168.0.200 192.168.0.201 192.168.0.202"
        UNPRIVPORTS="1024:65535"
        
        for dport in 80 443;
        do
          for sport in $UNPRIVPORTS 80 443 8080;
          do
            for ip in $webIpAddresses;
            do
              /sbin/iptables -A FORWARD -i eth0 -p tcp -d $ip \
                --syn -m state --state NEW \
                --sport $sport --dport $dport -j ACCEPT
        
              /sbin/iptables -A FORWARD -i eth0 -p tcp -d $ip \
                --sport $sport --dport $dport \
                -m state --state ESTABLISHED,RELATED -j ACCEPT
        
              /sbin/iptables -A FORWARD -o eth0 -s $ip \
                -p tcp --dport $sport --sport $dport \
                -m state --state NEW,ESTABLISHED,RELATED -j ACCEPT
            done  
          done
        done

Phew! So at this point, I thought I was done. The packets should find
their way forwarded through the dom0 to the Apache instance running on
the domU, webserv. While that much was true, I now have
the additional problem that packets got lost in a bit of a black hole
on webserv. When I discovered the black hole, I quickly
realized why. It was somewhat atypical, from webserv‘s
point of view, to have three “real” and different Ethernet
devices with three different IP numbers, which all talk to the exact
same network. There was more intelligent routing
needed.³

Routing in the domU

While most non-sysadmins still use the route command to
set up local IP routes on a GNU/Linux host, iproute2
(available via the ip command) has been a standard part
of GNU/Linux distributions and supported by Linux for nearly ten
years. To properly support the situation of multiple (from
webserv‘s point of view, at least) physical interfaces on
the same network, some special iproute2 code is needed.
Specifically, I set up separate route tables for each device. I first
encoded their names in /etc/iproute2/rt_tables (the
numbers 16-18 are arbitrary, BTW):

        16      eth0-200
        17      eth1-201
        18      eth2-202

And here are the ip commands that I thought would work
(but didn’t, as you’ll see next):

        /sbin/ip route del default via 192.168.0.1
        
        for table in eth0-200 eth1-201 eth2-202;
        do
           iface=`echo $table | perl -pe 's/^(\S+)\-.*$/$1/;'`
           ipEnding=`echo $table | perl -pe 's/^.*\-(\S+)$/$1/;'`
           ip=192.168.0.$ipEnding
           /sbin/ip route add 192.168.0.0/24 dev $iface table $table
        
           /sbin/ip route add default via 192.168.0.1 table $table
           /sbin/ip rule add from $ip table $table
           /sbin/ip rule add to 0.0.0.0 dev $iface table $table
        done
        
        /sbin/ip route add default via 192.168.0.1

The idea is that each table will use rules to force all traffic coming
in on the given IP number and/or interface to always go back out on
the same, and vice versa. The key is these two lines:

           /sbin/ip rule add from $ip table $table
           /sbin/ip rule add to 0.0.0.0 dev $iface table $table

The first rule says that when traffic is coming from the given IP number,
$ip, the routing rules in table, $table should
be used. The second says that traffic to anywhere when bound for
interface, $iface should use table,
$table.

The tables themselves are set up to always make sure the local network
traffic goes through the proper associated interface, and that the
network router (in this case, 192.168.0.1) is always
used for foreign networks, but that it is reached via the correct
interface.

This is all well and good, but it doesn’t work. Certain instructions
fail with the message, RTNETLINK answers: Network is unreachable, because the 192.168.0.0 network cannot be found
while the instructions are running. Perhaps there is an
elegant solution; I couldn’t find one. Instead, I temporarily set
up “dummy” global routes in the main route table and
deleted them once the table-specific ones were created. Here’s the
new bash script that does that (lines that are added are emphasized
and in bold):

        /sbin/ip route del default via 192.168.0.1
        for table in eth0-200 eth1-201 eth2-202;
        do
           iface=`echo $table | perl -pe 's/^(\S+)\-.*$/$1/;'`
           ipEnding=`echo $table | perl -pe 's/^.*\-(\S+)$/$1/;'`
           ip=192.168.0.$ipEnding
           /sbin/ip route add 192.168.0.0/24 dev $iface table $table
        
           /sbin/ip route add 192.168.0.0/24 dev $iface src $ip
        
           /sbin/ip route add default via 192.168.0.1 table $table
           /sbin/ip rule add from $ip table $table
        
           /sbin/ip rule add to 0.0.0.0 dev $iface table $table
        
           /sbin/ip route del 192.168.0.0/24 dev $iface src $ip
        done
        /sbin/ip route add 192.168.0.0/24 dev eth0 src 192.168.0.200
        /sbin/ip route add default via 192.168.0.1 
        /sbin/ip route del 192.168.0.0/24 dev eth0 src 192.168.0.200

I am pretty sure I’m missing something here — there must be a
better way to do this, but the above actually works, even if it’s
ugly.

Alas, Only Three

There was one additional confusion I put myself through while
implementing the solution. I was actually trying to route four
separate IP addresses into webserv, but discovered that
I got found this error message (found via dmesg on the
domU):
netfront can't alloc rx grant refs. A quick google
around showed me
that the
XenFaq, which says that Xen 3 cannot handled more than three network
interfaces per domU. Seems strangely arbitrary to me; I’d love
to hear why cuts it off at three. I can imagine limits at one and
two, but it seems that once you can do three, n should be
possible (perhaps still with linear slowdown or some such). I’ll
have to ask the Xen developers (or UTSL) some day to find out what
makes it possible to have three work but not four.

¹Yes, I know I
could rely on client-provided Host: headers and do this with full
name-based virtual hosting, but I don’t
like to do that for good reason (as outlined in the Apache
docs).

²Note that the
above firewall code must run on dom0, which has one real
Ethernet device (its eth0) that is connected properly to
the wide 192.168.0.0/24 network, and should have some IP
number of its own there — say 192.168.0.100. And,
don’t forget that dom0 is configured for vif-route, not
vif-bridge. Finally, for brevity, I’ve left out some of the
firewall code that FORWARDs through key stuff like DNS. If you are
interested in it, email me or look it up in a firewall book.

³I was actually a
bit surprised at this, because I often have multiple IP numbers
serviced from the same computer and physical Ethernet interface.
However, in those cases, I use virtual interfaces
(eth0:0, eth0:1, etc.). On a normal system,
Linux does the work of properly routing the IP numbers when you attach
multiple IP numbers virtually to the same physical interface.
However, in Xen domUs, the physical interfaces are locked by Xen to
only permit specific IP numbers to come through, and while you can set
up all the virtual interfaces you want in the domU, it will only get
packets destine for the IP number specified in the vif
section of the configuration file. That’s why I added my three
different “actual” interfaces in the domU.

Virtually Reluctant

2007-06-12 Bradley M. Kuhn

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2007/06/12/virtually-reluctant.html

Way back when User
Mode Linux (UML) was the “only way” the Free Software
world did anything like virtualization, I was already skeptical.
Those of us who lived through the coming of age of Internet security
— with a remote root exploit for every day of the week —
became obsessed with the chroot and its ultimate limitations. Each
possible upgrade to a better, more robust virtual environment was met
with suspicion on the security front. I joined the many who doubted
that you could truly secure a machine that offered disjoint services
provisioned on the same physical machine. I’ve recently revisited
this position. I won’t say that Xen has completely changed my mind,
but I am open-minded enough again to experiment.

For more than a decade, I have used chroots as a mechanism to segment a
service that needed to run on a given box. In the old days
of ancient BINDs and sendmails, this was often the best we could do
when living with a program we didn’t fully trust to be clean of
remotely exploitable bugs.

I suppose those days gave us all rather strange sense of computer
security. I constantly have the sense that two services running on
the same box always endanger each other in some fundamental way. It
therefore took me a while before I was comfortable with the resurgence
of virtualization.

However, what ultimately drew me in was the simple fact that modern
hardware is just too darn fast. It’s tough to get a machine these
days that isn’t ridiculously overpowered for most tasks you put in
front of it. CPUs sit idle; RAM sits empty. We should make more
efficient use of the hardware we have.

Even with that reality, I might have given up if it wasn’t so easy. I
found a good link
about Debian on Xen, a useful entry in
the Xen Wiki, and some good
network and LVM
examples. I also quickly learned how to use RAID/LVM
together for disk redundancy inside Xen instances. I even got bonded
ethernet working with some help to add
additional network redundancy.

So, one Saturday morning, I headed into the office, and left that
afternoon with two virtual servers running. It helped that Xen 3.0 is
packaged properly for recent Ubuntu versions, and a few obvious
apt-get installs get you what you need on edgy and
feisty. In fact, I only struggled (and only just a bit) with the
network, but quickly discovered two important facts:

VIF network routing in my opinion is a bit easier to configure and
more stable than VIF bridging, even if routing is a bit
slower.
sysctl -w net.ipv4.conf.DEVICE.proxy_arp=1 is needed to
make the network routing down into the instances work
properly.

I’m not completely comfortable yet with the security of virtualization.
Of course, locking down the Dom0 is absolutely essential, because
there lies the keys to your virtual kingdom. I lock it down with
iptables so that only SSH from a few trusted hosts comes
in, and even services as fundamental as DNS can only be had from a few
trusted places. But, I still find myself imagining ways people can
bust through the instance kernels and find their way to the
hypervisor.

I’d really love to see a strong line-by-line code audit of the
hypervisor and related utilities to be sure we’ve got something we can
trust. However, in the meantime, I certainly have been sold on the
value of this approach, and am glad it’s so easy to set up.

Tools for Investigating Copyright Infringement

2007-05-08 Bradley M. Kuhn

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2007/05/08/infringement.html

Nearly all software developers know that software is covered by
copyright. Many know that copyright covers the expression of an idea
fixed in a medium (such as a series of bytes), and that the copyright
rules govern the copying, modifying and distributing of the work.
However, only a very few have considered the questions that arise when
trying to determine if one work infringes the copyright of
another.

Indeed, in the world of software freedom, copyright is seen as a system
we have little choice but to tolerate. Many Free Software developers
dislike the copyright system we have, so it is little surprise that
developers want to spend minimal time thinking about it.
Nevertheless, the copyright system is the foremost legal framework
that governs software¹, and we have to
live within it for the moment.

My fellow developers have asked me for years what constitute copyright
infringement. In turn, for years, I have asked the lawyers I worked
with to give me guidelines to pass on to the Free Software development
community. I’ve discovered that it’s difficult to adequately describe
the nature of copyright infringement to software developers. While it
is easy to give pathological examples of obvious infringement (such as
taking someone’s work, removing their copyright notices and
distributing it as your own), it quickly becomes difficult to give
definitive answers in many real world examples whether some particular
activity constitutes infringement.

In fact, in nearly every GPL enforcement cases that I’ve worked on in
my career, the fact that infringement had occurred was never in
dispute. The typical GPL violator started with a work under GPL, made
some modifications to a small portion of the codebase, and then
distributed the whole work in binary form only. It is virtually
impossible to act in that way and still not infringe the original
copyright.

Usually, the cases of “hazy” copyright infringement come up
the other way around: when a Free Software program is accused of
infringing the copyright of some proprietary work. The most famous
accusation of this nature came from Darl McBride and his colleagues at
SCO, who claimed that something called “Linux” infringed
his company’s rights. We now know that there was no copyright
infringement (BTW, whether McBride meant to accuse the GNU/Linux
operating system or the kernel named Linux, we’ll never actually
know). However, the SCO situation educated the Free Software
community that we must strive to answer quickly and definitively when
such accusations arise. The burden of proof is usually on the
accuser, but being able to make a preemptive response to even the hint
of an allegation is always advantageous when fighting FUD in the court
of public opinion.

Finally, issues of “would-be” infringement detection come
up for companies during due diligence work. Ideally, there should be
an easy way for companies to confirm which parts of their systems are
derivatives of Free Software systems, which would make compliance with
licenses easy. A few proprietary software companies provide this
service; however there should be readily available Free Software tools
(just as there should be for all tasks one might want to perform with a
computer).

It is not so easy to create such tools. Copyright infringement is not
trivially defined; in fact, most non-trivial situations require a
significant amount of both technical and legal judgement. Software
tools cannot make a legal conclusion regarding copyright infringement.
Rather, successful tools will guide an expert’s analysis of a
situation. Such systems will immediately identify the rarely-found
obvious indications of infringement, bring to the forefront facts that
need an exercise of judgement, and leave everything else in the
background.

In this multi-part series of blog entries, I will discuss the state of
the art in these Free Software systems for infringement analysis and
what plans our community should make for the creation Free systems
that address this problem.

¹ Copyright is the legal
system that non-lawyers usually identify most readily as governing
software, but the patent system (unfortunately) also governs software
in many countries, and many non-Free Software licenses (and a few of
the stranger Free Software ones) also operate under contract law as
well as copyright law. Trade secrets are often involved with software
as well. Nevertheless, in the Software Freedom world, copyright is
the legal system of primary attention on a daily basis.

Walnut Hills, AP Computer Science, 1998-1999

2007-05-05 Bradley M. Kuhn

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2007/05/05/walnut-hills-1998.html

I taught AP Computer Science at Walnut Hills High School in Cincinnati,
OH during the 1998-1999 school year.

I taught this course because:

They were desperate for a teacher. The rather incompetent
teacher who was scheduled to teach the course quit (actually,
frighteningly enough, she got a higher paying and higher ranking job
in a nearby school system) a few weeks before the school year was to
start.
The environment was GNU/Linux
using GCC‘s C++
compiler. I went to the job interview because a mother of someone in
the class begged me to go, but I was going to walk out as soon as I
saw I’d have to teach on Microsoft (which I assumed it would be). My
jaw literally dropped when I saw:
The
students had built their own lab, which even got covered in the
Cincinnati Post. I was quite amazed that some of
the most brilliant high school students I’ve ever seen were assembled
there in one classroom.

It became quite clear to me that I owed it to these students to teach
the course. They’d discovered Free Software before the boom, and
built their own lab despite the designate CS teacher obviously
knowning a hell of lot less about the field than they did. There
wasn’t a person qualified and available , in my view, in all of
Cincinnati to teach the class. High school teacher wages are
traditionally pathetic. So, I joined the teacher’s union and took
the job.

Doing this work delayed my thesis and graduation from the Master’s
program at University of Cincinnati for yet another year, but it was
worth doing. Even almost a decade later, it ranks in my mind on the
top ten list of great things I’ve done in my life, even despite all
the exciting Free Software work I’ve been involved with in my
positions at the FSF and the Software Freedom Conservancy.

I am exceedingly proud of what my students have accomplished. It’s
clear to me that somehow we assembled an incredibly special group of
Computer Science students; many of them have gone on to make
interesting contributions. I know they didn’t always like that I
brought my Free Software politics into the classroom, but I think we
had a good year, and their excellent results on that AP exam showed
it. Here are a few of my students from that year who have a public
online life:

If you were my student at Walnut Hills and would like a link here, let
me know and I’ll add one.

Remember the Verbosity (A Brief Note)

2007-04-17 Bradley M. Kuhn

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2007/04/17/linux-verbose-build.html

I don’t remember when it happened, but sometime in the past four years,
the Makefiles for the kernel named Linux changed. I can’t remember
exactly, but I do recall sometime “recently” that the
kernel build output stopped looking like what I remember from 1991,
and started looking like this:

CC arch/i386/kernel/semaphore.o CC arch/i386/kernel/signal.o

This is a heck of a lot easier to read, but there was something cool
about having make display the whole gcc
command lines, like this:

gcc -m32 -Wp,-MD,arch/i386/kernel/.semaphore.o.d -nostdinc -isystem /usr/lib/gcc/i486-linux-gnu/4.0.3/include -D__KERNEL__ -Iinclude -include include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -ffreestanding -Os -fomit-frame-pointer -pipe -msoft-float -mpreferred-stack-boundary=2 -march=i686 -mtune=pentium4 -Iinclude/asm-i386/mach-default -Wdeclaration-after-statement -Wno-pointer-sign -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(semaphore)" -D"KBUILD_MODNAME=KBUILD_STR(semaphore)" -c -o arch/i386/kernel/semaphore.o arch/i386/kernel/semaphore.c gcc -m32 -Wp,-MD,arch/i386/kernel/.signal.o.d -nostdinc -isystem /usr/lib/gcc/i486-linux-gnu/4.0.3/include -D__KERNEL__ -Iinclude -include include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -ffreestanding -Os -fomit-frame-pointer -pipe -msoft-float -mpreferred-stack-boundary=2 -march=i686 -mtune=pentium4 -Iinclude/asm-i386/mach-default -Wdeclaration-after-statement -Wno-pointer-sign -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(signal)" -D"KBUILD_MODNAME=KBUILD_STR(signal)" -c -o arch/i386/kernel/signal.o arch/i386/kernel/signal.c

I never gave it much thought, since the new form was easier to read. I
figured that those folks who still eat kernel code for breakfast knew
about this change well ahead of time. Of course, they were the only
ones who needed to see the verbose output of the gcc
command lines. I could live with seeing the simpler CC
lines for my purposes, until today.

I was compiling kernel code and for the first time since this change in
the Makefiles, I was using a non-default gcc to build
Linux. I wanted to double-check that I’d given the right options to
make throughout the process. I therefore found myself
looking for a way to see the full output again (and for the first
time). It was easy enough to figure out: giving the variable setting
V=1 to make gives you the verbose version.
For you Debian folks like me, we’re using make-kpkg, so
the line we need looks like: MAKEFLAGS="V=1" make-kpkg kernel_image.

It’s nice sometimes to pretend I’m compiling 0.99pl12 again and not
2.6.20.7. 🙂 No matter which options you give make, it is
still a whole lot easier to bootstrap Linux these days.

User-Empowered Security via encfs

2007-04-10 Bradley M. Kuhn

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2007/04/10/encfs.html

One of my biggest worries in using a laptop is that data
can suddenly become available to anyone in the world if a laptop is
lost or stolen. I was reminded of this during the mainstream
media coverage¹ of this issue last year.

There’s the old security through obscurity perception of running
GNU/Linux systems. Proponents of this theory argue that most thieves
(or impromptu thieves, who find a lost laptop but decide not to return
it to its owner) aren’t likely to know how to use a GNU/Linux system,
and will probably wipe the drive before selling it or using it.
However, with the popularity of Free Software rising, this old standby
(which never should have been a standby anyway, of course) doesn’t
even give an illusion of security anymore.

I have been known as a computer security paranoid in my time, and I
keep a rather strict regiment of protocols for my own personal
computer security. But, I don’t like to inflict new onerous security
procedures on the otherwise unwilling. Generally, people will find
methods around security procedures when they aren’t fully convinced
they are necessary, and you’re often left with a situation just as bad
or worse than when you started implementing your new procedures.

My solution for the lost/stolen laptop security problem was therefore
two-fold: (a) education among the userbase about how common it is to
have a laptop lost or stolen, and (b) providing a simple user-space
mechanism for encrypting sensitive data on the laptop. Since (a) is
somewhat obvious, I’ll talk about (b) in detail.

I was fortunate that, in parallel, my friend Paul and one of my
coworkers discovered how easy it is to use encfs and
told me about it. encfs uses the Filesystem in
Userspace (FUSE) to store encrypted data right in a user’s own home
directory. And, it is trivially easy to set up! I used Paul’s tutorial
myself, but there are many published all over the Internet.

My favorite part of this solution is that rather than an onerous
mandated procedure, encfs turns security into user
empowerment. My colleague James wrote up a tutorial for our internal
Wiki, and I’ve simply encouraged users to take a look and consider
encrypting their confidential data. Even though not everyone has
taken it up yet, many already have. When a new security measure
requires substantial change in behavior of the user, the measure works
best when users are given an opportunity to adopt it at their own
pace. FUSE deserves a lot of credit in this regard, since it lets
users switch their filesystem to encryption in pieces (unlike other
cryptographic filesystems that require some planning ahead). For my
part, I’ve been slowly moving parts of my filesystem into an encrypted
area as I move aside old habits gradually.

I should note that this solution isn’t completely without cost. First,
there is no metadata encryption, but I am really not worried about
interlopers finding out how big our nameless files and directories are
and who created them (anyway, with an SVN checkout, the interesting
metadata is in .svn, so it’s encrypted in this case).
Second, we’ve found that I/O intensive file operations take
approximately twice as long (both under ext3 and XFS) when using
encfs. I haven’t moved my email archives to my encrypted
area yet because of the latter drawback. However, for all my other
sensitive data (confidential text documents, IRC chat logs, financial
records, ~/.mozilla, etc.), I don’t really notice the
slow-down using a 1.6 Ghz CPU with ample free RAM. YMMV.

¹
BTW, I’m skeptical about the FBI’s claim in that
old Washington Post article which states
“review of the equipment by computer forensic teams has
determined that the data base remains intact and has not been
accessed since it was stolen”. I am mostly clueless about
computer forensics; however, barring any sort of physical seal on
the laptop or hard drive casing, could a forensics expert tell if
someone had pulled out the drive, put it in another computer, did a
dd if=/dev/hdb of=/dev/hda, and then put it back as it
was found?

CP Technologies CP-UH-135 USB 2.0 Hub

2005-05-10 Bradley M. Kuhn

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2005/05/10/cp-tech-usb-hub.html

I needed to pick a small, inexpensive, 2.0-compliant USB hub for myself,
and one for any of the users at my job who asked for one. I found
one, the “CP Technologies Hi-Speed USB 2.0 Hub”, which is part
number CP-UH-135. This worked great with GNU/Linux without any
trouble (using Linux 2.6.10 as distributed by Ubuntu), at least at first.

I used this hub without too much trouble for a number of months. Then,
one day, I plugged in a very standard PS-2 to USB converter (a
cable that takes a standard PS-2 mouse and PS-2 keyboard and makes
them show up as USB devices). The hub began to heat up and the
smell of burning electronics came from it. After a few weeks, the
hub began to generate serious USB errors from the kernel named
Linux, and I finally gave up on it. I don’t recommend this hub!

Finally, it has one additional annoying drawback for me: the blue LED
power light on the side of thing is incredibly distracting. I put a
small piece of black tape over it to block it, but it only helped a
little. Such a powerful power light on a small device like that is
highly annoying. I know geeks are really into these sorts of crazy
blue LEDs, but for my part, I always feel like I am about to be assimilated by a funky post-modern
Borg.

I am curious if there are any USB hubs out there that are more reliable
and don’t have annoying lights. I haven’t used USB hubs in the past
so I don’t know if a power LED is common. If you find one, I’d
encourage you to buy that one instead of this one. Almost anywhere
you put the thing on a desk, the LED catches your eye.

IBM xSeries EZ Swap Hard Drive Trays

2005-05-04 Bradley M. Kuhn

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2005/05/04/ibm-xseries.html

A few days ago, I acquired a number of IBM xSeries servers — namely x206
and x226 systems — for my work at the The Software Freedom Law
Center. We bought bare-metal, with just CPU and memory, with
plans to install drives ourselves.

I did that for a few reasons. First, serial ATA (S-ATA or SATA)
support under Linux has just become ready for prime time, and
despite being a SCSI-die-hard for most of my life, I’ve given in
that ATA’s price/performance ratio can’t really be beat, especially
if you don’t need hot swap or hardware RAID.

When I got the machines, which each came with one 80 GB S-ATA drive, I
found them well constructed, including a very easy mounting system
for hard drives. Drives have a blue plastic tray that looks like
this (follow link of image for higher resolution shot).

These so-called “EZ Swap” trays are not for hot-swap; the big IBM swap
trays with the lever are for that. This is just to mount and unmount
drives quickly. I was impressed, and was sad that, since IBM’s goal
is to resell you hard drives, they don’t make it easy to buy these
things outright. You have to look on IBM’s
parts and upgrade site for the x206, you’ll find that they offer
to sell 26K-7344, which is listed as a “SATA tray”, and a 73P-8007,
which is listed as a “Tray, SATA simple swap”. However, there is no
photo, and that part number does not match the part number on the item
itself. On the machines I got, the tray is numbered 73P-9591 (or
rather, P73P9591, but I think the “P” in the front is superfluous and
stands for “Part”).

I spoke to IBM tech support (at +1-800-426-7378), who told me the
replacement part number he had for that tray I had was 73P-8007.
Indeed, if you look at third
party sites, such as Spare Parts Warehouse, you find that number
and a price of US$28 or so. Spare Parts Warehouse doesn’t even sell
the 26K-7344.

It seemed to me strange that we had two things described as SATA tray
could be that different. And the difference in price was
substantial. It costs about US$28 for the 73P-8007 and around US$7
for the 26K-7344.

So, I called IBM spare parts division at +1-800-388-7080, and ordered
one of each. They arrived by DHL this morning. Lo and behold, they
are the very same item. I cannot tell the difference
between them upon close study. The only cosmetic difference is that
they are labeled with different part numbers. The cheaper one is
labeled 26K-7343 (one number less than what I ordered) and the other
is labeled 73P-9591 (the same number that my original SATA drives
came with).

So, if you need an EZ Swap tray from IBM for the xSeries server, I
suggest you order the 26K-7344. If you do so, and find any difference
from the 73P-8007, please do let me know. Update: on 2005-06-22, a
reader told me they now charge US$12 for the 26K-7344 tray. Further
Update: The prices seem to keep rising! Another reader reported to me
on 2005-08-08 that the 26K-7344 is now US$84 (!) and the 73P-8007 is now
only US$15. So, it costs twice as much as it did a few months
ago to get these units, and the cheaper unit apperas to be the 73P-8007.
It’ll be fun to watch and see if the prices change big again in the months
to come.

When you call IBM’s spare parts division, they may give you some
trouble about ordering the part. When you call +1-800-388-7080,
they are expecting you to be an out-of-warranty customer, and make
it difficult for you to order. It depends on who you get, but you
can place an order with a credit card even without an “IBM
Out-of-Warranty Customer Number”. If you have a customer number you
got with your original IBM equipment order, that’s your warranty
customer number and is in a different database than the one used by
the IBM Spare Parts Division.

You can just tell them that you want to make a new order with a credit
card. After some trouble, they’ll do that.

The GNU GPL and the American Dream

2001-02-21 Bradley M. Kuhn

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2001/02/21/american-dream.html

[ This essay
was originally
published on gnu.org. ]

When I was in grade school, right here in the United States of America,
I was taught that our country was the “land of opportunity”. My teachers
told me that my country was special, because anyone with a good idea and a
drive to do good work could make a living, and be successful too. They
called it the “American Dream”.

What was the cornerstone to the “American Dream”? It was
equality — everyone had the same chance in our society to choose
their own way. I could have any career I wanted, and if I worked hard, I
would be successful.

It turned out that I had some talent for working with computers —
in particular, computer software. Indoctrinated with the “American
Dream”, I learned as much as I could about computer software. I
wanted my chance at success.

I quickly discovered though, that in many cases, not all the players in
the field of computer software were equal. By the time I entered the
field, large companies like Microsoft tended to control much of the
technology. And, that technology was available to me under licensing
agreements that forbid me to study and learn from it. I was completely
prohibited from viewing the program source code of the software.

I found out, too, that those with lots of money could negotiate
different licenses. If they paid enough, they could get permission to
study and learn from the source code. Typically, such licenses cost many
thousands of dollars, and being young and relatively poor, I was out of
luck.

After spending my early years in the software business a bit
downtrodden by my inability to learn more, I eventually discovered another
body of software that did allow me to study and learn. This software was
released under a license called the GNU General Public License (GNU
GPL). Instead of restricting my freedom to study and learn from it, this
license was specifically designed to allow me to learn. The license
ensured that no matter what happened to the public versions of the
software, I’d always be able to study its source code.

I quickly built my career around this software. I got lots of work
configuring, installing, administering, and teaching about that
software. Thanks to the GNU GPL, I always knew that I could stay
competitive in my business, because I would always be able to learn easily
about new innovations as soon as they were made. This gave me a unique
ability to innovate myself. I could innovate quickly, and impress my
employers. I was even able to start my own consulting business. My own
business! The pinnacle of the American Dream!

Thus, I was quite surprised last week
when Jim Allchin, a
vice president at
Microsoft hinted
that
the
GNU GPL
contradicted
the
American Way.

The GNU GPL is specifically designed to make sure that all
technological innovators, programmers, and software users are given equal
footing. Each high school student, independent contractor, small business,
and large corporation are given an equal chance to innovate. We all start
the race from the same point. Those people with deep understanding of the
software and an ability to make it work well for others are most likely to
succeed, and they do succeed.

That is exactly what the American Way is about, at least the way I
learned it in grade school. I hope that we won’t let Microsoft and
others change the definition.

Finished Thesis

2001-01-22 Bradley M. Kuhn

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2001/01/22/masters-complete.html

My thesis is nearly complete. I defend tomorrow, and as usual, I let
the deadline run up until the end. I just finished my slides for the
defense, and practiced once. I have some time in the schedule tomorrow to
practice at least once, although I have to find some empty room up at the
University to do it in.

I’ll be glad to be done. It’s been annoying to spend three or four
weeks here sitting around writing about perljvm, and not hacking on it. I
have a Cosource deadline coming up this week, so now’s a good a time as
any to release the first version of the Kawa-based perljvm.

I am really excited about how Kawa works, and how easy it is to massage
perl’s IR into Kawa’s IR. I got more excited about it as I wrote my thesis
defense talk. I really think great things can happen with Kawa in the
future.

Larry Wall is here, and we’ve had two dinners for the Cincinnati
GNU/Linux Users’ Group (who paid Larry’s way to come here). I was there,
and Larry was asking some hard-ish questions about Kawa. Not hard exactly,
just things I didn’t know. I began to realize how much I have focused on
the Kawa API, and I haven’t really been digging in the internals. I told
him I’d try to have some answers about it for my defense, and I will
likely reread Bothner’s papers on the subject tomorrow to get familiar
with how he deals with various issues.

It’s odd having Larry on my thesis committee. I otherwise wouldn’t be
nervous in the least, but I am quite worried with him on the
committee.

Anyway, so I defend tommorrow, then it’s into perljvm hacking again
right away on Tuesday to make the Cosource deadline, and then I have to
finish preparing my Perl tutorial for LinuxExpo Paris.

Finished Thesis Document

2001-01-18 Bradley M. Kuhn

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2001/01/18/thesis-document.html

Tonight, I finished the actual document of my Master’s thesis. I had to
vet it by reading it out loud, about three times. I have a real hard time
finding subtle grammar errors. I believe that when I read, I parse them
out in my head. Reading out loud usually helps, but it wasn’t working so
well this time. (The first draft had many errors, even though I read it
out loud.)

This time, I went through it twice, reading it out loud while bouncing
the mouse along each word. This seemed to help a lot, as I was catching
errors left and right. I hope I got them all.

I sent the final document off to the committee. I haven’t heard from
Larry Wall, whose an external member of my committee, at all. I haven’t
heard from since we set up the plane tickets months ago. I am sure he’s
insanely busy, and that’s likely why. No big deal, I suppose, I am just
overly nervous.

I really need to get to the actually hacking on perljvm. I have lost
three weeks working on the thesis document, which is really only
describing things, not hacking. While I’ll be glad, I’m sure, to have the
Master’s thesis done, but perljvm needs some hacking done on it,
especially considering that I have a Cosource deadline to meet soon.

Noise