All posts by Bradley M. Kuhn

The GPL is a Tool to Encourage Freedom, Not an End in Itself

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2008/04/10/gpl-not-end-in-itself.html

I was amazed to be involved in yet another discussion recently
regarding the old debate about the scope of the GPL under copyright law.
The debate itself isn’t amazing — these debates have happened
somewhere every six months, almost on cue, since around 1994 or so.
What amazed me this time is that some people in the debate believed that
the GPL proponents intend to sneakily pursue an increased scope for
copyright law. Those who think that have completely misunderstood the
fundamental idea behind the GPL.

I’m disturbed by the notion that some believe the goal of the GPL is to
expand copyrightability and the inclusiveness of derivative works. It
seems that so many forget (or maybe they never even knew) that copyleft
was invented to hack copyright — to turn its typical applications
to software inside out. The state of affairs that software is
controlled by draconian copyright rules is a lamentable reality;
copyleft is merely a tool that diffuses the proprietary copyright
weaponry.

But, if it were possible to really consider reduction in copyright
control over software, then I don’t know of a single GPL proponent who
wouldn’t want to bilaterally reduce copyright’s scope for software. For
example, I’ve often proposed, since around 2001, that perhaps copyright
for software should only last three years, non-renewable, and that it
require all who wished to distribute non-public-domain software to
register the source with the Copyright Office. At the end of the three
years, the Copyright Office would automatically publish that now
public-domain source to the world.

If my hypothetical system were the actual (and only) legal regime for
software, and were equally applied to all software — from the
fully Free to the most proprietary — I’d have no sadness at all
that opportunities for GPL enforcement ended after three years, and that
all GPL’d software fell into the public domain on that tight schedule,
because proprietary software and FLOSS would have the same treatment.
Meanwhile, great benefit would be gained for the freedom of all software
users. In short, GPL is not an end in itself, and I wouldn’t want to
ignore the actual goal — more freedom for software users —
merely to strengthen one tool in that battle.

In one of my favorite films, Kevin Smith’s Dogma, Chris
Rock’s character, Rufus, argues that it’s better to have ideas than
beliefs, because ideas can change when the situation does, but beliefs
become ingrained and are harder to shake. I’m not a belief-less person,
but I certainly hold the GPL and the notion of copyleft firmly in the
“idea” camp, not the “belief” one. It’s
unfortunate that the entrenched interests outside of software are (more
or less) inadvertently strengthening software copyright, too. Thus, in
the meantime, we must hold steadfast to the GPL going as far as is
legally permitted under this ridiculously expansive copyright system we
have. But, should a real policy dialogue open on the reduction software
copyright’s scope, GPL proponents will be the first in line to encourage
such bilateral reduction.

The GPL is a Tool to Encourage Freedom, Not an End in Itself

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2008/04/10/gpl-not-end-in-itself.html

I was amazed to be involved in yet another discussion recently
regarding the old debate about the scope of the GPL under copyright law.
The debate itself isn’t amazing — these debates have happened
somewhere every six months, almost on cue, since around 1994 or so.
What amazed me this time is that some people in the debate believed that
the GPL proponents intend to sneakily pursue an increased scope for
copyright law. Those who think that have completely misunderstood the
fundamental idea behind the GPL.

I’m disturbed by the notion that some believe the goal of the GPL is to
expand copyrightability and the inclusiveness of derivative works. It
seems that so many forget (or maybe they never even knew) that copyleft
was invented to hack copyright — to turn its typical applications
to software inside out. The state of affairs that software is
controlled by draconian copyright rules is a lamentable reality;
copyleft is merely a tool that diffuses the proprietary copyright
weaponry.

But, if it were possible to really consider reduction in copyright
control over software, then I don’t know of a single GPL proponent who
wouldn’t want to bilaterally reduce copyright’s scope for software. For
example, I’ve often proposed, since around 2001, that perhaps copyright
for software should only last three years, non-renewable, and that it
require all who wished to distribute non-public-domain software to
register the source with the Copyright Office. At the end of the three
years, the Copyright Office would automatically publish that now
public-domain source to the world.

If my hypothetical system were the actual (and only) legal regime for
software, and were equally applied to all software — from the
fully Free to the most proprietary — I’d have no sadness at all
that opportunities for GPL enforcement ended after three years, and that
all GPL’d software fell into the public domain on that tight schedule,
because proprietary software and FLOSS would have the same treatment.
Meanwhile, great benefit would be gained for the freedom of all software
users. In short, GPL is not an end in itself, and I wouldn’t want to
ignore the actual goal — more freedom for software users —
merely to strengthen one tool in that battle.

In one of my favorite films, Kevin Smith’s Dogma, Chris
Rock’s character, Rufus, argues that it’s better to have ideas than
beliefs, because ideas can change when the situation does, but beliefs
become ingrained and are harder to shake. I’m not a belief-less person,
but I certainly hold the GPL and the notion of copyleft firmly in the
“idea” camp, not the “belief” one. It’s
unfortunate that the entrenched interests outside of software are (more
or less) inadvertently strengthening software copyright, too. Thus, in
the meantime, we must hold steadfast to the GPL going as far as is
legally permitted under this ridiculously expansive copyright system we
have. But, should a real policy dialogue open on the reduction software
copyright’s scope, GPL proponents will be the first in line to encourage
such bilateral reduction.

The GPL is a Tool to Encourage Freedom, Not an End in Itself

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2008/04/10/gpl-not-end-in-itself.html

I was amazed to be involved in yet another discussion recently
regarding the old debate about the scope of the GPL under copyright law.
The debate itself isn’t amazing — these debates have happened
somewhere every six months, almost on cue, since around 1994 or so.
What amazed me this time is that some people in the debate believed that
the GPL proponents intend to sneakily pursue an increased scope for
copyright law. Those who think that have completely misunderstood the
fundamental idea behind the GPL.

I’m disturbed by the notion that some believe the goal of the GPL is to
expand copyrightability and the inclusiveness of derivative works. It
seems that so many forget (or maybe they never even knew) that copyleft
was invented to hack copyright — to turn its typical applications
to software inside out. The state of affairs that software is
controlled by draconian copyright rules is a lamentable reality;
copyleft is merely a tool that diffuses the proprietary copyright
weaponry.

But, if it were possible to really consider reduction in copyright
control over software, then I don’t know of a single GPL proponent who
wouldn’t want to bilaterally reduce copyright’s scope for software. For
example, I’ve often proposed, since around 2001, that perhaps copyright
for software should only last three years, non-renewable, and that it
require all who wished to distribute non-public-domain software to
register the source with the Copyright Office. At the end of the three
years, the Copyright Office would automatically publish that now
public-domain source to the world.

If my hypothetical system were the actual (and only) legal regime for
software, and were equally applied to all software — from the
fully Free to the most proprietary — I’d have no sadness at all
that opportunities for GPL enforcement ended after three years, and that
all GPL’d software fell into the public domain on that tight schedule,
because proprietary software and FLOSS would have the same treatment.
Meanwhile, great benefit would be gained for the freedom of all software
users. In short, GPL is not an end in itself, and I wouldn’t want to
ignore the actual goal — more freedom for software users —
merely to strengthen one tool in that battle.

In one of my favorite films, Kevin Smith’s Dogma, Chris
Rock’s character, Rufus, argues that it’s better to have ideas than
beliefs, because ideas can change when the situation does, but beliefs
become ingrained and are harder to shake. I’m not a belief-less person,
but I certainly hold the GPL and the notion of copyleft firmly in the
“idea” camp, not the “belief” one. It’s
unfortunate that the entrenched interests outside of software are (more
or less) inadvertently strengthening software copyright, too. Thus, in
the meantime, we must hold steadfast to the GPL going as far as is
legally permitted under this ridiculously expansive copyright system we
have. But, should a real policy dialogue open on the reduction software
copyright’s scope, GPL proponents will be the first in line to encourage
such bilateral reduction.

When your apt-mirror is always downloading

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2008/01/24/apt-mirror-2.html

When I started building our apt-mirror, I ran into a problem: the
machine was throttled against ubuntu.com’s servers, but I had completed
much of the download (which took weeks to get multiple distributions).
I really wanted to roll out the solution quickly, particularly because
the service from the remote servers was worse than ever due to the
throttling that the mirroring created. But, with the mirror incomplete,
I couldn’t so easily make available incomplete repositories.

The solution was to simply let apache redirect users on to the real
servers if the mirror doesn’t have the file. The first order of
business for that is to rewrite and redirect URLs when files aren’t
found. This is a straightforward Apache configuration:

           RewriteEngine on
           RewriteLogLevel 0
           RewriteCond %{REQUEST_FILENAME} !^/cgi/
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-F
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-d
           RewriteCond %{REQUEST_URI} !(Packages|Sources)\.bz2$
           RewriteCond %{REQUEST_URI} !/index\.[^/]*$ [NC]
           RewriteRule ^(http://%{HTTP_HOST})?/(.*) http://91.189.88.45/$2 [P]
         

Note a few things there:

  • I have to hard-code an IP number, because as I mentioned in
    the last
    post on this subject
    , I’ve faked out DNS
    for archive.ubuntu.com and other sites I’m mirroring. (Note:
    this has the unfortunate side-effect that I can’t easily take advantage
    of round-robin DNS on the other side.)

  • I avoid taking Packages.bz2 from the other site, because
    apt-mirror actually doesn’t mirror the bz2 files (although I’ve
    submitted a patch to it so it will eventually).

  • I make sure that index files get built by my Apache and not
    redirected.

  • I am using Apache proxying, which gives me Yet Another type of
    cache temporarily while I’m still downloading the other packages. (I
    should actually work out a way to have these caches used by apt-mirror
    itself in case a user has already requested a new package while waiting
    for apt-mirror to get it.)

Once I do a rewrite like this for each of the hosts I’m replacing with
a mirror, I’m almost done. The problem is that if for any reason my
site needs to give a 403 to the clients, I would actually like to
double-check to be sure that the URL doesn’t happen to work at the place
I’m mirroring from.

My hope was that I could write a RewriteRule based on what the
HTTP return code would be when the request completed. This was
really hard to do, it seemed, and perhaps undoable. The quickest
solution I found was to write a CGI script to do the redirect. So, in
the Apache config I have:

        ErrorDocument 403 /cgi/redirect-forbidden.cgi
        

And, the CGI script looks like this:

        #!/usr/bin/perl
        
        use strict;
        use CGI qw(:standard);
        
        my $val = $ENV{REDIRECT_SCRIPT_URI};
        
        $val =~ s%^http://(\S+).sflc.info(/.*)$%$2%;
        if ($1 eq "ubuntu-security") {
           $val = "http://91.189.88.37$val";
        } else {
           $val = "http://91.189.88.45$val";
        }
        
        print redirect($val);
        

With these changes, the user will be redirected to the original when
the files aren’t available on the mirror, and as the mirror gets more
accurate, they’ll get more files from the mirror.

I still have problems if for any reason the user gets a Packages or
Sources file from the original site before the mirror is synchronized,
but this rarely happens since apt-mirror is pretty careful. The only
time it might happen is if the user did an apt-get update when
not connected to our VPN and only a short time later did one while
connected.

When your apt-mirror is always downloading

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2008/01/24/apt-mirror-2.html

When I started building our apt-mirror, I ran into a problem: the
machine was throttled against ubuntu.com’s servers, but I had completed
much of the download (which took weeks to get multiple distributions).
I really wanted to roll out the solution quickly, particularly because
the service from the remote servers was worse than ever due to the
throttling that the mirroring created. But, with the mirror incomplete,
I couldn’t so easily make available incomplete repositories.

The solution was to simply let apache redirect users on to the real
servers if the mirror doesn’t have the file. The first order of
business for that is to rewrite and redirect URLs when files aren’t
found. This is a straightforward Apache configuration:

           RewriteEngine on
           RewriteLogLevel 0
           RewriteCond %{REQUEST_FILENAME} !^/cgi/
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-F
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-d
           RewriteCond %{REQUEST_URI} !(Packages|Sources)\.bz2$
           RewriteCond %{REQUEST_URI} !/index\.[^/]*$ [NC]
           RewriteRule ^(http://%{HTTP_HOST})?/(.*) http://91.189.88.45/$2 [P]
         

Note a few things there:

  • I have to hard-code an IP number, because as I mentioned in
    the last
    post on this subject
    , I’ve faked out DNS
    for archive.ubuntu.com and other sites I’m mirroring. (Note:
    this has the unfortunate side-effect that I can’t easily take advantage
    of round-robin DNS on the other side.)

  • I avoid taking Packages.bz2 from the other site, because
    apt-mirror actually doesn’t mirror the bz2 files (although I’ve
    submitted a patch to it so it will eventually).

  • I make sure that index files get built by my Apache and not
    redirected.

  • I am using Apache proxying, which gives me Yet Another type of
    cache temporarily while I’m still downloading the other packages. (I
    should actually work out a way to have these caches used by apt-mirror
    itself in case a user has already requested a new package while waiting
    for apt-mirror to get it.)

Once I do a rewrite like this for each of the hosts I’m replacing with
a mirror, I’m almost done. The problem is that if for any reason my
site needs to give a 403 to the clients, I would actually like to
double-check to be sure that the URL doesn’t happen to work at the place
I’m mirroring from.

My hope was that I could write a RewriteRule based on what the
HTTP return code would be when the request completed. This was
really hard to do, it seemed, and perhaps undoable. The quickest
solution I found was to write a CGI script to do the redirect. So, in
the Apache config I have:

        ErrorDocument 403 /cgi/redirect-forbidden.cgi
        

And, the CGI script looks like this:

        #!/usr/bin/perl
        
        use strict;
        use CGI qw(:standard);
        
        my $val = $ENV{REDIRECT_SCRIPT_URI};
        
        $val =~ s%^http://(\S+).sflc.info(/.*)$%$2%;
        if ($1 eq "ubuntu-security") {
           $val = "http://91.189.88.37$val";
        } else {
           $val = "http://91.189.88.45$val";
        }
        
        print redirect($val);
        

With these changes, the user will be redirected to the original when
the files aren’t available on the mirror, and as the mirror gets more
accurate, they’ll get more files from the mirror.

I still have problems if for any reason the user gets a Packages or
Sources file from the original site before the mirror is synchronized,
but this rarely happens since apt-mirror is pretty careful. The only
time it might happen is if the user did an apt-get update when
not connected to our VPN and only a short time later did one while
connected.

When your apt-mirror is always downloading

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2008/01/24/apt-mirror-2.html

When I started building our apt-mirror, I ran into a problem: the
machine was throttled against ubuntu.com’s servers, but I had completed
much of the download (which took weeks to get multiple distributions).
I really wanted to roll out the solution quickly, particularly because
the service from the remote servers was worse than ever due to the
throttling that the mirroring created. But, with the mirror incomplete,
I couldn’t so easily make available incomplete repositories.

The solution was to simply let apache redirect users on to the real
servers if the mirror doesn’t have the file. The first order of
business for that is to rewrite and redirect URLs when files aren’t
found. This is a straightforward Apache configuration:

           RewriteEngine on
           RewriteLogLevel 0
           RewriteCond %{REQUEST_FILENAME} !^/cgi/
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-F
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-d
           RewriteCond %{REQUEST_URI} !(Packages|Sources)\.bz2$
           RewriteCond %{REQUEST_URI} !/index\.[^/]*$ [NC]
           RewriteRule ^(http://%{HTTP_HOST})?/(.*) http://91.189.88.45/$2 [P]
         

Note a few things there:

  • I have to hard-code an IP number, because as I mentioned in
    the last
    post on this subject
    , I’ve faked out DNS
    for archive.ubuntu.com and other sites I’m mirroring. (Note:
    this has the unfortunate side-effect that I can’t easily take advantage
    of round-robin DNS on the other side.)

  • I avoid taking Packages.bz2 from the other site, because
    apt-mirror actually doesn’t mirror the bz2 files (although I’ve
    submitted a patch to it so it will eventually).

  • I make sure that index files get built by my Apache and not
    redirected.

  • I am using Apache proxying, which gives me Yet Another type of
    cache temporarily while I’m still downloading the other packages. (I
    should actually work out a way to have these caches used by apt-mirror
    itself in case a user has already requested a new package while waiting
    for apt-mirror to get it.)

Once I do a rewrite like this for each of the hosts I’m replacing with
a mirror, I’m almost done. The problem is that if for any reason my
site needs to give a 403 to the clients, I would actually like to
double-check to be sure that the URL doesn’t happen to work at the place
I’m mirroring from.

My hope was that I could write a RewriteRule based on what the
HTTP return code would be when the request completed. This was
really hard to do, it seemed, and perhaps undoable. The quickest
solution I found was to write a CGI script to do the redirect. So, in
the Apache config I have:

        ErrorDocument 403 /cgi/redirect-forbidden.cgi
        

And, the CGI script looks like this:

        #!/usr/bin/perl
        
        use strict;
        use CGI qw(:standard);
        
        my $val = $ENV{REDIRECT_SCRIPT_URI};
        
        $val =~ s%^http://(\S+).sflc.info(/.*)$%$2%;
        if ($1 eq "ubuntu-security") {
           $val = "http://91.189.88.37$val";
        } else {
           $val = "http://91.189.88.45$val";
        }
        
        print redirect($val);
        

With these changes, the user will be redirected to the original when
the files aren’t available on the mirror, and as the mirror gets more
accurate, they’ll get more files from the mirror.

I still have problems if for any reason the user gets a Packages or
Sources file from the original site before the mirror is synchronized,
but this rarely happens since apt-mirror is pretty careful. The only
time it might happen is if the user did an apt-get update when
not connected to our VPN and only a short time later did one while
connected.

When your apt-mirror is always downloading

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2008/01/24/apt-mirror-2.html

When I started building our apt-mirror, I ran into a problem: the
machine was throttled against ubuntu.com’s servers, but I had completed
much of the download (which took weeks to get multiple distributions).
I really wanted to roll out the solution quickly, particularly because
the service from the remote servers was worse than ever due to the
throttling that the mirroring created. But, with the mirror incomplete,
I couldn’t so easily make available incomplete repositories.

The solution was to simply let apache redirect users on to the real
servers if the mirror doesn’t have the file. The first order of
business for that is to rewrite and redirect URLs when files aren’t
found. This is a straightforward Apache configuration:

           RewriteEngine on
           RewriteLogLevel 0
           RewriteCond %{REQUEST_FILENAME} !^/cgi/
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-F
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-d
           RewriteCond %{REQUEST_URI} !(Packages|Sources)\.bz2$
           RewriteCond %{REQUEST_URI} !/index\.[^/]*$ [NC]
           RewriteRule ^(http://%{HTTP_HOST})?/(.*) http://91.189.88.45/$2 [P]
         

Note a few things there:

  • I have to hard-code an IP number, because as I mentioned in
    the last
    post on this subject
    , I’ve faked out DNS
    for archive.ubuntu.com and other sites I’m mirroring. (Note:
    this has the unfortunate side-effect that I can’t easily take advantage
    of round-robin DNS on the other side.)

  • I avoid taking Packages.bz2 from the other site, because
    apt-mirror actually doesn’t mirror the bz2 files (although I’ve
    submitted a patch to it so it will eventually).

  • I make sure that index files get built by my Apache and not
    redirected.

  • I am using Apache proxying, which gives me Yet Another type of
    cache temporarily while I’m still downloading the other packages. (I
    should actually work out a way to have these caches used by apt-mirror
    itself in case a user has already requested a new package while waiting
    for apt-mirror to get it.)

Once I do a rewrite like this for each of the hosts I’m replacing with
a mirror, I’m almost done. The problem is that if for any reason my
site needs to give a 403 to the clients, I would actually like to
double-check to be sure that the URL doesn’t happen to work at the place
I’m mirroring from.

My hope was that I could write a RewriteRule based on what the
HTTP return code would be when the request completed. This was
really hard to do, it seemed, and perhaps undoable. The quickest
solution I found was to write a CGI script to do the redirect. So, in
the Apache config I have:

        ErrorDocument 403 /cgi/redirect-forbidden.cgi
        

And, the CGI script looks like this:

        #!/usr/bin/perl
        
        use strict;
        use CGI qw(:standard);
        
        my $val = $ENV{REDIRECT_SCRIPT_URI};
        
        $val =~ s%^http://(\S+).sflc.info(/.*)$%$2%;
        if ($1 eq "ubuntu-security") {
           $val = "http://91.189.88.37$val";
        } else {
           $val = "http://91.189.88.45$val";
        }
        
        print redirect($val);
        

With these changes, the user will be redirected to the original when
the files aren’t available on the mirror, and as the mirror gets more
accurate, they’ll get more files from the mirror.

I still have problems if for any reason the user gets a Packages or
Sources file from the original site before the mirror is synchronized,
but this rarely happens since apt-mirror is pretty careful. The only
time it might happen is if the user did an apt-get update when
not connected to our VPN and only a short time later did one while
connected.

When your apt-mirror is always downloading

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2008/01/24/apt-mirror-2.html

When I started building our apt-mirror, I ran into a problem: the
machine was throttled against ubuntu.com’s servers, but I had completed
much of the download (which took weeks to get multiple distributions).
I really wanted to roll out the solution quickly, particularly because
the service from the remote servers was worse than ever due to the
throttling that the mirroring created. But, with the mirror incomplete,
I couldn’t so easily make available incomplete repositories.

The solution was to simply let apache redirect users on to the real
servers if the mirror doesn’t have the file. The first order of
business for that is to rewrite and redirect URLs when files aren’t
found. This is a straightforward Apache configuration:

           RewriteEngine on
           RewriteLogLevel 0
           RewriteCond %{REQUEST_FILENAME} !^/cgi/
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-F
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-d
           RewriteCond %{REQUEST_URI} !(Packages|Sources)\.bz2$
           RewriteCond %{REQUEST_URI} !/index\.[^/]*$ [NC]
           RewriteRule ^(http://%{HTTP_HOST})?/(.*) http://91.189.88.45/$2 [P]
         

Note a few things there:

  • I have to hard-code an IP number, because as I mentioned in
    the last
    post on this subject
    , I’ve faked out DNS
    for archive.ubuntu.com and other sites I’m mirroring. (Note:
    this has the unfortunate side-effect that I can’t easily take advantage
    of round-robin DNS on the other side.)

  • I avoid taking Packages.bz2 from the other site, because
    apt-mirror actually doesn’t mirror the bz2 files (although I’ve
    submitted a patch to it so it will eventually).

  • I make sure that index files get built by my Apache and not
    redirected.

  • I am using Apache proxying, which gives me Yet Another type of
    cache temporarily while I’m still downloading the other packages. (I
    should actually work out a way to have these caches used by apt-mirror
    itself in case a user has already requested a new package while waiting
    for apt-mirror to get it.)

Once I do a rewrite like this for each of the hosts I’m replacing with
a mirror, I’m almost done. The problem is that if for any reason my
site needs to give a 403 to the clients, I would actually like to
double-check to be sure that the URL doesn’t happen to work at the place
I’m mirroring from.

My hope was that I could write a RewriteRule based on what the
HTTP return code would be when the request completed. This was
really hard to do, it seemed, and perhaps undoable. The quickest
solution I found was to write a CGI script to do the redirect. So, in
the Apache config I have:

        ErrorDocument 403 /cgi/redirect-forbidden.cgi
        

And, the CGI script looks like this:

        #!/usr/bin/perl
        
        use strict;
        use CGI qw(:standard);
        
        my $val = $ENV{REDIRECT_SCRIPT_URI};
        
        $val =~ s%^http://(\S+).sflc.info(/.*)$%$2%;
        if ($1 eq "ubuntu-security") {
           $val = "http://91.189.88.37$val";
        } else {
           $val = "http://91.189.88.45$val";
        }
        
        print redirect($val);
        

With these changes, the user will be redirected to the original when
the files aren’t available on the mirror, and as the mirror gets more
accurate, they’ll get more files from the mirror.

I still have problems if for any reason the user gets a Packages or
Sources file from the original site before the mirror is synchronized,
but this rarely happens since apt-mirror is pretty careful. The only
time it might happen is if the user did an apt-get update when
not connected to our VPN and only a short time later did one while
connected.

When your apt-mirror is always downloading

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2008/01/24/apt-mirror-2.html

When I started building our apt-mirror, I ran into a problem: the
machine was throttled against ubuntu.com’s servers, but I had completed
much of the download (which took weeks to get multiple distributions).
I really wanted to roll out the solution quickly, particularly because
the service from the remote servers was worse than ever due to the
throttling that the mirroring created. But, with the mirror incomplete,
I couldn’t so easily make available incomplete repositories.

The solution was to simply let apache redirect users on to the real
servers if the mirror doesn’t have the file. The first order of
business for that is to rewrite and redirect URLs when files aren’t
found. This is a straightforward Apache configuration:

           RewriteEngine on
           RewriteLogLevel 0
           RewriteCond %{REQUEST_FILENAME} !^/cgi/
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-F
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-d
           RewriteCond %{REQUEST_URI} !(Packages|Sources)\.bz2$
           RewriteCond %{REQUEST_URI} !/index\.[^/]*$ [NC]
           RewriteRule ^(http://%{HTTP_HOST})?/(.*) http://91.189.88.45/$2 [P]
         

Note a few things there:

  • I have to hard-code an IP number, because as I mentioned in
    the last
    post on this subject
    , I’ve faked out DNS
    for archive.ubuntu.com and other sites I’m mirroring. (Note:
    this has the unfortunate side-effect that I can’t easily take advantage
    of round-robin DNS on the other side.)

  • I avoid taking Packages.bz2 from the other site, because
    apt-mirror actually doesn’t mirror the bz2 files (although I’ve
    submitted a patch to it so it will eventually).

  • I make sure that index files get built by my Apache and not
    redirected.

  • I am using Apache proxying, which gives me Yet Another type of
    cache temporarily while I’m still downloading the other packages. (I
    should actually work out a way to have these caches used by apt-mirror
    itself in case a user has already requested a new package while waiting
    for apt-mirror to get it.)

Once I do a rewrite like this for each of the hosts I’m replacing with
a mirror, I’m almost done. The problem is that if for any reason my
site needs to give a 403 to the clients, I would actually like to
double-check to be sure that the URL doesn’t happen to work at the place
I’m mirroring from.

My hope was that I could write a RewriteRule based on what the
HTTP return code would be when the request completed. This was
really hard to do, it seemed, and perhaps undoable. The quickest
solution I found was to write a CGI script to do the redirect. So, in
the Apache config I have:

        ErrorDocument 403 /cgi/redirect-forbidden.cgi
        

And, the CGI script looks like this:

        #!/usr/bin/perl
        
        use strict;
        use CGI qw(:standard);
        
        my $val = $ENV{REDIRECT_SCRIPT_URI};
        
        $val =~ s%^http://(\S+).sflc.info(/.*)$%$2%;
        if ($1 eq "ubuntu-security") {
           $val = "http://91.189.88.37$val";
        } else {
           $val = "http://91.189.88.45$val";
        }
        
        print redirect($val);
        

With these changes, the user will be redirected to the original when
the files aren’t available on the mirror, and as the mirror gets more
accurate, they’ll get more files from the mirror.

I still have problems if for any reason the user gets a Packages or
Sources file from the original site before the mirror is synchronized,
but this rarely happens since apt-mirror is pretty careful. The only
time it might happen is if the user did an apt-get update when
not connected to our VPN and only a short time later did one while
connected.

When your apt-mirror is always downloading

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2008/01/24/apt-mirror-2.html

When I started building our apt-mirror, I ran into a problem: the
machine was throttled against ubuntu.com’s servers, but I had completed
much of the download (which took weeks to get multiple distributions).
I really wanted to roll out the solution quickly, particularly because
the service from the remote servers was worse than ever due to the
throttling that the mirroring created. But, with the mirror incomplete,
I couldn’t so easily make available incomplete repositories.

The solution was to simply let apache redirect users on to the real
servers if the mirror doesn’t have the file. The first order of
business for that is to rewrite and redirect URLs when files aren’t
found. This is a straightforward Apache configuration:

           RewriteEngine on
           RewriteLogLevel 0
           RewriteCond %{REQUEST_FILENAME} !^/cgi/
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-F
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-d
           RewriteCond %{REQUEST_URI} !(Packages|Sources)\.bz2$
           RewriteCond %{REQUEST_URI} !/index\.[^/]*$ [NC]
           RewriteRule ^(http://%{HTTP_HOST})?/(.*) http://91.189.88.45/$2 [P]
         

Note a few things there:

  • I have to hard-code an IP number, because as I mentioned in
    the last
    post on this subject
    , I’ve faked out DNS
    for archive.ubuntu.com and other sites I’m mirroring. (Note:
    this has the unfortunate side-effect that I can’t easily take advantage
    of round-robin DNS on the other side.)

  • I avoid taking Packages.bz2 from the other site, because
    apt-mirror actually doesn’t mirror the bz2 files (although I’ve
    submitted a patch to it so it will eventually).

  • I make sure that index files get built by my Apache and not
    redirected.

  • I am using Apache proxying, which gives me Yet Another type of
    cache temporarily while I’m still downloading the other packages. (I
    should actually work out a way to have these caches used by apt-mirror
    itself in case a user has already requested a new package while waiting
    for apt-mirror to get it.)

Once I do a rewrite like this for each of the hosts I’m replacing with
a mirror, I’m almost done. The problem is that if for any reason my
site needs to give a 403 to the clients, I would actually like to
double-check to be sure that the URL doesn’t happen to work at the place
I’m mirroring from.

My hope was that I could write a RewriteRule based on what the
HTTP return code would be when the request completed. This was
really hard to do, it seemed, and perhaps undoable. The quickest
solution I found was to write a CGI script to do the redirect. So, in
the Apache config I have:

        ErrorDocument 403 /cgi/redirect-forbidden.cgi
        

And, the CGI script looks like this:

        #!/usr/bin/perl
        
        use strict;
        use CGI qw(:standard);
        
        my $val = $ENV{REDIRECT_SCRIPT_URI};
        
        $val =~ s%^http://(\S+).sflc.info(/.*)$%$2%;
        if ($1 eq "ubuntu-security") {
           $val = "http://91.189.88.37$val";
        } else {
           $val = "http://91.189.88.45$val";
        }
        
        print redirect($val);
        

With these changes, the user will be redirected to the original when
the files aren’t available on the mirror, and as the mirror gets more
accurate, they’ll get more files from the mirror.

I still have problems if for any reason the user gets a Packages or
Sources file from the original site before the mirror is synchronized,
but this rarely happens since apt-mirror is pretty careful. The only
time it might happen is if the user did an apt-get update when
not connected to our VPN and only a short time later did one while
connected.

When your apt-mirror is always downloading

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2008/01/24/apt-mirror-2.html

When I started building our apt-mirror, I ran into a problem: the
machine was throttled against ubuntu.com’s servers, but I had completed
much of the download (which took weeks to get multiple distributions).
I really wanted to roll out the solution quickly, particularly because
the service from the remote servers was worse than ever due to the
throttling that the mirroring created. But, with the mirror incomplete,
I couldn’t so easily make available incomplete repositories.

The solution was to simply let apache redirect users on to the real
servers if the mirror doesn’t have the file. The first order of
business for that is to rewrite and redirect URLs when files aren’t
found. This is a straightforward Apache configuration:

           RewriteEngine on
           RewriteLogLevel 0
           RewriteCond %{REQUEST_FILENAME} !^/cgi/
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-F
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-d
           RewriteCond %{REQUEST_URI} !(Packages|Sources)\.bz2$
           RewriteCond %{REQUEST_URI} !/index\.[^/]*$ [NC]
           RewriteRule ^(http://%{HTTP_HOST})?/(.*) http://91.189.88.45/$2 [P]
         

Note a few things there:

  • I have to hard-code an IP number, because as I mentioned in
    the last
    post on this subject
    , I’ve faked out DNS
    for archive.ubuntu.com and other sites I’m mirroring. (Note:
    this has the unfortunate side-effect that I can’t easily take advantage
    of round-robin DNS on the other side.)

  • I avoid taking Packages.bz2 from the other site, because
    apt-mirror actually doesn’t mirror the bz2 files (although I’ve
    submitted a patch to it so it will eventually).

  • I make sure that index files get built by my Apache and not
    redirected.

  • I am using Apache proxying, which gives me Yet Another type of
    cache temporarily while I’m still downloading the other packages. (I
    should actually work out a way to have these caches used by apt-mirror
    itself in case a user has already requested a new package while waiting
    for apt-mirror to get it.)

Once I do a rewrite like this for each of the hosts I’m replacing with
a mirror, I’m almost done. The problem is that if for any reason my
site needs to give a 403 to the clients, I would actually like to
double-check to be sure that the URL doesn’t happen to work at the place
I’m mirroring from.

My hope was that I could write a RewriteRule based on what the
HTTP return code would be when the request completed. This was
really hard to do, it seemed, and perhaps undoable. The quickest
solution I found was to write a CGI script to do the redirect. So, in
the Apache config I have:

        ErrorDocument 403 /cgi/redirect-forbidden.cgi
        

And, the CGI script looks like this:

        #!/usr/bin/perl
        
        use strict;
        use CGI qw(:standard);
        
        my $val = $ENV{REDIRECT_SCRIPT_URI};
        
        $val =~ s%^http://(\S+).sflc.info(/.*)$%$2%;
        if ($1 eq "ubuntu-security") {
           $val = "http://91.189.88.37$val";
        } else {
           $val = "http://91.189.88.45$val";
        }
        
        print redirect($val);
        

With these changes, the user will be redirected to the original when
the files aren’t available on the mirror, and as the mirror gets more
accurate, they’ll get more files from the mirror.

I still have problems if for any reason the user gets a Packages or
Sources file from the original site before the mirror is synchronized,
but this rarely happens since apt-mirror is pretty careful. The only
time it might happen is if the user did an apt-get update when
not connected to our VPN and only a short time later did one while
connected.

When your apt-mirror is always downloading

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2008/01/24/apt-mirror-2.html

When I started building our apt-mirror, I ran into a problem: the
machine was throttled against ubuntu.com’s servers, but I had completed
much of the download (which took weeks to get multiple distributions).
I really wanted to roll out the solution quickly, particularly because
the service from the remote servers was worse than ever due to the
throttling that the mirroring created. But, with the mirror incomplete,
I couldn’t so easily make available incomplete repositories.

The solution was to simply let apache redirect users on to the real
servers if the mirror doesn’t have the file. The first order of
business for that is to rewrite and redirect URLs when files aren’t
found. This is a straightforward Apache configuration:

           RewriteEngine on
           RewriteLogLevel 0
           RewriteCond %{REQUEST_FILENAME} !^/cgi/
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-F
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-d
           RewriteCond %{REQUEST_URI} !(Packages|Sources)\.bz2$
           RewriteCond %{REQUEST_URI} !/index\.[^/]*$ [NC]
           RewriteRule ^(http://%{HTTP_HOST})?/(.*) http://91.189.88.45/$2 [P]
         

Note a few things there:

  • I have to hard-code an IP number, because as I mentioned in
    the last
    post on this subject
    , I’ve faked out DNS
    for archive.ubuntu.com and other sites I’m mirroring. (Note:
    this has the unfortunate side-effect that I can’t easily take advantage
    of round-robin DNS on the other side.)

  • I avoid taking Packages.bz2 from the other site, because
    apt-mirror actually doesn’t mirror the bz2 files (although I’ve
    submitted a patch to it so it will eventually).

  • I make sure that index files get built by my Apache and not
    redirected.

  • I am using Apache proxying, which gives me Yet Another type of
    cache temporarily while I’m still downloading the other packages. (I
    should actually work out a way to have these caches used by apt-mirror
    itself in case a user has already requested a new package while waiting
    for apt-mirror to get it.)

Once I do a rewrite like this for each of the hosts I’m replacing with
a mirror, I’m almost done. The problem is that if for any reason my
site needs to give a 403 to the clients, I would actually like to
double-check to be sure that the URL doesn’t happen to work at the place
I’m mirroring from.

My hope was that I could write a RewriteRule based on what the
HTTP return code would be when the request completed. This was
really hard to do, it seemed, and perhaps undoable. The quickest
solution I found was to write a CGI script to do the redirect. So, in
the Apache config I have:

        ErrorDocument 403 /cgi/redirect-forbidden.cgi
        

And, the CGI script looks like this:

        #!/usr/bin/perl
        
        use strict;
        use CGI qw(:standard);
        
        my $val = $ENV{REDIRECT_SCRIPT_URI};
        
        $val =~ s%^http://(\S+).sflc.info(/.*)$%$2%;
        if ($1 eq "ubuntu-security") {
           $val = "http://91.189.88.37$val";
        } else {
           $val = "http://91.189.88.45$val";
        }
        
        print redirect($val);
        

With these changes, the user will be redirected to the original when
the files aren’t available on the mirror, and as the mirror gets more
accurate, they’ll get more files from the mirror.

I still have problems if for any reason the user gets a Packages or
Sources file from the original site before the mirror is synchronized,
but this rarely happens since apt-mirror is pretty careful. The only
time it might happen is if the user did an apt-get update when
not connected to our VPN and only a short time later did one while
connected.

When your apt-mirror is always downloading

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2008/01/24/apt-mirror-2.html

When I started building our apt-mirror, I ran into a problem: the
machine was throttled against ubuntu.com’s servers, but I had completed
much of the download (which took weeks to get multiple distributions).
I really wanted to roll out the solution quickly, particularly because
the service from the remote servers was worse than ever due to the
throttling that the mirroring created. But, with the mirror incomplete,
I couldn’t so easily make available incomplete repositories.

The solution was to simply let apache redirect users on to the real
servers if the mirror doesn’t have the file. The first order of
business for that is to rewrite and redirect URLs when files aren’t
found. This is a straightforward Apache configuration:

           RewriteEngine on
           RewriteLogLevel 0
           RewriteCond %{REQUEST_FILENAME} !^/cgi/
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-F
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-d
           RewriteCond %{REQUEST_URI} !(Packages|Sources)\.bz2$
           RewriteCond %{REQUEST_URI} !/index\.[^/]*$ [NC]
           RewriteRule ^(http://%{HTTP_HOST})?/(.*) http://91.189.88.45/$2 [P]
         

Note a few things there:

  • I have to hard-code an IP number, because as I mentioned in
    the last
    post on this subject
    , I’ve faked out DNS
    for archive.ubuntu.com and other sites I’m mirroring. (Note:
    this has the unfortunate side-effect that I can’t easily take advantage
    of round-robin DNS on the other side.)

  • I avoid taking Packages.bz2 from the other site, because
    apt-mirror actually doesn’t mirror the bz2 files (although I’ve
    submitted a patch to it so it will eventually).

  • I make sure that index files get built by my Apache and not
    redirected.

  • I am using Apache proxying, which gives me Yet Another type of
    cache temporarily while I’m still downloading the other packages. (I
    should actually work out a way to have these caches used by apt-mirror
    itself in case a user has already requested a new package while waiting
    for apt-mirror to get it.)

Once I do a rewrite like this for each of the hosts I’m replacing with
a mirror, I’m almost done. The problem is that if for any reason my
site needs to give a 403 to the clients, I would actually like to
double-check to be sure that the URL doesn’t happen to work at the place
I’m mirroring from.

My hope was that I could write a RewriteRule based on what the
HTTP return code would be when the request completed. This was
really hard to do, it seemed, and perhaps undoable. The quickest
solution I found was to write a CGI script to do the redirect. So, in
the Apache config I have:

        ErrorDocument 403 /cgi/redirect-forbidden.cgi
        

And, the CGI script looks like this:

        #!/usr/bin/perl
        
        use strict;
        use CGI qw(:standard);
        
        my $val = $ENV{REDIRECT_SCRIPT_URI};
        
        $val =~ s%^http://(\S+).sflc.info(/.*)$%$2%;
        if ($1 eq "ubuntu-security") {
           $val = "http://91.189.88.37$val";
        } else {
           $val = "http://91.189.88.45$val";
        }
        
        print redirect($val);
        

With these changes, the user will be redirected to the original when
the files aren’t available on the mirror, and as the mirror gets more
accurate, they’ll get more files from the mirror.

I still have problems if for any reason the user gets a Packages or
Sources file from the original site before the mirror is synchronized,
but this rarely happens since apt-mirror is pretty careful. The only
time it might happen is if the user did an apt-get update when
not connected to our VPN and only a short time later did one while
connected.

When your apt-mirror is always downloading

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2008/01/24/apt-mirror-2.html

When I started building our apt-mirror, I ran into a problem: the
machine was throttled against ubuntu.com’s servers, but I had completed
much of the download (which took weeks to get multiple distributions).
I really wanted to roll out the solution quickly, particularly because
the service from the remote servers was worse than ever due to the
throttling that the mirroring created. But, with the mirror incomplete,
I couldn’t so easily make available incomplete repositories.

The solution was to simply let apache redirect users on to the real
servers if the mirror doesn’t have the file. The first order of
business for that is to rewrite and redirect URLs when files aren’t
found. This is a straightforward Apache configuration:

           RewriteEngine on
           RewriteLogLevel 0
           RewriteCond %{REQUEST_FILENAME} !^/cgi/
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-F
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-d
           RewriteCond %{REQUEST_URI} !(Packages|Sources)\.bz2$
           RewriteCond %{REQUEST_URI} !/index\.[^/]*$ [NC]
           RewriteRule ^(http://%{HTTP_HOST})?/(.*) http://91.189.88.45/$2 [P]
         

Note a few things there:

  • I have to hard-code an IP number, because as I mentioned in
    the last
    post on this subject
    , I’ve faked out DNS
    for archive.ubuntu.com and other sites I’m mirroring. (Note:
    this has the unfortunate side-effect that I can’t easily take advantage
    of round-robin DNS on the other side.)

  • I avoid taking Packages.bz2 from the other site, because
    apt-mirror actually doesn’t mirror the bz2 files (although I’ve
    submitted a patch to it so it will eventually).

  • I make sure that index files get built by my Apache and not
    redirected.

  • I am using Apache proxying, which gives me Yet Another type of
    cache temporarily while I’m still downloading the other packages. (I
    should actually work out a way to have these caches used by apt-mirror
    itself in case a user has already requested a new package while waiting
    for apt-mirror to get it.)

Once I do a rewrite like this for each of the hosts I’m replacing with
a mirror, I’m almost done. The problem is that if for any reason my
site needs to give a 403 to the clients, I would actually like to
double-check to be sure that the URL doesn’t happen to work at the place
I’m mirroring from.

My hope was that I could write a RewriteRule based on what the
HTTP return code would be when the request completed. This was
really hard to do, it seemed, and perhaps undoable. The quickest
solution I found was to write a CGI script to do the redirect. So, in
the Apache config I have:

        ErrorDocument 403 /cgi/redirect-forbidden.cgi
        

And, the CGI script looks like this:

        #!/usr/bin/perl
        
        use strict;
        use CGI qw(:standard);
        
        my $val = $ENV{REDIRECT_SCRIPT_URI};
        
        $val =~ s%^http://(\S+).sflc.info(/.*)$%$2%;
        if ($1 eq "ubuntu-security") {
           $val = "http://91.189.88.37$val";
        } else {
           $val = "http://91.189.88.45$val";
        }
        
        print redirect($val);
        

With these changes, the user will be redirected to the original when
the files aren’t available on the mirror, and as the mirror gets more
accurate, they’ll get more files from the mirror.

I still have problems if for any reason the user gets a Packages or
Sources file from the original site before the mirror is synchronized,
but this rarely happens since apt-mirror is pretty careful. The only
time it might happen is if the user did an apt-get update when
not connected to our VPN and only a short time later did one while
connected.

When your apt-mirror is always downloading

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2008/01/24/apt-mirror-2.html

When I started building our apt-mirror, I ran into a problem: the
machine was throttled against ubuntu.com’s servers, but I had completed
much of the download (which took weeks to get multiple distributions).
I really wanted to roll out the solution quickly, particularly because
the service from the remote servers was worse than ever due to the
throttling that the mirroring created. But, with the mirror incomplete,
I couldn’t so easily make available incomplete repositories.

The solution was to simply let apache redirect users on to the real
servers if the mirror doesn’t have the file. The first order of
business for that is to rewrite and redirect URLs when files aren’t
found. This is a straightforward Apache configuration:

           RewriteEngine on
           RewriteLogLevel 0
           RewriteCond %{REQUEST_FILENAME} !^/cgi/
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-F
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-d
           RewriteCond %{REQUEST_URI} !(Packages|Sources)\.bz2$
           RewriteCond %{REQUEST_URI} !/index\.[^/]*$ [NC]
           RewriteRule ^(http://%{HTTP_HOST})?/(.*) http://91.189.88.45/$2 [P]
         

Note a few things there:

  • I have to hard-code an IP number, because as I mentioned in
    the last
    post on this subject
    , I’ve faked out DNS
    for archive.ubuntu.com and other sites I’m mirroring. (Note:
    this has the unfortunate side-effect that I can’t easily take advantage
    of round-robin DNS on the other side.)

  • I avoid taking Packages.bz2 from the other site, because
    apt-mirror actually doesn’t mirror the bz2 files (although I’ve
    submitted a patch to it so it will eventually).

  • I make sure that index files get built by my Apache and not
    redirected.

  • I am using Apache proxying, which gives me Yet Another type of
    cache temporarily while I’m still downloading the other packages. (I
    should actually work out a way to have these caches used by apt-mirror
    itself in case a user has already requested a new package while waiting
    for apt-mirror to get it.)

Once I do a rewrite like this for each of the hosts I’m replacing with
a mirror, I’m almost done. The problem is that if for any reason my
site needs to give a 403 to the clients, I would actually like to
double-check to be sure that the URL doesn’t happen to work at the place
I’m mirroring from.

My hope was that I could write a RewriteRule based on what the
HTTP return code would be when the request completed. This was
really hard to do, it seemed, and perhaps undoable. The quickest
solution I found was to write a CGI script to do the redirect. So, in
the Apache config I have:

        ErrorDocument 403 /cgi/redirect-forbidden.cgi
        

And, the CGI script looks like this:

        #!/usr/bin/perl
        
        use strict;
        use CGI qw(:standard);
        
        my $val = $ENV{REDIRECT_SCRIPT_URI};
        
        $val =~ s%^http://(\S+).sflc.info(/.*)$%$2%;
        if ($1 eq "ubuntu-security") {
           $val = "http://91.189.88.37$val";
        } else {
           $val = "http://91.189.88.45$val";
        }
        
        print redirect($val);
        

With these changes, the user will be redirected to the original when
the files aren’t available on the mirror, and as the mirror gets more
accurate, they’ll get more files from the mirror.

I still have problems if for any reason the user gets a Packages or
Sources file from the original site before the mirror is synchronized,
but this rarely happens since apt-mirror is pretty careful. The only
time it might happen is if the user did an apt-get update when
not connected to our VPN and only a short time later did one while
connected.

When your apt-mirror is always downloading

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2008/01/24/apt-mirror-2.html

When I started building our apt-mirror, I ran into a problem: the
machine was throttled against ubuntu.com’s servers, but I had completed
much of the download (which took weeks to get multiple distributions).
I really wanted to roll out the solution quickly, particularly because
the service from the remote servers was worse than ever due to the
throttling that the mirroring created. But, with the mirror incomplete,
I couldn’t so easily make available incomplete repositories.

The solution was to simply let apache redirect users on to the real
servers if the mirror doesn’t have the file. The first order of
business for that is to rewrite and redirect URLs when files aren’t
found. This is a straightforward Apache configuration:

           RewriteEngine on
           RewriteLogLevel 0
           RewriteCond %{REQUEST_FILENAME} !^/cgi/
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-F
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-d
           RewriteCond %{REQUEST_URI} !(Packages|Sources)\.bz2$
           RewriteCond %{REQUEST_URI} !/index\.[^/]*$ [NC]
           RewriteRule ^(http://%{HTTP_HOST})?/(.*) http://91.189.88.45/$2 [P]
         

Note a few things there:

  • I have to hard-code an IP number, because as I mentioned in
    the last
    post on this subject
    , I’ve faked out DNS
    for archive.ubuntu.com and other sites I’m mirroring. (Note:
    this has the unfortunate side-effect that I can’t easily take advantage
    of round-robin DNS on the other side.)

  • I avoid taking Packages.bz2 from the other site, because
    apt-mirror actually doesn’t mirror the bz2 files (although I’ve
    submitted a patch to it so it will eventually).

  • I make sure that index files get built by my Apache and not
    redirected.

  • I am using Apache proxying, which gives me Yet Another type of
    cache temporarily while I’m still downloading the other packages. (I
    should actually work out a way to have these caches used by apt-mirror
    itself in case a user has already requested a new package while waiting
    for apt-mirror to get it.)

Once I do a rewrite like this for each of the hosts I’m replacing with
a mirror, I’m almost done. The problem is that if for any reason my
site needs to give a 403 to the clients, I would actually like to
double-check to be sure that the URL doesn’t happen to work at the place
I’m mirroring from.

My hope was that I could write a RewriteRule based on what the
HTTP return code would be when the request completed. This was
really hard to do, it seemed, and perhaps undoable. The quickest
solution I found was to write a CGI script to do the redirect. So, in
the Apache config I have:

        ErrorDocument 403 /cgi/redirect-forbidden.cgi
        

And, the CGI script looks like this:

        #!/usr/bin/perl
        
        use strict;
        use CGI qw(:standard);
        
        my $val = $ENV{REDIRECT_SCRIPT_URI};
        
        $val =~ s%^http://(\S+).sflc.info(/.*)$%$2%;
        if ($1 eq "ubuntu-security") {
           $val = "http://91.189.88.37$val";
        } else {
           $val = "http://91.189.88.45$val";
        }
        
        print redirect($val);
        

With these changes, the user will be redirected to the original when
the files aren’t available on the mirror, and as the mirror gets more
accurate, they’ll get more files from the mirror.

I still have problems if for any reason the user gets a Packages or
Sources file from the original site before the mirror is synchronized,
but this rarely happens since apt-mirror is pretty careful. The only
time it might happen is if the user did an apt-get update when
not connected to our VPN and only a short time later did one while
connected.

When your apt-mirror is always downloading

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2008/01/24/apt-mirror-2.html

When I started building our apt-mirror, I ran into a problem: the
machine was throttled against ubuntu.com’s servers, but I had completed
much of the download (which took weeks to get multiple distributions).
I really wanted to roll out the solution quickly, particularly because
the service from the remote servers was worse than ever due to the
throttling that the mirroring created. But, with the mirror incomplete,
I couldn’t so easily make available incomplete repositories.

The solution was to simply let apache redirect users on to the real
servers if the mirror doesn’t have the file. The first order of
business for that is to rewrite and redirect URLs when files aren’t
found. This is a straightforward Apache configuration:

           RewriteEngine on
           RewriteLogLevel 0
           RewriteCond %{REQUEST_FILENAME} !^/cgi/
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-F
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-d
           RewriteCond %{REQUEST_URI} !(Packages|Sources)\.bz2$
           RewriteCond %{REQUEST_URI} !/index\.[^/]*$ [NC]
           RewriteRule ^(http://%{HTTP_HOST})?/(.*) http://91.189.88.45/$2 [P]
         

Note a few things there:

  • I have to hard-code an IP number, because as I mentioned in
    the last
    post on this subject
    , I’ve faked out DNS
    for archive.ubuntu.com and other sites I’m mirroring. (Note:
    this has the unfortunate side-effect that I can’t easily take advantage
    of round-robin DNS on the other side.)

  • I avoid taking Packages.bz2 from the other site, because
    apt-mirror actually doesn’t mirror the bz2 files (although I’ve
    submitted a patch to it so it will eventually).

  • I make sure that index files get built by my Apache and not
    redirected.

  • I am using Apache proxying, which gives me Yet Another type of
    cache temporarily while I’m still downloading the other packages. (I
    should actually work out a way to have these caches used by apt-mirror
    itself in case a user has already requested a new package while waiting
    for apt-mirror to get it.)

Once I do a rewrite like this for each of the hosts I’m replacing with
a mirror, I’m almost done. The problem is that if for any reason my
site needs to give a 403 to the clients, I would actually like to
double-check to be sure that the URL doesn’t happen to work at the place
I’m mirroring from.

My hope was that I could write a RewriteRule based on what the
HTTP return code would be when the request completed. This was
really hard to do, it seemed, and perhaps undoable. The quickest
solution I found was to write a CGI script to do the redirect. So, in
the Apache config I have:

        ErrorDocument 403 /cgi/redirect-forbidden.cgi
        

And, the CGI script looks like this:

        #!/usr/bin/perl
        
        use strict;
        use CGI qw(:standard);
        
        my $val = $ENV{REDIRECT_SCRIPT_URI};
        
        $val =~ s%^http://(\S+).sflc.info(/.*)$%$2%;
        if ($1 eq "ubuntu-security") {
           $val = "http://91.189.88.37$val";
        } else {
           $val = "http://91.189.88.45$val";
        }
        
        print redirect($val);
        

With these changes, the user will be redirected to the original when
the files aren’t available on the mirror, and as the mirror gets more
accurate, they’ll get more files from the mirror.

I still have problems if for any reason the user gets a Packages or
Sources file from the original site before the mirror is synchronized,
but this rarely happens since apt-mirror is pretty careful. The only
time it might happen is if the user did an apt-get update when
not connected to our VPN and only a short time later did one while
connected.

When your apt-mirror is always downloading

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2008/01/24/apt-mirror-2.html

When I started building our apt-mirror, I ran into a problem: the
machine was throttled against ubuntu.com’s servers, but I had completed
much of the download (which took weeks to get multiple distributions).
I really wanted to roll out the solution quickly, particularly because
the service from the remote servers was worse than ever due to the
throttling that the mirroring created. But, with the mirror incomplete,
I couldn’t so easily make available incomplete repositories.

The solution was to simply let apache redirect users on to the real
servers if the mirror doesn’t have the file. The first order of
business for that is to rewrite and redirect URLs when files aren’t
found. This is a straightforward Apache configuration:

           RewriteEngine on
           RewriteLogLevel 0
           RewriteCond %{REQUEST_FILENAME} !^/cgi/
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-F
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-d
           RewriteCond %{REQUEST_URI} !(Packages|Sources)\.bz2$
           RewriteCond %{REQUEST_URI} !/index\.[^/]*$ [NC]
           RewriteRule ^(http://%{HTTP_HOST})?/(.*) http://91.189.88.45/$2 [P]
         

Note a few things there:

  • I have to hard-code an IP number, because as I mentioned in
    the last
    post on this subject
    , I’ve faked out DNS
    for archive.ubuntu.com and other sites I’m mirroring. (Note:
    this has the unfortunate side-effect that I can’t easily take advantage
    of round-robin DNS on the other side.)

  • I avoid taking Packages.bz2 from the other site, because
    apt-mirror actually doesn’t mirror the bz2 files (although I’ve
    submitted a patch to it so it will eventually).

  • I make sure that index files get built by my Apache and not
    redirected.

  • I am using Apache proxying, which gives me Yet Another type of
    cache temporarily while I’m still downloading the other packages. (I
    should actually work out a way to have these caches used by apt-mirror
    itself in case a user has already requested a new package while waiting
    for apt-mirror to get it.)

Once I do a rewrite like this for each of the hosts I’m replacing with
a mirror, I’m almost done. The problem is that if for any reason my
site needs to give a 403 to the clients, I would actually like to
double-check to be sure that the URL doesn’t happen to work at the place
I’m mirroring from.

My hope was that I could write a RewriteRule based on what the
HTTP return code would be when the request completed. This was
really hard to do, it seemed, and perhaps undoable. The quickest
solution I found was to write a CGI script to do the redirect. So, in
the Apache config I have:

        ErrorDocument 403 /cgi/redirect-forbidden.cgi
        

And, the CGI script looks like this:

        #!/usr/bin/perl
        
        use strict;
        use CGI qw(:standard);
        
        my $val = $ENV{REDIRECT_SCRIPT_URI};
        
        $val =~ s%^http://(\S+).sflc.info(/.*)$%$2%;
        if ($1 eq "ubuntu-security") {
           $val = "http://91.189.88.37$val";
        } else {
           $val = "http://91.189.88.45$val";
        }
        
        print redirect($val);
        

With these changes, the user will be redirected to the original when
the files aren’t available on the mirror, and as the mirror gets more
accurate, they’ll get more files from the mirror.

I still have problems if for any reason the user gets a Packages or
Sources file from the original site before the mirror is synchronized,
but this rarely happens since apt-mirror is pretty careful. The only
time it might happen is if the user did an apt-get update when
not connected to our VPN and only a short time later did one while
connected.

When your apt-mirror is always downloading

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2008/01/24/apt-mirror-2.html

When I started building our apt-mirror, I ran into a problem: the
machine was throttled against ubuntu.com’s servers, but I had completed
much of the download (which took weeks to get multiple distributions).
I really wanted to roll out the solution quickly, particularly because
the service from the remote servers was worse than ever due to the
throttling that the mirroring created. But, with the mirror incomplete,
I couldn’t so easily make available incomplete repositories.

The solution was to simply let apache redirect users on to the real
servers if the mirror doesn’t have the file. The first order of
business for that is to rewrite and redirect URLs when files aren’t
found. This is a straightforward Apache configuration:

           RewriteEngine on
           RewriteLogLevel 0
           RewriteCond %{REQUEST_FILENAME} !^/cgi/
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-F
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-d
           RewriteCond %{REQUEST_URI} !(Packages|Sources)\.bz2$
           RewriteCond %{REQUEST_URI} !/index\.[^/]*$ [NC]
           RewriteRule ^(http://%{HTTP_HOST})?/(.*) http://91.189.88.45/$2 [P]
         

Note a few things there:

  • I have to hard-code an IP number, because as I mentioned in
    the last
    post on this subject
    , I’ve faked out DNS
    for archive.ubuntu.com and other sites I’m mirroring. (Note:
    this has the unfortunate side-effect that I can’t easily take advantage
    of round-robin DNS on the other side.)

  • I avoid taking Packages.bz2 from the other site, because
    apt-mirror actually doesn’t mirror the bz2 files (although I’ve
    submitted a patch to it so it will eventually).

  • I make sure that index files get built by my Apache and not
    redirected.

  • I am using Apache proxying, which gives me Yet Another type of
    cache temporarily while I’m still downloading the other packages. (I
    should actually work out a way to have these caches used by apt-mirror
    itself in case a user has already requested a new package while waiting
    for apt-mirror to get it.)

Once I do a rewrite like this for each of the hosts I’m replacing with
a mirror, I’m almost done. The problem is that if for any reason my
site needs to give a 403 to the clients, I would actually like to
double-check to be sure that the URL doesn’t happen to work at the place
I’m mirroring from.

My hope was that I could write a RewriteRule based on what the
HTTP return code would be when the request completed. This was
really hard to do, it seemed, and perhaps undoable. The quickest
solution I found was to write a CGI script to do the redirect. So, in
the Apache config I have:

        ErrorDocument 403 /cgi/redirect-forbidden.cgi
        

And, the CGI script looks like this:

        #!/usr/bin/perl
        
        use strict;
        use CGI qw(:standard);
        
        my $val = $ENV{REDIRECT_SCRIPT_URI};
        
        $val =~ s%^http://(\S+).sflc.info(/.*)$%$2%;
        if ($1 eq "ubuntu-security") {
           $val = "http://91.189.88.37$val";
        } else {
           $val = "http://91.189.88.45$val";
        }
        
        print redirect($val);
        

With these changes, the user will be redirected to the original when
the files aren’t available on the mirror, and as the mirror gets more
accurate, they’ll get more files from the mirror.

I still have problems if for any reason the user gets a Packages or
Sources file from the original site before the mirror is synchronized,
but this rarely happens since apt-mirror is pretty careful. The only
time it might happen is if the user did an apt-get update when
not connected to our VPN and only a short time later did one while
connected.

When your apt-mirror is always downloading

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2008/01/24/apt-mirror-2.html

When I started building our apt-mirror, I ran into a problem: the
machine was throttled against ubuntu.com’s servers, but I had completed
much of the download (which took weeks to get multiple distributions).
I really wanted to roll out the solution quickly, particularly because
the service from the remote servers was worse than ever due to the
throttling that the mirroring created. But, with the mirror incomplete,
I couldn’t so easily make available incomplete repositories.

The solution was to simply let apache redirect users on to the real
servers if the mirror doesn’t have the file. The first order of
business for that is to rewrite and redirect URLs when files aren’t
found. This is a straightforward Apache configuration:

           RewriteEngine on
           RewriteLogLevel 0
           RewriteCond %{REQUEST_FILENAME} !^/cgi/
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-F
           RewriteCond /var/spool/apt-mirror/mirror/archive.ubuntu.com%{REQUEST_FILENAME} !-d
           RewriteCond %{REQUEST_URI} !(Packages|Sources)\.bz2$
           RewriteCond %{REQUEST_URI} !/index\.[^/]*$ [NC]
           RewriteRule ^(http://%{HTTP_HOST})?/(.*) http://91.189.88.45/$2 [P]
         

Note a few things there:

  • I have to hard-code an IP number, because as I mentioned in
    the last
    post on this subject
    , I’ve faked out DNS
    for archive.ubuntu.com and other sites I’m mirroring. (Note:
    this has the unfortunate side-effect that I can’t easily take advantage
    of round-robin DNS on the other side.)

  • I avoid taking Packages.bz2 from the other site, because
    apt-mirror actually doesn’t mirror the bz2 files (although I’ve
    submitted a patch to it so it will eventually).

  • I make sure that index files get built by my Apache and not
    redirected.

  • I am using Apache proxying, which gives me Yet Another type of
    cache temporarily while I’m still downloading the other packages. (I
    should actually work out a way to have these caches used by apt-mirror
    itself in case a user has already requested a new package while waiting
    for apt-mirror to get it.)

Once I do a rewrite like this for each of the hosts I’m replacing with
a mirror, I’m almost done. The problem is that if for any reason my
site needs to give a 403 to the clients, I would actually like to
double-check to be sure that the URL doesn’t happen to work at the place
I’m mirroring from.

My hope was that I could write a RewriteRule based on what the
HTTP return code would be when the request completed. This was
really hard to do, it seemed, and perhaps undoable. The quickest
solution I found was to write a CGI script to do the redirect. So, in
the Apache config I have:

        ErrorDocument 403 /cgi/redirect-forbidden.cgi
        

And, the CGI script looks like this:

        #!/usr/bin/perl
        
        use strict;
        use CGI qw(:standard);
        
        my $val = $ENV{REDIRECT_SCRIPT_URI};
        
        $val =~ s%^http://(\S+).sflc.info(/.*)$%$2%;
        if ($1 eq "ubuntu-security") {
           $val = "http://91.189.88.37$val";
        } else {
           $val = "http://91.189.88.45$val";
        }
        
        print redirect($val);
        

With these changes, the user will be redirected to the original when
the files aren’t available on the mirror, and as the mirror gets more
accurate, they’ll get more files from the mirror.

I still have problems if for any reason the user gets a Packages or
Sources file from the original site before the mirror is synchronized,
but this rarely happens since apt-mirror is pretty careful. The only
time it might happen is if the user did an apt-get update when
not connected to our VPN and only a short time later did one while
connected.