On the Brokenness of File Locking

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/locking.html

It’s amazing how far Linux has come without providing for proper file
locking that works and is usable from userspace. A little overview why file
locking is still in a very sad state:

To begin with, there’s a plethora of APIs, and all of them are awful:

POSIX File locking as available with fcntl(F_SET_LK): the POSIX
locking API is the most portable one and in theory works across NFS. It can do
byte-range locking. So much on the good side. On the bad side there’s a lot
more however: locks are bound to processes, not file descriptors. That means
that this logic cannot be used in threaded environments unless combined with a
process-local mutex. This is hard to get right, especially in libraries that do
not know the environment they are run in, i.e. whether they are used in
threaded environments or not. The worst part however is that POSIX locks are
automatically released if a process calls close() on any (!) of
its open file descriptors for that file. That means that when one part of a
program locks a file and another by coincidence accesses it too for a short
time, the first part’s lock will be broken and it won’t be notified about that.
Modern software tends to load big frameworks (such as Gtk+ or Qt) into memory
as well as arbitrary modules via mechanisms such as NSS, PAM, gvfs,
GTK_MODULES, Apache modules, GStreamer modules where one module seldom can
control what another module in the same process does or accesses. The effect of
this is that POSIX locks are unusable in any non-trivial program where it
cannot be ensured that a file that is locked is never accessed by
any other part of the process at the same time. Example: a user managing
daemon wants to write /etc/passwd and locks the file for that. At
the same time in another thread (or from a stack frame further down)
something calls getpwuid() which internally accesses
/etc/passwd and causes the lock to be released, the first thread
(or stack frame) not knowing that. Furthermore should two threads use the
locking fcntl()s on the same file they will interfere with each other’s locks
and reset the locking ranges and flags of each other. On top of that locking
cannot be used on any file that is publicly accessible (i.e. has the R bit set
for groups/others, i.e. more access bits on than 0600), because that would
otherwise effectively give arbitrary users a way to indefinitely block
execution of any process (regardless of the UID it is running under) that wants
to access and lock the file. This is generally not an acceptable security risk.
Finally, while POSIX file locks are supposedly NFS-safe they not always really
are as there are still many NFS implementations around where locking is not properly
implemented, and NFS tends to be used in heterogenous networks. The biggest
problem about this is that there is no way to properly detect whether file
locking works on a specific NFS mount (or any mount) or not.

The other API for POSIX file locks: lockf() is another API for the
same mechanism and suffers by the same problems. One wonders why there are two
APIs for the same messed up interface.

BSD locking based on flock(). The semantics of this kind of
locking are much nicer than for POSIX locking: locks are bound to file
descriptors, not processes. This kind of locking can hence be used safely
between threads and can even be inherited across fork() and
exec(). Locks are only automatically broken on the close()
call for the one file descriptor they were created with (or the last duplicate
of it). On the other hand this kind of locking does not offer byte-range
locking and suffers by the same security problems as POSIX locking, and works
on even less cases on NFS than POSIX locking (i.e. on BSD and Linux < 2.6.12
they were NOPs returning success). And since BSD locking is not as portable as
POSIX locking this is sometimes an unsafe choice. Some OSes even find it funny
to make flock() and fcntl(F_SET_LK) control the same locks.
Linux treats them independently — except for the cases where it doesn’t: on
Linux NFS they are transparently converted to POSIX locks, too now. What a chaos!

Mandatory locking is available too. It’s based on the POSIX locking API but
not portable in itself. It’s dangerous business and should generally be avoided
in cleanly written software.

Traditional lock file based file locking. This is how things where done
traditionally, based around known atomicity guarantees of certain basic file
system operations. It’s a cumbersome thing, and requires polling of the file
system to get notifications when a lock is released. Also, On Linux NFS < 2.6.5
it doesn’t work properly, since O_EXCL isn’t atomic there. And of course the
client cannot really know what the server is running, so again this brokeness
is not detectable.

The Disappointing Summary

File locking on Linux is just broken. The broken semantics of POSIX locking
show that the designers of this API apparently never have tried to actually use
it in real software. It smells a lot like an interface that kernel people
thought makes sense but in reality doesn’t when you try to use it from
userspace.

Here’s a list of places where you shouldn’t use file locking due to the
problems shown above: If you want to lock a file in $HOME, forget about it as
$HOME might be NFS and locks generally are not reliable there. The same applies
to every other file system that might be shared across the network. If the file
you want to lock is accessible to more than your own user (i.e. an access mode
> 0700), forget about locking, it would allow others to block your
application indefinitely. If your program is non-trivial or threaded or uses a
framework such as Gtk+ or Qt or any of the module-based APIs such as NSS, PAM,
… forget about about POSIX locking. If you care about portability, don’t use
file locking.

Or to turn this around, the only case where it is kind of safe to use file locking
is in trivial applications where portability is not key and by using BSD
locking on a file system where you can rely that it is local and on files
inaccessible to others. Of course, that doesn’t leave much, except for private
files in /tmp for trivial user applications.

Or in one sentence: in its current state Linux file locking is unusable.

And that is a shame.

Update: Check out the follow-up story on this topic.

Addendum on the Brokenness of File Locking

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/locking2.html

I forgot to mention another central problem in my blog story about file locking
on Linux
:

Different machines have access to different features of the same file
system. Here’s an example: let’s say you have two machines in your home LAN.
You want them to share their $HOME directory, so that you (or your family) can
use either machine and have access to all your (or their) data. So you export
/home on one machine via NFS and mount it from the other machine.

So far so good. But what happens to file locking now? Programs on the first
machine see a fully-featured ext3 or ext4 file system, where all kinds of
locking works (even though the API might suck as mentioned in the earlier blog
story). But what about the other machine? If you set up lockd properly
then POSIX locking will work on both. If you didn’t one machine can use POSIX
locking properly, the other cannot. And it gets even worse: as mentioned recent
NFS implementations on Linux transparently convert client-side BSD locking into
POSIX locking on the server side. Now, if the same application uses BSD locking on both
the client and the server side from two instances they will end up with two
orthogonal locks and although both sides think they have properly acquired a
lock (and they actually did) they will overwrite each other’s data, because
those two locks are independent. (And one wonders why the NFS developers
implemented this brokenness nonetheless…).

This basically means that locking cannot be used unless it is verified that
everyone accessing a file system can make use of the same file system feature
set. If you use file locking on a file system you should do so only if you are
sufficiently sure that nobody using a broken or weird NFS implementation might
want to access and lock those files as well. And practically that is
impossible. Even if fpathconf() was improved so that it could inform
the caller whether it can successfully apply a file lock to a file, this would
still not give any hint if the same is true for everybody else accessing the
file. But that is essential when speaking of advisory (i.e. cooperative) file
locking.

And no, this isn’t easy to fix. So again, the recommendation: forget about
file locking on Linux, it’s nothing more than a useless toy.

Also read Jeremy
Allison’s
(Samba) take on POSIX file locking. It’s an interesting read.

On the Brokenness of File Locking

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/locking.html

It’s amazing how far Linux has come without providing for proper file
locking that works and is usable from userspace. A little overview why file
locking is still in a very sad state:

To begin with, there’s a plethora of APIs, and all of them are awful:

  • POSIX File locking as available with fcntl(F_SET_LK): the POSIX
    locking API is the most portable one and in theory works across NFS. It can do
    byte-range locking. So much on the good side. On the bad side there’s a lot
    more however: locks are bound to processes, not file descriptors. That means
    that this logic cannot be used in threaded environments unless combined with a
    process-local mutex. This is hard to get right, especially in libraries that do
    not know the environment they are run in, i.e. whether they are used in
    threaded environments or not. The worst part however is that POSIX locks are
    automatically released if a process calls close() on any (!) of
    its open file descriptors for that file. That means that when one part of a
    program locks a file and another by coincidence accesses it too for a short
    time, the first part’s lock will be broken and it won’t be notified about that.
    Modern software tends to load big frameworks (such as Gtk+ or Qt) into memory
    as well as arbitrary modules via mechanisms such as NSS, PAM, gvfs,
    GTK_MODULES, Apache modules, GStreamer modules where one module seldom can
    control what another module in the same process does or accesses. The effect of
    this is that POSIX locks are unusable in any non-trivial program where it
    cannot be ensured that a file that is locked is never accessed by
    any other part of the process at the same time. Example: a user managing
    daemon wants to write /etc/passwd and locks the file for that. At
    the same time in another thread (or from a stack frame further down)
    something calls getpwuid() which internally accesses
    /etc/passwd and causes the lock to be released, the first thread
    (or stack frame) not knowing that. Furthermore should two threads use the
    locking fcntl()s on the same file they will interfere with each other’s locks
    and reset the locking ranges and flags of each other. On top of that locking
    cannot be used on any file that is publicly accessible (i.e. has the R bit set
    for groups/others, i.e. more access bits on than 0600), because that would
    otherwise effectively give arbitrary users a way to indefinitely block
    execution of any process (regardless of the UID it is running under) that wants
    to access and lock the file. This is generally not an acceptable security risk.
    Finally, while POSIX file locks are supposedly NFS-safe they not always really
    are as there are still many NFS implementations around where locking is not properly
    implemented, and NFS tends to be used in heterogenous networks. The biggest
    problem about this is that there is no way to properly detect whether file
    locking works on a specific NFS mount (or any mount) or not.
  • The other API for POSIX file locks: lockf() is another API for the
    same mechanism and suffers by the same problems. One wonders why there are two
    APIs for the same messed up interface.
  • BSD locking based on flock(). The semantics of this kind of
    locking are much nicer than for POSIX locking: locks are bound to file
    descriptors, not processes. This kind of locking can hence be used safely
    between threads and can even be inherited across fork() and
    exec(). Locks are only automatically broken on the close()
    call for the one file descriptor they were created with (or the last duplicate
    of it). On the other hand this kind of locking does not offer byte-range
    locking and suffers by the same security problems as POSIX locking, and works
    on even less cases on NFS than POSIX locking (i.e. on BSD and Linux < 2.6.12
    they were NOPs returning success). And since BSD locking is not as portable as
    POSIX locking this is sometimes an unsafe choice. Some OSes even find it funny
    to make flock() and fcntl(F_SET_LK) control the same locks.
    Linux treats them independently — except for the cases where it doesn’t: on
    Linux NFS they are transparently converted to POSIX locks, too now. What a chaos!
  • Mandatory locking is available too. It’s based on the POSIX locking API but
    not portable in itself. It’s dangerous business and should generally be avoided
    in cleanly written software.
  • Traditional lock file based file locking. This is how things where done
    traditionally, based around known atomicity guarantees of certain basic file
    system operations. It’s a cumbersome thing, and requires polling of the file
    system to get notifications when a lock is released. Also, On Linux NFS < 2.6.5
    it doesn’t work properly, since O_EXCL isn’t atomic there. And of course the
    client cannot really know what the server is running, so again this brokeness
    is not detectable.

The Disappointing Summary

File locking on Linux is just broken. The broken semantics of POSIX locking
show that the designers of this API apparently never have tried to actually use
it in real software. It smells a lot like an interface that kernel people
thought makes sense but in reality doesn’t when you try to use it from
userspace.

Here’s a list of places where you shouldn’t use file locking due to the
problems shown above: If you want to lock a file in $HOME, forget about it as
$HOME might be NFS and locks generally are not reliable there. The same applies
to every other file system that might be shared across the network. If the file
you want to lock is accessible to more than your own user (i.e. an access mode
> 0700), forget about locking, it would allow others to block your
application indefinitely. If your program is non-trivial or threaded or uses a
framework such as Gtk+ or Qt or any of the module-based APIs such as NSS, PAM,
… forget about about POSIX locking. If you care about portability, don’t use
file locking.

Or to turn this around, the only case where it is kind of safe to use file locking
is in trivial applications where portability is not key and by using BSD
locking on a file system where you can rely that it is local and on files
inaccessible to others. Of course, that doesn’t leave much, except for private
files in /tmp for trivial user applications.

Or in one sentence: in its current state Linux file locking is unusable.

And that is a shame.

Update: Check out the follow-up story on this topic.

On IDs

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/ids.html

When programming software that cooperates with software running on behalf of
other users, other sessions or other computers it is often necessary to work with
unique identifiers. These can be bound to various hardware and software objects
as well as lifetimes. Often, when people look for such an ID to use they pick
the wrong one because semantics and lifetime or the IDs are not clear. Here’s a
little incomprehensive list of IDs accessible on Linux and how you should or
should not use them.

Hardware IDs

/sys/class/dmi/id/product_uuid: The main board product UUID, as
set by the board manufacturer and encoded in the BIOS DMI information. It may
be used to identify a mainboard and only the mainboard. It changes when the
user replaces the main board. Also, often enough BIOS manufacturers write bogus
serials into it. In addition, it is x86-specific. Access for unprivileged users
is forbidden. Hence it is of little general use.

CPUID/EAX=3 CPU serial number: A CPU UUID, as set by the
CPU manufacturer and encoded on the CPU chip. It may be used to identify a CPU
and only a CPU. It changes when the user replaces the CPU. Also, most modern
CPUs don’t implement this feature anymore, and older computers tend to disable
this option by default, controllable via a BIOS Setup option. In addition, it
is x86-specific. Hence this too is of little general use.

/sys/class/net/*/address: One or more network MAC addresses, as
set by the network adapter manufacturer and encoded on some network card
EEPROM. It changes when the user replaces the network card. Since network cards
are optional and there may be more than one the availability if this ID is not
guaranteed and you might have more than one to choose from. On virtual machines
the MAC addresses tend to be random. This too is hence of little general use.

/sys/bus/usb/devices/*/serial: Serial numbers of various USB
devices, as encoded in the USB device EEPROM. Most devices don’t have a serial
number set, and if they have it is often bogus. If the user replaces his USB
hardware or plugs it into another machine these IDs may change or appear in
other machines. This hence too is of little use.

There are various other hardware IDs available, many of which you may
discover via the ID_SERIAL udev property of various devices, such hard disks
and similar. They all have in common that they are bound to specific
(replacable) hardware, not universally available, often filled with bogus data
and random in virtualized environments. Or in other words: don’t use them, don’t
rely on them for identification, unless you really know what you are doing and
in general they do not guarantee what you might hope they guarantee.

Software IDs

/proc/sys/kernel/random/boot_id: A random ID that is regenerated
on each boot. As such it can be used to identify the local machine’s current
boot. It’s universally available on any recent Linux kernel. It’s a good and
safe choice if you need to identify a specific boot on a specific booted
kernel.

gethostname(), /proc/sys/kernel/hostname: A non-random ID
configured by the administrator to identify a machine in the network. Often
this is not set at all or is set to some default value such as
localhost and not even unique in the local network. In addition it
might change during runtime, for example because it changes based on updated
DHCP information. As such it is almost entirely useless for anything but
presentation to the user. It has very weak semantics and relies on correct
configuration by the administrator. Don’t use this to identify machines in a
distributed environment. It won’t work unless centrally administered, which
makes it useless in a globalized, mobile world. It has no place in
automatically generated filenames that shall be bound to specific hosts. Just
don’t use it, please. It’s really not what many people think it is.
gethostname() is standardized in POSIX and hence portable to other
Unixes.

IP Addresses returned by SIOCGIFCONF or the respective Netlink APIs: These
tend to be dynamically assigned and often enough only valid on local networks
or even only the local links (i.e. 192.168.x.x style addresses, or even
169.254.x.x/IPv4LL). Unfortunately they hence have little use outside of
networking.

gethostid(): Returns a supposedly unique 32-bit identifier for the
current machine. The semantics of this is not clear. On most machines this
simply returns a value based on a local IPv4 address. On others it is
administrator controlled via the /etc/hostid file. Since the semantics
of this ID are not clear and most often is just a value based on the IP address it is
almost always the wrong choice to use. On top of that 32bit are not
particularly a lot. On the other hand this is standardized in POSIX and hence
portable to other Unixes. It’s probably best to ignore this value and if people
don’t want to ignore it they should probably symlink /etc/hostid to
/var/lib/dbus/machine-id or something similar.

/var/lib/dbus/machine-id: An ID identifying a specific Linux/Unix
installation. It does not change if hardware is replaced. It is not unreliable
in virtualized environments. This value has clear semantics and is considered
part of the D-Bus API. It is supposedly globally unique and portable to all
systems that have D-Bus. On Linux, it is universally available, given that
almost all non-embedded and even a fair share of the embedded machines ship
D-Bus now. This is the recommended way to identify a machine, possibly with a
fallback to the host name to cover systems that still lack D-Bus. If your
application links against libdbus, you may access this ID with
dbus_get_local_machine_id(), if not you can read it directly from the file system.

/proc/self/sessionid: An ID identifying a specific Linux login
session. This ID is maintained by the kernel and part of the auditing logic. It
is uniquely assigned to each login session during a specific system boot,
shared by each process of a session, even across su/sudo and cannot be changed
by userspace. Unfortunately some distributions have so far failed to set things
up properly for this to work (Hey, you, Ubuntu!), and this ID is always
(uint32_t) -1 for them. But there’s hope they get this fixed
eventually. Nonetheless it is a good choice for a unique session identifier on
the local machine and for the current boot. To make this ID globally unique it
is best combined with /proc/sys/kernel/random/boot_id.

getuid(): An ID identifying a specific Unix/Linux user. This ID is
usually automatically assigned when a user is created. It is not unique across
machines and may be reassigned to a different user if the original user was
deleted. As such it should be used only locally and with the limited validity
in time in mind. To make this ID globally unique it is not sufficient to
combine it with /var/lib/dbus/machine-id, because the same ID might be
used for a different user that is created later with the same UID. Nonetheless
this combination is often good enough. It is available on all POSIX systems.

ID_FS_UUID: an ID that identifies a specific file system in the
udev tree. It is not always clear how these serials are generated but this
tends to be available on almost all modern disk file systems. It is not
available for NFS mounts or virtual file systems. Nonetheless this is often a
good way to identify a file system, and in the case of the root directory even
an installation. However due to the weakly defined generation semantics the
D-Bus machine ID is generally preferrable.

Generating IDs

Linux offers a kernel interface to generate UUIDs on demand, by reading from
/proc/sys/kernel/random/uuid. This is a very simple interface to
generate UUIDs. That said, the logic behind UUIDs is unnecessarily complex and
often it is a better choice to simply read 16 bytes or so from
/dev/urandom.

Summary

And the gist of it all: Use /var/lib/dbus/machine-id! Use
/proc/self/sessionid! Use /proc/sys/kernel/random/boot_id!
Use getuid()! Use /dev/urandom! And forget about the
rest, in particular the host name, or the hardware IDs such as DMI. And keep in
mind that you may combine the aforementioned IDs in various ways to get
different semantics and validity constraints.

On IDs

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/ids.html

When programming software that cooperates with software running on behalf of
other users, other sessions or other computers it is often necessary to work with
unique identifiers. These can be bound to various hardware and software objects
as well as lifetimes. Often, when people look for such an ID to use they pick
the wrong one because semantics and lifetime or the IDs are not clear. Here’s a
little incomprehensive list of IDs accessible on Linux and how you should or
should not use them.

Hardware IDs

  1. /sys/class/dmi/id/product_uuid: The main board product UUID, as
    set by the board manufacturer and encoded in the BIOS DMI information. It may
    be used to identify a mainboard and only the mainboard. It changes when the
    user replaces the main board. Also, often enough BIOS manufacturers write bogus
    serials into it. In addition, it is x86-specific. Access for unprivileged users
    is forbidden. Hence it is of little general use.
  2. CPUID/EAX=3 CPU serial number: A CPU UUID, as set by the
    CPU manufacturer and encoded on the CPU chip. It may be used to identify a CPU
    and only a CPU. It changes when the user replaces the CPU. Also, most modern
    CPUs don’t implement this feature anymore, and older computers tend to disable
    this option by default, controllable via a BIOS Setup option. In addition, it
    is x86-specific. Hence this too is of little general use.
  3. /sys/class/net/*/address: One or more network MAC addresses, as
    set by the network adapter manufacturer and encoded on some network card
    EEPROM. It changes when the user replaces the network card. Since network cards
    are optional and there may be more than one the availability if this ID is not
    guaranteed and you might have more than one to choose from. On virtual machines
    the MAC addresses tend to be random. This too is hence of little general use.
  4. /sys/bus/usb/devices/*/serial: Serial numbers of various USB
    devices, as encoded in the USB device EEPROM. Most devices don’t have a serial
    number set, and if they have it is often bogus. If the user replaces his USB
    hardware or plugs it into another machine these IDs may change or appear in
    other machines. This hence too is of little use.

There are various other hardware IDs available, many of which you may
discover via the ID_SERIAL udev property of various devices, such hard disks
and similar. They all have in common that they are bound to specific
(replacable) hardware, not universally available, often filled with bogus data
and random in virtualized environments. Or in other words: don’t use them, don’t
rely on them for identification, unless you really know what you are doing and
in general they do not guarantee what you might hope they guarantee.

Software IDs

  1. /proc/sys/kernel/random/boot_id: A random ID that is regenerated
    on each boot. As such it can be used to identify the local machine’s current
    boot. It’s universally available on any recent Linux kernel. It’s a good and
    safe choice if you need to identify a specific boot on a specific booted
    kernel.
  2. gethostname(), /proc/sys/kernel/hostname: A non-random ID
    configured by the administrator to identify a machine in the network. Often
    this is not set at all or is set to some default value such as
    localhost and not even unique in the local network. In addition it
    might change during runtime, for example because it changes based on updated
    DHCP information. As such it is almost entirely useless for anything but
    presentation to the user. It has very weak semantics and relies on correct
    configuration by the administrator. Don’t use this to identify machines in a
    distributed environment. It won’t work unless centrally administered, which
    makes it useless in a globalized, mobile world. It has no place in
    automatically generated filenames that shall be bound to specific hosts. Just
    don’t use it, please. It’s really not what many people think it is.
    gethostname() is standardized in POSIX and hence portable to other
    Unixes.
  3. IP Addresses returned by SIOCGIFCONF or the respective Netlink APIs: These
    tend to be dynamically assigned and often enough only valid on local networks
    or even only the local links (i.e. 192.168.x.x style addresses, or even
    169.254.x.x/IPv4LL). Unfortunately they hence have little use outside of
    networking.
  4. gethostid(): Returns a supposedly unique 32-bit identifier for the
    current machine. The semantics of this is not clear. On most machines this
    simply returns a value based on a local IPv4 address. On others it is
    administrator controlled via the /etc/hostid file. Since the semantics
    of this ID are not clear and most often is just a value based on the IP address it is
    almost always the wrong choice to use. On top of that 32bit are not
    particularly a lot. On the other hand this is standardized in POSIX and hence
    portable to other Unixes. It’s probably best to ignore this value and if people
    don’t want to ignore it they should probably symlink /etc/hostid to
    /var/lib/dbus/machine-id or something similar.
  5. /var/lib/dbus/machine-id: An ID identifying a specific Linux/Unix
    installation. It does not change if hardware is replaced. It is not unreliable
    in virtualized environments. This value has clear semantics and is considered
    part of the D-Bus API. It is supposedly globally unique and portable to all
    systems that have D-Bus. On Linux, it is universally available, given that
    almost all non-embedded and even a fair share of the embedded machines ship
    D-Bus now. This is the recommended way to identify a machine, possibly with a
    fallback to the host name to cover systems that still lack D-Bus. If your
    application links against libdbus, you may access this ID with
    dbus_get_local_machine_id(), if not you can read it directly from the file system.
  6. /proc/self/sessionid: An ID identifying a specific Linux login
    session. This ID is maintained by the kernel and part of the auditing logic. It
    is uniquely assigned to each login session during a specific system boot,
    shared by each process of a session, even across su/sudo and cannot be changed
    by userspace. Unfortunately some distributions have so far failed to set things
    up properly for this to work (Hey, you, Ubuntu!), and this ID is always
    (uint32_t) -1 for them. But there’s hope they get this fixed
    eventually. Nonetheless it is a good choice for a unique session identifier on
    the local machine and for the current boot. To make this ID globally unique it
    is best combined with /proc/sys/kernel/random/boot_id.
  7. getuid(): An ID identifying a specific Unix/Linux user. This ID is
    usually automatically assigned when a user is created. It is not unique across
    machines and may be reassigned to a different user if the original user was
    deleted. As such it should be used only locally and with the limited validity
    in time in mind. To make this ID globally unique it is not sufficient to
    combine it with /var/lib/dbus/machine-id, because the same ID might be
    used for a different user that is created later with the same UID. Nonetheless
    this combination is often good enough. It is available on all POSIX systems.
  8. ID_FS_UUID: an ID that identifies a specific file system in the
    udev tree. It is not always clear how these serials are generated but this
    tends to be available on almost all modern disk file systems. It is not
    available for NFS mounts or virtual file systems. Nonetheless this is often a
    good way to identify a file system, and in the case of the root directory even
    an installation. However due to the weakly defined generation semantics the
    D-Bus machine ID is generally preferrable.

Generating IDs

Linux offers a kernel interface to generate UUIDs on demand, by reading from
/proc/sys/kernel/random/uuid. This is a very simple interface to
generate UUIDs. That said, the logic behind UUIDs is unnecessarily complex and
often it is a better choice to simply read 16 bytes or so from
/dev/urandom.

Summary

And the gist of it all: Use /var/lib/dbus/machine-id! Use
/proc/self/sessionid! Use /proc/sys/kernel/random/boot_id!
Use getuid()! Use /dev/urandom!
And forget about the
rest, in particular the host name, or the hardware IDs such as DMI. And keep in
mind that you may combine the aforementioned IDs in various ways to get
different semantics and validity constraints.

New Ground on Terminology Debate?

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2010/06/23/open-source.html

(These days, ) I generally try to avoid the well-known terminology
debates in our community. But, if you hang around this FLOSS world of
ours long enough, you just can’t avoid occasionally getting into them.
I found myself in one this afternoon
that spanned
three identica
thread
s. I had some new thoughts that I’ve shared today (and even
previously) on my identi.ca
microblog
. I thought it might be useful to write them up in one place
rather than scattered across a series of microblog statements.

I gained my first new insight into the terminology issues when I had
dinner with Larry Wall in
early 2001 after my Master’s thesis defense. It was first time I talked
with him about these issues of terminology, and he said that it sounded
like a good place to apply what he called the “golden rule of
network protocols”: Always be conservative in what you emit and
liberal in what you accept.
I’ve recently
noted
again that’s a good rule to follow regarding terminology.

More recently, I’ve realized that the FLOSS community suffers here,
likely due to our high concentration of software developers and
engineers. Precision in communication is a necessarily component of the
lives of developers, engineers, computer scientists, or anyone in a
highly technical field. In our originating fields, lack of precise and
well-understood terminology can cause bridges to collapse or the wrong
software to get installed and crash mission critical systems.
Calling x by the name y sometimes causes mass confusion
and failure. Indeed, earlier this week, I watched
a PBS special, The
Pluto Files
,
where Neil
deGrasse Tyson
discussed the intense debate about the planetary
status of Pluto. I was actually somewhat relieved that a subtle point
regarding a categorical naming is just as contentious in another area
outside my chosen field. Watching the “what constitutes a
planet” debate showed me that FLOSS hackers are no different than
most other scientists in this regard. We all take quite a bit of pride
in our careful (sometimes pedantic) care in terminology and word choice;
I know I do, anyway.

However, on the advocacy side of software freedom (the part
that isn’t technical), our biggest confusion sometimes stems
from an assumption that other people’s word choice is as necessarily as
precise as ours. Consider the phrase “open source”, for
example. When I say “open source”, I am referring quite
exactly to a business-focused, apolitical and (frankly)
amoral0 interest in,
adoption of, and contribution to FLOSS. Those who coined the term
“open source” were right about at least one thing: it’s a
term that fits well with for-profit interests who might otherwise see
software freedom as too political.

However, many non-business users and developers that I talk to quite
clearly express that they are into this stuff precisely because there
are principles behind it: namely, that FLOSS seeks to make a better
world by giving important rights to users and programmers. Often, they
are using the phrase “open source” as they express this. I
of course take the opportunity to say: it’s because those principles
are so important that I talk about software freedom. Yet, it’s
clear they already meant software freedom as a concept, and
just had some sloppy word choice.

Fact is, most of us are just plain sloppy with language. Precision
isn’t everyone’s forte, and as a software freedom advocate (not a
language usage advocate), I see my job as making sure people have the
concepts right even if they use words that don’t make much sense. There
are times when the word choices really do confuse the concepts, and
there are other times when they don’t. Sometimes, it’s tough to
identify which of the two is occurring. I try to figure it out in each
given situation, and if I’m in doubt, I just simplify to the golden rule
of network protocols.

Furthermore, I try to have faith in our community’s intelligence.
Regardless of how people get drawn into FLOSS: be it from the moral
software freedom arguments or the technical-advantage-only open source
ones, I don’t think people stop listening immediately upon their arrival
in our community. I know this even from my own adoption of software
freedom: I came for the Free as in Price, but I stayed for the Free as
in Freedom. It’s only because I couldn’t afford a SCO Unix license in
1992 that I installed GNU/Linux. But, I learned within just a year why
the software freedom was what mattered most.

Surely, others have a similar introduction to the community: either
drawn in by zero-cost availability or the technical benefits first, but
still very interested to learn about software freedom. My goal is to
reach those who have arrived in the community. I therefore try to speak
almost constantly about software freedom, why it’s a moral issue, and
why I work every day to help either reduce the amount of proprietary
software, or increase the amount of Free Software in the world. My hope
is that newer community members will hear my arguments, see my actions,
and be convinced that a moral and ethical commitment to software freedom
is the long lasting principle worth undertaking. In essence, I seek to
lead by example as much as possible.

Old arguments are a bit too comfortable. We already know how to have
them on autopilot. I admit myself that I enjoy having an old argument
with a new person: my extensive practice often yields an oratorical
advantage. But, that crude drive is too much about winning the argument
and not enough about delivering the message of software freedom.
Occasionally, a terminology discussion is part of delivering that
message, but my terminology debate tools box has a “use with
care” written on it.

0 Note that here,
too, I took extreme care with my word choice. I mean specifically
amorality
merely an absence of any moral code in particular. I do not, by any
stretch, mean immoral.

Where Are The Bytes?

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2010/06/11/develop-in-public.html

A few years ago, I was considering starting a Free Software project. I
never did start that one, but I learned something valuable in the
process. When I thought about starting this project, I did what I
usually do: ask someone who knows more about the topic than I do. So I
phoned my friend Loïc Dachary, who
has started many Free Software projects, and asked him for advice.

Before I could even describe the idea, Loïc said: you don’t have a
URL? I was taken aback; I said: but I haven’t started yet.
He said: of course you have, you’re talking to me about it, so
you’ve started already. The most important thing you can tell
me, he said, is Where are the bytes?

Loïc explained further: Most projects don’t succeed. The hardest
part about a software freedom project is carrying it far enough so it
can survive even if its founders quit. Therefore, under Loïc’s
theory, the most important task at the project’s start is to generate
those bytes, in hopes those bytes find their way to the a group of
developers who will help keep the project alive.

But, what does he mean by “bytes”? He means, quite simply,
that you have to core dump your thinking, your code, your plans, your
ideas, just about everything on a public URL that everyone can take a
look at. Push bytes. Push them out every time you generate a few.
It’s the only chance your software freedom project has.

The first goal of a software freedom project is to gain developers. No
project can have long-term success without a diverse developer base.
The problem is, the initial development work and project planning too
often ends up trapped in the head of a few developers. It’s human
nature: How can I spend my time telling everyone about what I’m
doing? If I do that, when will I actually do anything?
Successful software freedom project leaders resist this human urge and
do the seemingly counterintuitive thing: they dump their bytes on the
public, even if it slows them down a bit.

This process is even more essential in the network age. If someone
wants to find a program that does a job, the first tool is a search
engine: to find out if someone else has done it yet. Your project’s
future depends completely that every such search performed helps
developers find your bytes.

In early 2001, I asked Larry
Wall
, of all the projects he’d worked on, which was the hardest.
His answer was quick: when I was developing the first version of
perl5, Larry said, I felt like I had to code completely alone and
just make it work by myself. Of course, Larry’s a very talented guy
who can make that happen: generate something by himself that everyone
wanted to use. While I haven’t asked him what he’d do in today’s world
if he was charged with a similar task, I can guess — especially
given at how public the Perl6 process has been — that he’d instead
use the new network tools, such as DVCS, to push his bytes early and
often and seek to get more developers involved
early.0

Admittedly, most developers’ first urge is to hide
everything. We’ll release it when it’s ready, is often heard, or
— even worse — Our core team works so well together;
it’ll just slow us down to make things public now. Truth is, this
is a dangerous mixture of fear and narcissism — the very same
drives that lead proprietary software developers to keep things
proprietary.

Software freedom developers have the opportunity to actually get past
the simple reality of software development: all code sucks, and usually
isn’t complete. Yet, it’s still essential that the community see what’s
going on at ever step, from the empty codebase and beyond. When a
project is seen as active, that draws in developers and gives the
project hope of success.

When I was in college, one of the teams in a software engineering class
crashed and burned; their project failed hopelessly. This happened
despite one of the team members spending about half the semester up long
nights, coding by himself, ignoring the other team members. In their
final evaluation, the professor pointed out: Being a software
developer isn’t like being a fighter pilot. The student, missing
the point, quipped: Yeah, I know, at least a fighter pilot has a
wingman. Truth is, one person, or two people, or even a small team,
aren’t going to make a software freedom project succeed. It’s only
going to succeed when a large community bolsters it and prevents any
single point of failure.

Nevertheless, most software freedom projects are going to fail. But,
there is no shame in pushing out a bunch of bytes, encouraging people to
take a look, and giving up later if it just doesn’t make it. All of
science works this way, and there’s no reason computer science should be
any different. Keeping your project private assures its failure; the
only benefit is that you can hide that you even tried. As my graduate
advisor told me when I was worried my thesis wasn’t a success: a
negative result can be just as compelling as a positive one. What’s
important is to make sure all results are published and available for
public scrutiny.

When I
started discussing
this idea a few weeks ago
, some argued that early GNU programs
— the founding software of our community — were developed in
private initially. This much is true, but just because GNU developers
once operated that way doesn’t mean it was the right way. We have the
tools now to easily do development in public, so we should. In my view,
today, it’s not really in the spirit of software freedom until the
project, including its design discussions, plans, and prototypes are all
developed in public. Code (regardless of its license) merely dumped
over the wall on intervals deserves to be forked by a community
committed to public development.

Update (2010-06-12): I completely forgot to mention
The Risks of
Distributed Version Control by Ben Collins-Sussman
, which
is five years old now but still useful. Ben is making a similar
point to mine, and pointing out how some uses of DVCS can cause the
effects that I’m encouraging developers to avoid. I think DVCS is
like any tool: it can be used wrongly. The usage Ben warns about
should be avoided, and DVCS, when used correctly, assists
in the public software development process.

0Note that pushing code
out to the public in the mid-1990s was substantially more arduous (from a
technological perspective) than it is today. Those of you who don’t
remember shar archives may not realize that. 🙂

Change of Plans

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/linuxtag2k10.html

The upcoming week I’ll do two talks at LinuxTag 2010 at the Berlin Fair Grounds. One of them was only
added to the schedule today, about
systemd
. Systemd has never been presented in a public talk before, so make
sure to attend this historic moment… ;-). Read about what has been written about systemd
so far
, so that you can ask the sharpest questions during my
presentation.

My second talk might be about stuff a little less reported in the press, but
still very interesting, about Surround Sound in Gnome.

See you at LinuxTag!

Change of Plans

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/linuxtag2k10.html

The upcoming week I’ll do two talks at LinuxTag 2010 at the Berlin Fair Grounds. One of them was only
added to the schedule today, about
systemd
. Systemd has never been presented in a public talk before, so make
sure to attend this historic moment… ;-). Read about what has been written about systemd
so far
, so that you can ask the sharpest questions during my
presentation.

My second talk might be about stuff a little less reported in the press, but
still very interesting, about Surround Sound in Gnome.

See you at LinuxTag!

Стените

Post Syndicated from RealEnder original http://alex.stanev.org/blog/?p=244

Искам да ви разкажа една история.
Има една група хора, които познават живота. Познават механиката. Знаят как се случват нещата. Да се обадиш на онзи, да почерпиш другия, да бутнеш на третия. Обикновенно, те заемат позиция, от която зависи едно или друго по-голямо решение за живота на другите. Тъй като всички знаят “как стават работите”, всеки елемент от картинката изгражда стена около услугите си. Нарочно оставя задна вратичка, за да може все пак да се изпълни процеса, в който е критична точка. За съответната услуга има и съответното поощрение.
Какво става обаче, когато работещ по горната схема елемент, опита да получи полагаемите му услуги(като гражданин на НРБ?) от останалите крепости? Очевидно, налага му се да търси задните вратички. И не просто му се налага – той само тях познава. Не мисли, че всъщност някой може и да си върши работата, за която му плащат, без да иска нещо допълнително. Така малкото състояние(без значение под какво форма е то) се стопява много по-бързо, от колкото е натрупано и в един момент се излиза на минус. Особено, ако от теб зависят малко неща. Тогава само дължиш на “Заверата”, защото рядко ти се случва да й даваш.
Някои хора казват, че съм малък и не помня много неща, които всъщност си спомянм. Горното е системата на “Второ направление” – а иначе стоката, предназначена за външнотърговския(към и извън СИВ) пазар, продавана “под тезгяха” на вътрешния. На “приятели”. Които ще ти върнат услугата.
Зная колко е трудно да се дефинират, налагат и изпълняват определени бизнес-процеси, в организации, където Второто направление е Първата власт. Нещата не стават с удари по масата, не стават и с внимателни опити за цивилизоване и образоване. Всъщност, ако в опита си не попаднеш в примката на ВН, може да се отчита като умерен успех:)
Истината, поне за мен, е, че всеки, работещ по тези схеми трябва да страда. И то от несъвършенствата на схемата в която е попаднал. Всеки кадърен управленец, осъзнал проблема, може да създаде несъвършенства – понижаване на отговорностите, възлагане на други, промяна на процеса и какво ли още не.
Накрая остана да споменем и другите – тези, които нямат вътрешен човек в крепостта. В зависимост от звяра вътре, те или минават бързо и безропотно, преди да се е стоварила решетката връз главите им, или просто чакат да се вдигне моста. Много от тях мечтаят да бъдат част от системата. Други просто псуват, пушат, пият и правят други неморални и незаконни неща (не задължително в този ред).
А какво правиш ти?

O’Reilly Book Deal – Get Security and Other Ebooks Cheap Today

Post Syndicated from David original http://feedproxy.google.com/~r/DevilsAdvocateSecurity/~3/BvqjB0WoUq4/oreilly-book-deal-get-security-and.html

O’Reilly has a coupon available for today only that makes any one ebook in their store $10. If you’re like me and like to have an electronic edition handy, this is a great deal for books that are updated and searchable. Their security books can be found here. You’ll want to use coupon code “FAVFA”.

_uacct = “UA-1423386-1”;
urchinTracker();

O’Reilly Book Deal – Get Security and Other Ebooks Cheap Today

Post Syndicated from David original http://feedproxy.google.com/~r/DevilsAdvocateSecurity/~3/BvqjB0WoUq4/oreilly-book-deal-get-security-and.html

O’Reilly has a coupon available for today only that makes any one ebook in their store $10. If you’re like me and like to have an electronic edition handy, this is a great deal for books that are updated and searchable. Their security books can be found here. You’ll want to use coupon code “FAVFA”.

_uacct = “UA-1423386-1”;
urchinTracker();

Mango Lassi is Back

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/mango-lassi-is-back.html

Mango Lassi's Icon

Sven Herzberg has recently
been doing a lot of work on Mango Lassi, a
project deserving love but which I as its original author haven’t touched
in 3 years.

His work is already bearing fruits:

Mango Lassi

Distribution packagers, please go and package his version, Mango Lassi is an
awesome, wonderful tool that needs distributor love.

If you want to use Mango Lassi without waiting for the distribution packagers to catch up, Sven has built some packages for you in the OpenSUSE Build Service.

Sven, KUTGW!

Mango Lassi is Back

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/mango-lassi-is-back.html

Mango Lassi's Icon

Sven Herzberg has recently
been doing a lot of work on Mango Lassi, a
project deserving love but which I as its original author haven’t touched
in 3 years.

His work is already bearing fruits:

Mango Lassi

Distribution packagers, please go and package his version, Mango Lassi is an
awesome, wonderful tool that needs distributor love.

If you want to use Mango Lassi without waiting for the distribution packagers to catch up, Sven has built some packages for you in the OpenSUSE Build Service.

Sven, KUTGW!

Check Facebook Privacy Settings with ReclaimPrivacyRights.org’s Scanner Bookmarklet

Post Syndicated from David original http://feedproxy.google.com/~r/DevilsAdvocateSecurity/~3/KxCCUb5bO8E/check-facebook-privacy-settings-with.html

ReclaimPrivacyRights.org provides a simple bookmarklet that works simply by loading it when you visit your Privacy settings page on Facebook. Simple, neat, and it appears to be a neat way to get a basic checkup. Better, the source code is available for review.

_uacct = “UA-1423386-1”;
urchinTracker();

Check Facebook Privacy Settings with ReclaimPrivacyRights.org’s Scanner Bookmarklet

Post Syndicated from David original http://feedproxy.google.com/~r/DevilsAdvocateSecurity/~3/KxCCUb5bO8E/check-facebook-privacy-settings-with.html

ReclaimPrivacyRights.org provides a simple bookmarklet that works simply by loading it when you visit your Privacy settings page on Facebook. Simple, neat, and it appears to be a neat way to get a basic checkup. Better, the source code is available for review.

_uacct = “UA-1423386-1”;
urchinTracker();

Facebook Friend Suggestions – Not a Virus!

Post Syndicated from David original http://feedproxy.google.com/~r/DevilsAdvocateSecurity/~3/MFuDDtcWxqw/facebook-friend-suggestions-not-virus.html

Facebook status updates are quickly being populated with warnings that the suggest a friend notes that are appearing in users inboxes are virus driven. They’re not – in fact, Facebook has released a notice that AllFacebook.com posted stating”This is neither a bug nor a virus, and the “Virus Alert” status update is incorrect. Friend suggestions are now mutual and will appear for both users involved. That is, if I suggest that one person become friends with another, both the person I suggested and the person to whom I sent the suggestion will receive the notification.”The fact that the Facebook populace quickly communicates about a potential issue is good – the fact that false information is spreading quickly is not as good – but I’d rather my users avoid a fake virus than not avoid a real one.

_uacct = “UA-1423386-1”;
urchinTracker();

Facebook Friend Suggestions – Not a Virus!

Post Syndicated from David original http://feedproxy.google.com/~r/DevilsAdvocateSecurity/~3/MFuDDtcWxqw/facebook-friend-suggestions-not-virus.html

Facebook status updates are quickly being populated with warnings that the suggest a friend notes that are appearing in users inboxes are virus driven. They’re not – in fact, Facebook has released a notice that AllFacebook.com posted stating”This is neither a bug nor a virus, and the “Virus Alert” status update is incorrect. Friend suggestions are now mutual and will appear for both users involved. That is, if I suggest that one person become friends with another, both the person I suggested and the person to whom I sent the suggestion will receive the notification.”The fact that the Facebook populace quickly communicates about a potential issue is good – the fact that false information is spreading quickly is not as good – but I’d rather my users avoid a fake virus than not avoid a real one.

_uacct = “UA-1423386-1”;
urchinTracker();

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close