Tag Archives: ssh

systemd for Developers I

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/socket-activation.html

systemd
not only brings improvements for administrators and users, it also
brings a (small) number of new APIs with it. In this blog story (which might
become the first of a series) I hope to shed some light on one of the
most important new APIs in systemd:

Socket Activation

In the original blog
story about systemd
I tried to explain why socket activation is a
wonderful technology to spawn services. Let’s reiterate the background
here a bit.

The basic idea of socket activation is not new. The inetd
superserver was a standard component of most Linux and Unix systems
since time began: instead of spawning all local Internet services
already at boot, the superserver would listen on behalf of the
services and whenever a connection would come in an instance of the
respective service would be spawned. This allowed relatively weak
machines with few resources to offer a big variety of services at the
same time. However it quickly got a reputation for being somewhat
slow: since daemons would be spawned for each incoming connection a
lot of time was spent on forking and initialization of the services
— once for each connection, instead of once for them all.

Spawning one instance per connection was how inetd was primarily
used, even though inetd actually understood another mode: on the first
incoming connection it would notice this via poll() (or
select()) and spawn a single instance for all future
connections. (This was controllable with the
wait/nowait options.) That way the first connection
would be slow to set up, but subsequent ones would be as fast as with
a standalone service. In this mode inetd would work in a true
on-demand mode: a service would be made available lazily when it was
required.

inetd’s focus was clearly on AF_INET (i.e. Internet) sockets. As
time progressed and Linux/Unix left the server niche and became
increasingly relevant on desktops, mobile and embedded environments
inetd was somehow lost in the troubles of time. Its reputation for
being slow, and the fact that Linux’ focus shifted away from only
Internet servers made a Linux machine running inetd (or one of its newer
implementations, like xinetd) the exception, not the rule.

When Apple engineers worked on optimizing the MacOS boot time they
found a new way to make use of the idea of socket activation: they
shifted the focus away from AF_INET sockets towards AF_UNIX
sockets. And they noticed that on-demand socket activation was only
part of the story: much more powerful is socket activation when used
for all local services including those which need to be started
anyway on boot. They implemented these ideas in launchd, a central building
block of modern MacOS X systems, and probably the main reason why
MacOS is so fast booting up.

But, before we continue, let’s have a closer look what the benefits
of socket activation for non-on-demand, non-Internet services in
detail are. Consider the four services Syslog, D-Bus, Avahi and the
Bluetooth daemon. D-Bus logs to Syslog, hence on traditional Linux
systems it would get started after Syslog. Similarly, Avahi requires
Syslog and D-Bus, hence would get started after both. Finally
Bluetooth is similar to Avahi and also requires Syslog and D-Bus but
does not interface at all with Avahi. Sinceoin a traditional
SysV-based system only one service can be in the process of getting
started at a time, the following serialization of startup would take
place: Syslog → D-Bus → Avahi → Bluetooth (Of course, Avahi and
Bluetooth could be started in the opposite order too, but we have to
pick one here, so let’s simply go alphabetically.). To illustrate
this, here’s a plot showing the order of startup beginning with system
startup (at the top).

Parallelization plot

Certain distributions tried to improve this strictly serialized
start-up: since Avahi and Bluetooth are independent from each other,
they can be started simultaneously. The parallelization is increased,
the overall startup time slightly smaller. (This is visualized in the
middle part of the plot.)

Socket activation makes it possible to start all four services
completely simultaneously, without any kind of ordering. Since the
creation of the listening sockets is moved outside of the daemons
themselves we can start them all at the same time, and they are able
to connect to each other’s sockets right-away. I.e. in a single step
the /dev/log and /run/dbus/system_bus_socket sockets
are created, and in the next step all four services are spawned
simultaneously. When D-Bus then wants to log to syslog, it just writes
its messages to /dev/log. As long as the socket buffer does
not run full it can go on immediately with what else it wants to do
for initialization. As soon as the syslog service catches up it will
process the queued messages. And if the socket buffer runs full then
the client logging will temporarily block until the socket is writable
again, and continue the moment it can write its log messages. That
means the scheduling of our services is entirely done by the kernel:
from the userspace perspective all services are run at the same time,
and when one service cannot keep up the others needing it will
temporarily block on their request but go on as soon as these
requests are dispatched. All of this is completely automatic and
invisible to userspace. Socket activation hence allows us to
drastically parallelize start-up, enabling simultaneous start-up of
services which previously were thought to strictly require
serialization. Most Linux services use sockets as communication
channel. Socket activation allows starting of clients and servers of
these channels at the same time.

But it’s not just about parallelization. It offers a number of
other benefits:

We no longer need to configure dependencies explicitly. Since the
sockets are initialized before all services they are simply available,
and no userspace ordering of service start-up needs to take place
anymore. Socket activation hence drastically simplifies configuration
and development of services.

If a service dies its listening socket stays around, not losing a
single message. After a restart of the crashed service it can continue
right where it left off.

If a service is upgraded we can restart the service while keeping
around its sockets, thus ensuring the service is continously
responsive. Not a single connection is lost during the upgrade.

We can even replace a service during runtime in a way that is
invisible to the client. For example, all systems running systemd
start up with a tiny syslog daemon at boot which passes all log
messages written to /dev/log on to the kernel message
buffer. That way we provide reliable userspace logging starting from
the first instant of boot-up. Then, when the actual rsyslog daemon is
ready to start we terminate the mini daemon and replace it with the
real daemon. And all that while keeping around the original logging
socket and sharing it between the two daemons and not losing a single
message. Since rsyslog flushes the kernel log buffer to disk after
start-up all log messages from the kernel, from early-boot and from
runtime end up on disk.

For another explanation of this idea consult the original blog
story about systemd
.

Socket activation has been available in systemd since its
inception. On Fedora 15 a number of services have been modified to
implement socket activation, including Avahi, D-Bus and rsyslog (to continue with the example above).

systemd’s socket activation is quite comprehensive. Not only classic
sockets are support but related technologies as well:

AF_UNIX sockets, in the flavours SOCK_DGRAM, SOCK_STREAM and SOCK_SEQPACKET; both in the filesystem and in the abstract namespace

AF_INET sockets, i.e. TCP/IP and UDP/IP; both IPv4 and IPv6

Unix named pipes/FIFOs in the filesystem

AF_NETLINK sockets, to subscribe to certain kernel features. This
is currently used by udev, but could be useful for other
netlink-related services too, such as audit.

Certain special files like /proc/kmsg or device nodes like /dev/input/*.

POSIX Message Queues

A service capable of socket activation must be able to receive its
preinitialized sockets from systemd, instead of creating them
internally. For most services this requires (minimal)
patching. However, since systemd actually provides inetd compatibility
a service working with inetd will also work with systemd — which is
quite useful for services like sshd for example.

So much about the background of socket activation, let’s now have a
look how to patch a service to make it socket activatable. Let’s start
with a theoretic service foobard. (In a later blog post we’ll focus on
real-life example.)

Our little (theoretic) service includes code like the following for
creating sockets (most services include code like this in one way or
another):

/* Source Code Example #1: ORIGINAL, NOT SOCKET-ACTIVATABLE SERVICE */

union {
struct sockaddr sa;
struct sockaddr_un un;
} sa;
int fd;

fd = socket(AF_UNIX, SOCK_STREAM, 0);
if (fd < 0) {
fprintf(stderr, “socket(): %mn”);
exit(1);
}

memset(&sa, 0, sizeof(sa));
sa.un.sun_family = AF_UNIX;
strncpy(sa.un.sun_path, “/run/foobar.sk”, sizeof(sa.un.sun_path));

if (bind(fd, &sa.sa, sizeof(sa)) < 0) {
fprintf(stderr, “bind(): %mn”);
exit(1);
}

if (listen(fd, SOMAXCONN) < 0) {
fprintf(stderr, “listen(): %mn”);
exit(1);
}

A socket activatable service may use the following code instead:

/* Source Code Example #2: UPDATED, SOCKET-ACTIVATABLE SERVICE */

#include “sd-daemon.h”

int fd;

if (sd_listen_fds(0) != 1) {
fprintf(stderr, “No or too many file descriptors received.n”);
exit(1);
}

fd = SD_LISTEN_FDS_START + 0;

systemd might pass you more than one socket (based on
configuration, see below). In this example we are interested in one
only. sd_listen_fds()
returns how many file descriptors are passed. We simply compare that
with 1, and fail if we got more or less. The file descriptors systemd
passes to us are inherited one after the other beginning with fd
#3. (SD_LISTEN_FDS_START is a macro defined to 3). Our code hence just
takes possession of fd #3.

As you can see this code is actually much shorter than the
original. This of course comes at the price that our little service
with this change will no longer work in a non-socket-activation
environment. With minimal changes we can adapt our example to work nicely
both with and without socket activation:

/* Source Code Example #3: UPDATED, SOCKET-ACTIVATABLE SERVICE WITH COMPATIBILITY */

#include “sd-daemon.h”

int fd, n;

n = sd_listen_fds(0);
if (n > 1) {
fprintf(stderr, “Too many file descriptors received.n”);
exit(1);
} else if (n == 1)
fd = SD_LISTEN_FDS_START + 0;
else {
union {
struct sockaddr sa;
struct sockaddr_un un;
} sa;

fd = socket(AF_UNIX, SOCK_STREAM, 0);
if (fd < 0) {
fprintf(stderr, “socket(): %mn”);
exit(1);
}

memset(&sa, 0, sizeof(sa));
sa.un.sun_family = AF_UNIX;
strncpy(sa.un.sun_path, “/run/foobar.sk”, sizeof(sa.un.sun_path));

if (bind(fd, &sa.sa, sizeof(sa)) < 0) {
fprintf(stderr, “bind(): %mn”);
exit(1);
}

if (listen(fd, SOMAXCONN) < 0) {
fprintf(stderr, “listen(): %mn”);
exit(1);
}
}

With this simple change our service can now make use of socket
activation but still works unmodified in classic environments. Now,
let’s see how we can enable this service in systemd. For this we have
to write two systemd unit files: one describing the socket, the other
describing the service. First, here’s foobar.socket:

[Socket]
ListenStream=/run/foobar.sk

[Install]
WantedBy=sockets.target

And here’s the matching service file foobar.service:

[Service]
ExecStart=/usr/bin/foobard

If we place these two files in /etc/systemd/system we can
enable and start them:

# systemctl enable foobar.socket
# systemctl start foobar.socket

Now our little socket is listening, but our service not running
yet. If we now connect to /run/foobar.sk the service will be
automatically spawned, for on-demand service start-up. With a
modification of foobar.service we can start our service
already at startup, thus using socket activation only for
parallelization purposes, not for on-demand auto-spawning anymore:

[Service]
ExecStart=/usr/bin/foobard

[Install]
WantedBy=multi-user.target

And now let’s enable this too:

# systemctl enable foobar.service
# systemctl start foobar.service

Now our little daemon will be started at boot and on-demand,
whatever comes first. It can be started fully in parallel with its
clients, and when it dies it will be automatically restarted when it
is used the next time.

A single .socket file can include multiple ListenXXX stanzas, which
is useful for services that listen on more than one socket. In this
case all configured sockets will be passed to the service in the exact
order they are configured in the socket unit file. Also,
you may configure various socket settings in the .socket
files.

In real life it’s a good idea to include description strings in
these unit files, to keep things simple we’ll leave this out of our
example. Speaking of real-life: our next installment will cover an
actual real-life example. We’ll add socket activation to the CUPS
printing server.

The sd_listen_fds() function call is defined in sd-daemon.h
and sd-daemon.c. These
two files are currently drop-in .c sources which projects should
simply copy into their source tree. Eventually we plan to turn this
into a proper shared library, however using the drop-in files allows
you to compile your project in a way that is compatible with socket
activation even without any compile time dependencies on
systemd. sd-daemon.c is liberally licensed, should compile
fine on the most exotic Unixes and the algorithms are trivial enough
to be reimplemented with very little code if the license should
nonetheless be a problem for your project. sd-daemon.c
contains a couple of other API functions besides
sd_listen_fds() that are useful when implementing socket
activation in a project. For example, there’s sd_is_socket()
which can be used to distuingish and identify particular sockets when
a service gets passed more than one.

Let me point out that the interfaces used here are in no way bound
directly to systemd. They are generic enough to be implemented in
other systems as well. We deliberately designed them as simple and
minimal as possible to make it possible for others to adopt similar
schemes.

Stay tuned for the next installment. As mentioned, it will cover a
real-life example of turning an existing daemon into a
socket-activatable one: the CUPS printing service. However, I hope
this blog story might already be enough to get you started if you plan
to convert an existing service into a socket activatable one. We
invite everybody to convert upstream projects to this scheme. If you
have any questions join us on #systemd on freenode.

systemd for Developers I

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/socket-activation.html

systemd
not only brings improvements for administrators and users, it also
brings a (small) number of new APIs with it. In this blog story (which might
become the first of a series) I hope to shed some light on one of the
most important new APIs in systemd:

Socket Activation

In the original blog
story about systemd
I tried to explain why socket activation is a
wonderful technology to spawn services. Let’s reiterate the background
here a bit.

The basic idea of socket activation is not new. The inetd
superserver was a standard component of most Linux and Unix systems
since time began: instead of spawning all local Internet services
already at boot, the superserver would listen on behalf of the
services and whenever a connection would come in an instance of the
respective service would be spawned. This allowed relatively weak
machines with few resources to offer a big variety of services at the
same time. However it quickly got a reputation for being somewhat
slow: since daemons would be spawned for each incoming connection a
lot of time was spent on forking and initialization of the services
— once for each connection, instead of once for them all.

Spawning one instance per connection was how inetd was primarily
used, even though inetd actually understood another mode: on the first
incoming connection it would notice this via poll() (or
select()) and spawn a single instance for all future
connections. (This was controllable with the
wait/nowait options.) That way the first connection
would be slow to set up, but subsequent ones would be as fast as with
a standalone service. In this mode inetd would work in a true
on-demand mode: a service would be made available lazily when it was
required.

inetd’s focus was clearly on AF_INET (i.e. Internet) sockets. As
time progressed and Linux/Unix left the server niche and became
increasingly relevant on desktops, mobile and embedded environments
inetd was somehow lost in the troubles of time. Its reputation for
being slow, and the fact that Linux’ focus shifted away from only
Internet servers made a Linux machine running inetd (or one of its newer
implementations, like xinetd) the exception, not the rule.

When Apple engineers worked on optimizing the MacOS boot time they
found a new way to make use of the idea of socket activation: they
shifted the focus away from AF_INET sockets towards AF_UNIX
sockets. And they noticed that on-demand socket activation was only
part of the story: much more powerful is socket activation when used
for all local services including those which need to be started
anyway on boot. They implemented these ideas in launchd, a central building
block of modern MacOS X systems, and probably the main reason why
MacOS is so fast booting up.

But, before we continue, let’s have a closer look what the benefits
of socket activation for non-on-demand, non-Internet services in
detail are. Consider the four services Syslog, D-Bus, Avahi and the
Bluetooth daemon. D-Bus logs to Syslog, hence on traditional Linux
systems it would get started after Syslog. Similarly, Avahi requires
Syslog and D-Bus, hence would get started after both. Finally
Bluetooth is similar to Avahi and also requires Syslog and D-Bus but
does not interface at all with Avahi. Sinceoin a traditional
SysV-based system only one service can be in the process of getting
started at a time, the following serialization of startup would take
place: Syslog → D-Bus → Avahi → Bluetooth (Of course, Avahi and
Bluetooth could be started in the opposite order too, but we have to
pick one here, so let’s simply go alphabetically.). To illustrate
this, here’s a plot showing the order of startup beginning with system
startup (at the top).

Parallelization plot

Certain distributions tried to improve this strictly serialized
start-up: since Avahi and Bluetooth are independent from each other,
they can be started simultaneously. The parallelization is increased,
the overall startup time slightly smaller. (This is visualized in the
middle part of the plot.)

Socket activation makes it possible to start all four services
completely simultaneously, without any kind of ordering. Since the
creation of the listening sockets is moved outside of the daemons
themselves we can start them all at the same time, and they are able
to connect to each other’s sockets right-away. I.e. in a single step
the /dev/log and /run/dbus/system_bus_socket sockets
are created, and in the next step all four services are spawned
simultaneously. When D-Bus then wants to log to syslog, it just writes
its messages to /dev/log. As long as the socket buffer does
not run full it can go on immediately with what else it wants to do
for initialization. As soon as the syslog service catches up it will
process the queued messages. And if the socket buffer runs full then
the client logging will temporarily block until the socket is writable
again, and continue the moment it can write its log messages. That
means the scheduling of our services is entirely done by the kernel:
from the userspace perspective all services are run at the same time,
and when one service cannot keep up the others needing it will
temporarily block on their request but go on as soon as these
requests are dispatched. All of this is completely automatic and
invisible to userspace. Socket activation hence allows us to
drastically parallelize start-up, enabling simultaneous start-up of
services which previously were thought to strictly require
serialization. Most Linux services use sockets as communication
channel. Socket activation allows starting of clients and servers of
these channels at the same time.

But it’s not just about parallelization. It offers a number of
other benefits:

  • We no longer need to configure dependencies explicitly. Since the
    sockets are initialized before all services they are simply available,
    and no userspace ordering of service start-up needs to take place
    anymore. Socket activation hence drastically simplifies configuration
    and development of services.
  • If a service dies its listening socket stays around, not losing a
    single message. After a restart of the crashed service it can continue
    right where it left off.
  • If a service is upgraded we can restart the service while keeping
    around its sockets, thus ensuring the service is continously
    responsive. Not a single connection is lost during the upgrade.
  • We can even replace a service during runtime in a way that is
    invisible to the client. For example, all systems running systemd
    start up with a tiny syslog daemon at boot which passes all log
    messages written to /dev/log on to the kernel message
    buffer. That way we provide reliable userspace logging starting from
    the first instant of boot-up. Then, when the actual rsyslog daemon is
    ready to start we terminate the mini daemon and replace it with the
    real daemon. And all that while keeping around the original logging
    socket and sharing it between the two daemons and not losing a single
    message. Since rsyslog flushes the kernel log buffer to disk after
    start-up all log messages from the kernel, from early-boot and from
    runtime end up on disk.

For another explanation of this idea consult the original blog
story about systemd
.

Socket activation has been available in systemd since its
inception. On Fedora 15 a number of services have been modified to
implement socket activation, including Avahi, D-Bus and rsyslog (to continue with the example above).

systemd’s socket activation is quite comprehensive. Not only classic
sockets are support but related technologies as well:

  • AF_UNIX sockets, in the flavours SOCK_DGRAM, SOCK_STREAM and SOCK_SEQPACKET; both in the filesystem and in the abstract namespace
  • AF_INET sockets, i.e. TCP/IP and UDP/IP; both IPv4 and IPv6
  • Unix named pipes/FIFOs in the filesystem
  • AF_NETLINK sockets, to subscribe to certain kernel features. This
    is currently used by udev, but could be useful for other
    netlink-related services too, such as audit.
  • Certain special files like /proc/kmsg or device nodes like /dev/input/*.
  • POSIX Message Queues

A service capable of socket activation must be able to receive its
preinitialized sockets from systemd, instead of creating them
internally. For most services this requires (minimal)
patching. However, since systemd actually provides inetd compatibility
a service working with inetd will also work with systemd — which is
quite useful for services like sshd for example.

So much about the background of socket activation, let’s now have a
look how to patch a service to make it socket activatable. Let’s start
with a theoretic service foobard. (In a later blog post we’ll focus on
real-life example.)

Our little (theoretic) service includes code like the following for
creating sockets (most services include code like this in one way or
another):

/* Source Code Example #1: ORIGINAL, NOT SOCKET-ACTIVATABLE SERVICE */
...
union {
        struct sockaddr sa;
        struct sockaddr_un un;
} sa;
int fd;

fd = socket(AF_UNIX, SOCK_STREAM, 0);
if (fd < 0) {
        fprintf(stderr, "socket(): %mn");
        exit(1);
}

memset(&sa, 0, sizeof(sa));
sa.un.sun_family = AF_UNIX;
strncpy(sa.un.sun_path, "/run/foobar.sk", sizeof(sa.un.sun_path));

if (bind(fd, &sa.sa, sizeof(sa)) < 0) {
        fprintf(stderr, "bind(): %mn");
        exit(1);
}

if (listen(fd, SOMAXCONN) < 0) {
        fprintf(stderr, "listen(): %mn");
        exit(1);
}
...

A socket activatable service may use the following code instead:

/* Source Code Example #2: UPDATED, SOCKET-ACTIVATABLE SERVICE */
...
#include "sd-daemon.h"
...
int fd;

if (sd_listen_fds(0) != 1) {
        fprintf(stderr, "No or too many file descriptors received.n");
        exit(1);
}

fd = SD_LISTEN_FDS_START + 0;
...

systemd might pass you more than one socket (based on
configuration, see below). In this example we are interested in one
only. sd_listen_fds()
returns how many file descriptors are passed. We simply compare that
with 1, and fail if we got more or less. The file descriptors systemd
passes to us are inherited one after the other beginning with fd
#3. (SD_LISTEN_FDS_START is a macro defined to 3). Our code hence just
takes possession of fd #3.

As you can see this code is actually much shorter than the
original. This of course comes at the price that our little service
with this change will no longer work in a non-socket-activation
environment. With minimal changes we can adapt our example to work nicely
both with and without socket activation:

/* Source Code Example #3: UPDATED, SOCKET-ACTIVATABLE SERVICE WITH COMPATIBILITY */
...
#include "sd-daemon.h"
...
int fd, n;

n = sd_listen_fds(0);
if (n > 1) {
        fprintf(stderr, "Too many file descriptors received.n");
        exit(1);
} else if (n == 1)
        fd = SD_LISTEN_FDS_START + 0;
else {
        union {
                struct sockaddr sa;
                struct sockaddr_un un;
        } sa;

        fd = socket(AF_UNIX, SOCK_STREAM, 0);
        if (fd < 0) {
                fprintf(stderr, "socket(): %mn");
                exit(1);
        }

        memset(&sa, 0, sizeof(sa));
        sa.un.sun_family = AF_UNIX;
        strncpy(sa.un.sun_path, "/run/foobar.sk", sizeof(sa.un.sun_path));

        if (bind(fd, &sa.sa, sizeof(sa)) < 0) {
                fprintf(stderr, "bind(): %mn");
                exit(1);
        }

        if (listen(fd, SOMAXCONN) < 0) {
                fprintf(stderr, "listen(): %mn");
                exit(1);
        }
}
...

With this simple change our service can now make use of socket
activation but still works unmodified in classic environments. Now,
let’s see how we can enable this service in systemd. For this we have
to write two systemd unit files: one describing the socket, the other
describing the service. First, here’s foobar.socket:

[Socket]
ListenStream=/run/foobar.sk

[Install]
WantedBy=sockets.target

And here’s the matching service file foobar.service:

[Service]
ExecStart=/usr/bin/foobard

If we place these two files in /etc/systemd/system we can
enable and start them:

# systemctl enable foobar.socket
# systemctl start foobar.socket

Now our little socket is listening, but our service not running
yet. If we now connect to /run/foobar.sk the service will be
automatically spawned, for on-demand service start-up. With a
modification of foobar.service we can start our service
already at startup, thus using socket activation only for
parallelization purposes, not for on-demand auto-spawning anymore:

[Service]
ExecStart=/usr/bin/foobard

[Install]
WantedBy=multi-user.target

And now let’s enable this too:

# systemctl enable foobar.service
# systemctl start foobar.service

Now our little daemon will be started at boot and on-demand,
whatever comes first. It can be started fully in parallel with its
clients, and when it dies it will be automatically restarted when it
is used the next time.

A single .socket file can include multiple ListenXXX stanzas, which
is useful for services that listen on more than one socket. In this
case all configured sockets will be passed to the service in the exact
order they are configured in the socket unit file. Also,
you may configure various socket settings in the .socket
files.

In real life it’s a good idea to include description strings in
these unit files, to keep things simple we’ll leave this out of our
example. Speaking of real-life: our next installment will cover an
actual real-life example. We’ll add socket activation to the CUPS
printing server.

The sd_listen_fds() function call is defined in sd-daemon.h
and sd-daemon.c. These
two files are currently drop-in .c sources which projects should
simply copy into their source tree. Eventually we plan to turn this
into a proper shared library, however using the drop-in files allows
you to compile your project in a way that is compatible with socket
activation even without any compile time dependencies on
systemd. sd-daemon.c is liberally licensed, should compile
fine on the most exotic Unixes and the algorithms are trivial enough
to be reimplemented with very little code if the license should
nonetheless be a problem for your project. sd-daemon.c
contains a couple of other API functions besides
sd_listen_fds() that are useful when implementing socket
activation in a project. For example, there’s sd_is_socket()
which can be used to distuingish and identify particular sockets when
a service gets passed more than one.

Let me point out that the interfaces used here are in no way bound
directly to systemd. They are generic enough to be implemented in
other systems as well. We deliberately designed them as simple and
minimal as possible to make it possible for others to adopt similar
schemes.

Stay tuned for the next installment. As mentioned, it will cover a
real-life example of turning an existing daemon into a
socket-activatable one: the CUPS printing service. However, I hope
this blog story might already be enough to get you started if you plan
to convert an existing service into a socket activatable one. We
invite everybody to convert upstream projects to this scheme. If you
have any questions join us on #systemd on freenode.

systemd for Developers I

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/socket-activation.html

systemd
not only brings improvements for administrators and users, it also
brings a (small) number of new APIs with it. In this blog story (which might
become the first of a series) I hope to shed some light on one of the
most important new APIs in systemd:

Socket Activation

In the original blog
story about systemd
I tried to explain why socket activation is a
wonderful technology to spawn services. Let’s reiterate the background
here a bit.

The basic idea of socket activation is not new. The inetd
superserver was a standard component of most Linux and Unix systems
since time began: instead of spawning all local Internet services
already at boot, the superserver would listen on behalf of the
services and whenever a connection would come in an instance of the
respective service would be spawned. This allowed relatively weak
machines with few resources to offer a big variety of services at the
same time. However it quickly got a reputation for being somewhat
slow: since daemons would be spawned for each incoming connection a
lot of time was spent on forking and initialization of the services
— once for each connection, instead of once for them all.

Spawning one instance per connection was how inetd was primarily
used, even though inetd actually understood another mode: on the first
incoming connection it would notice this via poll() (or
select()) and spawn a single instance for all future
connections. (This was controllable with the
wait/nowait options.) That way the first connection
would be slow to set up, but subsequent ones would be as fast as with
a standalone service. In this mode inetd would work in a true
on-demand mode: a service would be made available lazily when it was
required.

inetd’s focus was clearly on AF_INET (i.e. Internet) sockets. As
time progressed and Linux/Unix left the server niche and became
increasingly relevant on desktops, mobile and embedded environments
inetd was somehow lost in the troubles of time. Its reputation for
being slow, and the fact that Linux’ focus shifted away from only
Internet servers made a Linux machine running inetd (or one of its newer
implementations, like xinetd) the exception, not the rule.

When Apple engineers worked on optimizing the MacOS boot time they
found a new way to make use of the idea of socket activation: they
shifted the focus away from AF_INET sockets towards AF_UNIX
sockets. And they noticed that on-demand socket activation was only
part of the story: much more powerful is socket activation when used
for all local services including those which need to be started
anyway on boot. They implemented these ideas in launchd, a central building
block of modern MacOS X systems, and probably the main reason why
MacOS is so fast booting up.

But, before we continue, let’s have a closer look what the benefits
of socket activation for non-on-demand, non-Internet services in
detail are. Consider the four services Syslog, D-Bus, Avahi and the
Bluetooth daemon. D-Bus logs to Syslog, hence on traditional Linux
systems it would get started after Syslog. Similarly, Avahi requires
Syslog and D-Bus, hence would get started after both. Finally
Bluetooth is similar to Avahi and also requires Syslog and D-Bus but
does not interface at all with Avahi. Sinceoin a traditional
SysV-based system only one service can be in the process of getting
started at a time, the following serialization of startup would take
place: Syslog → D-Bus → Avahi → Bluetooth (Of course, Avahi and
Bluetooth could be started in the opposite order too, but we have to
pick one here, so let’s simply go alphabetically.). To illustrate
this, here’s a plot showing the order of startup beginning with system
startup (at the top).

Parallelization plot

Certain distributions tried to improve this strictly serialized
start-up: since Avahi and Bluetooth are independent from each other,
they can be started simultaneously. The parallelization is increased,
the overall startup time slightly smaller. (This is visualized in the
middle part of the plot.)

Socket activation makes it possible to start all four services
completely simultaneously, without any kind of ordering. Since the
creation of the listening sockets is moved outside of the daemons
themselves we can start them all at the same time, and they are able
to connect to each other’s sockets right-away. I.e. in a single step
the /dev/log and /run/dbus/system_bus_socket sockets
are created, and in the next step all four services are spawned
simultaneously. When D-Bus then wants to log to syslog, it just writes
its messages to /dev/log. As long as the socket buffer does
not run full it can go on immediately with what else it wants to do
for initialization. As soon as the syslog service catches up it will
process the queued messages. And if the socket buffer runs full then
the client logging will temporarily block until the socket is writable
again, and continue the moment it can write its log messages. That
means the scheduling of our services is entirely done by the kernel:
from the userspace perspective all services are run at the same time,
and when one service cannot keep up the others needing it will
temporarily block on their request but go on as soon as these
requests are dispatched. All of this is completely automatic and
invisible to userspace. Socket activation hence allows us to
drastically parallelize start-up, enabling simultaneous start-up of
services which previously were thought to strictly require
serialization. Most Linux services use sockets as communication
channel. Socket activation allows starting of clients and servers of
these channels at the same time.

But it’s not just about parallelization. It offers a number of
other benefits:

  • We no longer need to configure dependencies explicitly. Since the
    sockets are initialized before all services they are simply available,
    and no userspace ordering of service start-up needs to take place
    anymore. Socket activation hence drastically simplifies configuration
    and development of services.
  • If a service dies its listening socket stays around, not losing a
    single message. After a restart of the crashed service it can continue
    right where it left off.
  • If a service is upgraded we can restart the service while keeping
    around its sockets, thus ensuring the service is continously
    responsive. Not a single connection is lost during the upgrade.
  • We can even replace a service during runtime in a way that is
    invisible to the client. For example, all systems running systemd
    start up with a tiny syslog daemon at boot which passes all log
    messages written to /dev/log on to the kernel message
    buffer. That way we provide reliable userspace logging starting from
    the first instant of boot-up. Then, when the actual rsyslog daemon is
    ready to start we terminate the mini daemon and replace it with the
    real daemon. And all that while keeping around the original logging
    socket and sharing it between the two daemons and not losing a single
    message. Since rsyslog flushes the kernel log buffer to disk after
    start-up all log messages from the kernel, from early-boot and from
    runtime end up on disk.

For another explanation of this idea consult the original blog
story about systemd
.

Socket activation has been available in systemd since its
inception. On Fedora 15 a number of services have been modified to
implement socket activation, including Avahi, D-Bus and rsyslog (to continue with the example above).

systemd’s socket activation is quite comprehensive. Not only classic
sockets are support but related technologies as well:

  • AF_UNIX sockets, in the flavours SOCK_DGRAM, SOCK_STREAM and SOCK_SEQPACKET; both in the filesystem and in the abstract namespace
  • AF_INET sockets, i.e. TCP/IP and UDP/IP; both IPv4 and IPv6
  • Unix named pipes/FIFOs in the filesystem
  • AF_NETLINK sockets, to subscribe to certain kernel features. This
    is currently used by udev, but could be useful for other
    netlink-related services too, such as audit.
  • Certain special files like /proc/kmsg or device nodes like /dev/input/*.
  • POSIX Message Queues

A service capable of socket activation must be able to receive its
preinitialized sockets from systemd, instead of creating them
internally. For most services this requires (minimal)
patching. However, since systemd actually provides inetd compatibility
a service working with inetd will also work with systemd — which is
quite useful for services like sshd for example.

So much about the background of socket activation, let’s now have a
look how to patch a service to make it socket activatable. Let’s start
with a theoretic service foobard. (In a later blog post we’ll focus on
real-life example.)

Our little (theoretic) service includes code like the following for
creating sockets (most services include code like this in one way or
another):

/* Source Code Example #1: ORIGINAL, NOT SOCKET-ACTIVATABLE SERVICE */
...
union {
        struct sockaddr sa;
        struct sockaddr_un un;
} sa;
int fd;

fd = socket(AF_UNIX, SOCK_STREAM, 0);
if (fd < 0) {
        fprintf(stderr, "socket(): %m\n");
        exit(1);
}

memset(&sa, 0, sizeof(sa));
sa.un.sun_family = AF_UNIX;
strncpy(sa.un.sun_path, "/run/foobar.sk", sizeof(sa.un.sun_path));

if (bind(fd, &sa.sa, sizeof(sa)) < 0) {
        fprintf(stderr, "bind(): %m\n");
        exit(1);
}

if (listen(fd, SOMAXCONN) < 0) {
        fprintf(stderr, "listen(): %m\n");
        exit(1);
}
...

A socket activatable service may use the following code instead:

/* Source Code Example #2: UPDATED, SOCKET-ACTIVATABLE SERVICE */
...
#include "sd-daemon.h"
...
int fd;

if (sd_listen_fds(0) != 1) {
        fprintf(stderr, "No or too many file descriptors received.\n");
        exit(1);
}

fd = SD_LISTEN_FDS_START + 0;
...

systemd might pass you more than one socket (based on
configuration, see below). In this example we are interested in one
only. sd_listen_fds()
returns how many file descriptors are passed. We simply compare that
with 1, and fail if we got more or less. The file descriptors systemd
passes to us are inherited one after the other beginning with fd
#3. (SD_LISTEN_FDS_START is a macro defined to 3). Our code hence just
takes possession of fd #3.

As you can see this code is actually much shorter than the
original. This of course comes at the price that our little service
with this change will no longer work in a non-socket-activation
environment. With minimal changes we can adapt our example to work nicely
both with and without socket activation:

/* Source Code Example #3: UPDATED, SOCKET-ACTIVATABLE SERVICE WITH COMPATIBILITY */
...
#include "sd-daemon.h"
...
int fd, n;

n = sd_listen_fds(0);
if (n > 1) {
        fprintf(stderr, "Too many file descriptors received.\n");
        exit(1);
} else if (n == 1)
        fd = SD_LISTEN_FDS_START + 0;
else {
        union {
                struct sockaddr sa;
                struct sockaddr_un un;
        } sa;

        fd = socket(AF_UNIX, SOCK_STREAM, 0);
        if (fd < 0) {
                fprintf(stderr, "socket(): %m\n");
                exit(1);
        }

        memset(&sa, 0, sizeof(sa));
        sa.un.sun_family = AF_UNIX;
        strncpy(sa.un.sun_path, "/run/foobar.sk", sizeof(sa.un.sun_path));

        if (bind(fd, &sa.sa, sizeof(sa)) < 0) {
                fprintf(stderr, "bind(): %m\n");
                exit(1);
        }

        if (listen(fd, SOMAXCONN) < 0) {
                fprintf(stderr, "listen(): %m\n");
                exit(1);
        }
}
...

With this simple change our service can now make use of socket
activation but still works unmodified in classic environments. Now,
let’s see how we can enable this service in systemd. For this we have
to write two systemd unit files: one describing the socket, the other
describing the service. First, here’s foobar.socket:

[Socket]
ListenStream=/run/foobar.sk

[Install]
WantedBy=sockets.target

And here’s the matching service file foobar.service:

[Service]
ExecStart=/usr/bin/foobard

If we place these two files in /etc/systemd/system we can
enable and start them:

# systemctl enable foobar.socket
# systemctl start foobar.socket

Now our little socket is listening, but our service not running
yet. If we now connect to /run/foobar.sk the service will be
automatically spawned, for on-demand service start-up. With a
modification of foobar.service we can start our service
already at startup, thus using socket activation only for
parallelization purposes, not for on-demand auto-spawning anymore:

[Service]
ExecStart=/usr/bin/foobard

[Install]
WantedBy=multi-user.target

And now let’s enable this too:

# systemctl enable foobar.service
# systemctl start foobar.service

Now our little daemon will be started at boot and on-demand,
whatever comes first. It can be started fully in parallel with its
clients, and when it dies it will be automatically restarted when it
is used the next time.

A single .socket file can include multiple ListenXXX stanzas, which
is useful for services that listen on more than one socket. In this
case all configured sockets will be passed to the service in the exact
order they are configured in the socket unit file. Also,
you may configure various socket settings in the .socket
files.

In real life it’s a good idea to include description strings in
these unit files, to keep things simple we’ll leave this out of our
example. Speaking of real-life: our next installment will cover an
actual real-life example. We’ll add socket activation to the CUPS
printing server.

The sd_listen_fds() function call is defined in sd-daemon.h
and sd-daemon.c. These
two files are currently drop-in .c sources which projects should
simply copy into their source tree. Eventually we plan to turn this
into a proper shared library, however using the drop-in files allows
you to compile your project in a way that is compatible with socket
activation even without any compile time dependencies on
systemd. sd-daemon.c is liberally licensed, should compile
fine on the most exotic Unixes and the algorithms are trivial enough
to be reimplemented with very little code if the license should
nonetheless be a problem for your project. sd-daemon.c
contains a couple of other API functions besides
sd_listen_fds() that are useful when implementing socket
activation in a project. For example, there’s sd_is_socket()
which can be used to distuingish and identify particular sockets when
a service gets passed more than one.

Let me point out that the interfaces used here are in no way bound
directly to systemd. They are generic enough to be implemented in
other systems as well. We deliberately designed them as simple and
minimal as possible to make it possible for others to adopt similar
schemes.

Stay tuned for the next installment. As mentioned, it will cover a
real-life example of turning an existing daemon into a
socket-activatable one: the CUPS printing service. However, I hope
this blog story might already be enough to get you started if you plan
to convert an existing service into a socket activatable one. We
invite everybody to convert upstream projects to this scheme. If you
have any questions join us on #systemd on freenode.

systemd for Administrators, Part VII

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/blame-game.html

Here’s yet another installment of my ongoing
series
on
systemd
for
Administrators:

The Blame Game

Fedora 15[1] is the first Fedora release to sport systemd. Our
primary goal for F15 was to get everything integrated and working
well. One focus for Fedora 16 will be to further polish and speed up
what we have in the distribution now. To prepare for this cycle we
have implemented a few tools (which are already available in F15),
which can help us pinpoint where exactly the biggest problems in our
boot-up remain. With this blog story I hope to shed some light on how
to figure out what to blame for your slow boot-up, and what to do
about it. We want to allow you to put the blame where the blame
belongs: on the system component responsible.

The first utility is a very simple one: systemd will automatically
write a log message with the time it needed to syslog/kmsg when it
finished booting up.

systemd[1]: Startup finished in 2s 65ms 924us (kernel) + 2s 828ms 195us (initrd) + 11s 900ms 471us (userspace) = 16s 794ms 590us.

And here’s how you read this: 2s have been spent for kernel
initialization, until the time where the initial RAM disk (initrd,
i.e. dracut) was started. A bit less than 3s have then been spent in
the initrd. Finally, a bit less than 12s have been spent after the
actual system init daemon (systemd) has been invoked by the initrd to
bring up userspace. Summing this up the time that passed since the
boot loader jumped into the kernel code until systemd was finished
doing everything it needed to do at boot was a bit less than 17s. This
number is nice and simple to understand — and also easy to
misunderstand: it does not include the time that is spent initializing
your GNOME session, as that is outside of the scope of the init
system. Also, in many cases this is just where systemd finished doing
everything it needed to do. Very likely some daemons are still busy
doing whatever they need to do to finish startup when this time
is elapsed. Hence: while the time logged here is a good indication on
the general boot speed, it is not the time the user might feel
the boot actually takes.

Also, it is a pretty superficial value: it gives no insight which
system component systemd was waiting for all the time. To break this
up, we introduced the tool systemd-analyze blame:

$ systemd-analyze blame
6207ms udev-settle.service
5228ms [email protected]ervice
735ms NetworkManager.service
642ms avahi-daemon.service
600ms abrtd.service
517ms rtkit-daemon.service
478ms fedora-storage-init.service
396ms dbus.service
390ms rpcidmapd.service
346ms systemd-tmpfiles-setup.service
322ms fedora-sysinit-unhack.service
316ms cups.service
310ms console-kit-log-system-start.service
309ms libvirtd.service
303ms rpcbind.service
298ms ksmtuned.service
288ms lvm2-monitor.service
281ms rpcgssd.service
277ms sshd.service
276ms livesys.service
267ms iscsid.service
236ms mdmonitor.service
234ms nfslock.service
223ms ksm.service
218ms mcelog.service

This tool lists which systemd unit needed how much time to finish
initialization at boot, the worst offenders listed first. What we can
see here is that on this boot two services required more than 1s of
boot time: udev-settle.service and
[email protected]ervice. This
tool’s output is easily misunderstood as well, it does not shed any
light on why the services in question actually need this much time, it
just determines that they did. Also note that the times listed here
might be spent “in parallel”, i.e. two services might be initializing
at the same time and thus the time spent to initialize them both is
much less than the sum of both individual times combined.

Let’s have a closer look at the worst offender on this boot: a
service by the name of udev-settle.service. So why does it
take that much time to initialize, and what can we do about it? This
service actually does very little: it just waits for the device
probing being done by udev to finish and then exits. Device probing
can be slow. In this instance for example, the reason for the device
probing to take more than 6s is the 3G modem built into the machine,
which when not having an inserted SIM card takes this long to respond
to software probe requests. The software probing is part of the logic
that makes ModemManager work and enables NetworkManager to offer easy
3G setup. An obvious reflex might now be to blame ModemManager for
having such a slow prober. But that’s actually ill-directed: hardware
probing quite frequently is this slow, and in the case of ModemManager
it’s a simple fact that the 3G hardware takes this long. It is an
essential requirement for a proper hardware probing solution that
individual probers can take this much time to finish probing. The
actual culprit is something else: the fact that we actually wait for
the probing, in other words: that udev-settle.service is part
of our boot process.

So, why is udev-settle.service part of our boot process?
Well, it actually doesn’t need to be. It is pulled in by the storage
setup logic of Fedora: to be precise, by the LVM, RAID and Multipath
setup script. These storage services have not been implemented in the
way hardware detection and probing work today: they expect to be
initialized at a point in time where “all devices have been probed”,
so that they can simply iterate through the list of available disks
and do their work on it. However, on modern machinery this is not how
things actually work: hardware can come and hardware can go all the
time, during boot and during runtime. For some technologies it is not
even possible to know when the device enumeration is complete
(example: USB, or iSCSI), thus waiting for all storage devices to show
up and be probed must necessarily include a fixed delay when it is
assumed that all devices that can show up have shown up, and got
probed. In this case all this shows very negatively in the boot time: the
storage scripts force us to delay bootup until all potential devices
have shown up and all devices that did got probed — and all that even
though we don’t actually need most devices for anything. In particular
since this machine actually does not make use of LVM, RAID or
Multipath![2]

Knowing what we know now we can go and disable
udev-settle.service for the next boots: since neither LVM,
RAID nor Multipath is used we can mask the services in question and
thus speed up our boot a little:

# ln -s /dev/null /etc/systemd/system/udev-settle.service
# ln -s /dev/null /etc/systemd/system/fedora-wait-storage.service
# ln -s /dev/null /etc/systemd/system/fedora-storage-init.service
# systemctl daemon-reload

After restarting we can measure that the boot is now about 1s
faster. Why just 1s? Well, the second worst offender is cryptsetup
here: the machine in question has an encrypted
/home directory. For testing purposes I have stored the
passphrase in a file on disk, so that the boot-up is not delayed
because I as the user am a slow typer. The cryptsetup tool
unfortunately still takes more han 5s to set up the encrypted
partition. Being lazy instead of trying to fix
cryptsetup[3] we’ll just tape over it here [4]:
systemd will normally wait for all file systems not marked with the
noauto option in /etc/fstab to show up, to be fscked and to
be mounted before proceeding bootup and starting the usual system
services. In the case of /home (unlike for example
/var) we know that it is needed only very late (i.e. when the
user actually logs in). An easy fix is hence to make the mount point
available already during boot, but not actually wait until cryptsetup,
fsck and mount finished running for it. You ask how we can make a
mount point available before actually mounting the file system behind
it? Well, systemd possesses magic powers, in form of the
comment=systemd.automount mount option in
/etc/fstab. If you specify it, systemd will create an
automount point at /home and when at the time of the first
access to the file system it still isn’t backed by a proper file
system systemd will wait for the device, fsck and mount it.

And here’s the result with this change to /etc/fstab
made:

systemd[1]: Startup finished in 2s 47ms 112us (kernel) + 2s 663ms 942us (initrd) + 5s 540ms 522us (userspace) = 10s 251ms 576us.

Nice! With a few fixes we took almost 7s off our boot-time. And
these two changes are only fixes for the two most superficial
problems. With a bit of love and detail work there’s a lot of
additional room for improvements. In fact, on a different machine, a
more than two year old X300 laptop (which even back then wasn’t the
fastest machine on earth) and a bit of decrufting we have boot times
of around 4s (total) now, with a resonably complete GNOME system. And there’s
still a lot of room in it.

systemd-analyze blame is a nice and simple tool for
tracking down slow services. However, it suffers by a big problem: it
does not visualize how the parallel execution of the services actually
diminishes the price one pays for slow starting services. For that we
have prepared systemd-analyize plot for you. Use it like
this:

$ systemd-analyze plot > plot.svg
$ eog plot.svg

It creates pretty graphs, showing the time services spent to start
up in relation to the other services. It currently doesn’t visualize
explicitly which services wait for which ones, but with a bit of guess
work this is easily seen nonetheless.

To see the effect of our two little optimizations here are two
graphs generated with systemd-analyze plot, the first before
and the other after our change:

Before After

(For the sake of completeness, here are the two complete outputs of
systemd-analyze blame for these two boots: before and after.)

The well-informed reader probably wonders how this relates to Michael Meeks’
bootchart
. This plot and bootchart do show similar graphs, that is
true. Bootchart is by far the more powerful tool. It plots in all
detail what is happening during the boot, how much CPU and IO is
used. systemd-analyze plot shows more high-level data: which
service took how much time to initialize, and what needed to wait for
it. If you use them both together you’ll have a wonderful toolset to
figure out why your boot is not as fast as it could be.

Now, before you now take these tools and start filing bugs against
the worst boot-up time offenders on your system: think twice. These
tools give you raw data, don’t misread it. As my optimization example
above hopefully shows, the blame for the slow bootup was not actually
with udev-settle.service, and not with the ModemManager
prober run by it either. It is with the subsystem that pulled this
service in in the first place. And that’s where the problem needs to
be fixed. So, file the bugs at the right places. Put the blame where
the blame belongs.

As mentioned, these three utilities are available on your Fedora 15
system out-of-the-box.

And here’s what to take home from this little blog story:

systemd-analyze is a wonderful tool and systemd comes
with profiling built in.

Don’t misread the data these tools generate!

With two simple changes you might be able to speed up your system
by 7s!

Fix your software if it can’t handle dynamic hardware
properly!

The Fedora default of installing the OS on an enterprise-level
storage managing system might be something to rethink.

And that’s all for now. Thank you for your interest.

Footnotes

[1] Also known as the greatest Free Software OS release
ever.

[2] The right fix here is to improve the services in
question to actively listen to hotplug events via libudev or similar
and act on the devices showing up as they show up, so that we can
continue with the bootup the instant everything we really need to go
on has shown up. To get a quick bootup we should wait for what we
actually need to proceed, not for everything. Also note that the
storage services are not the only services which do not cope well with
modern dynamic hardware, and assume that the device list is static and
stays unchanged. For example, in this example the reason the initrd is
actually as slow as it is is mostly due to the fact that Plymouth
expects to be executed when all video devices have shown up and have
been probed. For an unknown reason (at least unknown to me) loading
the video kernel modules for my Intel graphics cards takes multiple
seconds, and hence the entire boot is delayed unnecessarily. (Here too
I’d not put the blame on the probing but on the fact that we
wait for it to complete before going on.)

[3] Well, to be precise, I actually did try to get this
fixed. Most of the delay of crypsetup stems from the — in my eyes —
unnecessarily high default values for –iter-time in
cryptsetup. I tried to convince our cryptsetup maintainers that 100ms
as a default here are not really less secure than 1s, but well, I
failed.

[4] Of course, it’s usually not our style to just tape over
problems instead of fixing them, but this is such a nice occasion to
show off yet another cool systemd feature…

systemd for Administrators, Part VII

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/blame-game.html

Here’s yet another installment of my ongoing
series
on
systemd
for
Administrators:

The Blame Game

Fedora 15[1] is the first Fedora release to sport systemd. Our
primary goal for F15 was to get everything integrated and working
well. One focus for Fedora 16 will be to further polish and speed up
what we have in the distribution now. To prepare for this cycle we
have implemented a few tools (which are already available in F15),
which can help us pinpoint where exactly the biggest problems in our
boot-up remain. With this blog story I hope to shed some light on how
to figure out what to blame for your slow boot-up, and what to do
about it. We want to allow you to put the blame where the blame
belongs: on the system component responsible.

The first utility is a very simple one: systemd will automatically
write a log message with the time it needed to syslog/kmsg when it
finished booting up.

systemd[1]: Startup finished in 2s 65ms 924us (kernel) + 2s 828ms 195us (initrd) + 11s 900ms 471us (userspace) = 16s 794ms 590us.

And here’s how you read this: 2s have been spent for kernel
initialization, until the time where the initial RAM disk (initrd,
i.e. dracut) was started. A bit less than 3s have then been spent in
the initrd. Finally, a bit less than 12s have been spent after the
actual system init daemon (systemd) has been invoked by the initrd to
bring up userspace. Summing this up the time that passed since the
boot loader jumped into the kernel code until systemd was finished
doing everything it needed to do at boot was a bit less than 17s. This
number is nice and simple to understand — and also easy to
misunderstand: it does not include the time that is spent initializing
your GNOME session, as that is outside of the scope of the init
system. Also, in many cases this is just where systemd finished doing
everything it needed to do. Very likely some daemons are still busy
doing whatever they need to do to finish startup when this time
is elapsed. Hence: while the time logged here is a good indication on
the general boot speed, it is not the time the user might feel
the boot actually takes.

Also, it is a pretty superficial value: it gives no insight which
system component systemd was waiting for all the time. To break this
up, we introduced the tool systemd-analyze blame:

$ systemd-analyze blame
  6207ms udev-settle.service
  5228ms [email protected]ervice
   735ms NetworkManager.service
   642ms avahi-daemon.service
   600ms abrtd.service
   517ms rtkit-daemon.service
   478ms fedora-storage-init.service
   396ms dbus.service
   390ms rpcidmapd.service
   346ms systemd-tmpfiles-setup.service
   322ms fedora-sysinit-unhack.service
   316ms cups.service
   310ms console-kit-log-system-start.service
   309ms libvirtd.service
   303ms rpcbind.service
   298ms ksmtuned.service
   288ms lvm2-monitor.service
   281ms rpcgssd.service
   277ms sshd.service
   276ms livesys.service
   267ms iscsid.service
   236ms mdmonitor.service
   234ms nfslock.service
   223ms ksm.service
   218ms mcelog.service
...

This tool lists which systemd unit needed how much time to finish
initialization at boot, the worst offenders listed first. What we can
see here is that on this boot two services required more than 1s of
boot time: udev-settle.service and
[email protected]ervice. This
tool’s output is easily misunderstood as well, it does not shed any
light on why the services in question actually need this much time, it
just determines that they did. Also note that the times listed here
might be spent “in parallel”, i.e. two services might be initializing
at the same time and thus the time spent to initialize them both is
much less than the sum of both individual times combined.

Let’s have a closer look at the worst offender on this boot: a
service by the name of udev-settle.service. So why does it
take that much time to initialize, and what can we do about it? This
service actually does very little: it just waits for the device
probing being done by udev to finish and then exits. Device probing
can be slow. In this instance for example, the reason for the device
probing to take more than 6s is the 3G modem built into the machine,
which when not having an inserted SIM card takes this long to respond
to software probe requests. The software probing is part of the logic
that makes ModemManager work and enables NetworkManager to offer easy
3G setup. An obvious reflex might now be to blame ModemManager for
having such a slow prober. But that’s actually ill-directed: hardware
probing quite frequently is this slow, and in the case of ModemManager
it’s a simple fact that the 3G hardware takes this long. It is an
essential requirement for a proper hardware probing solution that
individual probers can take this much time to finish probing. The
actual culprit is something else: the fact that we actually wait for
the probing, in other words: that udev-settle.service is part
of our boot process.

So, why is udev-settle.service part of our boot process?
Well, it actually doesn’t need to be. It is pulled in by the storage
setup logic of Fedora: to be precise, by the LVM, RAID and Multipath
setup script. These storage services have not been implemented in the
way hardware detection and probing work today: they expect to be
initialized at a point in time where “all devices have been probed”,
so that they can simply iterate through the list of available disks
and do their work on it. However, on modern machinery this is not how
things actually work: hardware can come and hardware can go all the
time, during boot and during runtime. For some technologies it is not
even possible to know when the device enumeration is complete
(example: USB, or iSCSI), thus waiting for all storage devices to show
up and be probed must necessarily include a fixed delay when it is
assumed that all devices that can show up have shown up, and got
probed. In this case all this shows very negatively in the boot time: the
storage scripts force us to delay bootup until all potential devices
have shown up and all devices that did got probed — and all that even
though we don’t actually need most devices for anything. In particular
since this machine actually does not make use of LVM, RAID or
Multipath![2]

Knowing what we know now we can go and disable
udev-settle.service for the next boots: since neither LVM,
RAID nor Multipath is used we can mask the services in question and
thus speed up our boot a little:

# ln -s /dev/null /etc/systemd/system/udev-settle.service
# ln -s /dev/null /etc/systemd/system/fedora-wait-storage.service
# ln -s /dev/null /etc/systemd/system/fedora-storage-init.service
# systemctl daemon-reload

After restarting we can measure that the boot is now about 1s
faster. Why just 1s? Well, the second worst offender is cryptsetup
here: the machine in question has an encrypted
/home directory. For testing purposes I have stored the
passphrase in a file on disk, so that the boot-up is not delayed
because I as the user am a slow typer. The cryptsetup tool
unfortunately still takes more han 5s to set up the encrypted
partition. Being lazy instead of trying to fix
cryptsetup[3] we’ll just tape over it here [4]:
systemd will normally wait for all file systems not marked with the
noauto option in /etc/fstab to show up, to be fscked and to
be mounted before proceeding bootup and starting the usual system
services. In the case of /home (unlike for example
/var) we know that it is needed only very late (i.e. when the
user actually logs in). An easy fix is hence to make the mount point
available already during boot, but not actually wait until cryptsetup,
fsck and mount finished running for it. You ask how we can make a
mount point available before actually mounting the file system behind
it? Well, systemd possesses magic powers, in form of the
comment=systemd.automount mount option in
/etc/fstab. If you specify it, systemd will create an
automount point at /home and when at the time of the first
access to the file system it still isn’t backed by a proper file
system systemd will wait for the device, fsck and mount it.

And here’s the result with this change to /etc/fstab
made:

systemd[1]: Startup finished in 2s 47ms 112us (kernel) + 2s 663ms 942us (initrd) + 5s 540ms 522us (userspace) = 10s 251ms 576us.

Nice! With a few fixes we took almost 7s off our boot-time. And
these two changes are only fixes for the two most superficial
problems. With a bit of love and detail work there’s a lot of
additional room for improvements. In fact, on a different machine, a
more than two year old X300 laptop (which even back then wasn’t the
fastest machine on earth) and a bit of decrufting we have boot times
of around 4s (total) now, with a resonably complete GNOME system. And there’s
still a lot of room in it.

systemd-analyze blame is a nice and simple tool for
tracking down slow services. However, it suffers by a big problem: it
does not visualize how the parallel execution of the services actually
diminishes the price one pays for slow starting services. For that we
have prepared systemd-analyize plot for you. Use it like
this:

$ systemd-analyze plot > plot.svg
$ eog plot.svg

It creates pretty graphs, showing the time services spent to start
up in relation to the other services. It currently doesn’t visualize
explicitly which services wait for which ones, but with a bit of guess
work this is easily seen nonetheless.

To see the effect of our two little optimizations here are two
graphs generated with systemd-analyze plot, the first before
and the other after our change:

Before After

(For the sake of completeness, here are the two complete outputs of
systemd-analyze blame for these two boots: before and after.)

The well-informed reader probably wonders how this relates to Michael Meeks’
bootchart
. This plot and bootchart do show similar graphs, that is
true. Bootchart is by far the more powerful tool. It plots in all
detail what is happening during the boot, how much CPU and IO is
used. systemd-analyze plot shows more high-level data: which
service took how much time to initialize, and what needed to wait for
it. If you use them both together you’ll have a wonderful toolset to
figure out why your boot is not as fast as it could be.

Now, before you now take these tools and start filing bugs against
the worst boot-up time offenders on your system: think twice. These
tools give you raw data, don’t misread it. As my optimization example
above hopefully shows, the blame for the slow bootup was not actually
with udev-settle.service, and not with the ModemManager
prober run by it either. It is with the subsystem that pulled this
service in in the first place. And that’s where the problem needs to
be fixed. So, file the bugs at the right places. Put the blame where
the blame belongs.

As mentioned, these three utilities are available on your Fedora 15
system out-of-the-box.

And here’s what to take home from this little blog story:

  • systemd-analyze is a wonderful tool and systemd comes
    with profiling built in.
  • Don’t misread the data these tools generate!
  • With two simple changes you might be able to speed up your system
    by 7s!
  • Fix your software if it can’t handle dynamic hardware
    properly!
  • The Fedora default of installing the OS on an enterprise-level
    storage managing system might be something to rethink.

And that’s all for now. Thank you for your interest.

Footnotes

[1] Also known as the greatest Free Software OS release
ever.

[2] The right fix here is to improve the services in
question to actively listen to hotplug events via libudev or similar
and act on the devices showing up as they show up, so that we can
continue with the bootup the instant everything we really need to go
on has shown up. To get a quick bootup we should wait for what we
actually need to proceed, not for everything. Also note that the
storage services are not the only services which do not cope well with
modern dynamic hardware, and assume that the device list is static and
stays unchanged. For example, in this example the reason the initrd is
actually as slow as it is is mostly due to the fact that Plymouth
expects to be executed when all video devices have shown up and have
been probed. For an unknown reason (at least unknown to me) loading
the video kernel modules for my Intel graphics cards takes multiple
seconds, and hence the entire boot is delayed unnecessarily. (Here too
I’d not put the blame on the probing but on the fact that we
wait for it to complete before going on.)

[3] Well, to be precise, I actually did try to get this
fixed. Most of the delay of crypsetup stems from the — in my eyes —
unnecessarily high default values for --iter-time in
cryptsetup. I tried to convince our cryptsetup maintainers that 100ms
as a default here are not really less secure than 1s, but well, I
failed.

[4] Of course, it’s usually not our style to just tape over
problems instead of fixing them, but this is such a nice occasion to
show off yet another cool systemd feature…

systemd for Administrators, Part VII

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/blame-game.html

Here’s yet another installment of my ongoing
series
on
systemd
for
Administrators:

The Blame Game

Fedora 15[1] is the first Fedora release to sport systemd. Our
primary goal for F15 was to get everything integrated and working
well. One focus for Fedora 16 will be to further polish and speed up
what we have in the distribution now. To prepare for this cycle we
have implemented a few tools (which are already available in F15),
which can help us pinpoint where exactly the biggest problems in our
boot-up remain. With this blog story I hope to shed some light on how
to figure out what to blame for your slow boot-up, and what to do
about it. We want to allow you to put the blame where the blame
belongs: on the system component responsible.

The first utility is a very simple one: systemd will automatically
write a log message with the time it needed to syslog/kmsg when it
finished booting up.

systemd[1]: Startup finished in 2s 65ms 924us (kernel) + 2s 828ms 195us (initrd) + 11s 900ms 471us (userspace) = 16s 794ms 590us.

And here’s how you read this: 2s have been spent for kernel
initialization, until the time where the initial RAM disk (initrd,
i.e. dracut) was started. A bit less than 3s have then been spent in
the initrd. Finally, a bit less than 12s have been spent after the
actual system init daemon (systemd) has been invoked by the initrd to
bring up userspace. Summing this up the time that passed since the
boot loader jumped into the kernel code until systemd was finished
doing everything it needed to do at boot was a bit less than 17s. This
number is nice and simple to understand — and also easy to
misunderstand: it does not include the time that is spent initializing
your GNOME session, as that is outside of the scope of the init
system. Also, in many cases this is just where systemd finished doing
everything it needed to do. Very likely some daemons are still busy
doing whatever they need to do to finish startup when this time
is elapsed. Hence: while the time logged here is a good indication on
the general boot speed, it is not the time the user might feel
the boot actually takes.

Also, it is a pretty superficial value: it gives no insight which
system component systemd was waiting for all the time. To break this
up, we introduced the tool systemd-analyze blame:

$ systemd-analyze blame
  6207ms udev-settle.service
  5228ms [email protected]\x2d9899b85d\x2df790\x2d4d2a\x2da650\x2d8b7d2fb92cc3.service
   735ms NetworkManager.service
   642ms avahi-daemon.service
   600ms abrtd.service
   517ms rtkit-daemon.service
   478ms fedora-storage-init.service
   396ms dbus.service
   390ms rpcidmapd.service
   346ms systemd-tmpfiles-setup.service
   322ms fedora-sysinit-unhack.service
   316ms cups.service
   310ms console-kit-log-system-start.service
   309ms libvirtd.service
   303ms rpcbind.service
   298ms ksmtuned.service
   288ms lvm2-monitor.service
   281ms rpcgssd.service
   277ms sshd.service
   276ms livesys.service
   267ms iscsid.service
   236ms mdmonitor.service
   234ms nfslock.service
   223ms ksm.service
   218ms mcelog.service
...

This tool lists which systemd unit needed how much time to finish
initialization at boot, the worst offenders listed first. What we can
see here is that on this boot two services required more than 1s of
boot time: udev-settle.service and
[email protected]\x2d9899b85d\x2df790\x2d4d2a\x2da650\x2d8b7d2fb92cc3.service. This
tool’s output is easily misunderstood as well, it does not shed any
light on why the services in question actually need this much time, it
just determines that they did. Also note that the times listed here
might be spent “in parallel”, i.e. two services might be initializing
at the same time and thus the time spent to initialize them both is
much less than the sum of both individual times combined.

Let’s have a closer look at the worst offender on this boot: a
service by the name of udev-settle.service. So why does it
take that much time to initialize, and what can we do about it? This
service actually does very little: it just waits for the device
probing being done by udev to finish and then exits. Device probing
can be slow. In this instance for example, the reason for the device
probing to take more than 6s is the 3G modem built into the machine,
which when not having an inserted SIM card takes this long to respond
to software probe requests. The software probing is part of the logic
that makes ModemManager work and enables NetworkManager to offer easy
3G setup. An obvious reflex might now be to blame ModemManager for
having such a slow prober. But that’s actually ill-directed: hardware
probing quite frequently is this slow, and in the case of ModemManager
it’s a simple fact that the 3G hardware takes this long. It is an
essential requirement for a proper hardware probing solution that
individual probers can take this much time to finish probing. The
actual culprit is something else: the fact that we actually wait for
the probing, in other words: that udev-settle.service is part
of our boot process.

So, why is udev-settle.service part of our boot process?
Well, it actually doesn’t need to be. It is pulled in by the storage
setup logic of Fedora: to be precise, by the LVM, RAID and Multipath
setup script. These storage services have not been implemented in the
way hardware detection and probing work today: they expect to be
initialized at a point in time where “all devices have been probed”,
so that they can simply iterate through the list of available disks
and do their work on it. However, on modern machinery this is not how
things actually work: hardware can come and hardware can go all the
time, during boot and during runtime. For some technologies it is not
even possible to know when the device enumeration is complete
(example: USB, or iSCSI), thus waiting for all storage devices to show
up and be probed must necessarily include a fixed delay when it is
assumed that all devices that can show up have shown up, and got
probed. In this case all this shows very negatively in the boot time: the
storage scripts force us to delay bootup until all potential devices
have shown up and all devices that did got probed — and all that even
though we don’t actually need most devices for anything. In particular
since this machine actually does not make use of LVM, RAID or
Multipath![2]

Knowing what we know now we can go and disable
udev-settle.service for the next boots: since neither LVM,
RAID nor Multipath is used we can mask the services in question and
thus speed up our boot a little:

# ln -s /dev/null /etc/systemd/system/udev-settle.service
# ln -s /dev/null /etc/systemd/system/fedora-wait-storage.service
# ln -s /dev/null /etc/systemd/system/fedora-storage-init.service
# systemctl daemon-reload

After restarting we can measure that the boot is now about 1s
faster. Why just 1s? Well, the second worst offender is cryptsetup
here: the machine in question has an encrypted
/home directory. For testing purposes I have stored the
passphrase in a file on disk, so that the boot-up is not delayed
because I as the user am a slow typer. The cryptsetup tool
unfortunately still takes more han 5s to set up the encrypted
partition. Being lazy instead of trying to fix
cryptsetup[3] we’ll just tape over it here [4]:
systemd will normally wait for all file systems not marked with the
noauto option in /etc/fstab to show up, to be fscked and to
be mounted before proceeding bootup and starting the usual system
services. In the case of /home (unlike for example
/var) we know that it is needed only very late (i.e. when the
user actually logs in). An easy fix is hence to make the mount point
available already during boot, but not actually wait until cryptsetup,
fsck and mount finished running for it. You ask how we can make a
mount point available before actually mounting the file system behind
it? Well, systemd possesses magic powers, in form of the
comment=systemd.automount mount option in
/etc/fstab. If you specify it, systemd will create an
automount point at /home and when at the time of the first
access to the file system it still isn’t backed by a proper file
system systemd will wait for the device, fsck and mount it.

And here’s the result with this change to /etc/fstab
made:

systemd[1]: Startup finished in 2s 47ms 112us (kernel) + 2s 663ms 942us (initrd) + 5s 540ms 522us (userspace) = 10s 251ms 576us.

Nice! With a few fixes we took almost 7s off our boot-time. And
these two changes are only fixes for the two most superficial
problems. With a bit of love and detail work there’s a lot of
additional room for improvements. In fact, on a different machine, a
more than two year old X300 laptop (which even back then wasn’t the
fastest machine on earth) and a bit of decrufting we have boot times
of around 4s (total) now, with a resonably complete GNOME system. And there’s
still a lot of room in it.

systemd-analyze blame is a nice and simple tool for
tracking down slow services. However, it suffers by a big problem: it
does not visualize how the parallel execution of the services actually
diminishes the price one pays for slow starting services. For that we
have prepared systemd-analyize plot for you. Use it like
this:

$ systemd-analyze plot > plot.svg
$ eog plot.svg

It creates pretty graphs, showing the time services spent to start
up in relation to the other services. It currently doesn’t visualize
explicitly which services wait for which ones, but with a bit of guess
work this is easily seen nonetheless.

To see the effect of our two little optimizations here are two
graphs generated with systemd-analyze plot, the first before
and the other after our change:

Before After

(For the sake of completeness, here are the two complete outputs of
systemd-analyze blame for these two boots: before and after.)

The well-informed reader probably wonders how this relates to Michael Meeks’
bootchart
. This plot and bootchart do show similar graphs, that is
true. Bootchart is by far the more powerful tool. It plots in all
detail what is happening during the boot, how much CPU and IO is
used. systemd-analyze plot shows more high-level data: which
service took how much time to initialize, and what needed to wait for
it. If you use them both together you’ll have a wonderful toolset to
figure out why your boot is not as fast as it could be.

Now, before you now take these tools and start filing bugs against
the worst boot-up time offenders on your system: think twice. These
tools give you raw data, don’t misread it. As my optimization example
above hopefully shows, the blame for the slow bootup was not actually
with udev-settle.service, and not with the ModemManager
prober run by it either. It is with the subsystem that pulled this
service in in the first place. And that’s where the problem needs to
be fixed. So, file the bugs at the right places. Put the blame where
the blame belongs.

As mentioned, these three utilities are available on your Fedora 15
system out-of-the-box.

And here’s what to take home from this little blog story:

  • systemd-analyze is a wonderful tool and systemd comes
    with profiling built in.
  • Don’t misread the data these tools generate!
  • With two simple changes you might be able to speed up your system
    by 7s!
  • Fix your software if it can’t handle dynamic hardware
    properly!
  • The Fedora default of installing the OS on an enterprise-level
    storage managing system might be something to rethink.

And that’s all for now. Thank you for your interest.

Footnotes

[1] Also known as the greatest Free Software OS release
ever.

[2] The right fix here is to improve the services in
question to actively listen to hotplug events via libudev or similar
and act on the devices showing up as they show up, so that we can
continue with the bootup the instant everything we really need to go
on has shown up. To get a quick bootup we should wait for what we
actually need to proceed, not for everything. Also note that the
storage services are not the only services which do not cope well with
modern dynamic hardware, and assume that the device list is static and
stays unchanged. For example, in this example the reason the initrd is
actually as slow as it is is mostly due to the fact that Plymouth
expects to be executed when all video devices have shown up and have
been probed. For an unknown reason (at least unknown to me) loading
the video kernel modules for my Intel graphics cards takes multiple
seconds, and hence the entire boot is delayed unnecessarily. (Here too
I’d not put the blame on the probing but on the fact that we
wait for it to complete before going on.)

[3] Well, to be precise, I actually did try to get this
fixed. Most of the delay of crypsetup stems from the — in my eyes —
unnecessarily high default values for --iter-time in
cryptsetup. I tried to convince our cryptsetup maintainers that 100ms
as a default here are not really less secure than 1s, but well, I
failed.

[4] Of course, it’s usually not our style to just tape over
problems instead of fixing them, but this is such a nice occasion to
show off yet another cool systemd feature…

A Brief Tutorial on a Shared Git Repository

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2011/01/23/git-shared-repository-tutorial.html

A while ago, I set up Git for a group privately sharing the same
central repository. Specifically, this is a tutorial for those who would
want to have a Git setup that is a little bit like a SVN repository: a
central repository that has all the branches that matter published there
in one repository. I found this file today floating in a directory of
“thing I should publish at some point”, so I decided just to
put it up, as every time I came across this file, it reminded me I should
put this up and it’s really morally wrong (IMO) to keep generally useful
technical information private, even when it’s only laziness that’s causing
it.

Before you read this, note that most developers don’t use Git this way,
particularly with the advent of shared
hosting facilities like Gitorious
, as systems like Gitorious solve the
weirdness of problems that this tutorial addresses. When I originally
wrote this (more than a year ago), the only well-known project that I
found using a system like this was Samba; I haven’t seen a lot of other
projects that do this. Indeed, this process is not really what Git is
designed to do, but sometimes groups that are used to SVN expect there to be
a “canonical repository” that has all the contents of the
shared work under one proverbial roof, and set up a “one true Git
repository” for the project from which everyone clones.

Thus, this tutorial is primarily targeted to a user mostly familiar
with an SVN workflow, that has ssh access to
host.example.org that has a writable (usually by multiple people)
Git repository living in the directory
/git/REPOSITORY.git/.

Ultimately, The stuff that I’ve documented herein is basically to fill
in the gaps that I found when reading the following tutorials:

The Git Crash Course for
SVN Users
. (NOTE: some things in that tutorial, where the author says
various commands are equivalent to various svn commands are misleading.
It’d better said: if you do foo in Git, it will feel like you were
using svn and did bar.)

Linux
Developers’ Git Manual

The
Official Git Tutorial
.

A Collection site for Git
Documentation
(which includes some links already here).

The
Samba guys use Git very similarly to the method
discussed in this
tutorial.

markpasc points
out there is no svn cp equivalent in git
.

reinh’s
blog post is the first thing I ever read that explained git push
particularly well
.

So, here’s my tutorial, FWIW. (I apologize that I make the mortal sin
of tutorial writing: I drift wildly between second-person-singular,
first-person-plural, and passive-voice third-person. If someone sends
me a patch to the HTML file that fixes this, I’ll fix it. 🙂

Initial Setup

Before you start using git, you should run these commands to let it
know who you are so your info appears correctly in commit logs:

$ git config –global user.email [email protected]
$ git config –global user.name “Your Real Name”

Examining Your First Clone

To get started, first we clone the repository:

$ git clone ssh://host.example.org/git/REPOSITORY.git/

Now, note that Git almost always operates in the terms of
branches. Unlike Subversion, Git’s branches are first-class citizens and
most operations in Git operate around a branch. The default branch is
often called “master”, although I tend to avoid using the
master branch for much, mainly because everyone who uses git has a
different perception of what the master branch should embody. Therefore,
giving all your branches more descriptive name is helpful. But, when you
first import something into git, (for example, from existing Subversion
trees), everything from Subversion’s trunk is thrown on the master
branch.

So, we take a look at the result of that clone command. We have a new
directory, called REPOSITORY, that contains a “working
checkout&rquo; of the repository, and under that there is one special
directory, REPOSITORY/.git/, which is a full copy of the repository. Note
that this is not like Subversion, where what you have on your local
machine is merely one view of the repository. With Git, you have a full
copy of everything. However, an interesting thing has been done on your
copy with the branches. You can take a look with these commands:

$ git branch
* master
$ git branch -r
origin/HEAD
origin/master

The first list of branches are the branches that are personal and local
to you. (By default, git branch uses the -l option,
which shows you only “local” branches; -r means
“remote” branches. You can also use -a to see all of
them.) Unless you take action to publish your local branches in some way,
they will be your private area to work in and live only on your
computer. (And be aware: they are not backed up unless you back them up!)
The remote ones, that all start with “origin/” track the
progress on the shared repository.

(Note the term “origin” is a standard way of referring to
“the repository from whence you cloned”, and
origin/BRANCH refers to “BRANCH as it looks in the
repository from whence you cloned”. However, there is nothing
magical about the name “origin”. It’s set up to DTRT in your
WORKING-DIRECTORY/.git/config file, and the clone command set it
all up for you, which is why you have them now.)

Get to Work

The canonical way to “get moving” with a new task in Git is
to somehow create a branch for it. Branches are designed to be cheap and
quick to create so that users will not be shy about creating a new one.
Naming conventions are your own, but generally I like to call a
branch USERNAME/TASK when I’m still not sure exactly what I’ll be
doing with it (i.e., who I will publish it to, etc.) You can always merge
it back into another branch, or copy it to another branch (perhaps using a
more formal name) later.

Where do you Start Your Branch From?

Once a repository exists, each branch in the repository comes from
somewhere — it has a parent. These relationships help Git know how
to easily merge branches together. So, the most typical procedure of
starting a new branch of your own is to begin with an existing branch.
The git checkout command is the easiest to use to start this:

git checkout -b USERNAME/feature origin/master

In this example, we’ve created our own local branch, called
USERNAME/feature, and it’s started from the current state
of origin/master. When you are getting started, you will
probably usually want to always base your new branches off of ones that
exist on the origin. This isn’t a rule, it’s just less confusing
for a newbie if all your branches have a parent revision that live on the
server.

Now, it’s important to note here that no branch stands still. It’s
best to think about a branch as a “moving pointer” to a linked
list of some set of revisions in the repository.

Every revision stored in git, local or remote, has a SHA1 which is
computed based on the revisions before it plus new patch the revision just
applied.

Meanwhile, the only two substantive differences between one of these
SHA1 identifiers and an actual branch is that (a) Git keeps changing what
identifier the branch refers to as new commits come in (aka it moves the
branch’s HEAD), and (b) Git keeps track of the history of identifiers the
branch previously referred to.

So, above, when we asked git checkout to creat a new branch called
USERNAME/feature based on origin/master, the two
important things to realize are that (a) your new branch has its HEAD
pointing at the same head that is currently the HEAD of
origin/master, and (b) you got a new list to start adding
revisions in the new branch.

We didn’t have to use branch for that. We could have simply started
our branch from any old SHA1 of any revision. We happened to want to
declare a relationship with the master branch on the server in
this case, but we could have easily picked any SHA1 from our git log and
used that one.

Do Not Fear the checkout

Every time you run a git checkout SOMETHING command, your
entire working directory changes. This normally scares Subversion users;
it certainly scared me the first time I used git checkout
SOMETHING. But, the only reason it is scary is because svn
switch, which is the roughly analogous command in the Subversion
world, so often doesn’t do something sane with your working copy. By
contrast, switching branches and changing your whole working directory is
a common occurrence with git.

Note, however, that you cannot do git checkout with
uncommitted changes in your directory (which, BTW, also makes it safer
than svn switch). However, don’t be too Subversion-user-like and
therefore afraid to commit things. Remember, with Git (and unlike with
Subversion), committing and publishing are two different operations. You
can commit to your heart’s content on local branches and merge or push
into public branches later. (There are even commands to squash many
commits into one before putting it on a public branch, in case you don’t
want people to see all the intermediate goofiness you might have done.
This is why, BTW, many Git users commit as often as an SVN user would save
in their editors.)

However, if you must switch checkouts but really do fear making
commits, there is a tool for you: look into git stash.

Share with the Group

Once you’ve been doing some work, you’ll end up with some useful work
finished on a USERNAME/feature branch. As noted before, this is
your own private branch. You probably want to use the shared repository
to make your work available to others.

When using a shared Git repository, there are two ways to share your
branches with your colleagues. The first procedure is when you simply
want to publish directly on an existing branch. The second is when you
wish to create your own branch.

Publishing to Existing Branch

You may choose to merge your work directly into a known branch on the
remote repository. That’s a viable option, certainly, but often you want
to make it available on a separate branch for others to examine, even
before you merge it into something like the master branch.
We discuss the slightly more complicated new branch publication next, but
for the moment, we can consider the quicker process of publishing to an
existing branch.

Let’s consider when we have work on USERNAME/feature and we
would like to make it available on the master branch. Make sure
your USERNAME/feature branch is clean (i.e., all your changes are
committed).

The first thing you should verify is that you have what I call a
“local tracking branch” (this is my own term that I made up, I
think, you won’t likely see it in other documentation) that is tied
directly with the same name to the origin. This is not completely
necessary, but is much more convenient to keep track of what you are
doing. To check, do a:

$ git branch -a
* USERNAME/feature
master
origin/master

In the list, you should see both master and
origin/master. If you don’t have that, you should create it
with:

$ git checkout -b master origin/master

So, either way, you wan to be on the master branch. To get
there if it already existed, you can run:

$ git checkout master

And you should be able verify that you are now on master with:

$ git branch
* master

Now, we’re ready to merge in our changes:

$ git merge USERNAME/feature
Updating ded2fb3..9b1c0c9
Fast forward
FILE …
N files changed, X insertions(+), Y deletions(-)

If you don’t get any message about conflicts, everything is fine. Your
changes from USERNAME/feature are now on master. Next,
we publish it to the shared repository:

$ git push
Counting objects: N, done.
Compressing objects: 100% (A/A), done.
Writing objects: 100% (A/A), XXX bytes, done.
Total G (delta T), reused 0 (delta 0)
refs/heads/master: IDENTIFIER_X -> IDENTIFIER_Y
To ssh://host.example.org/git/REPOSITORY.git
X..Y master -> master

Your changes can now be seen by others when they git pull (See
below for details).

Publishing to a New Branch

Suppose, what you wanted to instead of immediately putting the feature
on the master branch, you wanted to simply mirror your personal
feature branch to the rest of your colleagues so they can try it out
before it officially becomes part of master. To do that, first,
you need tell Git we want to make a new branch on the shared repository.
In this case, you do have to use the git push command as
well. (It is a catch-all command for any operations you want to do to the
remote repository without actually logging into the server where the
shared Git repository is hosted. Thus, Not surprisingly, nearly any
git push commands you can think of will require you to be
net.connected.)

So, first let’s create a local branch that has the actual name we want
to use publicly. To do this, we’ll just use the checkout command, because
it’s the most convenient and quick way to create a local branch from an
already existing local branch:

$ git branch -l
* USERNAME/feature
master

$ git checkout -b proposed-feature USERNAME/feature
Switched to a new branch “proposed-feature”
$ git branch -l
* proposed-feature
USERNAME/feature
master

Now, again, we’ve only created this branch locally. We need an
equivalent branch on the server, too. This is where git push comes in:

$ git push origin proposed-feature:refs/heads/proposed-feature

Let’s break that command down. The first argument for push is always
“the place you are pushing to”. That can be any sort of git
URL, including ssh://, http://, or git://. However, remember that the
original clone operation set up this shorthand “origin” to
refer to the place from whence we cloned. We’ll use that shorthand here
so we don’t have to type out that big long URL.

The second argument is a colon-separated item. The left hand side is
the local branch we’re pushing from on our local repository, and
the right hand side is the branch we are pushing to on the remote
repository.

(BTW, I have no idea why refs/heads/ is necessary. It seems
you should be able to say proposed-feature:proposed-feature and git would
figure out what you mean. But, in the setups I’ve worked with, it doesn’t
usually work if you don’t put in refs/heads/.)

That operation will take a bit to run, but when it is done we see
something like:

Counting objects: 35, done.
Compressing objects: 100% (31/31), done.
Writing objects: 100% (33/33), 9.44 MiB | 262 KiB/s, done.
Total 33 (delta 1), reused 27 (delta 0)
refs/heads/proposed-feature: 0000000000000000000000000000000000000000
-> CURRENT_HEAD_SHA1_SUM
To ssh://host.example.org/git/REPOSITORY.git/
* [new branch] proposed-feature -> proposed-feature

In older Git clients, you may not see that last line, and you won’t get
the origin/proposed-feature branch until you do a subsequent pull. I
believe newer git clients do the pull automatically for you.

Reconfiguring Your Client to see the New Remote Branch

Annoyingly, as the creator of the branch, we have some extra config
work to do to officially tell our repository copy that these two branches
should be linked. Git didn’t know from our single git push
command that our repository’s relationship with that remote branch was
going to be a long term thing. To marry our local to
origin/proposed-feature to a local branch, we must use the
commands:

$ git config branch.proposed-feature.remote origin
$ git config branch.proposed-feature.merge refs/heads/proposed-feature

We can see that this branch now exists because we find:

$ git branch -a
* proposed-feature
USERNAME/feature
master
origin/HEAD
origin/proposed-feature
origin/master

After this is done, the remote repository has a
proposed-feature branch and, locally, we have a
proposed-feature branch that is a “local tracking
branch” of origin/proposed-feature. Note that
our USERNAME/feature, where all this stuff started from, is
still around too, but can be deleted with:

git branch -d USERNAME/feature

Finding It Elsewhere

Meanwhile, someone else who has separately cloned the repository before
we did this won’t see these changes automatically, but a simple git
pull command can get it:

$ git pull
remote: Generating pack…
remote: Done counting 35 objects.
remote: Result has 33 objects.
remote: Deltifying 33 objects…
remote: 100% (33/33) done
remote: Total 33 (delta 1), reused 27 (delta 0)
Unpacking objects: 100% (33/33), done.
From ssh://host.example.org/git/REPOSITORY.git
* [new branch] proposed-feature -> origin/proposed-feature
Already up-to-date.
$ git branch -a
* master
origin/HEAD
origin/proposed-feature
origin/master

However, their checkout directory won’t be updated to show the changes
until they make a local “mirror” branch to show them the
changes. Usually, this would be done with:

$ git checkout -b proposed-feature origin/proposed-feature

Then they’ll have a working copy with all the data and a local branch
to work on.

BTW, if you want to try this yourself just to see how it works, you can
always make another clone in some other director just to play with, by
doing something like:

$ git clone ssh://host.example.org/git/SOME-REPOSITORY.git/
extra-clone-for-git-didactic-purposes

Now on this secondary checkout (which makes you just like the user who
is not the creator of the new branch), work can be pushed and pulled on
that branch easily. Namely, anything you merge into or commit on your
local proposed-feature branch will automatically be pushed to
origin/proposed-feature on the server when you git push. And,
anything that shows up from other users on the origin/proposed-feature
branch will show up when you do a git pull. These two branches were paired
together from the start.

Irrational Rebased Fears

When using a shared repository like this, it’s generally the case that
git rebase usually screws something up. When Git is used in the
“normal way”, rebase is one of the amazing things about Git.
The rebase idea is: you unwind the entire work you’ve done on one of your
local branches, bringing in changes that other people have made in the
meantime, and then reapply your changes on top of them.

It works out great when you use Git the way the Linux Project does.
However, if you use a single, shared repository in a work group, rebase
can be dangerous.

Generally speaking, though, with a shared repository, you can
use git merge and won’t need rebasing. My usual work flow is
that I get started on a feature with:

$ git checkout -b bkuhn/new-feature starting-branch

I work work work away on it. Then, when it’s ready, I send a patch around
to a mailing list that I generate with:

$ git diff $(git merge-base starting-branch bkuhn/new-feature) bkuhn/new-feature

Note that the thing in the $() returns a single identifier for a
version, namely, the version of the fork point between starting-branch and
bkuhn/new-feature. Therefore, the diff output is just the stuff I’ve
actually changed. This generates all the differences between the place
where I forked and my current work.

Once I have discussed and decided with my co-developers that we like
what I’ve done, I do this:

$ git checkout starting-branch
$ git merge bkuhn/new-feature

If all went well, this should automatically commit my feature into
starting-branch. Usually, there is also an origin/starting-branch, which
I’ve probably set up for automatic push/pull with my local
starting-branch, so I then can make the change officially by running:

$ git push

The fact that I avoid rebase is probably merely FUD, and if I learned
more, I could use it safely in cases with shared repository. But I have
no advice on how to make it work. In
particular, this
Git FAQ entry
shows quite clearly that my work sequence ceases to work
all that well when you do a rebase — namely, doing a git
push becomes more complicated.

I am sure a rebase would easily become very necessary if I lived on
bkuhn/new-feature for a long time and there had been tons of changes
underneath me, but I generally try not to dive to deep into a fork,
although many people love DVCS because they can do just that. YMMV,
etc.

Handy binaries for Thecus NAS boxes

Post Syndicated from Laurie Denness original https://laur.ie/blog/2010/11/handy-binaries-for-thecus-nas-boxes/

I recently took delivery of the rather splendid Thecus N5500 which I love; it’s the perfect mix between “it just works” and “oh, let’s stick SSH on there and poke around”. With 5 hot swap disk shelves, and 2TB hard drives you’ve got a serious amount of storage.

For your money you get a very nice little piece of hardware in a pretty nice shell (it strikes me as a touch tacky in places but then again it’s hardly going on show) with software that gets the job done. NFS, AFP, Samba, iSCSI, iTunes DAAP support, and plenty of modules to tickle your fancy (Logitech Squeezecenter, for instance).

But who am I kidding, I’m a sysadmin. 10 minutes after powering the thing on I was dying to log in using SSH so I could watch /proc/mdstat to see the RAID build. Luckily, the modules from the Thecus N5200 work fine; which means you’re a couple of clicks away from a root terminal.

  1. Grab the SSH and SYSUSER N5200 modules, and unzip them (a mistake I made.. How embarrassing.)
  2. Upload them using the webinterface, and enable them.
  3. SSH to the NAS box using the user “sys” and the password “sys”
  4. Enjoy your shell, and remember to run `passwd sys` to change the password to something else.

Now, you’ve got yourself a pretty handy, albeit it BusyBox-ridden install of Linux. The whole point of this post, is so I can pimp a few statically compiled binaries that might come in useful to you; they did to me anyway.

(You may wish to install the UTILITIES module, which gives you a proper version of top and ps, amongst other things, available here)

You can simply untar and drop the binaries into /raid/data/modules/bin folder so that they’re in your path, and stored on your disks rather than the flash units which are rather limited in space. By the way, these modules should also work fine on the Thecus N5200 NAS boxes too.

The binaries are available here: http://denness.net/thecus/binaries/

The list includes (all the latest versions as of the date of this blog post):

  • ethtool, handy for network interface prodding
  • iftop, a very useful “GUI” app that shows incoming/outgoing network bandwidth (let’s face it, this is fun on a NAS. NOTE: you may need to execute this one using `TERM=vt100; iftop`)
  • iostat, for hard core disk stats porn. Run it with `iostat -mx 1` and watch the megabytes fly
  • rsync, particularly handy if you want to synchronise/backup data from one place to another, so particularly handy on a NAS.
  • vim, just in case you were planning on writing a lot of code on the Thecus 🙂
  • GNU screen, a nice place to store your terminals and detach and come back later. (NOTE: you may need to execute this one using `TERM=vt100; screen`)
  • The command line version of PHP, in case you were planning on writing any scripts in PHP to run on the Thecus.

Any suggestions/comments, let me know.

systemd for Administrators, Part II

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/systemd-for-admins-2.html

Here’s the second installment of my ongoing series about systemd for administrators.

Which Service Owns Which Processes?

On most Linux systems the number of processes that are running by
default is substantial. Knowing which process does what and where it
belongs to becomes increasingly difficult. Some services even maintain
a couple of worker processes which clutter the “ps” output with
many additional processes that are often not easy to recognize. This is
further complicated if daemons spawn arbitrary 3rd-party processes, as
Apache does with CGI processes, or cron does with user jobs.

A slight remedy for this is often the process inheritance tree, as
shown by “ps xaf”. However this is usually not reliable, as processes
whose parents die get reparented to PID 1, and hence all information
about inheritance gets lost. If a process “double forks” it hence loses
its relationships to the processes that started it. (This actually is
supposed to be a feature and is relied on for the traditional Unix
daemonizing logic.) Furthermore processes can freely change their names
with PR_SETNAME or by patching argv[0], thus making
it harder to recognize them. In fact they can play hide-and-seek with
the administrator pretty nicely this way.

In systemd we place every process that is spawned in a control
group named after its service. Control groups (or cgroups)
at their most basic are simply groups of processes that can be
arranged in a hierarchy and labelled individually. When processes
spawn other processes these children are automatically made members of
the parents cgroup. Leaving a cgroup is not possible for unprivileged
processes. Thus, cgroups can be used as an effective way to label
processes after the service they belong to and be sure that the
service cannot escape from the label, regardless how often it forks or
renames itself. Furthermore this can be used to safely kill a service
and all processes it created, again with no chance of escaping.

In today’s installment I want to introduce you to two commands you
may use to relate systemd services and processes. The first one, is
the well known ps command which has been updated to show
cgroup information along the other process details. And this is how it
looks:

$ ps xawf -eo pid,user,cgroup,args
PID USER CGROUP COMMAND
2 root – [kthreadd]
3 root – _ [ksoftirqd/0]
[…]
4281 root – _ [flush-8:0]
1 root name=systemd:/systemd-1 /sbin/init
455 root name=systemd:/systemd-1/sysinit.service /sbin/udevd -d
28188 root name=systemd:/systemd-1/sysinit.service _ /sbin/udevd -d
28191 root name=systemd:/systemd-1/sysinit.service _ /sbin/udevd -d
1096 dbus name=systemd:/systemd-1/dbus.service /bin/dbus-daemon –system –address=systemd: –nofork –systemd-activation
1131 root name=systemd:/systemd-1/auditd.service auditd
1133 root name=systemd:/systemd-1/auditd.service _ /sbin/audispd
1135 root name=systemd:/systemd-1/auditd.service _ /usr/sbin/sedispatch
1171 root name=systemd:/systemd-1/NetworkManager.service /usr/sbin/NetworkManager –no-daemon
4028 root name=systemd:/systemd-1/NetworkManager.service _ /sbin/dhclient -d -4 -sf /usr/libexec/nm-dhcp-client.action -pf /var/run/dhclient-wlan0.pid -lf /var/lib/dhclient/dhclient-7d32a784-ede9-4cf6-9ee3-60edc0bce5ff-wlan0.lease –
1175 avahi name=systemd:/systemd-1/avahi-daemon.service avahi-daemon: running [epsilon.local]
1194 avahi name=systemd:/systemd-1/avahi-daemon.service _ avahi-daemon: chroot helper
1193 root name=systemd:/systemd-1/rsyslog.service /sbin/rsyslogd -c 4
1195 root name=systemd:/systemd-1/cups.service cupsd -C /etc/cups/cupsd.conf
1207 root name=systemd:/systemd-1/mdmonitor.service mdadm –monitor –scan -f –pid-file=/var/run/mdadm/mdadm.pid
1210 root name=systemd:/systemd-1/irqbalance.service irqbalance
1216 root name=systemd:/systemd-1/dbus.service /usr/sbin/modem-manager
1219 root name=systemd:/systemd-1/dbus.service /usr/libexec/polkit-1/polkitd
1242 root name=systemd:/systemd-1/dbus.service /usr/sbin/wpa_supplicant -c /etc/wpa_supplicant/wpa_supplicant.conf -B -u -f /var/log/wpa_supplicant.log -P /var/run/wpa_supplicant.pid
1249 68 name=systemd:/systemd-1/haldaemon.service hald
1250 root name=systemd:/systemd-1/haldaemon.service _ hald-runner
1273 root name=systemd:/systemd-1/haldaemon.service _ hald-addon-input: Listening on /dev/input/event3 /dev/input/event9 /dev/input/event1 /dev/input/event7 /dev/input/event2 /dev/input/event0 /dev/input/event8
1275 root name=systemd:/systemd-1/haldaemon.service _ /usr/libexec/hald-addon-rfkill-killswitch
1284 root name=systemd:/systemd-1/haldaemon.service _ /usr/libexec/hald-addon-leds
1285 root name=systemd:/systemd-1/haldaemon.service _ /usr/libexec/hald-addon-generic-backlight
1287 68 name=systemd:/systemd-1/haldaemon.service _ /usr/libexec/hald-addon-acpi
1317 root name=systemd:/systemd-1/abrtd.service /usr/sbin/abrtd -d -s
1332 root name=systemd:/systemd-1/[email protected]/tty2 /sbin/mingetty tty2
1339 root name=systemd:/systemd-1/[email protected]/tty3 /sbin/mingetty tty3
1342 root name=systemd:/systemd-1/[email protected]/tty5 /sbin/mingetty tty5
1343 root name=systemd:/systemd-1/[email protected]/tty4 /sbin/mingetty tty4
1344 root name=systemd:/systemd-1/crond.service crond
1346 root name=systemd:/systemd-1/[email protected]/tty6 /sbin/mingetty tty6
1362 root name=systemd:/systemd-1/sshd.service /usr/sbin/sshd
1376 root name=systemd:/systemd-1/prefdm.service /usr/sbin/gdm-binary -nodaemon
1391 root name=systemd:/systemd-1/prefdm.service _ /usr/libexec/gdm-simple-slave –display-id /org/gnome/DisplayManager/Display1 –force-active-vt
1394 root name=systemd:/systemd-1/prefdm.service _ /usr/bin/Xorg :0 -nr -verbose -auth /var/run/gdm/auth-for-gdm-f2KUOh/database -nolisten tcp vt1
1495 root name=systemd:/user/lennart/1 _ pam: gdm-password
1521 lennart name=systemd:/user/lennart/1 _ gnome-session
1621 lennart name=systemd:/user/lennart/1 _ metacity
1635 lennart name=systemd:/user/lennart/1 _ gnome-panel
1638 lennart name=systemd:/user/lennart/1 _ nautilus
1640 lennart name=systemd:/user/lennart/1 _ /usr/libexec/polkit-gnome-authentication-agent-1
1641 lennart name=systemd:/user/lennart/1 _ /usr/bin/seapplet
1644 lennart name=systemd:/user/lennart/1 _ gnome-volume-control-applet
1646 lennart name=systemd:/user/lennart/1 _ /usr/sbin/restorecond -u
1652 lennart name=systemd:/user/lennart/1 _ /usr/bin/devilspie
1662 lennart name=systemd:/user/lennart/1 _ nm-applet –sm-disable
1664 lennart name=systemd:/user/lennart/1 _ gnome-power-manager
1665 lennart name=systemd:/user/lennart/1 _ /usr/libexec/gdu-notification-daemon
1670 lennart name=systemd:/user/lennart/1 _ /usr/libexec/evolution/2.32/evolution-alarm-notify
1672 lennart name=systemd:/user/lennart/1 _ /usr/bin/python /usr/share/system-config-printer/applet.py
1674 lennart name=systemd:/user/lennart/1 _ /usr/lib64/deja-dup/deja-dup-monitor
1675 lennart name=systemd:/user/lennart/1 _ abrt-applet
1677 lennart name=systemd:/user/lennart/1 _ bluetooth-applet
1678 lennart name=systemd:/user/lennart/1 _ gpk-update-icon
1408 root name=systemd:/systemd-1/console-kit-daemon.service /usr/sbin/console-kit-daemon –no-daemon
1419 gdm name=systemd:/systemd-1/prefdm.service /usr/bin/dbus-launch –exit-with-session
1453 root name=systemd:/systemd-1/dbus.service /usr/libexec/upowerd
1473 rtkit name=systemd:/systemd-1/rtkit-daemon.service /usr/libexec/rtkit-daemon
1496 root name=systemd:/systemd-1/accounts-daemon.service /usr/libexec/accounts-daemon
1499 root name=systemd:/systemd-1/systemd-logger.service /lib/systemd/systemd-logger
1511 lennart name=systemd:/systemd-1/prefdm.service /usr/bin/gnome-keyring-daemon –daemonize –login
1534 lennart name=systemd:/user/lennart/1 dbus-launch –sh-syntax –exit-with-session
1535 lennart name=systemd:/user/lennart/1 /bin/dbus-daemon –fork –print-pid 5 –print-address 7 –session
1603 lennart name=systemd:/user/lennart/1 /usr/libexec/gconfd-2
1612 lennart name=systemd:/user/lennart/1 /usr/libexec/gnome-settings-daemon
1615 lennart name=systemd:/user/lennart/1 /usr/libexec/gvfsd
1626 lennart name=systemd:/user/lennart/1 /usr/libexec//gvfs-fuse-daemon /home/lennart/.gvfs
1634 lennart name=systemd:/user/lennart/1 /usr/bin/pulseaudio –start –log-target=syslog
1649 lennart name=systemd:/user/lennart/1 _ /usr/libexec/pulse/gconf-helper
1645 lennart name=systemd:/user/lennart/1 /usr/libexec/bonobo-activation-server –ac-activate –ior-output-fd=24
1668 lennart name=systemd:/user/lennart/1 /usr/libexec/im-settings-daemon
1701 lennart name=systemd:/user/lennart/1 /usr/libexec/gvfs-gdu-volume-monitor
1707 lennart name=systemd:/user/lennart/1 /usr/bin/gnote –panel-applet –oaf-activate-iid=OAFIID:GnoteApplet_Factory –oaf-ior-fd=22
1725 lennart name=systemd:/user/lennart/1 /usr/libexec/clock-applet
1727 lennart name=systemd:/user/lennart/1 /usr/libexec/wnck-applet
1729 lennart name=systemd:/user/lennart/1 /usr/libexec/notification-area-applet
1733 root name=systemd:/systemd-1/dbus.service /usr/libexec/udisks-daemon
1747 root name=systemd:/systemd-1/dbus.service _ udisks-daemon: polling /dev/sr0
1759 lennart name=systemd:/user/lennart/1 gnome-screensaver
1780 lennart name=systemd:/user/lennart/1 /usr/libexec/gvfsd-trash –spawner :1.9 /org/gtk/gvfs/exec_spaw/0
1864 lennart name=systemd:/user/lennart/1 /usr/libexec/gvfs-afc-volume-monitor
1874 lennart name=systemd:/user/lennart/1 /usr/libexec/gconf-im-settings-daemon
1903 lennart name=systemd:/user/lennart/1 /usr/libexec/gvfsd-burn –spawner :1.9 /org/gtk/gvfs/exec_spaw/1
1909 lennart name=systemd:/user/lennart/1 gnome-terminal
1913 lennart name=systemd:/user/lennart/1 _ gnome-pty-helper
1914 lennart name=systemd:/user/lennart/1 _ bash
29231 lennart name=systemd:/user/lennart/1 | _ ssh tango
2221 lennart name=systemd:/user/lennart/1 _ bash
4193 lennart name=systemd:/user/lennart/1 | _ ssh tango
2461 lennart name=systemd:/user/lennart/1 _ bash
29219 lennart name=systemd:/user/lennart/1 | _ emacs systemd-for-admins-1.txt
15113 lennart name=systemd:/user/lennart/1 _ bash
27251 lennart name=systemd:/user/lennart/1 _ empathy
29504 lennart name=systemd:/user/lennart/1 _ ps xawf -eo pid,user,cgroup,args
1968 lennart name=systemd:/user/lennart/1 ssh-agent
1994 lennart name=systemd:/user/lennart/1 gpg-agent –daemon –write-env-file
18679 lennart name=systemd:/user/lennart/1 /bin/sh /usr/lib64/firefox-3.6/run-mozilla.sh /usr/lib64/firefox-3.6/firefox
18741 lennart name=systemd:/user/lennart/1 _ /usr/lib64/firefox-3.6/firefox
28900 lennart name=systemd:/user/lennart/1 _ /usr/lib64/nspluginwrapper/npviewer.bin –plugin /usr/lib64/mozilla/plugins/libflashplayer.so –connection /org/wrapper/NSPlugins/libflashplayer.so/18741-6
4016 root name=systemd:/systemd-1/sysinit.service /usr/sbin/bluetoothd –udev
4094 smmsp name=systemd:/systemd-1/sendmail.service sendmail: Queue [email protected]:00:00 for /var/spool/clientmqueue
4096 root name=systemd:/systemd-1/sendmail.service sendmail: accepting connections
4112 ntp name=systemd:/systemd-1/ntpd.service /usr/sbin/ntpd -n -u ntp:ntp -g
27262 lennart name=systemd:/user/lennart/1 /usr/libexec/mission-control-5
27265 lennart name=systemd:/user/lennart/1 /usr/libexec/telepathy-haze
27268 lennart name=systemd:/user/lennart/1 /usr/libexec/telepathy-logger
27270 lennart name=systemd:/user/lennart/1 /usr/libexec/dconf-service
27280 lennart name=systemd:/user/lennart/1 /usr/libexec/notification-daemon
27284 lennart name=systemd:/user/lennart/1 /usr/libexec/telepathy-gabble
27285 lennart name=systemd:/user/lennart/1 /usr/libexec/telepathy-salut
27297 lennart name=systemd:/user/lennart/1 /usr/libexec/geoclue-yahoo

(Note that this output is shortened, I have removed most of the
kernel threads here, since they are not relevant in the context of
this blog story)

In the third column you see the cgroup systemd assigned to each
process. You’ll find that the udev processes are in the
name=systemd:/systemd-1/sysinit.service cgroup, which is
where systemd places all processes started by the
sysinit.service service, which covers early boot.

My personal recommendation is to set the shell alias psc
to the ps command line shown above:

alias psc=’ps xawf -eo pid,user,cgroup,args’

With this service information of processes is just four keypresses
away!

A different way to present the same information is the
systemd-cgls tool we ship with systemd. It shows the cgroup
hierarchy in a pretty tree. Its output looks like this:

$ systemd-cgls
+ 2 [kthreadd]
[…]
+ 4281 [flush-8:0]
+ user
| lennart
| 1
| + 1495 pam: gdm-password
| + 1521 gnome-session
| + 1534 dbus-launch –sh-syntax –exit-with-session
| + 1535 /bin/dbus-daemon –fork –print-pid 5 –print-address 7 –session
| + 1603 /usr/libexec/gconfd-2
| + 1612 /usr/libexec/gnome-settings-daemon
| + 1615 /ushr/libexec/gvfsd
| + 1621 metacity
| + 1626 /usr/libexec//gvfs-fuse-daemon /home/lennart/.gvfs
| + 1634 /usr/bin/pulseaudio –start –log-target=syslog
| + 1635 gnome-panel
| + 1638 nautilus
| + 1640 /usr/libexec/polkit-gnome-authentication-agent-1
| + 1641 /usr/bin/seapplet
| + 1644 gnome-volume-control-applet
| + 1645 /usr/libexec/bonobo-activation-server –ac-activate –ior-output-fd=24
| + 1646 /usr/sbin/restorecond -u
| + 1649 /usr/libexec/pulse/gconf-helper
| + 1652 /usr/bin/devilspie
| + 1662 nm-applet –sm-disable
| + 1664 gnome-power-manager
| + 1665 /usr/libexec/gdu-notification-daemon
| + 1668 /usr/libexec/im-settings-daemon
| + 1670 /usr/libexec/evolution/2.32/evolution-alarm-notify
| + 1672 /usr/bin/python /usr/share/system-config-printer/applet.py
| + 1674 /usr/lib64/deja-dup/deja-dup-monitor
| + 1675 abrt-applet
| + 1677 bluetooth-applet
| + 1678 gpk-update-icon
| + 1701 /usr/libexec/gvfs-gdu-volume-monitor
| + 1707 /usr/bin/gnote –panel-applet –oaf-activate-iid=OAFIID:GnoteApplet_Factory –oaf-ior-fd=22
| + 1725 /usr/libexec/clock-applet
| + 1727 /usr/libexec/wnck-applet
| + 1729 /usr/libexec/notification-area-applet
| + 1759 gnome-screensaver
| + 1780 /usr/libexec/gvfsd-trash –spawner :1.9 /org/gtk/gvfs/exec_spaw/0
| + 1864 /usr/libexec/gvfs-afc-volume-monitor
| + 1874 /usr/libexec/gconf-im-settings-daemon
| + 1882 /usr/libexec/gvfs-gphoto2-volume-monitor
| + 1903 /usr/libexec/gvfsd-burn –spawner :1.9 /org/gtk/gvfs/exec_spaw/1
| + 1909 gnome-terminal
| + 1913 gnome-pty-helper
| + 1914 bash
| + 1968 ssh-agent
| + 1994 gpg-agent –daemon –write-env-file
| + 2221 bash
| + 2461 bash
| + 4193 ssh tango
| + 15113 bash
| + 18679 /bin/sh /usr/lib64/firefox-3.6/run-mozilla.sh /usr/lib64/firefox-3.6/firefox
| + 18741 /usr/lib64/firefox-3.6/firefox
| + 27251 empathy
| + 27262 /usr/libexec/mission-control-5
| + 27265 /usr/libexec/telepathy-haze
| + 27268 /usr/libexec/telepathy-logger
| + 27270 /usr/libexec/dconf-service
| + 27280 /usr/libexec/notification-daemon
| + 27284 /usr/libexec/telepathy-gabble
| + 27285 /usr/libexec/telepathy-salut
| + 27297 /usr/libexec/geoclue-yahoo
| + 28900 /usr/lib64/nspluginwrapper/npviewer.bin –plugin /usr/lib64/mozilla/plugins/libflashplayer.so –connection /org/wrapper/NSPlugins/libflashplayer.so/18741-6
| + 29219 emacs systemd-for-admins-1.txt
| + 29231 ssh tango
| 29519 systemd-cgls
systemd-1
+ 1 /sbin/init
+ ntpd.service
| 4112 /usr/sbin/ntpd -n -u ntp:ntp -g
+ systemd-logger.service
| 1499 /lib/systemd/systemd-logger
+ accounts-daemon.service
| 1496 /usr/libexec/accounts-daemon
+ rtkit-daemon.service
| 1473 /usr/libexec/rtkit-daemon
+ console-kit-daemon.service
| 1408 /usr/sbin/console-kit-daemon –no-daemon
+ prefdm.service
| + 1376 /usr/sbin/gdm-binary -nodaemon
| + 1391 /usr/libexec/gdm-simple-slave –display-id /org/gnome/DisplayManager/Display1 –force-active-vt
| + 1394 /usr/bin/Xorg :0 -nr -verbose -auth /var/run/gdm/auth-for-gdm-f2KUOh/database -nolisten tcp vt1
| + 1419 /usr/bin/dbus-launch –exit-with-session
| 1511 /usr/bin/gnome-keyring-daemon –daemonize –login
+ [email protected]
| + tty6
| | 1346 /sbin/mingetty tty6
| + tty4
| | 1343 /sbin/mingetty tty4
| + tty5
| | 1342 /sbin/mingetty tty5
| + tty3
| | 1339 /sbin/mingetty tty3
| tty2
| 1332 /sbin/mingetty tty2
+ abrtd.service
| 1317 /usr/sbin/abrtd -d -s
+ crond.service
| 1344 crond
+ sshd.service
| 1362 /usr/sbin/sshd
+ sendmail.service
| + 4094 sendmail: Queue [email protected]:00:00 for /var/spool/clientmqueue
| 4096 sendmail: accepting connections
+ haldaemon.service
| + 1249 hald
| + 1250 hald-runner
| + 1273 hald-addon-input: Listening on /dev/input/event3 /dev/input/event9 /dev/input/event1 /dev/input/event7 /dev/input/event2 /dev/input/event0 /dev/input/event8
| + 1275 /usr/libexec/hald-addon-rfkill-killswitch
| + 1284 /usr/libexec/hald-addon-leds
| + 1285 /usr/libexec/hald-addon-generic-backlight
| 1287 /usr/libexec/hald-addon-acpi
+ irqbalance.service
| 1210 irqbalance
+ avahi-daemon.service
| + 1175 avahi-daemon: running [epsilon.local]
+ NetworkManager.service
| + 1171 /usr/sbin/NetworkManager –no-daemon
| 4028 /sbin/dhclient -d -4 -sf /usr/libexec/nm-dhcp-client.action -pf /var/run/dhclient-wlan0.pid -lf /var/lib/dhclient/dhclient-7d32a784-ede9-4cf6-9ee3-60edc0bce5ff-wlan0.lease -cf /var/run/nm-dhclient-wlan0.conf wlan0
+ rsyslog.service
| 1193 /sbin/rsyslogd -c 4
+ mdmonitor.service
| 1207 mdadm –monitor –scan -f –pid-file=/var/run/mdadm/mdadm.pid
+ cups.service
| 1195 cupsd -C /etc/cups/cupsd.conf
+ auditd.service
| + 1131 auditd
| + 1133 /sbin/audispd
| 1135 /usr/sbin/sedispatch
+ dbus.service
| + 1096 /bin/dbus-daemon –system –address=systemd: –nofork –systemd-activation
| + 1216 /usr/sbin/modem-manager
| + 1219 /usr/libexec/polkit-1/polkitd
| + 1242 /usr/sbin/wpa_supplicant -c /etc/wpa_supplicant/wpa_supplicant.conf -B -u -f /var/log/wpa_supplicant.log -P /var/run/wpa_supplicant.pid
| + 1453 /usr/libexec/upowerd
| + 1733 /usr/libexec/udisks-daemon
| + 1747 udisks-daemon: polling /dev/sr0
| 29509 /usr/libexec/packagekitd
+ dev-mqueue.mount
+ dev-hugepages.mount
sysinit.service
+ 455 /sbin/udevd -d
+ 4016 /usr/sbin/bluetoothd –udev
+ 28188 /sbin/udevd -d
28191 /sbin/udevd -d

(This too is shortened, the same way)

As you can see, this command shows the processes by their cgroup
and hence service, as systemd labels the cgroups after the
services. For example, you can easily see that the auditing service
auditd.service spawns three individual processes,
auditd, audisp and sedispatch.

If you look closely you will notice that a number of processes have
been assigned to the cgroup /user/1. At this point let’s
simply leave it at that systemd not only maintains services in cgroups,
but user session processes as well. In a later installment we’ll discuss in
more detail what this about.

So much for now, come back soon for the next installment!

systemd for Administrators, Part II

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/systemd-for-admins-2.html

Here’s the second installment of my ongoing series about systemd for administrators.

Which Service Owns Which Processes?

On most Linux systems the number of processes that are running by
default is substantial. Knowing which process does what and where it
belongs to becomes increasingly difficult. Some services even maintain
a couple of worker processes which clutter the “ps” output with
many additional processes that are often not easy to recognize. This is
further complicated if daemons spawn arbitrary 3rd-party processes, as
Apache does with CGI processes, or cron does with user jobs.

A slight remedy for this is often the process inheritance tree, as
shown by “ps xaf“. However this is usually not reliable, as processes
whose parents die get reparented to PID 1, and hence all information
about inheritance gets lost. If a process “double forks” it hence loses
its relationships to the processes that started it. (This actually is
supposed to be a feature and is relied on for the traditional Unix
daemonizing logic.) Furthermore processes can freely change their names
with PR_SETNAME or by patching argv[0], thus making
it harder to recognize them. In fact they can play hide-and-seek with
the administrator pretty nicely this way.

In systemd we place every process that is spawned in a control
group
named after its service. Control groups (or cgroups)
at their most basic are simply groups of processes that can be
arranged in a hierarchy and labelled individually. When processes
spawn other processes these children are automatically made members of
the parents cgroup. Leaving a cgroup is not possible for unprivileged
processes. Thus, cgroups can be used as an effective way to label
processes after the service they belong to and be sure that the
service cannot escape from the label, regardless how often it forks or
renames itself. Furthermore this can be used to safely kill a service
and all processes it created, again with no chance of escaping.

In today’s installment I want to introduce you to two commands you
may use to relate systemd services and processes. The first one, is
the well known ps command which has been updated to show
cgroup information along the other process details. And this is how it
looks:

$ ps xawf -eo pid,user,cgroup,args
  PID USER     CGROUP                              COMMAND
    2 root     -                                   [kthreadd]
    3 root     -                                    \_ [ksoftirqd/0]
[...]
 4281 root     -                                    \_ [flush-8:0]
    1 root     name=systemd:/systemd-1             /sbin/init
  455 root     name=systemd:/systemd-1/sysinit.service /sbin/udevd -d
28188 root     name=systemd:/systemd-1/sysinit.service  \_ /sbin/udevd -d
28191 root     name=systemd:/systemd-1/sysinit.service  \_ /sbin/udevd -d
 1096 dbus     name=systemd:/systemd-1/dbus.service /bin/dbus-daemon --system --address=systemd: --nofork --systemd-activation
 1131 root     name=systemd:/systemd-1/auditd.service auditd
 1133 root     name=systemd:/systemd-1/auditd.service  \_ /sbin/audispd
 1135 root     name=systemd:/systemd-1/auditd.service      \_ /usr/sbin/sedispatch
 1171 root     name=systemd:/systemd-1/NetworkManager.service /usr/sbin/NetworkManager --no-daemon
 4028 root     name=systemd:/systemd-1/NetworkManager.service  \_ /sbin/dhclient -d -4 -sf /usr/libexec/nm-dhcp-client.action -pf /var/run/dhclient-wlan0.pid -lf /var/lib/dhclient/dhclient-7d32a784-ede9-4cf6-9ee3-60edc0bce5ff-wlan0.lease -
 1175 avahi    name=systemd:/systemd-1/avahi-daemon.service avahi-daemon: running [epsilon.local]
 1194 avahi    name=systemd:/systemd-1/avahi-daemon.service  \_ avahi-daemon: chroot helper
 1193 root     name=systemd:/systemd-1/rsyslog.service /sbin/rsyslogd -c 4
 1195 root     name=systemd:/systemd-1/cups.service cupsd -C /etc/cups/cupsd.conf
 1207 root     name=systemd:/systemd-1/mdmonitor.service mdadm --monitor --scan -f --pid-file=/var/run/mdadm/mdadm.pid
 1210 root     name=systemd:/systemd-1/irqbalance.service irqbalance
 1216 root     name=systemd:/systemd-1/dbus.service /usr/sbin/modem-manager
 1219 root     name=systemd:/systemd-1/dbus.service /usr/libexec/polkit-1/polkitd
 1242 root     name=systemd:/systemd-1/dbus.service /usr/sbin/wpa_supplicant -c /etc/wpa_supplicant/wpa_supplicant.conf -B -u -f /var/log/wpa_supplicant.log -P /var/run/wpa_supplicant.pid
 1249 68       name=systemd:/systemd-1/haldaemon.service hald
 1250 root     name=systemd:/systemd-1/haldaemon.service  \_ hald-runner
 1273 root     name=systemd:/systemd-1/haldaemon.service      \_ hald-addon-input: Listening on /dev/input/event3 /dev/input/event9 /dev/input/event1 /dev/input/event7 /dev/input/event2 /dev/input/event0 /dev/input/event8
 1275 root     name=systemd:/systemd-1/haldaemon.service      \_ /usr/libexec/hald-addon-rfkill-killswitch
 1284 root     name=systemd:/systemd-1/haldaemon.service      \_ /usr/libexec/hald-addon-leds
 1285 root     name=systemd:/systemd-1/haldaemon.service      \_ /usr/libexec/hald-addon-generic-backlight
 1287 68       name=systemd:/systemd-1/haldaemon.service      \_ /usr/libexec/hald-addon-acpi
 1317 root     name=systemd:/systemd-1/abrtd.service /usr/sbin/abrtd -d -s
 1332 root     name=systemd:/systemd-1/[email protected]/tty2 /sbin/mingetty tty2
 1339 root     name=systemd:/systemd-1/[email protected]/tty3 /sbin/mingetty tty3
 1342 root     name=systemd:/systemd-1/[email protected]/tty5 /sbin/mingetty tty5
 1343 root     name=systemd:/systemd-1/[email protected]/tty4 /sbin/mingetty tty4
 1344 root     name=systemd:/systemd-1/crond.service crond
 1346 root     name=systemd:/systemd-1/[email protected]/tty6 /sbin/mingetty tty6
 1362 root     name=systemd:/systemd-1/sshd.service /usr/sbin/sshd
 1376 root     name=systemd:/systemd-1/prefdm.service /usr/sbin/gdm-binary -nodaemon
 1391 root     name=systemd:/systemd-1/prefdm.service  \_ /usr/libexec/gdm-simple-slave --display-id /org/gnome/DisplayManager/Display1 --force-active-vt
 1394 root     name=systemd:/systemd-1/prefdm.service      \_ /usr/bin/Xorg :0 -nr -verbose -auth /var/run/gdm/auth-for-gdm-f2KUOh/database -nolisten tcp vt1
 1495 root     name=systemd:/user/lennart/1             \_ pam: gdm-password
 1521 lennart  name=systemd:/user/lennart/1                 \_ gnome-session
 1621 lennart  name=systemd:/user/lennart/1                     \_ metacity
 1635 lennart  name=systemd:/user/lennart/1                     \_ gnome-panel
 1638 lennart  name=systemd:/user/lennart/1                     \_ nautilus
 1640 lennart  name=systemd:/user/lennart/1                     \_ /usr/libexec/polkit-gnome-authentication-agent-1
 1641 lennart  name=systemd:/user/lennart/1                     \_ /usr/bin/seapplet
 1644 lennart  name=systemd:/user/lennart/1                     \_ gnome-volume-control-applet
 1646 lennart  name=systemd:/user/lennart/1                     \_ /usr/sbin/restorecond -u
 1652 lennart  name=systemd:/user/lennart/1                     \_ /usr/bin/devilspie
 1662 lennart  name=systemd:/user/lennart/1                     \_ nm-applet --sm-disable
 1664 lennart  name=systemd:/user/lennart/1                     \_ gnome-power-manager
 1665 lennart  name=systemd:/user/lennart/1                     \_ /usr/libexec/gdu-notification-daemon
 1670 lennart  name=systemd:/user/lennart/1                     \_ /usr/libexec/evolution/2.32/evolution-alarm-notify
 1672 lennart  name=systemd:/user/lennart/1                     \_ /usr/bin/python /usr/share/system-config-printer/applet.py
 1674 lennart  name=systemd:/user/lennart/1                     \_ /usr/lib64/deja-dup/deja-dup-monitor
 1675 lennart  name=systemd:/user/lennart/1                     \_ abrt-applet
 1677 lennart  name=systemd:/user/lennart/1                     \_ bluetooth-applet
 1678 lennart  name=systemd:/user/lennart/1                     \_ gpk-update-icon
 1408 root     name=systemd:/systemd-1/console-kit-daemon.service /usr/sbin/console-kit-daemon --no-daemon
 1419 gdm      name=systemd:/systemd-1/prefdm.service /usr/bin/dbus-launch --exit-with-session
 1453 root     name=systemd:/systemd-1/dbus.service /usr/libexec/upowerd
 1473 rtkit    name=systemd:/systemd-1/rtkit-daemon.service /usr/libexec/rtkit-daemon
 1496 root     name=systemd:/systemd-1/accounts-daemon.service /usr/libexec/accounts-daemon
 1499 root     name=systemd:/systemd-1/systemd-logger.service /lib/systemd/systemd-logger
 1511 lennart  name=systemd:/systemd-1/prefdm.service /usr/bin/gnome-keyring-daemon --daemonize --login
 1534 lennart  name=systemd:/user/lennart/1        dbus-launch --sh-syntax --exit-with-session
 1535 lennart  name=systemd:/user/lennart/1        /bin/dbus-daemon --fork --print-pid 5 --print-address 7 --session
 1603 lennart  name=systemd:/user/lennart/1        /usr/libexec/gconfd-2
 1612 lennart  name=systemd:/user/lennart/1        /usr/libexec/gnome-settings-daemon
 1615 lennart  name=systemd:/user/lennart/1        /usr/libexec/gvfsd
 1626 lennart  name=systemd:/user/lennart/1        /usr/libexec//gvfs-fuse-daemon /home/lennart/.gvfs
 1634 lennart  name=systemd:/user/lennart/1        /usr/bin/pulseaudio --start --log-target=syslog
 1649 lennart  name=systemd:/user/lennart/1         \_ /usr/libexec/pulse/gconf-helper
 1645 lennart  name=systemd:/user/lennart/1        /usr/libexec/bonobo-activation-server --ac-activate --ior-output-fd=24
 1668 lennart  name=systemd:/user/lennart/1        /usr/libexec/im-settings-daemon
 1701 lennart  name=systemd:/user/lennart/1        /usr/libexec/gvfs-gdu-volume-monitor
 1707 lennart  name=systemd:/user/lennart/1        /usr/bin/gnote --panel-applet --oaf-activate-iid=OAFIID:GnoteApplet_Factory --oaf-ior-fd=22
 1725 lennart  name=systemd:/user/lennart/1        /usr/libexec/clock-applet
 1727 lennart  name=systemd:/user/lennart/1        /usr/libexec/wnck-applet
 1729 lennart  name=systemd:/user/lennart/1        /usr/libexec/notification-area-applet
 1733 root     name=systemd:/systemd-1/dbus.service /usr/libexec/udisks-daemon
 1747 root     name=systemd:/systemd-1/dbus.service  \_ udisks-daemon: polling /dev/sr0
 1759 lennart  name=systemd:/user/lennart/1        gnome-screensaver
 1780 lennart  name=systemd:/user/lennart/1        /usr/libexec/gvfsd-trash --spawner :1.9 /org/gtk/gvfs/exec_spaw/0
 1864 lennart  name=systemd:/user/lennart/1        /usr/libexec/gvfs-afc-volume-monitor
 1874 lennart  name=systemd:/user/lennart/1        /usr/libexec/gconf-im-settings-daemon
 1903 lennart  name=systemd:/user/lennart/1        /usr/libexec/gvfsd-burn --spawner :1.9 /org/gtk/gvfs/exec_spaw/1
 1909 lennart  name=systemd:/user/lennart/1        gnome-terminal
 1913 lennart  name=systemd:/user/lennart/1         \_ gnome-pty-helper
 1914 lennart  name=systemd:/user/lennart/1         \_ bash
29231 lennart  name=systemd:/user/lennart/1         |   \_ ssh tango
 2221 lennart  name=systemd:/user/lennart/1         \_ bash
 4193 lennart  name=systemd:/user/lennart/1         |   \_ ssh tango
 2461 lennart  name=systemd:/user/lennart/1         \_ bash
29219 lennart  name=systemd:/user/lennart/1         |   \_ emacs systemd-for-admins-1.txt
15113 lennart  name=systemd:/user/lennart/1         \_ bash
27251 lennart  name=systemd:/user/lennart/1             \_ empathy
29504 lennart  name=systemd:/user/lennart/1             \_ ps xawf -eo pid,user,cgroup,args
 1968 lennart  name=systemd:/user/lennart/1        ssh-agent
 1994 lennart  name=systemd:/user/lennart/1        gpg-agent --daemon --write-env-file
18679 lennart  name=systemd:/user/lennart/1        /bin/sh /usr/lib64/firefox-3.6/run-mozilla.sh /usr/lib64/firefox-3.6/firefox
18741 lennart  name=systemd:/user/lennart/1         \_ /usr/lib64/firefox-3.6/firefox
28900 lennart  name=systemd:/user/lennart/1             \_ /usr/lib64/nspluginwrapper/npviewer.bin --plugin /usr/lib64/mozilla/plugins/libflashplayer.so --connection /org/wrapper/NSPlugins/libflashplayer.so/18741-6
 4016 root     name=systemd:/systemd-1/sysinit.service /usr/sbin/bluetoothd --udev
 4094 smmsp    name=systemd:/systemd-1/sendmail.service sendmail: Queue [email protected]:00:00 for /var/spool/clientmqueue
 4096 root     name=systemd:/systemd-1/sendmail.service sendmail: accepting connections
 4112 ntp      name=systemd:/systemd-1/ntpd.service /usr/sbin/ntpd -n -u ntp:ntp -g
27262 lennart  name=systemd:/user/lennart/1        /usr/libexec/mission-control-5
27265 lennart  name=systemd:/user/lennart/1        /usr/libexec/telepathy-haze
27268 lennart  name=systemd:/user/lennart/1        /usr/libexec/telepathy-logger
27270 lennart  name=systemd:/user/lennart/1        /usr/libexec/dconf-service
27280 lennart  name=systemd:/user/lennart/1        /usr/libexec/notification-daemon
27284 lennart  name=systemd:/user/lennart/1        /usr/libexec/telepathy-gabble
27285 lennart  name=systemd:/user/lennart/1        /usr/libexec/telepathy-salut
27297 lennart  name=systemd:/user/lennart/1        /usr/libexec/geoclue-yahoo

(Note that this output is shortened, I have removed most of the
kernel threads here, since they are not relevant in the context of
this blog story)

In the third column you see the cgroup systemd assigned to each
process. You’ll find that the udev processes are in the
name=systemd:/systemd-1/sysinit.service cgroup, which is
where systemd places all processes started by the
sysinit.service service, which covers early boot.

My personal recommendation is to set the shell alias psc
to the ps command line shown above:

alias psc='ps xawf -eo pid,user,cgroup,args'

With this service information of processes is just four keypresses
away!

A different way to present the same information is the
systemd-cgls tool we ship with systemd. It shows the cgroup
hierarchy in a pretty tree. Its output looks like this:

$ systemd-cgls
+    2 [kthreadd]
[...]
+ 4281 [flush-8:0]
+ user
| \ lennart
|   \ 1
|     +  1495 pam: gdm-password
|     +  1521 gnome-session
|     +  1534 dbus-launch --sh-syntax --exit-with-session
|     +  1535 /bin/dbus-daemon --fork --print-pid 5 --print-address 7 --session
|     +  1603 /usr/libexec/gconfd-2
|     +  1612 /usr/libexec/gnome-settings-daemon
|     +  1615 /ushr/libexec/gvfsd
|     +  1621 metacity
|     +  1626 /usr/libexec//gvfs-fuse-daemon /home/lennart/.gvfs
|     +  1634 /usr/bin/pulseaudio --start --log-target=syslog
|     +  1635 gnome-panel
|     +  1638 nautilus
|     +  1640 /usr/libexec/polkit-gnome-authentication-agent-1
|     +  1641 /usr/bin/seapplet
|     +  1644 gnome-volume-control-applet
|     +  1645 /usr/libexec/bonobo-activation-server --ac-activate --ior-output-fd=24
|     +  1646 /usr/sbin/restorecond -u
|     +  1649 /usr/libexec/pulse/gconf-helper
|     +  1652 /usr/bin/devilspie
|     +  1662 nm-applet --sm-disable
|     +  1664 gnome-power-manager
|     +  1665 /usr/libexec/gdu-notification-daemon
|     +  1668 /usr/libexec/im-settings-daemon
|     +  1670 /usr/libexec/evolution/2.32/evolution-alarm-notify
|     +  1672 /usr/bin/python /usr/share/system-config-printer/applet.py
|     +  1674 /usr/lib64/deja-dup/deja-dup-monitor
|     +  1675 abrt-applet
|     +  1677 bluetooth-applet
|     +  1678 gpk-update-icon
|     +  1701 /usr/libexec/gvfs-gdu-volume-monitor
|     +  1707 /usr/bin/gnote --panel-applet --oaf-activate-iid=OAFIID:GnoteApplet_Factory --oaf-ior-fd=22
|     +  1725 /usr/libexec/clock-applet
|     +  1727 /usr/libexec/wnck-applet
|     +  1729 /usr/libexec/notification-area-applet
|     +  1759 gnome-screensaver
|     +  1780 /usr/libexec/gvfsd-trash --spawner :1.9 /org/gtk/gvfs/exec_spaw/0
|     +  1864 /usr/libexec/gvfs-afc-volume-monitor
|     +  1874 /usr/libexec/gconf-im-settings-daemon
|     +  1882 /usr/libexec/gvfs-gphoto2-volume-monitor
|     +  1903 /usr/libexec/gvfsd-burn --spawner :1.9 /org/gtk/gvfs/exec_spaw/1
|     +  1909 gnome-terminal
|     +  1913 gnome-pty-helper
|     +  1914 bash
|     +  1968 ssh-agent
|     +  1994 gpg-agent --daemon --write-env-file
|     +  2221 bash
|     +  2461 bash
|     +  4193 ssh tango
|     + 15113 bash
|     + 18679 /bin/sh /usr/lib64/firefox-3.6/run-mozilla.sh /usr/lib64/firefox-3.6/firefox
|     + 18741 /usr/lib64/firefox-3.6/firefox
|     + 27251 empathy
|     + 27262 /usr/libexec/mission-control-5
|     + 27265 /usr/libexec/telepathy-haze
|     + 27268 /usr/libexec/telepathy-logger
|     + 27270 /usr/libexec/dconf-service
|     + 27280 /usr/libexec/notification-daemon
|     + 27284 /usr/libexec/telepathy-gabble
|     + 27285 /usr/libexec/telepathy-salut
|     + 27297 /usr/libexec/geoclue-yahoo
|     + 28900 /usr/lib64/nspluginwrapper/npviewer.bin --plugin /usr/lib64/mozilla/plugins/libflashplayer.so --connection /org/wrapper/NSPlugins/libflashplayer.so/18741-6
|     + 29219 emacs systemd-for-admins-1.txt
|     + 29231 ssh tango
|     \ 29519 systemd-cgls
\ systemd-1
  + 1 /sbin/init
  + ntpd.service
  | \ 4112 /usr/sbin/ntpd -n -u ntp:ntp -g
  + systemd-logger.service
  | \ 1499 /lib/systemd/systemd-logger
  + accounts-daemon.service
  | \ 1496 /usr/libexec/accounts-daemon
  + rtkit-daemon.service
  | \ 1473 /usr/libexec/rtkit-daemon
  + console-kit-daemon.service
  | \ 1408 /usr/sbin/console-kit-daemon --no-daemon
  + prefdm.service
  | + 1376 /usr/sbin/gdm-binary -nodaemon
  | + 1391 /usr/libexec/gdm-simple-slave --display-id /org/gnome/DisplayManager/Display1 --force-active-vt
  | + 1394 /usr/bin/Xorg :0 -nr -verbose -auth /var/run/gdm/auth-for-gdm-f2KUOh/database -nolisten tcp vt1
  | + 1419 /usr/bin/dbus-launch --exit-with-session
  | \ 1511 /usr/bin/gnome-keyring-daemon --daemonize --login
  + [email protected]
  | + tty6
  | | \ 1346 /sbin/mingetty tty6
  | + tty4
  | | \ 1343 /sbin/mingetty tty4
  | + tty5
  | | \ 1342 /sbin/mingetty tty5
  | + tty3
  | | \ 1339 /sbin/mingetty tty3
  | \ tty2
  |   \ 1332 /sbin/mingetty tty2
  + abrtd.service
  | \ 1317 /usr/sbin/abrtd -d -s
  + crond.service
  | \ 1344 crond
  + sshd.service
  | \ 1362 /usr/sbin/sshd
  + sendmail.service
  | + 4094 sendmail: Queue [email protected]:00:00 for /var/spool/clientmqueue
  | \ 4096 sendmail: accepting connections
  + haldaemon.service
  | + 1249 hald
  | + 1250 hald-runner
  | + 1273 hald-addon-input: Listening on /dev/input/event3 /dev/input/event9 /dev/input/event1 /dev/input/event7 /dev/input/event2 /dev/input/event0 /dev/input/event8
  | + 1275 /usr/libexec/hald-addon-rfkill-killswitch
  | + 1284 /usr/libexec/hald-addon-leds
  | + 1285 /usr/libexec/hald-addon-generic-backlight
  | \ 1287 /usr/libexec/hald-addon-acpi
  + irqbalance.service
  | \ 1210 irqbalance
  + avahi-daemon.service
  | + 1175 avahi-daemon: running [epsilon.local]
  + NetworkManager.service
  | + 1171 /usr/sbin/NetworkManager --no-daemon
  | \ 4028 /sbin/dhclient -d -4 -sf /usr/libexec/nm-dhcp-client.action -pf /var/run/dhclient-wlan0.pid -lf /var/lib/dhclient/dhclient-7d32a784-ede9-4cf6-9ee3-60edc0bce5ff-wlan0.lease -cf /var/run/nm-dhclient-wlan0.conf wlan0
  + rsyslog.service
  | \ 1193 /sbin/rsyslogd -c 4
  + mdmonitor.service
  | \ 1207 mdadm --monitor --scan -f --pid-file=/var/run/mdadm/mdadm.pid
  + cups.service
  | \ 1195 cupsd -C /etc/cups/cupsd.conf
  + auditd.service
  | + 1131 auditd
  | + 1133 /sbin/audispd
  | \ 1135 /usr/sbin/sedispatch
  + dbus.service
  | +  1096 /bin/dbus-daemon --system --address=systemd: --nofork --systemd-activation
  | +  1216 /usr/sbin/modem-manager
  | +  1219 /usr/libexec/polkit-1/polkitd
  | +  1242 /usr/sbin/wpa_supplicant -c /etc/wpa_supplicant/wpa_supplicant.conf -B -u -f /var/log/wpa_supplicant.log -P /var/run/wpa_supplicant.pid
  | +  1453 /usr/libexec/upowerd
  | +  1733 /usr/libexec/udisks-daemon
  | +  1747 udisks-daemon: polling /dev/sr0
  | \ 29509 /usr/libexec/packagekitd
  + dev-mqueue.mount
  + dev-hugepages.mount
  \ sysinit.service
    +   455 /sbin/udevd -d
    +  4016 /usr/sbin/bluetoothd --udev
    + 28188 /sbin/udevd -d
    \ 28191 /sbin/udevd -d

(This too is shortened, the same way)

As you can see, this command shows the processes by their cgroup
and hence service, as systemd labels the cgroups after the
services. For example, you can easily see that the auditing service
auditd.service spawns three individual processes,
auditd, audisp and sedispatch.

If you look closely you will notice that a number of processes have
been assigned to the cgroup /user/1. At this point let’s
simply leave it at that systemd not only maintains services in cgroups,
but user session processes as well. In a later installment we’ll discuss in
more detail what this about.

So much for now, come back soon for the next installment!

systemd for Administrators, Part 1

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/systemd-for-admins-1.html

As many of you know, systemd is the new
Fedora init system, starting with F14, and it is also on its way to being adopted in
a number of other distributions as well (for example, OpenSUSE). For administrators
systemd provides a variety of new features and changes and enhances the
administrative process substantially. This blog story is the first part of a
series of articles I plan to post roughly every week for the next months. In
every post I will try to explain one new feature of systemd. Many of these features
are small and simple, so these stories should be interesting to a broader audience.
However, from time to time we’ll dive a little bit deeper into the great new
features systemd provides you with.

Verifying Bootup

Traditionally, when booting up a Linux system, you see a lot of
little messages passing by on your screen. As we work on speeding up
and parallelizing the boot process these messages are becoming visible
for a shorter and shorter time only and be less and less readable —
if they are shown at all, given we use graphical boot splash
technology like Plymouth these days. Nonetheless the information of
the boot screens was and still is very relevant, because it shows you
for each service that is being started as part of bootup, wether it
managed to start up successfully or failed (with those green or red
[ OK ] or [ FAILED ] indicators). To improve the
situation for machines that boot up fast and parallelized and to make
this information more nicely available during runtime, we added a
feature to systemd that tracks and remembers for each service whether
it started up successfully, whether it exited with a non-zero exit
code, whether it timed out, or whether it terminated abnormally (by
segfaulting or similar), both during start-up and runtime. By simply
typing systemctl in your shell you can query the state of all
services, both systemd native and SysV/LSB services:

[[email protected]] ~# systemctl
UNIT LOAD ACTIVE SUB JOB DESCRIPTION
dev-hugepages.automount loaded active running Huge Pages File System Automount Point
dev-mqueue.automount loaded active running POSIX Message Queue File System Automount Point
proc-sys-fs-binfmt_misc.automount loaded active waiting Arbitrary Executable File Formats File System Automount Point
sys-kernel-debug.automount loaded active waiting Debug File System Automount Point
sys-kernel-security.automount loaded active waiting Security File System Automount Point
sys-devices-pc…0000:02:00.0-net-eth0.device loaded active plugged 82573L Gigabit Ethernet Controller
[…]
sys-devices-virtual-tty-tty9.device loaded active plugged /sys/devices/virtual/tty/tty9
-.mount loaded active mounted /
boot.mount loaded active mounted /boot
dev-hugepages.mount loaded active mounted Huge Pages File System
dev-mqueue.mount loaded active mounted POSIX Message Queue File System
home.mount loaded active mounted /home
proc-sys-fs-binfmt_misc.mount loaded active mounted Arbitrary Executable File Formats File System
abrtd.service loaded active running ABRT Automated Bug Reporting Tool
accounts-daemon.service loaded active running Accounts Service
acpid.service loaded active running ACPI Event Daemon
atd.service loaded active running Execution Queue Daemon
auditd.service loaded active running Security Auditing Service
avahi-daemon.service loaded active running Avahi mDNS/DNS-SD Stack
bluetooth.service loaded active running Bluetooth Manager
console-kit-daemon.service loaded active running Console Manager
cpuspeed.service loaded active exited LSB: processor frequency scaling support
crond.service loaded active running Command Scheduler
cups.service loaded active running CUPS Printing Service
dbus.service loaded active running D-Bus System Message Bus
[email protected] loaded active running Getty on tty2
[email protected] loaded active running Getty on tty3
[email protected] loaded active running Getty on tty4
[email protected] loaded active running Getty on tty5
[email protected] loaded active running Getty on tty6
haldaemon.service loaded active running Hardware Manager
[email protected] loaded active running sda shock protection daemon
irqbalance.service loaded active running LSB: start and stop irqbalance daemon
iscsi.service loaded active exited LSB: Starts and stops login and scanning of iSCSI devices.
iscsid.service loaded active exited LSB: Starts and stops login iSCSI daemon.
livesys-late.service loaded active exited LSB: Late init script for live image.
livesys.service loaded active exited LSB: Init script for live image.
lvm2-monitor.service loaded active exited LSB: Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling
mdmonitor.service loaded active running LSB: Start and stop the MD software RAID monitor
modem-manager.service loaded active running Modem Manager
netfs.service loaded active exited LSB: Mount and unmount network filesystems.
NetworkManager.service loaded active running Network Manager
ntpd.service loaded maintenance maintenance Network Time Service
polkitd.service loaded active running Policy Manager
prefdm.service loaded active running Display Manager
rc-local.service loaded active exited /etc/rc.local Compatibility
rpcbind.service loaded active running RPC Portmapper Service
rsyslog.service loaded active running System Logging Service
rtkit-daemon.service loaded active running RealtimeKit Scheduling Policy Service
sendmail.service loaded active running LSB: start and stop sendmail
[email protected]:22-172.31.0.4:36368.service loaded active running SSH Per-Connection Server
sysinit.service loaded active running System Initialization
systemd-logger.service loaded active running systemd Logging Daemon
udev-post.service loaded active exited LSB: Moves the generated persistent udev rules to /etc/udev/rules.d
udisks.service loaded active running Disk Manager
upowerd.service loaded active running Power Manager
wpa_supplicant.service loaded active running Wi-Fi Security Service
avahi-daemon.socket loaded active listening Avahi mDNS/DNS-SD Stack Activation Socket
cups.socket loaded active listening CUPS Printing Service Sockets
dbus.socket loaded active running dbus.socket
rpcbind.socket loaded active listening RPC Portmapper Socket
sshd.socket loaded active listening sshd.socket
systemd-initctl.socket loaded active listening systemd /dev/initctl Compatibility Socket
systemd-logger.socket loaded active running systemd Logging Socket
systemd-shutdownd.socket loaded active listening systemd Delayed Shutdown Socket
dev-disk-byx1…x1db22ax1d870f1adf2732.swap loaded active active /dev/disk/by-uuid/fd626ef7-34a4-4958-b22a-870f1adf2732
basic.target loaded active active Basic System
bluetooth.target loaded active active Bluetooth
dbus.target loaded active active D-Bus
getty.target loaded active active Login Prompts
graphical.target loaded active active Graphical Interface
local-fs.target loaded active active Local File Systems
multi-user.target loaded active active Multi-User
network.target loaded active active Network
remote-fs.target loaded active active Remote File Systems
sockets.target loaded active active Sockets
swap.target loaded active active Swap
sysinit.target loaded active active System Initialization

LOAD = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB = The low-level unit activation state, values depend on unit type.
JOB = Pending job for the unit.

221 units listed. Pass –all to see inactive units, too.
[[email protected]] ~#

(I have shortened the output above a little, and removed a few lines not relevant for this blog post.)

Look at the ACTIVE column, which shows you the high-level state of
a service (or in fact of any kind of unit systemd maintains, which can
be more than just services, but we’ll have a look on this in a later
blog posting), whether it is active (i.e. running),
inactive (i.e. not running) or in any other state. If you look
closely you’ll see one item in the list that is marked maintenance
and highlighted in red. This informs you about a service that failed
to run or otherwise encountered a problem. In this case this is
ntpd. Now, let’s find out what actually
happened to ntpd, with the systemctl status
command:

[[email protected]] ~# systemctl status ntpd.service
ntpd.service – Network Time Service
Loaded: loaded (/etc/systemd/system/ntpd.service)
Active: maintenance
Main: 953 (code=exited, status=255)
CGroup: name=systemd:/systemd-1/ntpd.service
[[email protected]] ~#

This shows us that NTP terminated during runtime (when it ran as
PID 953), and tells us exactly the error condition: the process exited
with an exit status of 255.

In a later systemd version, we plan to hook this up to ABRT, as soon as
this enhancement request is fixed
. Then, if systemctl
status shows you information about a service that crashed it will
direct you right-away to the appropriate crash dump in ABRT.

Summary: use systemctl and systemctl
status as modern, more complete replacements for the traditional
boot-up status messages of SysV services. systemctl status
not only captures in more detail the error condition but also shows
runtime errors in addition to start-up errors.

That’s it for this week, make sure to come back next week, for the
next posting about systemd for administrators!

systemd for Administrators, Part 1

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/systemd-for-admins-1.html

As many of you know, systemd is the new
Fedora init system, starting with F14, and it is also on its way to being adopted in
a number of other distributions as well (for example, OpenSUSE). For administrators
systemd provides a variety of new features and changes and enhances the
administrative process substantially. This blog story is the first part of a
series of articles I plan to post roughly every week for the next months. In
every post I will try to explain one new feature of systemd. Many of these features
are small and simple, so these stories should be interesting to a broader audience.
However, from time to time we’ll dive a little bit deeper into the great new
features systemd provides you with.

Verifying Bootup

Traditionally, when booting up a Linux system, you see a lot of
little messages passing by on your screen. As we work on speeding up
and parallelizing the boot process these messages are becoming visible
for a shorter and shorter time only and be less and less readable —
if they are shown at all, given we use graphical boot splash
technology like Plymouth these days. Nonetheless the information of
the boot screens was and still is very relevant, because it shows you
for each service that is being started as part of bootup, wether it
managed to start up successfully or failed (with those green or red
[ OK ] or [ FAILED ] indicators). To improve the
situation for machines that boot up fast and parallelized and to make
this information more nicely available during runtime, we added a
feature to systemd that tracks and remembers for each service whether
it started up successfully, whether it exited with a non-zero exit
code, whether it timed out, or whether it terminated abnormally (by
segfaulting or similar), both during start-up and runtime. By simply
typing systemctl in your shell you can query the state of all
services, both systemd native and SysV/LSB services:

[[email protected]] ~# systemctl
UNIT                                          LOAD   ACTIVE       SUB          JOB             DESCRIPTION
dev-hugepages.automount                       loaded active       running                      Huge Pages File System Automount Point
dev-mqueue.automount                          loaded active       running                      POSIX Message Queue File System Automount Point
proc-sys-fs-binfmt_misc.automount             loaded active       waiting                      Arbitrary Executable File Formats File System Automount Point
sys-kernel-debug.automount                    loaded active       waiting                      Debug File System Automount Point
sys-kernel-security.automount                 loaded active       waiting                      Security File System Automount Point
sys-devices-pc...0000:02:00.0-net-eth0.device loaded active       plugged                      82573L Gigabit Ethernet Controller
[...]
sys-devices-virtual-tty-tty9.device           loaded active       plugged                      /sys/devices/virtual/tty/tty9
-.mount                                       loaded active       mounted                      /
boot.mount                                    loaded active       mounted                      /boot
dev-hugepages.mount                           loaded active       mounted                      Huge Pages File System
dev-mqueue.mount                              loaded active       mounted                      POSIX Message Queue File System
home.mount                                    loaded active       mounted                      /home
proc-sys-fs-binfmt_misc.mount                 loaded active       mounted                      Arbitrary Executable File Formats File System
abrtd.service                                 loaded active       running                      ABRT Automated Bug Reporting Tool
accounts-daemon.service                       loaded active       running                      Accounts Service
acpid.service                                 loaded active       running                      ACPI Event Daemon
atd.service                                   loaded active       running                      Execution Queue Daemon
auditd.service                                loaded active       running                      Security Auditing Service
avahi-daemon.service                          loaded active       running                      Avahi mDNS/DNS-SD Stack
bluetooth.service                             loaded active       running                      Bluetooth Manager
console-kit-daemon.service                    loaded active       running                      Console Manager
cpuspeed.service                              loaded active       exited                       LSB: processor frequency scaling support
crond.service                                 loaded active       running                      Command Scheduler
cups.service                                  loaded active       running                      CUPS Printing Service
dbus.service                                  loaded active       running                      D-Bus System Message Bus
[email protected]                            loaded active       running                      Getty on tty2
[email protected]                            loaded active       running                      Getty on tty3
[email protected]                            loaded active       running                      Getty on tty4
[email protected]                            loaded active       running                      Getty on tty5
[email protected]                            loaded active       running                      Getty on tty6
haldaemon.service                             loaded active       running                      Hardware Manager
[email protected]                            loaded active       running                      sda shock protection daemon
irqbalance.service                            loaded active       running                      LSB: start and stop irqbalance daemon
iscsi.service                                 loaded active       exited                       LSB: Starts and stops login and scanning of iSCSI devices.
iscsid.service                                loaded active       exited                       LSB: Starts and stops login iSCSI daemon.
livesys-late.service                          loaded active       exited                       LSB: Late init script for live image.
livesys.service                               loaded active       exited                       LSB: Init script for live image.
lvm2-monitor.service                          loaded active       exited                       LSB: Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling
mdmonitor.service                             loaded active       running                      LSB: Start and stop the MD software RAID monitor
modem-manager.service                         loaded active       running                      Modem Manager
netfs.service                                 loaded active       exited                       LSB: Mount and unmount network filesystems.
NetworkManager.service                        loaded active       running                      Network Manager
ntpd.service                                  loaded maintenance  maintenance                  Network Time Service
polkitd.service                               loaded active       running                      Policy Manager
prefdm.service                                loaded active       running                      Display Manager
rc-local.service                              loaded active       exited                       /etc/rc.local Compatibility
rpcbind.service                               loaded active       running                      RPC Portmapper Service
rsyslog.service                               loaded active       running                      System Logging Service
rtkit-daemon.service                          loaded active       running                      RealtimeKit Scheduling Policy Service
sendmail.service                              loaded active       running                      LSB: start and stop sendmail
[email protected]:22-172.31.0.4:36368.service  loaded active       running                      SSH Per-Connection Server
sysinit.service                               loaded active       running                      System Initialization
systemd-logger.service                        loaded active       running                      systemd Logging Daemon
udev-post.service                             loaded active       exited                       LSB: Moves the generated persistent udev rules to /etc/udev/rules.d
udisks.service                                loaded active       running                      Disk Manager
upowerd.service                               loaded active       running                      Power Manager
wpa_supplicant.service                        loaded active       running                      Wi-Fi Security Service
avahi-daemon.socket                           loaded active       listening                    Avahi mDNS/DNS-SD Stack Activation Socket
cups.socket                                   loaded active       listening                    CUPS Printing Service Sockets
dbus.socket                                   loaded active       running                      dbus.socket
rpcbind.socket                                loaded active       listening                    RPC Portmapper Socket
sshd.socket                                   loaded active       listening                    sshd.socket
systemd-initctl.socket                        loaded active       listening                    systemd /dev/initctl Compatibility Socket
systemd-logger.socket                         loaded active       running                      systemd Logging Socket
systemd-shutdownd.socket                      loaded active       listening                    systemd Delayed Shutdown Socket
dev-disk-by\x1...x1db22a\x1d870f1adf2732.swap loaded active       active                       /dev/disk/by-uuid/fd626ef7-34a4-4958-b22a-870f1adf2732
basic.target                                  loaded active       active                       Basic System
bluetooth.target                              loaded active       active                       Bluetooth
dbus.target                                   loaded active       active                       D-Bus
getty.target                                  loaded active       active                       Login Prompts
graphical.target                              loaded active       active                       Graphical Interface
local-fs.target                               loaded active       active                       Local File Systems
multi-user.target                             loaded active       active                       Multi-User
network.target                                loaded active       active                       Network
remote-fs.target                              loaded active       active                       Remote File Systems
sockets.target                                loaded active       active                       Sockets
swap.target                                   loaded active       active                       Swap
sysinit.target                                loaded active       active                       System Initialization

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.
JOB    = Pending job for the unit.

221 units listed. Pass --all to see inactive units, too.
[[email protected]] ~#

(I have shortened the output above a little, and removed a few lines not relevant for this blog post.)

Look at the ACTIVE column, which shows you the high-level state of
a service (or in fact of any kind of unit systemd maintains, which can
be more than just services, but we’ll have a look on this in a later
blog posting), whether it is active (i.e. running),
inactive (i.e. not running) or in any other state. If you look
closely you’ll see one item in the list that is marked maintenance
and highlighted in red. This informs you about a service that failed
to run or otherwise encountered a problem. In this case this is
ntpd. Now, let’s find out what actually
happened to ntpd, with the systemctl status
command:

[[email protected]] ~# systemctl status ntpd.service
ntpd.service - Network Time Service
	  Loaded: loaded (/etc/systemd/system/ntpd.service)
	  Active: maintenance
	    Main: 953 (code=exited, status=255)
	  CGroup: name=systemd:/systemd-1/ntpd.service
[[email protected]] ~#

This shows us that NTP terminated during runtime (when it ran as
PID 953), and tells us exactly the error condition: the process exited
with an exit status of 255.

In a later systemd version, we plan to hook this up to ABRT, as soon as
this enhancement request is fixed
. Then, if systemctl
status
shows you information about a service that crashed it will
direct you right-away to the appropriate crash dump in ABRT.

Summary: use systemctl and systemctl
status
as modern, more complete replacements for the traditional
boot-up status messages of SysV services. systemctl status
not only captures in more detail the error condition but also shows
runtime errors in addition to start-up errors.

That’s it for this week, make sure to come back next week, for the
next posting about systemd for administrators!

Rethinking PID 1

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/systemd.html

If you are well connected or good at reading between the lines
you might already know what this blog post is about. But even then
you may find this story interesting. So grab a cup of coffee,
sit down, and read what’s coming.

This blog story is long, so even though I can only recommend
reading the long story, here’s the one sentence summary: we are
experimenting with a new init system and it is fun.

Here’s the code. And here’s the story:

Process Identifier 1

On every Unix system there is one process with the special
process identifier 1. It is started by the kernel before all other
processes and is the parent process for all those other processes
that have nobody else to be child of. Due to that it can do a lot
of stuff that other processes cannot do. And it is also
responsible for some things that other processes are not
responsible for, such as bringing up and maintaining userspace
during boot.

Historically on Linux the software acting as PID 1 was the
venerable sysvinit package, though it had been showing its age for
quite a while. Many replacements have been suggested, only one of
them really took off: Upstart, which has by now found
its way into all major distributions.

As mentioned, the central responsibility of an init system is
to bring up userspace. And a good init system does that
fast. Unfortunately, the traditional SysV init system was not
particularly fast.

For a fast and efficient boot-up two things are crucial:

  • To start less.
  • And to start more in parallel.

What does that mean? Starting less means starting fewer
services or deferring the starting of services until they are
actually needed. There are some services where we know that they
will be required sooner or later (syslog, D-Bus system bus, etc.),
but for many others this isn’t the case. For example, bluetoothd
does not need to be running unless a bluetooth dongle is actually
plugged in or an application wants to talk to its D-Bus
interfaces. Same for a printing system: unless the machine
physically is connected to a printer, or an application wants to
print something, there is no need to run a printing daemon such as
CUPS. Avahi: if the machine is not connected to a
network, there is no need to run Avahi, unless some application wants
to use its APIs. And even SSH: as long as nobody wants to contact
your machine there is no need to run it, as long as it is then
started on the first connection. (And admit it, on most machines
where sshd might be listening somebody connects to it only every
other month or so.)

Starting more in parallel means that if we have
to run something, we should not serialize its start-up (as sysvinit
does), but run it all at the same time, so that the available
CPU and disk IO bandwidth is maxed out, and hence
the overall start-up time minimized.

Hardware and Software Change Dynamically

Modern systems (especially general purpose OS) are highly
dynamic in their configuration and use: they are mobile, different
applications are started and stopped, different hardware added and
removed again. An init system that is responsible for maintaining
services needs to listen to hardware and software
changes. It needs to dynamically start (and sometimes stop)
services as they are needed to run a program or enable some
hardware.

Most current systems that try to parallelize boot-up still
synchronize the start-up of the various daemons involved: since
Avahi needs D-Bus, D-Bus is started first, and only when D-Bus
signals that it is ready, Avahi is started too. Similar for other
services: livirtd and X11 need HAL (well, I am considering the
Fedora 13 services here, ignore that HAL is obsolete), hence HAL
is started first, before livirtd and X11 are started. And
libvirtd also needs Avahi, so it waits for Avahi too. And all of
them require syslog, so they all wait until Syslog is fully
started up and initialized. And so on.

Parallelizing Socket Services

This kind of start-up synchronization results in the
serialization of a significant part of the boot process. Wouldn’t
it be great if we could get rid of the synchronization and
serialization cost? Well, we can, actually. For that, we need to
understand what exactly the daemons require from each other, and
why their start-up is delayed. For traditional Unix daemons,
there’s one answer to it: they wait until the socket the other
daemon offers its services on is ready for connections. Usually
that is an AF_UNIX socket in the file-system, but it could be
AF_INET[6], too. For example, clients of D-Bus wait that
/var/run/dbus/system_bus_socket can be connected to,
clients of syslog wait for /dev/log, clients of CUPS wait
for /var/run/cups/cups.sock and NFS mounts wait for
/var/run/rpcbind.sock and the portmapper IP port, and so
on. And think about it, this is actually the only thing they wait
for!

Now, if that’s all they are waiting for, if we manage to make
those sockets available for connection earlier and only actually
wait for that instead of the full daemon start-up, then we can
speed up the entire boot and start more processes in parallel. So,
how can we do that? Actually quite easily in Unix-like systems: we
can create the listening sockets before we actually start
the daemon, and then just pass the socket during exec()
to it. That way, we can create all sockets for all
daemons in one step in the init system, and then in a second step
run all daemons at once. If a service needs another, and it is not
fully started up, that’s completely OK: what will happen is that
the connection is queued in the providing service and the client
will potentially block on that single request. But only that one
client will block and only on that one request. Also, dependencies
between services will no longer necessarily have to be configured
to allow proper parallelized start-up: if we start all sockets at
once and a service needs another it can be sure that it can
connect to its socket.

Because this is at the core of what is following, let me say
this again, with different words and by example: if you start
syslog and and various syslog clients at the same time, what will
happen in the scheme pointed out above is that the messages of the
clients will be added to the /dev/log socket buffer. As
long as that buffer doesn’t run full, the clients will not have to
wait in any way and can immediately proceed with their start-up. As
soon as syslog itself finished start-up, it will dequeue all
messages and process them. Another example: we start D-Bus and
several clients at the same time. If a synchronous bus
request is sent and hence a reply expected, what will happen is
that the client will have to block, however only that one client
and only until D-Bus managed to catch up and process it.

Basically, the kernel socket buffers help us to maximize
parallelization, and the ordering and synchronization is done by
the kernel, without any further management from userspace! And if
all the sockets are available before the daemons actually start-up,
dependency management also becomes redundant (or at least
secondary): if a daemon needs another daemon, it will just connect
to it. If the other daemon is already started, this will
immediately succeed. If it isn’t started but in the process of
being started, the first daemon will not even have to wait for it,
unless it issues a synchronous request. And even if the other
daemon is not running at all, it can be auto-spawned. From the
first daemon’s perspective there is no difference, hence dependency
management becomes mostly unnecessary or at least secondary, and
all of this in optimal parallelization and optionally with
on-demand loading. On top of this, this is also more robust, because
the sockets stay available regardless whether the actual daemons
might temporarily become unavailable (maybe due to crashing). In
fact, you can easily write a daemon with this that can run, and
exit (or crash), and run again and exit again (and so on), and all
of that without the clients noticing or loosing any request.

It’s a good time for a pause, go and refill your coffee mug,
and be assured, there is more interesting stuff following.

But first, let’s clear a few things up: is this kind of logic
new? No, it certainly is not. The most prominent system that works
like this is Apple’s launchd system: on MacOS the listening of the
sockets is pulled out of all daemons and done by launchd. The
services themselves hence can all start up in parallel and
dependencies need not to be configured for them. And that is
actually a really ingenious design, and the primary reason why
MacOS manages to provide the fantastic boot-up times it
provides. I can highly recommend this
video
where the launchd folks explain what they are
doing. Unfortunately this idea never really took on outside of the Apple
camp.

The idea is actually even older than launchd. Prior to launchd
the venerable inetd worked much like this: sockets were
centrally created in a daemon that would start the actual service
daemons passing the socket file descriptors during
exec(). However the focus of inetd certainly
wasn’t local services, but Internet services (although later
reimplementations supported AF_UNIX sockets, too). It also wasn’t a
tool to parallelize boot-up or even useful for getting implicit
dependencies right.

For TCP sockets inetd was primarily used in a way that
for every incoming connection a new daemon instance was
spawned. That meant that for each connection a new
process was spawned and initialized, which is not a
recipe for high-performance servers. However, right from the
beginning inetd also supported another mode, where a
single daemon was spawned on the first connection, and that single
instance would then go on and also accept the follow-up connections
(that’s what the wait/nowait option in
inetd.conf was for, a particularly badly documented
option, unfortunately.) Per-connection daemon starts probably gave
inetd its bad reputation for being slow. But that’s not entirely
fair.

Parallelizing Bus Services

Modern daemons on Linux tend to provide services via D-Bus
instead of plain AF_UNIX sockets. Now, the question is, for those
services, can we apply the same parallelizing boot logic as for
traditional socket services? Yes, we can, D-Bus already has all
the right hooks for it: using bus activation a service can be
started the first time it is accessed. Bus activation also gives
us the minimal per-request synchronisation we need for starting up
the providers and the consumers of D-Bus services at the same
time: if we want to start Avahi at the same time as CUPS (side
note: CUPS uses Avahi to browse for mDNS/DNS-SD printers), then we
can simply run them at the same time, and if CUPS is quicker than
Avahi via the bus activation logic we can get D-Bus to queue the
request until Avahi manages to establish its service name.

So, in summary: the socket-based service activation and the
bus-based service activation together enable us to start
all daemons in parallel, without any further
synchronization. Activation also allows us to do lazy-loading of
services: if a service is rarely used, we can just load it the
first time somebody accesses the socket or bus name, instead of
starting it during boot.

And if that’s not great, then I don’t know what is
great!

Parallelizing File System Jobs

If you look at
the serialization graphs of the boot process
of current
distributions, there are more synchronisation points than just
daemon start-ups: most prominently there are file-system related
jobs: mounting, fscking, quota. Right now, on boot-up a lot of
time is spent idling to wait until all devices that are listed in
/etc/fstab show up in the device tree and are then
fsck’ed, mounted, quota checked (if enabled). Only after that is
fully finished we go on and boot the actual services.

Can we improve this? It turns out we can. Harald Hoyer came up
with the idea of using the venerable autofs system for this:

Just like a connect() call shows that a service is
interested in another service, an open() (or a similar
call) shows that a service is interested in a specific file or
file-system. So, in order to improve how much we can parallelize
we can make those apps wait only if a file-system they are looking
for is not yet mounted and readily available: we set up an autofs
mount point, and then when our file-system finished fsck and quota
due to normal boot-up we replace it by the real mount. While the
file-system is not ready yet, the access will be queued by the
kernel and the accessing process will block, but only that one
daemon and only that one access. And this way we can begin
starting our daemons even before all file systems have been fully
made available — without them missing any files, and maximizing
parallelization.

Parallelizing file system jobs and service jobs does
not make sense for /, after all that’s where the service
binaries are usually stored. However, for file-systems such as
/home, that usually are bigger, even encrypted, possibly
remote and seldom accessed by the usual boot-up daemons, this
can improve boot time considerably. It is probably not necessary
to mention this, but virtual file systems, such as
procfs or sysfs should never be mounted via autofs.

I wouldn’t be surprised if some readers might find integrating
autofs in an init system a bit fragile and even weird, and maybe
more on the “crackish” side of things. However, having played
around with this extensively I can tell you that this actually
feels quite right. Using autofs here simply means that we can
create a mount point without having to provide the backing file
system right-away. In effect it hence only delays accesses. If an
application tries to access an autofs file-system and we take very
long to replace it with the real file-system, it will hang in an
interruptible sleep, meaning that you can safely cancel it, for
example via C-c. Also note that at any point, if the mount point
should not be mountable in the end (maybe because fsck failed), we
can just tell autofs to return a clean error code (like
ENOENT). So, I guess what I want to say is that even though
integrating autofs into an init system might appear adventurous at
first, our experimental code has shown that this idea works
surprisingly well in practice — if it is done for the right
reasons and the right way.

Also note that these should be direct autofs
mounts, meaning that from an application perspective there’s
little effective difference between a classic mount point and one
based on autofs.

Keeping the First User PID Small

Another thing we can learn from the MacOS boot-up logic is
that shell scripts are evil. Shell is fast and shell is slow. It
is fast to hack, but slow in execution. The classic sysvinit boot
logic is modelled around shell scripts. Whether it is
/bin/bash or any other shell (that was written to make
shell scripts faster), in the end the approach is doomed to be
slow. On my system the scripts in /etc/init.d call
grep at least 77 times. awk is called 92
times, cut 23 and sed 74. Every time those
commands (and others) are called, a process is spawned, the
libraries searched, some start-up stuff like i18n and so on set up
and more. And then after seldom doing more than a trivial string
operation the process is terminated again. Of course, that has to
be incredibly slow. No other language but shell would do something like
that. On top of that, shell scripts are also very fragile, and
change their behaviour drastically based on environment variables
and suchlike, stuff that is hard to oversee and control.

So, let’s get rid of shell scripts in the boot process! Before
we can do that we need to figure out what they are currently
actually used for: well, the big picture is that most of the time,
what they do is actually quite boring. Most of the scripting is
spent on trivial setup and tear-down of services, and should be
rewritten in C, either in separate executables, or moved into the
daemons themselves, or simply be done in the init system.

It is not likely that we can get rid of shell scripts during
system boot-up entirely anytime soon. Rewriting them in C takes
time, in a few case does not really make sense, and sometimes
shell scripts are just too handy to do without. But we can
certainly make them less prominent.

A good metric for measuring shell script infestation of the
boot process is the PID number of the first process you can start
after the system is fully booted up. Boot up, log in, open a
terminal, and type echo $$. Try that on your Linux
system, and then compare the result with MacOS! (Hint, it’s
something like this: Linux PID 1823; MacOS PID 154, measured on
test systems we own.)

Keeping Track of Processes

A central part of a system that starts up and maintains
services should be process babysitting: it should watch
services. Restart them if they shut down. If they crash it should
collect information about them, and keep it around for the
administrator, and cross-link that information with what is
available from crash dump systems such as abrt, and in logging
systems like syslog or the audit system.

It should also be capable of shutting down a service
completely. That might sound easy, but is harder than you
think. Traditionally on Unix a process that does double-forking
can escape the supervision of its parent, and the old parent will
not learn about the relation of the new process to the one it
actually started. An example: currently, a misbehaving CGI script
that has double-forked is not terminated when you shut down
Apache. Furthermore, you will not even be able to figure out its
relation to Apache, unless you know it by name and purpose.

So, how can we keep track of processes, so that they cannot
escape the babysitter, and that we can control them as one unit
even if they fork a gazillion times?

Different people came up with different solutions for this. I
am not going into much detail here, but let’s at least say that
approaches based on ptrace or the netlink connector (a kernel
interface which allows you to get a netlink message each time any
process on the system fork()s or exit()s) that some people have
investigated and implemented, have been criticised as ugly and not
very scalable.

So what can we do about this? Well, since quite a while the
kernel knows Control
Groups
(aka “cgroups”). Basically they allow the creation of a
hierarchy of groups of processes. The hierarchy is directly
exposed in a virtual file-system, and hence easily accessible. The
group names are basically directory names in that file-system. If
a process belonging to a specific cgroup fork()s, its child will
become a member of the same group. Unless it is privileged and has
access to the cgroup file system it cannot escape its
group. Originally, cgroups have been introduced into the kernel
for the purpose of containers: certain kernel subsystems can
enforce limits on resources of certain groups, such as limiting
CPU or memory usage. Traditional resource limits (as implemented
by setrlimit()) are (mostly) per-process. cgroups on the
other hand let you enforce limits on entire groups of
processes. cgroups are also useful to enforce limits outside of
the immediate container use case. You can use it for example to
limit the total amount of memory or CPU Apache and all its
children may use. Then, a misbehaving CGI script can no longer
escape your setrlimit() resource control by simply
forking away.

In addition to container and resource limit enforcement cgroups
are very useful to keep track of daemons: cgroup membership is
securely inherited by child processes, they cannot escape. There’s
a notification system available so that a supervisor process can
be notified when a cgroup runs empty. You can find the cgroups of
a process by reading /proc/$PID/cgroup. cgroups hence
make a very good choice to keep track of processes for babysitting
purposes.

Controlling the Process Execution Environment

A good babysitter should not only oversee and control when a
daemon starts, ends or crashes, but also set up a good, minimal,
and secure working environment for it.

That means setting obvious process parameters such as the
setrlimit() resource limits, user/group IDs or the
environment block, but does not end there. The Linux kernel gives
users and administrators a lot of control over processes (some of
it is rarely used, currently). For each process you can set CPU
and IO scheduler controls, the capability bounding set, CPU
affinity or of course cgroup environments with additional limits,
and more.

As an example, ioprio_set() with
IOPRIO_CLASS_IDLE is a great away to minimize the effect
of locate‘s updatedb on system interactivity.

On top of that certain high-level controls can be very useful,
such as setting up read-only file system overlays based on
read-only bind mounts. That way one can run certain daemons so
that all (or some) file systems appear read-only to them, so that
EROFS is returned on every write request. As such this can be used
to lock down what daemons can do similar in fashion to a poor
man’s SELinux policy system (but this certainly doesn’t replace
SELinux, don’t get any bad ideas, please).

Finally logging is an important part of executing services:
ideally every bit of output a service generates should be logged
away. An init system should hence provide logging to daemons it
spawns right from the beginning, and connect stdout and stderr to
syslog or in some cases even /dev/kmsg which in many
cases makes a very useful replacement for syslog (embedded folks,
listen up!), especially in times where the kernel log buffer is
configured ridiculously large out-of-the-box.

On Upstart

To begin with, let me emphasize that I actually like the code
of Upstart, it is very well commented and easy to
follow. It’s certainly something other projects should learn
from (including my own).

That being said, I can’t say I agree with the general approach
of Upstart. But first, a bit more about the project:

Upstart does not share code with sysvinit, and its
functionality is a super-set of it, and provides compatibility to
some degree with the well known SysV init scripts. It’s main
feature is its event-based approach: starting and stopping of
processes is bound to “events” happening in the system, where an
“event” can be a lot of different things, such as: a network
interfaces becomes available or some other software has been
started.

Upstart does service serialization via these events: if the
syslog-started event is triggered this is used as an
indication to start D-Bus since it can now make use of Syslog. And
then, when dbus-started is triggered,
NetworkManager is started, since it may now use
D-Bus, and so on.

One could say that this way the actual logical dependency tree
that exists and is understood by the admin or developer is
translated and encoded into event and action rules: every logical
“a needs b” rule that the administrator/developer is aware of
becomes a “start a when b is started” plus “stop a when b is
stopped”. In some way this certainly is a simplification:
especially for the code in Upstart itself. However I would argue
that this simplification is actually detrimental. First of all,
the logical dependency system does not go away, the person who is
writing Upstart files must now translate the dependencies manually
into these event/action rules (actually, two rules for each
dependency). So, instead of letting the computer figure out what
to do based on the dependencies, the user has to manually
translate the dependencies into simple event/action rules. Also,
because the dependency information has never been encoded it is
not available at runtime, effectively meaning that an
administrator who tries to figure our why something
happened, i.e. why a is started when b is started, has no chance
of finding that out.

Furthermore, the event logic turns around all dependencies,
from the feet onto their head. Instead of minimizing the
amount of work (which is something that a good init system should
focus on, as pointed out in the beginning of this blog story), it
actually maximizes the amount of work to do during
operations. Or in other words, instead of having a clear goal and
only doing the things it really needs to do to reach the goal, it
does one step, and then after finishing it, it does all
steps that possibly could follow it.

Or to put it simpler: the fact that the user just started D-Bus
is in no way an indication that NetworkManager should be started
too (but this is what Upstart would do). It’s right the other way
round: when the user asks for NetworkManager, that is definitely
an indication that D-Bus should be started too (which is certainly
what most users would expect, right?).

A good init system should start only what is needed, and that
on-demand. Either lazily or parallelized and in advance. However
it should not start more than necessary, particularly not
everything installed that could use that service.

Finally, I fail to see the actual usefulness of the event
logic. It appears to me that most events that are exposed in
Upstart actually are not punctual in nature, but have duration: a
service starts, is running, and stops. A device is plugged in, is
available, and is plugged out again. A mount point is in the
process of being mounted, is fully mounted, or is being
unmounted. A power plug is plugged in, the system runs on AC, and
the power plug is pulled. Only a minority of the events an init
system or process supervisor should handle are actually punctual,
most of them are tuples of start, condition, and stop. This
information is again not available in Upstart, because it focuses
in singular events, and ignores durable dependencies.

Now, I am aware that some of the issues I pointed out above are
in some way mitigated by certain more recent changes in Upstart,
particularly condition based syntaxes such as start on
(local-filesystems and net-device-up IFACE=lo)
in Upstart
rule files. However, to me this appears mostly as an attempt to
fix a system whose core design is flawed.

Besides that Upstart does OK for babysitting daemons, even though
some choices might be questionable (see above), and there are certainly a lot
of missed opportunities (see above, too).

There are other init systems besides sysvinit, Upstart and
launchd. Most of them offer little substantial more than Upstart or
sysvinit. The most interesting other contender is Solaris SMF,
which supports proper dependencies between services. However, in
many ways it is overly complex and, let’s say, a bit academic
with its excessive use of XML and new terminology for known
things. It is also closely bound to Solaris specific features such
as the contract system.

Putting it All Together: systemd

Well, this is another good time for a little pause, because
after I have hopefully explained above what I think a good PID 1
should be doing and what the current most used system does, we’ll
now come to where the beef is. So, go and refill you coffee mug
again. It’s going to be worth it.

You probably guessed it: what I suggested above as requirements
and features for an ideal init system is actually available now,
in a (still experimental) init system called systemd, and
which I hereby want to announce. Again, here’s the
code.
And here’s a quick rundown of its features, and the
rationale behind them:

systemd starts up and supervises the entire system (hence the
name…). It implements all of the features pointed out above and
a few more. It is based around the notion of units. Units
have a name and a type. Since their configuration is usually
loaded directly from the file system, these unit names are
actually file names. Example: a unit avahi.service is
read from a configuration file by the same name, and of course
could be a unit encapsulating the Avahi daemon. There are several
kinds of units:

  1. service: these are the most obvious kind of unit:
    daemons that can be started, stopped, restarted, reloaded. For
    compatibility with SysV we not only support our own
    configuration files for services, but also are able to read
    classic SysV init scripts, in particular we parse the LSB
    header, if it exists. /etc/init.d is hence not much
    more than just another source of configuration.
  2. socket: this unit encapsulates a socket in the
    file-system or on the Internet. We currently support AF_INET,
    AF_INET6, AF_UNIX sockets of the types stream, datagram, and
    sequential packet. We also support classic FIFOs as
    transport. Each socket unit has a matching
    service unit, that is started if the first connection
    comes in on the socket or FIFO. Example: nscd.socket
    starts nscd.service on an incoming connection.
  3. device: this unit encapsulates a device in the
    Linux device tree. If a device is marked for this via udev
    rules, it will be exposed as a device unit in
    systemd. Properties set with udev can be used as
    configuration source to set dependencies for device units.
  4. mount: this unit encapsulates a mount point in the
    file system hierarchy. systemd monitors all mount points how
    they come and go, and can also be used to mount or
    unmount mount-points. /etc/fstab is used here as an
    additional configuration source for these mount points, similar to
    how SysV init scripts can be used as additional configuration
    source for service units.
  5. automount: this unit type encapsulates an automount
    point in the file system hierarchy. Each automount
    unit has a matching mount unit, which is started
    (i.e. mounted) as soon as the automount directory is
    accessed.
  6. target: this unit type is used for logical
    grouping of units: instead of actually doing anything by itself
    it simply references other units, which thereby can be controlled
    together. Examples for this are: multi-user.target,
    which is a target that basically plays the role of run-level 5 on
    classic SysV system, or bluetooth.target which is
    requested as soon as a bluetooth dongle becomes available and
    which simply pulls in bluetooth related services that otherwise
    would not need to be started: bluetoothd and
    obexd and suchlike.
  7. snapshot: similar to target units
    snapshots do not actually do anything themselves and their only
    purpose is to reference other units. Snapshots can be used to
    save/rollback the state of all services and units of the init
    system. Primarily it has two intended use cases: to allow the
    user to temporarily enter a specific state such as “Emergency
    Shell”, terminating current services, and provide an easy way to
    return to the state before, pulling up all services again that
    got temporarily pulled down. And to ease support for system
    suspending: still many services cannot correctly deal with
    system suspend, and it is often a better idea to shut them down
    before suspend, and restore them afterwards.

All these units can have dependencies between each other (both
positive and negative, i.e. ‘Requires’ and ‘Conflicts’): a device
can have a dependency on a service, meaning that as soon as a
device becomes available a certain service is started. Mounts get
an implicit dependency on the device they are mounted from. Mounts
also gets implicit dependencies to mounts that are their prefixes
(i.e. a mount /home/lennart implicitly gets a dependency
added to the mount for /home) and so on.

A short list of other features:

  1. For each process that is spawned, you may control: the
    environment, resource limits, working and root directory, umask,
    OOM killer adjustment, nice level, IO class and priority, CPU policy
    and priority, CPU affinity, timer slack, user id, group id,
    supplementary group ids, readable/writable/inaccessible
    directories, shared/private/slave mount flags,
    capabilities/bounding set, secure bits, CPU scheduler reset of
    fork, private /tmp name-space, cgroup control for
    various subsystems. Also, you can easily connect
    stdin/stdout/stderr of services to syslog, /dev/kmsg,
    arbitrary TTYs. If connected to a TTY for input systemd will make
    sure a process gets exclusive access, optionally waiting or enforcing
    it.
  2. Every executed process gets its own cgroup (currently by
    default in the debug subsystem, since that subsystem is not
    otherwise used and does not much more than the most basic
    process grouping), and it is very easy to configure systemd to
    place services in cgroups that have been configured externally,
    for example via the libcgroups utilities.
  3. The native configuration files use a syntax that closely
    follows the well-known .desktop files. It is a simple syntax for
    which parsers exist already in many software frameworks. Also, this
    allows us to rely on existing tools for i18n for service
    descriptions, and similar. Administrators and developers don’t
    need to learn a new syntax.
  4. As mentioned, we provide compatibility with SysV init
    scripts. We take advantages of LSB and Red Hat chkconfig headers
    if they are available. If they aren’t we try to make the best of
    the otherwise available information, such as the start
    priorities in /etc/rc.d. These init scripts are simply
    considered a different source of configuration, hence an easy
    upgrade path to proper systemd services is available. Optionally
    we can read classic PID files for services to identify the main
    pid of a daemon. Note that we make use of the dependency
    information from the LSB init script headers, and translate
    those into native systemd dependencies. Side note: Upstart is
    unable to harvest and make use of that information. Boot-up on a
    plain Upstart system with mostly LSB SysV init scripts will
    hence not be parallelized, a similar system running systemd
    however will. In fact, for Upstart all SysV scripts together
    make one job that is executed, they are not treated
    individually, again in contrast to systemd where SysV init
    scripts are just another source of configuration and are all
    treated and controlled individually, much like any other native
    systemd service.
  5. Similarly, we read the existing /etc/fstab
    configuration file, and consider it just another source of
    configuration. Using the comment= fstab option you can
    even mark /etc/fstab entries to become systemd
    controlled automount points.
  6. If the same unit is configured in multiple configuration
    sources (e.g. /etc/systemd/system/avahi.service exists,
    and /etc/init.d/avahi too), then the native
    configuration will always take precedence, the legacy format is
    ignored, allowing an easy upgrade path and packages to carry
    both a SysV init script and a systemd service file for a
    while.
  7. We support a simple templating/instance mechanism. Example:
    instead of having six configuration files for six gettys, we
    only have one [email protected] file which gets instantiated to
    [email protected] and suchlike. The interface part can
    even be inherited by dependency expressions, i.e. it is easy to
    encode that a service [email protected] pulls in
    [email protected], while leaving the
    eth0 string wild-carded.
  8. For socket activation we support full compatibility with the
    traditional inetd modes, as well as a very simple mode that
    tries to mimic launchd socket activation and is recommended for
    new services. The inetd mode only allows passing one socket to
    the started daemon, while the native mode supports passing
    arbitrary numbers of file descriptors. We also support one
    instance per connection, as well as one instance for all
    connections modes. In the former mode we name the cgroup the
    daemon will be started in after the connection parameters, and
    utilize the templating logic mentioned above for this. Example:
    sshd.socket might spawn services
    [email protected] with a
    cgroup of [email protected]/192.168.0.1-4711-192.168.0.2-22
    (i.e. the IP address and port numbers are used in the instance
    names. For AF_UNIX sockets we use PID and user id of the
    connecting client). This provides a nice way for the
    administrator to identify the various instances of a daemon and
    control their runtime individually. The native socket passing
    mode is very easily implementable in applications: if
    $LISTEN_FDS is set it contains the number of sockets
    passed and the daemon will find them sorted as listed in the
    .service file, starting from file descriptor 3 (a
    nicely written daemon could also use fstat() and
    getsockname() to identify the sockets in case it
    receives more than one). In addition we set $LISTEN_PID
    to the PID of the daemon that shall receive the fds, because
    environment variables are normally inherited by sub-processes and
    hence could confuse processes further down the chain. Even
    though this socket passing logic is very simple to implement in
    daemons, we will provide a BSD-licensed reference implementation
    that shows how to do this. We have ported a couple of existing
    daemons to this new scheme.
  9. We provide compatibility with /dev/initctl to a
    certain extent. This compatibility is in fact implemented with a
    FIFO-activated service, which simply translates these legacy
    requests to D-Bus requests. Effectively this means the old
    shutdown, poweroff and similar commands from
    Upstart and sysvinit continue to work with
    systemd.
  10. We also provide compatibility with utmp and
    wtmp. Possibly even to an extent that is far more
    than healthy, given how crufty utmp and wtmp
    are.
  11. systemd supports several kinds of
    dependencies between units. After/Before can be used to fix
    the ordering how units are activated. It is completely
    orthogonal to Requires and Wants, which
    express a positive requirement dependency, either mandatory, or
    optional. Then, there is Conflicts which
    expresses a negative requirement dependency. Finally, there are
    three further, less used dependency types.
  12. systemd has a minimal transaction system. Meaning: if a unit
    is requested to start up or shut down we will add it and all its
    dependencies to a temporary transaction. Then, we will
    verify if the transaction is consistent (i.e. whether the
    ordering via After/Before of all units is
    cycle-free). If it is not, systemd will try to fix it up, and
    removes non-essential jobs from the transaction that might
    remove the loop. Also, systemd tries to suppress non-essential
    jobs in the transaction that would stop a running
    service. Non-essential jobs are those which the original request
    did not directly include but which where pulled in by
    Wants type of dependencies. Finally we check whether
    the jobs of the transaction contradict jobs that have already
    been queued, and optionally the transaction is aborted then. If
    all worked out and the transaction is consistent and minimized
    in its impact it is merged with all already outstanding jobs and
    added to the run queue. Effectively this means that before
    executing a requested operation, we will verify that it makes
    sense, fixing it if possible, and only failing if it really cannot
    work.
  13. We record start/exit time as well as the PID and exit status
    of every process we spawn and supervise. This data can be used
    to cross-link daemons with their data in abrtd, auditd and
    syslog. Think of an UI that will highlight crashed daemons for
    you, and allows you to easily navigate to the respective UIs for
    syslog, abrt, and auditd that will show the data generated from
    and for this daemon on a specific run.
  14. We support reexecution of the init process itself at any
    time. The daemon state is serialized before the reexecution and
    deserialized afterwards. That way we provide a simple way to
    facilitate init system upgrades as well as handover from an
    initrd daemon to the final daemon. Open sockets and autofs
    mounts are properly serialized away, so that they stay
    connectible all the time, in a way that clients will not even
    notice that the init system reexecuted itself. Also, the fact
    that a big part of the service state is encoded anyway in the
    cgroup virtual file system would even allow us to resume
    execution without access to the serialization data. The
    reexecution code paths are actually mostly the same as the init
    system configuration reloading code paths, which
    guarantees that reexecution (which is probably more seldom
    triggered) gets similar testing as reloading (which is probably
    more common).
  15. Starting the work of removing shell scripts from the boot
    process we have recoded part of the basic system setup in C and
    moved it directly into systemd. Among that is mounting of the API
    file systems (i.e. virtual file systems such as /proc,
    /sys and /dev.) and setting of the
    host-name.
  16. Server state is introspectable and controllable via
    D-Bus. This is not complete yet but quite extensive.
  17. While we want to emphasize socket-based and bus-name-based
    activation, and we hence support dependencies between sockets and
    services, we also support traditional inter-service
    dependencies. We support multiple ways how such a service can
    signal its readiness: by forking and having the start process
    exit (i.e. traditional daemonize() behaviour), as well
    as by watching the bus until a configured service name appears.
  18. There’s an interactive mode which asks for confirmation each
    time a process is spawned by systemd. You may enable it by
    passing systemd.confirm_spawn=1 on the kernel command
    line.
  19. With the systemd.default= kernel command line
    parameter you can specify which unit systemd should start on
    boot-up. Normally you’d specify something like
    multi-user.target here, but another choice could even
    be a single service instead of a target, for example
    out-of-the-box we ship a service emergency.service that
    is similar in its usefulness as init=/bin/bash, however
    has the advantage of actually running the init system, hence
    offering the option to boot up the full system from the
    emergency shell.
  20. There’s a minimal UI that allows you to
    start/stop/introspect services. It’s far from complete but
    useful as a debugging tool. It’s written in Vala (yay!) and goes
    by the name of systemadm.

It should be noted that systemd uses many Linux-specific
features, and does not limit itself to POSIX. That unlocks a lot
of functionality a system that is designed for portability to
other operating systems cannot provide.

Status

All the features listed above are already implemented. Right
now systemd can already be used as a drop-in replacement for
Upstart and sysvinit (at least as long as there aren’t too many
native upstart services yet. Thankfully most distributions don’t
carry too many native Upstart services yet.)

However, testing has been minimal, our version number is
currently at an impressive 0. Expect breakage if you run this in
its current state. That said, overall it should be quite stable
and some of us already boot their normal development systems with
systemd (in contrast to VMs only). YMMV, especially if you try
this on distributions we developers don’t use.

Where is This Going?

The feature set described above is certainly already
comprehensive. However, we have a few more things on our plate. I
don’t really like speaking too much about big plans but here’s a
short overview in which direction we will be pushing this:

We want to add at least two more unit types: swap
shall be used to control swap devices the same way we
already control mounts, i.e. with automatic dependencies on the
device tree devices they are activated from, and
suchlike. timer shall provide functionality similar to
cron, i.e. starts services based on time events, the
focus being both monotonic clock and wall-clock/calendar
events. (i.e. “start this 5h after it last ran” as well as “start
this every monday 5 am”)

More importantly however, it is also our plan to experiment with
systemd not only for optimizing boot times, but also to make it
the ideal session manager, to replace (or possibly just augment)
gnome-session, kdeinit and similar daemons. The problem set of a
session manager and an init system are very similar: quick start-up
is essential and babysitting processes the focus. Using the same
code for both uses hence suggests itself. Apple recognized that
and does just that with launchd. And so should we: socket and bus
based activation and parallelization is something session services
and system services can benefit from equally.

I should probably note that all three of these features are
already partially available in the current code base, but not
complete yet. For example, already, you can run systemd just fine
as a normal user, and it will detect that is run that way and
support for this mode has been available since the very beginning,
and is in the very core. (It is also exceptionally useful for
debugging! This works fine even without having the system
otherwise converted to systemd for booting.)

However, there are some things we probably should fix in the
kernel and elsewhere before finishing work on this: we
need swap status change notifications from the kernel similar to
how we can already subscribe to mount changes; we want a
notification when CLOCK_REALTIME jumps relative to
CLOCK_MONOTONIC; we want to allow normal processes to get
some init-like powers
; we need a well-defined
place where we can put user sockets
. None of these issues are
really essential for systemd, but they’d certainly improve
things.

You Want to See This in Action?

Currently, there are no tarball releases, but it should be
straightforward to check out the code from our
repository
. In addition, to have something to start with, here’s
a tarball with unit configuration files
that allows an
otherwise unmodified Fedora 13 system to work with systemd. We
have no RPMs to offer you for now.

An easier way is to download this Fedora 13 qemu image, which
has been prepared for systemd. In the grub menu you can select
whether you want to boot the system with Upstart or systemd. Note
that this system is minimally modified only. Service information
is read exclusively from the existing SysV init scripts. Hence it
will not take advantage of the full socket and bus-based
parallelization pointed out above, however it will interpret the
parallelization hints from the LSB headers, and hence boots faster
than the Upstart system, which in Fedora does not employ any
parallelization at the moment. The image is configured to output
debug information on the serial console, as well as writing it to
the kernel log buffer (which you may access with dmesg).
You might want to run qemu configured with a virtual
serial terminal. All passwords are set to systemd.

Even simpler than downloading and booting the qemu image is
looking at pretty screen-shots. Since an init system usually is
well hidden beneath the user interface, some shots of
systemadm and ps must do:

systemadm

That’s systemadm showing all loaded units, with more detailed
information on one of the getty instances.

ps

That’s an excerpt of the output of ps xaf -eo
pid,user,args,cgroup
showing how neatly the processes are
sorted into the cgroup of their service. (The fourth column is the
cgroup, the debug: prefix is shown because we use the
debug cgroup controller for systemd, as mentioned earlier. This is
only temporary.)

Note that both of these screenshots show an only minimally
modified Fedora 13 Live CD installation, where services are
exclusively loaded from the existing SysV init scripts. Hence,
this does not use socket or bus activation for any existing
service.

Sorry, no bootcharts or hard data on start-up times for the
moment. We’ll publish that as soon as we have fully parallelized
all services from the default Fedora install. Then, we’ll welcome
you to benchmark the systemd approach, and provide our own
benchmark data as well.

Well, presumably everybody will keep bugging me about this, so
here are two numbers I’ll tell you. However, they are completely
unscientific as they are measured for a VM (single CPU) and by
using the stop timer in my watch. Fedora 13 booting up with
Upstart takes 27s, with systemd we reach 24s (from grub to gdm,
same system, same settings, shorter value of two bootups, one
immediately following the other). Note however that this shows
nothing more than the speedup effect reached by using the LSB
dependency information parsed from the init script headers for
parallelization. Socket or bus based activation was not utilized
for this, and hence these numbers are unsuitable to assess the
ideas pointed out above. Also, systemd was set to debug verbosity
levels on a serial console. So again, this benchmark data has
barely any value.

Writing Daemons

An ideal daemon for use with systemd does a few things
differently then things were traditionally done. Later on, we will
publish a longer guide explaining and suggesting how to write a daemon for use
with this systemd. Basically, things get simpler for daemon
developers:

  • We ask daemon writers not to fork or even double fork
    in their processes, but run their event loop from the initial process
    systemd starts for you. Also, don’t call setsid().
  • Don’t drop user privileges in the daemon itself, leave this
    to systemd and configure it in systemd service configuration
    files. (There are exceptions here. For example, for some daemons
    there are good reasons to drop privileges inside the daemon
    code, after an initialization phase that requires elevated
    privileges.)
  • Don’t write PID files
  • Grab a name on the bus
  • You may rely on systemd for logging, you are welcome to log
    whatever you need to log to stderr.
  • Let systemd create and watch sockets for you, so that socket
    activation works. Hence, interpret $LISTEN_FDS and
    $LISTEN_PID as described above.
  • Use SIGTERM for requesting shut downs from your daemon.

The list above is very similar to what Apple
recommends for daemons compatible with launchd
. It should be
easy to extend daemons that already support launchd
activation to support systemd activation as well.

Note that systemd supports daemons not written in this style
perfectly as well, already for compatibility reasons (launchd has
only limited support for that). As mentioned, this even extends to
existing inetd capable daemons which can be used unmodified for
socket activation by systemd.

So, yes, should systemd prove itself in our experiments and get
adopted by the distributions it would make sense to port at least
those services that are started by default to use socket or
bus-based activation. We have
written proof-of-concept patches
, and the porting turned out
to be very easy. Also, we can leverage the work that has already
been done for launchd, to a certain extent. Moreover, adding
support for socket-based activation does not make the service
incompatible with non-systemd systems.

FAQs

Who’s behind this?
Well, the current code-base is mostly my work, Lennart
Poettering (Red Hat). However the design in all its details is
result of close cooperation between Kay Sievers (Novell) and
me. Other people involved are Harald Hoyer (Red Hat), Dhaval
Giani (Formerly IBM), and a few others from various
companies such as Intel, SUSE and Nokia.
Is this a Red Hat project?
No, this is my personal side project. Also, let me emphasize
this: the opinions reflected here are my own. They are not
the views of my employer, or Ronald McDonald, or anyone
else.
Will this come to Fedora?
If our experiments prove that this approach works out, and
discussions in the Fedora community show support for this, then
yes, we’ll certainly try to get this into Fedora.
Will this come to OpenSUSE?
Kay’s pursuing that, so something similar as for Fedora applies here, too.
Will this come to Debian/Gentoo/Mandriva/MeeGo/Ubuntu/[insert your favourite distro here]?
That’s up to them. We’d certainly welcome their interest, and help with the integration.
Why didn’t you just add this to Upstart, why did you invent something new?
Well, the point of the part about Upstart above was to show
that the core design of Upstart is flawed, in our
opinion. Starting completely from scratch suggests itself if the
existing solution appears flawed in its core. However, note that
we took a lot of inspiration from Upstart’s code-base
otherwise.
If you love Apple launchd so much, why not adopt that?
launchd is a great invention, but I am not convinced that it
would fit well into Linux, nor that it is suitable for a system
like Linux with its immense scalability and flexibility to
numerous purposes and uses.
Is this an NIH project?
Well, I hope that I managed to explain in the text above why
we came up with something new, instead of building on Upstart or
launchd. We came up with systemd due to technical
reasons, not political reasons.
Don’t forget that it is Upstart that includes
a library called NIH
(which is kind of a reimplementation of glib) — not systemd!
Will this run on [insert non-Linux OS here]?
Unlikely. As pointed out, systemd uses many Linux specific
APIs (such as epoll, signalfd, libudev, cgroups, and numerous
more), a port to other operating systems appears to us as not
making a lot of sense. Also, we, the people involved are
unlikely to be interested in merging possible ports to other
platforms and work with the constraints this introduces. That said,
git supports branches and rebasing quite well, in case
people really want to do a port.
Actually portability is even more limited than just to other OSes: we require a very
recent Linux kernel, glibc, libcgroup and libudev. No support for
less-than-current Linux systems, sorry.
If folks want to implement something similar for other
operating systems, the preferred mode of cooperation is probably
that we help you identify which interfaces can be shared with
your system, to make life easier for daemon writers to support
both systemd and your systemd counterpart. Probably, the focus should be
to share interfaces, not code.
I hear [fill one in here: the Gentoo boot system, initng,
Solaris SMF, runit, uxlaunch, …] is an awesome init system and
also does parallel boot-up, so why not adopt that?
Well, before we started this we actually had a very close
look at the various systems, and none of them did what we had in
mind for systemd (with the exception of launchd, of course). If
you cannot see that, then please read again what I wrote
above.

Contributions

We are very interested in patches and help. It should be common
sense that every Free Software project can only benefit from the
widest possible external contributions. That is particularly true
for a core part of the OS, such as an init system. We value your
contributions and hence do not require copyright assignment (Very
much unlike Canonical/Upstart
!). And also, we use git,
everybody’s favourite VCS, yay!

We are particularly interested in help getting systemd to work
on other distributions, besides Fedora and OpenSUSE. (Hey, anybody
from Debian, Gentoo, Mandriva, MeeGo looking for something to do?)
But even beyond that we are keen to attract contributors on every
level: we welcome C hackers, packagers, as well as folks who are interested
to write documentation, or contribute a logo.

Community

At this time we only have source code
repository
and an IRC channel (#systemd on
Freenode). There’s no mailing list, web site or bug tracking
system. We’ll probably set something up on freedesktop.org
soon. If you have any questions or want to contact us otherwise we
invite you to join us on IRC!

Update: our GIT repository has moved.

How to convert a GIT SVN mirror into GIT upstream

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/git-mirror-to-upstream.html

Yesterday I did the final steps to convert all my SVN repositories to GIT (including Avahi and PulseAudio). I had been running hot GIT mirrors of the SVN repositories for quite a while now. The last step was the switch to make them canonical upstream, and to disable the SVN repos.

For future Google reference, here are the steps that are necessary to make an SVN GIT mirror into a proper GIT repo:

# On the client:
$ git clone ssh://…./git/foobar foobar
$ cd foobar
$ git checkout trunk
$ git branch -m master
$ git push origin master
# This is a good time to edit the HEAD file on the server and replace its contents “ref: refs/heads/trunk” by “ref: refs/heads/master”
$ git push origin :trunk

This will basically replace ‘trunk’ by ‘master’, and make it
the default when clients clone the repository. This will however not rename
tags from the git-svn style to the GIT style. (Which I personally
think would be a bad idea anyway, BTW)

Removing the origin from the server’s config file is a good idea, too, since the repo is now canonical upstream.

Of course, afterwards you still need to create proper .gitignore
files for the repositories. Just taking the value of the old
svn:ignore property is a bad idea BTW, because .gitignore
lists patterns that are used for the directory it is placed in and
everything beneath, while svn:ignore is not applied
recursively.

And finally you need to remove all those $Id$ lines and suchlike
from all source files since they are kind of pointless on GIT. It is left as an
excercise to the user to craft a good sed or perl script to do this
automatically and recursively for an entire tree.

Lazyweb, do you have a good idea how to integrate mutt and
git-am best? I want a key in mutt I can press which will ask me for a
GIT directory and then call git-am –interactive for the currently
selected email. Anyone got a good idea? Right now I am piping the mail from
mutt to git-am. But that sucks, because –interactive refuses to work called like that
and because I cannot specify the git repo to apply this to.

How to convert a GIT SVN mirror into GIT upstream

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/git-mirror-to-upstream.html

Yesterday I did the final steps to convert all my SVN repositories to GIT (including Avahi and PulseAudio). I had been running hot GIT mirrors of the SVN repositories for quite a while now. The last step was the switch to make them canonical upstream, and to disable the SVN repos.

For future Google reference, here are the steps that are necessary to make an SVN GIT mirror into a proper GIT repo:

# On the client:
$ git clone ssh://..../git/foobar foobar
$ cd foobar
$ git checkout trunk
$ git branch -m master
$ git push origin master
# This is a good time to edit the HEAD file on the server and replace its contents "ref: refs/heads/trunk" by "ref: refs/heads/master"
$ git push origin :trunk

This will basically replace ‘trunk‘ by ‘master‘, and make it
the default when clients clone the repository. This will however not rename
tags from the git-svn style to the GIT style. (Which I personally
think would be a bad idea anyway, BTW)

Removing the origin from the server’s config file is a good idea, too, since the repo is now canonical upstream.

Of course, afterwards you still need to create proper .gitignore
files for the repositories. Just taking the value of the old
svn:ignore property is a bad idea BTW, because .gitignore
lists patterns that are used for the directory it is placed in and
everything beneath
, while svn:ignore is not applied
recursively.

And finally you need to remove all those $Id$ lines and suchlike
from all source files since they are kind of pointless on GIT. It is left as an
excercise to the user to craft a good sed or perl script to do this
automatically and recursively for an entire tree.

Lazyweb, do you have a good idea how to integrate mutt and
git-am best? I want a key in mutt I can press which will ask me for a
GIT directory and then call git-am --interactive for the currently
selected email. Anyone got a good idea? Right now I am piping the mail from
mutt to git-am. But that sucks, because --interactive refuses to work called like that
and because I cannot specify the git repo to apply this to.

Virtually Reluctant

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2007/06/12/virtually-reluctant.html

Way back when User
Mode Linux (UML)
was the “only way” the Free Software
world did anything like virtualization, I was already skeptical.
Those of us who lived through the coming of age of Internet security
— with a remote root exploit for every day of the week —
became obsessed with the chroot and its ultimate limitations. Each
possible upgrade to a better, more robust virtual environment was met
with suspicion on the security front. I joined the many who doubted
that you could truly secure a machine that offered disjoint services
provisioned on the same physical machine. I’ve recently revisited
this position. I won’t say that Xen has completely changed my mind,
but I am open-minded enough again to experiment.

For more than a decade, I have used chroots as a mechanism to segment a
service that needed to run on a given box. In the old days
of ancient BINDs and sendmails, this was often the best we could do
when living with a program we didn’t fully trust to be clean of
remotely exploitable bugs.

I suppose those days gave us all rather strange sense of computer
security. I constantly have the sense that two services running on
the same box always endanger each other in some fundamental way. It
therefore took me a while before I was comfortable with the resurgence
of virtualization.

However, what ultimately drew me in was the simple fact that modern
hardware is just too darn fast. It’s tough to get a machine these
days that isn’t ridiculously overpowered for most tasks you put in
front of it. CPUs sit idle; RAM sits empty. We should make more
efficient use of the hardware we have.

Even with that reality, I might have given up if it wasn’t so easy. I
found a good link
about Debian on Xen
, a useful entry in
the Xen Wiki
, and some good
network
and LVM
examples
. I also quickly learned how to use RAID/LVM
together for disk redundancy inside Xen instances
. I even got bonded
ethernet
working with some help to add
additional network redundancy.

So, one Saturday morning, I headed into the office, and left that
afternoon with two virtual servers running. It helped that Xen 3.0 is
packaged properly for recent Ubuntu versions, and a few obvious
apt-get installs get you what you need on edgy and
feisty. In fact, I only struggled (and only just a bit) with the
network, but quickly discovered two important facts:

VIF network routing in my opinion is a bit easier to configure and
more stable than VIF bridging, even if routing is a bit
slower.

sysctl -w net.ipv4.conf.DEVICE.proxy_arp=1 is needed to
make the network routing down into the instances work
properly.

I’m not completely comfortable yet with the security of virtualization.
Of course, locking down the Dom0 is absolutely essential, because
there lies the keys to your virtual kingdom. I lock it down with
iptables so that only SSH from a few trusted hosts comes
in, and even services as fundamental as DNS can only be had from a few
trusted places. But, I still find myself imagining ways people can
bust through the instance kernels and find their way to the
hypervisor.

I’d really love to see a strong line-by-line code audit of the
hypervisor and related utilities to be sure we’ve got something we can
trust. However, in the meantime, I certainly have been sold on the
value of this approach, and am glad it’s so easy to set up.

My thoughts on the future of Gnome-VFS

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/gnomevfs-future.html

One of the major construction sites in GNOME and all the other free
desktop environments is the VFS abstraction. Recently, there has been
some discussion about developing a replacement DVFS
as replacement for the venerable Gnome-VFS system. Here are my 5 euro
cent on this issue (Yepp, I am not fully up-to-date on the whole DVFS
discussion, but during my flight from HEL to HAM I wrote this up,
without being necesarily too well informed, lacking an Internet
connection. Hence, if you find that I am an uniformed idiot, you’re of
course welcome to flame me!):

First of all, we have to acknowledge that Gnome-VFS never achieved any major
adoption besides some core (not even all) GNOME applications. The reasons are
many, among them: the API wasn’t all fun, using Gnome-VFS added another
dependency to applications, KDE uses a different abstraction (KIO), and many
others. Adoption was suboptimal, and due to that user experience was
suboptimal, too (to say the least).

One of the basic problems of Gnome-VFS is that it is a (somewhat) redundant
abstraction layer over yet another abstraction layer. Gnome-VFS makes available
an API that offers more or less the same functionality as the (most of the
time) underlying POSIX API. The POSIX API is well accepted, relatively
easy-to-use, portable and very well accepted. The same is not true for
Gnome-VFS. Semantics of the translation between Gnome-VFS and POSIX are not
always that clear. Paths understood by Gnome-VFS (URLs) follow a different
model than those of the Linux kernel. Applications which understand Gnome-VFS
can deal with FTP and HTTP resources, while the majority of the applications
which do not link against Gnome-VFS does not understand it. Integration of
Gnome-VFS-speaking and POSIX-speaking applications is difficult and most of the
time only partially implementable.

So, in short: One one side we have that POSIX API which is a file system
abstraction API. And a (kernel-based) virtual file system behind it. And on the other
side we have the Gnome-VFS API which is also a file system abstraction API and
a virtual file system behind it. Hence, why did we decide to standardize on
Gnome-VFS, and not just on POSIX?

The major reason of course is that until recently accessing FTP,
HTTP and other protocol shares through the POSIX API was not doable
without special kernel patches. However, a while ago the FUSE system
has been merged into the Linux kernel and has been made available for
other operating systems as well, among them FreeBSD and MacOS X. This
allows implementing file system drivers in userspace. Currently there
are all kinds of these FUSE based file systems around, FTP and SSHFS
are only two of them. My very own fusedav tool
implements WebDAV for FUSE.

Another (*the* other?) major problem of the POSIX file system API is
its synchronous design. While that is usually not a problem for local
file systems and for high-speed network file systems such as NFS it
becomes a problem for slow network FSs such as HTTP or FTP. Having the
GUI block for various seconds while an application saves its documents
is certainly not user friendly. But, can this be fixed? Yes, definitely, it can!
Firstly, there already is the POSIX AIO interface — which however is
quite unfriendly to use (one reason is its use of Unix signals for
notification of completed IO operations). Secondly, the (Linux) kernel
people are working on a better asynchronous IO API (see the
syslets/fibrils discussion). Unfortunately it will take a while
before that new API will finally be available in upstream
kernels. However, there’s always the third solution: add an
asynchronous API entirely in userspace. This is doable in a clean (and
glib-ified) fashion: have a couple of worker threads which
(synchronously) execute the various POSIX file system functions and
add a nice, asynchronous API that can start and stop these threads,
feed them operations to execute, and so on.

So, what’s the grand solution I propose for the desktop VFS mess? First, kick
Gnome-VFS entirely and don’t replace it. Instead write a small D-Bus-accessible
daemon that organizes a special directory ~/net/. Populate that
directory with subdirectories for all WebDAV, FTP, NFS and SMB shares that can
be found on the local network using both Avahi-based browsing and native SMB
browsing. Now use the Linux automounting interface on top of that directory and
automount the respective share every time someone wants to access it. For
shares that are not announced via Avahi/Samba, add some D-Bus API (and a nice
UI) for adding arbitrary shares. NFS and CIFS/SMB shares are mounted with the
fast, optimized kernel filesystem implementation; WebDAV and FTP on the other
hand are accessed via userspace FUSE-based file systems. The latter should also
integrate with D-BUS in some way, to query the user nicely for access
credentials and suchlike, with gnome-keyring support and everything.

~/net/ itself can — but probably doesn’t need to — be a FUSE
filesystem itself.

A shared library should be made available that will implement a few
remaining things, that are not available in the POSIX file
system API directly:

As mentioned, some nice Glib-ish asynchronous POSIX file system
API wrapper

High-level file system operations such as copying, moving,
deleting (trash!) which show a nice GUI when they are long-running
operations.

An API to translate and setup URL <-> filesystem
mappings, i.e. something that translates
ftp://test.local/a/certain/path/ to
~/net/ftp:test.local/a/certain/path and vice versa. (and
probably also to a more user-friendly notation, maybe like “FTP Share
on test.local” or similar). (Needs to communicate with the ~/net/
handling daemon to setup mappings if required)

Meta data extraction. It makes sense to integrate that with
extended attribute support (EA) in the kernel file system layer, which should be used more often anyway.

Explicit mount operations (in contrast to implicit mounts, that
are done through automounting) (this also needs to communicate with
the ~/net/ daemon in some way)

Et voilá! Without a lot of new code you get a nice, asynchronous,
modern, well integrated file system, that doesn’t suck. (or at least,
it doesn’t suck as much as other solutions).

Also, this way we can escape the “abstraction trap”. Let’s KDE play
the abstraction game, maybe they’ll grow up eventually and learn that
abstracting abstracted abstraction layers is child’s play.

Yeah, sure, this proposed solution also has a few drawbacks, but be it that way. Here’s a short incomprehensive list:

The POSIX file system API sucks for file systems that don’t have “inodes” or that are attached to a specific user sessions. — Yes, sure, but both problems have been overcome by the FUSE project, at least partially.
Not that portable — Yes, but FUSE is now available for many systems besides Linux. The automount project is the bigger problem. But all you loose if you would run this proposed system on these (let’s say “legacy”) systems that don’t have FUSE or automounting is access to FTP and WebDAV shares. So what? Local files can still be accessed.
Translating between URLs and $HOME/net/ based paths sucks — yepp, it does. But much less than not being able to access FTP/WebDAV shares from some apps but not from others, as we have it right now.
Bah, you suck — Yes, I do. On a straw, taking a nip from my caipirinha, right at the moment.

I guess I don’t have to list all the advantages of this solution, do I?

BTW, pumping massive amounts of data through D-Bus sucks anyway.

And no, I am not going to hack on this. Too busy with other stuff.

The plane is now landing in HAM, that shall conclude our small rant.

Update: No, I didn’t get a Caipirinha during my flight. That line I
added in before publishing the blog story, which was when I was drinking my
Caipirinha. In contrast to other people from the Free Software community I don’t
own my own private jet yet, with two stewardesses that might fix me a
Caipirinha.

My thoughts on the future of Gnome-VFS

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/gnomevfs-future.html

One of the major construction sites in GNOME and all the other free
desktop environments is the VFS abstraction. Recently, there has been
some discussion about developing a replacement DVFS
as replacement for the venerable Gnome-VFS system. Here are my 5 euro
cent on this issue (Yepp, I am not fully up-to-date on the whole DVFS
discussion, but during my flight from HEL to HAM I wrote this up,
without being necesarily too well informed, lacking an Internet
connection. Hence, if you find that I am an uniformed idiot, you’re of
course welcome to flame me!):

First of all, we have to acknowledge that Gnome-VFS never achieved any major
adoption besides some core (not even all) GNOME applications. The reasons are
many, among them: the API wasn’t all fun, using Gnome-VFS added another
dependency to applications, KDE uses a different abstraction (KIO), and many
others. Adoption was suboptimal, and due to that user experience was
suboptimal, too (to say the least).

One of the basic problems of Gnome-VFS is that it is a (somewhat) redundant
abstraction layer over yet another abstraction layer. Gnome-VFS makes available
an API that offers more or less the same functionality as the (most of the
time) underlying POSIX API. The POSIX API is well accepted, relatively
easy-to-use, portable and very well accepted. The same is not true for
Gnome-VFS. Semantics of the translation between Gnome-VFS and POSIX are not
always that clear. Paths understood by Gnome-VFS (URLs) follow a different
model than those of the Linux kernel. Applications which understand Gnome-VFS
can deal with FTP and HTTP resources, while the majority of the applications
which do not link against Gnome-VFS does not understand it. Integration of
Gnome-VFS-speaking and POSIX-speaking applications is difficult and most of the
time only partially implementable.

So, in short: One one side we have that POSIX API which is a file system
abstraction API. And a (kernel-based) virtual file system behind it. And on the other
side we have the Gnome-VFS API which is also a file system abstraction API and
a virtual file system behind it. Hence, why did we decide to standardize on
Gnome-VFS, and not just on POSIX?

The major reason of course is that until recently accessing FTP,
HTTP and other protocol shares through the POSIX API was not doable
without special kernel patches. However, a while ago the FUSE system
has been merged into the Linux kernel and has been made available for
other operating systems as well, among them FreeBSD and MacOS X. This
allows implementing file system drivers in userspace. Currently there
are all kinds of these FUSE based file systems around, FTP and SSHFS
are only two of them. My very own fusedav tool
implements WebDAV for FUSE.

Another (*the* other?) major problem of the POSIX file system API is
its synchronous design. While that is usually not a problem for local
file systems and for high-speed network file systems such as NFS it
becomes a problem for slow network FSs such as HTTP or FTP. Having the
GUI block for various seconds while an application saves its documents
is certainly not user friendly. But, can this be fixed? Yes, definitely, it can!
Firstly, there already is the POSIX AIO interface — which however is
quite unfriendly to use (one reason is its use of Unix signals for
notification of completed IO operations). Secondly, the (Linux) kernel
people are working on a better asynchronous IO API (see the
syslets/fibrils discussion). Unfortunately it will take a while
before that new API will finally be available in upstream
kernels. However, there’s always the third solution: add an
asynchronous API entirely in userspace. This is doable in a clean (and
glib-ified) fashion: have a couple of worker threads which
(synchronously) execute the various POSIX file system functions and
add a nice, asynchronous API that can start and stop these threads,
feed them operations to execute, and so on.

So, what’s the grand solution I propose for the desktop VFS mess? First, kick
Gnome-VFS entirely and don’t replace it. Instead write a small D-Bus-accessible
daemon that organizes a special directory ~/net/. Populate that
directory with subdirectories for all WebDAV, FTP, NFS and SMB shares that can
be found on the local network using both Avahi-based browsing and native SMB
browsing. Now use the Linux automounting interface on top of that directory and
automount the respective share every time someone wants to access it. For
shares that are not announced via Avahi/Samba, add some D-Bus API (and a nice
UI) for adding arbitrary shares. NFS and CIFS/SMB shares are mounted with the
fast, optimized kernel filesystem implementation; WebDAV and FTP on the other
hand are accessed via userspace FUSE-based file systems. The latter should also
integrate with D-BUS in some way, to query the user nicely for access
credentials and suchlike, with gnome-keyring support and everything.

~/net/ itself can — but probably doesn’t need to — be a FUSE
filesystem itself.

A shared library should be made available that will implement a few
remaining things, that are not available in the POSIX file
system API directly:

  • As mentioned, some nice Glib-ish asynchronous POSIX file system
    API wrapper
  • High-level file system operations such as copying, moving,
    deleting (trash!) which show a nice GUI when they are long-running
    operations.
  • An API to translate and setup URL <-> filesystem
    mappings, i.e. something that translates
    ftp://test.local/a/certain/path/ to
    ~/net/ftp:test.local/a/certain/path and vice versa. (and
    probably also to a more user-friendly notation, maybe like “FTP Share
    on test.local
    ” or similar). (Needs to communicate with the ~/net/
    handling daemon to setup mappings if required)
  • Meta data extraction. It makes sense to integrate that with
    extended attribute support (EA) in the kernel file system layer, which should be used more often anyway.
  • Explicit mount operations (in contrast to implicit mounts, that
    are done through automounting) (this also needs to communicate with
    the ~/net/ daemon in some way)

Et voilá! Without a lot of new code you get a nice, asynchronous,
modern, well integrated file system, that doesn’t suck. (or at least,
it doesn’t suck as much as other solutions).

Also, this way we can escape the “abstraction trap”. Let’s KDE play
the abstraction game, maybe they’ll grow up eventually and learn that
abstracting abstracted abstraction layers is child’s play.

Yeah, sure, this proposed solution also has a few drawbacks, but be it that way. Here’s a short incomprehensive list:

  • The POSIX file system API sucks for file systems that don’t have “inodes” or that are attached to a specific user sessions. — Yes, sure, but both problems have been overcome by the FUSE project, at least partially.
  • Not that portable — Yes, but FUSE is now available for many systems besides Linux. The automount project is the bigger problem. But all you loose if you would run this proposed system on these (let’s say “legacy”) systems that don’t have FUSE or automounting is access to FTP and WebDAV shares. So what? Local files can still be accessed.
  • Translating between URLs and $HOME/net/ based paths sucks — yepp, it does. But much less than not being able to access FTP/WebDAV shares from some apps but not from others, as we have it right now.
  • Bah, you suck — Yes, I do. On a straw, taking a nip from my caipirinha, right at the moment.

I guess I don’t have to list all the advantages of this solution, do I?

BTW, pumping massive amounts of data through D-Bus sucks anyway.

And no, I am not going to hack on this. Too busy with other stuff.

The plane is now landing in HAM, that shall conclude our small rant.

Update: No, I didn’t get a Caipirinha during my flight. That line I
added in before publishing the blog story, which was when I was drinking my
Caipirinha. In contrast to other people from the Free Software community I don’t
own my own private jet yet, with two stewardesses that might fix me a
Caipirinha.

Avahify Your Application!

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/avahify-your-app.html

It has never been easier to add Zeroconf service discovery support to your GTK application!

The upcoming Avahi 0.6.18 will ship with a
new library libavahi-ui which contains a GTK UI dialog
AuiServiceDialog, a simple and easy-to-use dialog for
selecting Zeroconf services, similar in style to GtkFileChooserDialog and friends. This dialog should be used whenever there is an IP
server to enter in a GTK GUI. For example:

Mail applications such as Evolution may use it to browse for POP3, POP3S, IMAP, IMAPS and SMTP servers.
VNC applications may use it to browse for VNC/RFB servers
Database clients such as Glom may use it to browse for PostrgreSQL servers
FTP clients may use it to browse for FTP servers
RSS readers may use it to browse for local RSS feeds

So, how does it look like? Here’s a screenshot of a service dialog browsing for FTP, SFTP and WebDAV shares simultaneously:

Service Dialog

The dialog properly supports browsing in remote domains, browsing for
multiple service types at the same time (i.e. POP3 and POP3S) and supports
multi-homed services. It will also resolve the services if requested. Avahi
will ship a (very useful!) example tool zssh.c which
if started from the command line allows you to quickly browse for local SSH
servers and connect to one of those available. (Short Theora screencast of zssh
Please excuse the strange cursor, seems to be a bug in Istanbul 0.2.1,
which BTW is totally broken on multi-headed setups):

A simple application making use of this dialog might look like this:

#include <gtk/gtk.h>
#include <avahi-ui/avahi-ui.h>

int main(int argc, char*argv[]) {
GtkWidget *d;

gtk_init(&argc, &argv);

d = aui_service_dialog_new(“Choose Web Service”);
aui_service_dialog_set_browse_service_types(AUI_SERVICE_DIALOG(d), “_http._tcp”, “_https._tcp”, NULL);

if (gtk_dialog_run(GTK_DIALOG(d)) == GTK_RESPONSE_OK)
g_message(“Selected service name: %s; service type: %s; host name: %s; port: %u”,
aui_service_dialog_get_service_name(AUI_SERVICE_DIALOG(d)),
aui_service_dialog_get_service_type(AUI_SERVICE_DIALOG(d)),
aui_service_dialog_get_host_name(AUI_SERVICE_DIALOG(d)),
aui_service_dialog_get_port(AUI_SERVICE_DIALOG(d)));
else
g_message(“Canceled.”);

gtk_widget_destroy(d);

return 0;
}

A more elaborate example is zssh.c. You
may browse the
full API online
.

AuiServiceDialog is not perfect yet. It still lacks i18n and a11y
support. In addition it follows the HIG only very roughly. Patches welcome! I
am also very interested in feedback from more experienced GTK programmers,
since my experience with implementing GTK controls is rather limited. This is
my first GTK library which should really feel like a GTK API. So please, read
through the
API
and the
implementation
and send me your comments! Thank you!

If you want to integrate AuiServiceDialog into your application and
don’t want to wait for Avahi 0.6.18, just copy avahi-ui.h
and avahi-ui.c into your sources
and make sure to add avahi-client, avahi-glib, gtk+-2.0 to your pkg-config dependencies.

Avahify Your Application!

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/avahify-your-app.html

It has never been easier to add Zeroconf service discovery support to your GTK application!

The upcoming Avahi 0.6.18 will ship with a
new library libavahi-ui which contains a GTK UI dialog
AuiServiceDialog, a simple and easy-to-use dialog for
selecting Zeroconf services, similar in style to GtkFileChooserDialog and friends. This dialog should be used whenever there is an IP
server to enter in a GTK GUI. For example:

  • Mail applications such as Evolution may use it to browse for POP3, POP3S, IMAP, IMAPS and SMTP servers.
  • VNC applications may use it to browse for VNC/RFB servers
  • Database clients such as Glom may use it to browse for PostrgreSQL servers
  • FTP clients may use it to browse for FTP servers
  • RSS readers may use it to browse for local RSS feeds

So, how does it look like? Here’s a screenshot of a service dialog browsing for FTP, SFTP and WebDAV shares simultaneously:

Service Dialog

The dialog properly supports browsing in remote domains, browsing for
multiple service types at the same time (i.e. POP3 and POP3S) and supports
multi-homed services. It will also resolve the services if requested. Avahi
will ship a (very useful!) example tool zssh.c which
if started from the command line allows you to quickly browse for local SSH
servers and connect to one of those available. (Short Theora screencast of zssh
Please excuse the strange cursor, seems to be a bug in Istanbul 0.2.1,
which BTW is totally broken on multi-headed setups):

A simple application making use of this dialog might look like this:

#include <gtk/gtk.h>
#include <avahi-ui/avahi-ui.h>

int main(int argc, char*argv[]) {
    GtkWidget *d;

    gtk_init(&argc, &argv);

    d = aui_service_dialog_new("Choose Web Service");
    aui_service_dialog_set_browse_service_types(AUI_SERVICE_DIALOG(d), "_http._tcp", "_https._tcp", NULL);

    if (gtk_dialog_run(GTK_DIALOG(d)) == GTK_RESPONSE_OK)
        g_message("Selected service name: %s; service type: %s; host name: %s; port: %u",
		aui_service_dialog_get_service_name(AUI_SERVICE_DIALOG(d)),
		aui_service_dialog_get_service_type(AUI_SERVICE_DIALOG(d)),
		aui_service_dialog_get_host_name(AUI_SERVICE_DIALOG(d)),
		aui_service_dialog_get_port(AUI_SERVICE_DIALOG(d)));
    else
        g_message("Canceled.");

    gtk_widget_destroy(d);

    return 0;
}

A more elaborate example is zssh.c. You
may browse the
full API online
.

AuiServiceDialog is not perfect yet. It still lacks i18n and a11y
support. In addition it follows the HIG only very roughly. Patches welcome! I
am also very interested in feedback from more experienced GTK programmers,
since my experience with implementing GTK controls is rather limited. This is
my first GTK library which should really feel like a GTK API. So please, read
through the
API
and the
implementation
and send me your comments! Thank you!

If you want to integrate AuiServiceDialog into your application and
don’t want to wait for Avahi 0.6.18, just copy avahi-ui.h
and avahi-ui.c into your sources
and make sure to add avahi-client, avahi-glib, gtk+-2.0 to your pkg-config dependencies.