All posts by Lennart Poettering

libabc

2011-11-01 Lennart Poettering

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/libabc.html

At the Kernel Summit in Prague last week Kay Sievers and I lead a session on
developing shared userspace libraries, for kernel hackers. More and more
userspace interfaces of the kernel (for example many which deal with storage,
audio, resource management, security, file systems or a number of other
subsystems) nowadays rely on a dedicated userspace component. As people who
work primarily in the plumbing layer of the Linux OS we noticed over and over
again that these libraries written by people who usually are at home on the
kernel side of things make the same mistakes repeatedly, thus making life for
the users of the libraries unnecessarily difficult. In our session we tried to
point out a number of these things, and in particular places where the usual
kernel hacking style translates badly into userspace shared library hacking.
Our hope is that maybe a few kernel developers have a look at our list of
recommendations and consider the points we are raising.

To make things easy we have put together an example skeleton library we
dubbed libabc, whose README
file includes all our points in terse form. It’s available on kernel.org:

The git repository and the README.

This list of recommendations draws inspiration from David Zeuthen’s and
Ulrich Drepper’s well known papers on the topic of writing shared libraries. In
the README linked above we try to distill this wealth of information into a
terse list of recommendations, with a couple of additions and with a strict
focus on a kernel hacker background.

Please have a look, and even if you are not a kernel hacker there might be
something useful to know in it, especially if you work on the lower layers of
our stack.

If you have any questions or additions, just ping us, or comment below!

Prague

2011-10-23 Lennart Poettering

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/linuxcon-europe.html

If you make it to Prague the coming week for the LinuxCon/ELCE/GStreamer/Kernel Summit/… superconference, make sure not to miss:

The Linux Audio BoF with numerous Linux audio hackers, 5pm, on Sunday (23rd, i.e. today).
Latest
developments in PulseAudio by Arun Raghavan. 4pm, on Tuesday, GStreamer
Summit
Linux
Kernel Developer Panel, a shared session of LinuxCon and ELCE. Panelists
are Linus Torvalds, Alan Cox, Thomas Gleixner and Paul McKenney. Moderated by
yours truly. 9:30am, on Wednesday
systemd
Administration in the Enterprise by Kay Sievers and yours truly. 4:15pm, on
Wednesday, LinuxCon
Integrating
systemd: Booting Userspace in Less Than 1 Second by Koen Kooi. 11:15am, on
Friday, ELCE

All of that at the Clarion Hotel. See you in Prague!

Plumbers Wishlist, The Second Edition

2011-10-20 Lennart Poettering

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/plumbers-wishlist-2.html

Two weeks ago we published a Plumber’s
Wishlist for Linux. So far, this has already created lively discussions in
the community (as reported on LWN among others), and patches for a few of the
items listed have already been posted (thanks a lot to those who worked on
this, your contributions are much appreciated!).

We
have now prepared a second version of the wish list. It includes a number
of additions (tmpfs quota! hostname change notifications! and more!) and
updates to the previous items, including links to patches, and references to
other interesting material.

We hope to update this wishlist from time, so stay tuned!

And now, go and read the new wishlist!

Google doesn’t like my name

2011-10-17 Lennart Poettering

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/google-doesnt-like-my-name.html

Nice one, Google suspended my Google+ account because I created it under,
well, my name, which is “Lennart Poettering”, and Google+ thinks that wasn’t my
name, even though it says so in my passport, and almost every document I own
and I was never aware I had any other name. This is ricidulous. Google, give me
my name back! This is a really uncool move.

Your Questions for the Kernel Developer Panel at LinuxCon in Prague

2011-10-17 Lennart Poettering

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/kernel-hacker-panel.html

#nocomments yes

I
am currently collecting questions for the kernel
developer panel at LinuxCon in Prague. If there’s something you’d like the
panelists to respond to, please post it on the
thread, and I’ll see what I can do. Thank you!

A Big Loss

2011-10-15 Lennart Poettering

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/a-big-loss.html

Google
announced today that they’ll be shutting down Google Code Search in
January. I am quite sure that this would be a massive loss for the Free
Software community. The ability to learn from other people’s code is a key
idea of Free Software. There’s simply no better way to do that than with a
source code search engine. The day Google Code Search will be shut down will
be a sad day for the Free Software community.

Of course, there are a couple of alternatives around, but they all have one
thing in common: they, uh, don’t even remotely compare to the completeness,
performance and simplicity of the Google Code Search interface, and have
serious usability issues. (For example: koders.com is really really slow, and
splits up identifiers you search for at underscores, which kinda makes it
useless for looking for almost any kind of code.)

I think it must be of genuine interest to the Free Software community to
have a capable replacement for Google Code Search, for the day it is turned
off. In fact, it probably should be something the various foundations which
promote Free Software should be looking into, like the FSF or the Linux
Foundation. There are very few better ways to get Free Software into the heads
and minds of engineers than by examples — examples consisting of real life
code they can find with a source code search engine. I believe a source code
search engine is probably among the best vehicles to promote Free Software
towards engineers. In particular if it itself was Free Software (in contrast to
Google Code Search).

Ideally, all software available on web sites like SourceForge, Freshmeat, or
github should be indexed. But there’s also a chance for distributions here:
indexing the sources of all packages a distribution like Debian or Fedora
include would be a great tool for developers. In fact, a distribution offering
this functionality might benefit from such functionality, as it attracts
developer interest in the distribution.

It’s sad that Google Code Search will be gone soon. But maybe there’s
something positive in the bad news here, and a chance to create something better,
more comprehensive, that is free, and promotes our ideals better than Google
ever could. Maybe there’s a chance here for the Open Source foundations, for
the distributions and for the communities to create a better replacement!

Dresden, California, Poznan

2011-10-09 Lennart Poettering

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/photos/california.html

Hofkirche, Dresden, Saxony, Germany

Bastei, Saxon Switzerland, Saxony, Germany

Fürstenzug, Dresden, Saxony, Germany

Near California State Route 46, California, USA

Near Generals Highway, California, USA

Near Generals Highway, California, USA, a bit further down the road.

Parish Church in Poznan, Poland

A Plumber’s Wish List for Linux

2011-10-07 Lennart Poettering

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/plumbers-wishlist.html

Here’s a mail
we just sent to LKML, for your consideration. Enjoy:

Subject: A Plumber’s Wish List for Linux

We’d like to share our current wish list of plumbing layer features we
are hoping to see implemented in the near future in the Linux kernel and
associated tools. Some items we can implement on our own, others are not
our area of expertise, and we will need help getting them implemented.

Acknowledging that this wish list of ours only gets longer and not
shorter, even though we have implemented a number of other features on
our own in the previous years, we are posting this list here, in the
hope to find some help.

If you happen to be interested in working on something from this list or
able to help out, we’d be delighted. Please ping us in case you need
clarifications or more information on specific items.

Thanks,
Kay, Lennart, Harald, in the name of all the other plumbers


An here’s the wish list, in no particular order:

* (ioctl based?) interface to query and modify the label of a mounted
FAT volume:
A FAT labels is implemented as a hidden directory entry in the file
system which need to be renamed when changing the file system label,
this is impossible to do from userspace without unmounting. Hence we’d
like to see a kernel interface that is available on the mounted file
system mount point itself. Of course, bonus points if this new interface
can be implemented for other file systems as well, and also covers fs
UUIDs in addition to labels.

* CPU modaliases in /sys/devices/system/cpu/cpuX/modalias:
useful to allow module auto-loading of e.g. cpufreq drivers and KVM
modules. Andy Kleen has a patch to create the alias file itself. CPU
‘struct sysdev’ needs to be converted to ‘struct device’ and a ‘struct
bus_type cpu’ needs to be introduced to allow proper CPU coldplug event
replay at bootup. This is one of the last remaining places where
automatic hardware-triggered module auto-loading is not available. And
we’d like to see that fix to make numerous ugly userspace work-arounds
to achieve the same go away.

* expose CAP_LAST_CAP somehow in the running kernel at runtime:
Userspace needs to know the highest valid capability of the running
kernel, which right now cannot reliably be retrieved from header files
only. The fact that this value cannot be detected properly right now
creates various problems for libraries compiled on newer header files
which are run on older kernels. They assume capabilities are available
which actually aren’t. Specifically, libcap-ng claims that all running
processes retain the higher capabilities in this case due to the
“inverted” semantics of CapBnd in /proc/$PID/status.

* export ‘struct device_type fb/fbcon’ of ‘struct class graphics’
Userspace wants to easily distinguish ‘fb’ and ‘fbcon’ from each other
without the need to match on the device name.

* allow changing argv[] of a process without mucking with environ[]:
Something like setproctitle() or a prctl() would be ideal. Of course it
is questionable if services like sendmail make use of this, but otoh for
services which fork but do not immediately exec() another binary being
able to rename this child processes in ps is of importance.

* module-init-tools: provide a proper libmodprobe.so from
module-init-tools:
Early boot tools, installers, driver install disks want to access
information about available modules to optimize bootup handling.

* fork throttling mechanism as basic cgroup functionality that is
available in all hierarchies independent of the controllers used:
This is important to implement race-free killing of all members of a
cgroup, so that cgroup member processes cannot fork faster then a cgroup
supervisor process could kill them. This needs to be recursive, so that
not only a cgroup but all its subgroups are covered as well.

* proper cgroup-is-empty notification interface:
The current call_usermodehelper() interface is an unefficient and an
ugly hack. Tools would prefer anything more lightweight like a netlink,
poll() or fanotify interface.

* allow user xattrs to be set on files in the cgroupfs (and maybe
procfs?)

* simple, reliable and future-proof way to detect whether a specific pid
is running in a CLONE_NEWPID container, i.e. not in the root PID
namespace. Currently, there are available a few ugly hacks to detect
this (for example a process wanting to know whether it is running in a
PID namespace could just look for a PID 2 being around and named
kthreadd which is a kernel thread only visible in the root namespace),
however all these solutions encode information and expectations that
better shouldn’t be encoded in a namespace test like this. This
functionality is needed in particular since the removal of the the ns
cgroup controller which provided the namespace membership information to
user code.

* allow making use of the “cpu” cgroup controller by default without
breaking RT. Right now creating a cgroup in the “cpu” hierarchy that
shall be able to take advantage of RT is impossible for the generic case
since it needs an RT budget configured which is from a limited resource
pool. What we want is the ability to create cgroups in “cpu” whose
processes get an non-RT weight applied, but for RT take advantage of the
parent’s RT budget. We want the separation of RT and non-RT budget
assignment in the “cpu” hierarchy, because right now, you lose RT
functionality in it unless you assign an RT budget. This issue severely
limits the usefulness of “cpu” hierarchy on general purpose systems
right now.

* Add a timerslack cgroup controller, to allow increasing the timer
slack of user session cgroups when the machine is idle.

* An auxiliary meta data message for AF_UNIX called SCM_CGROUPS (or
something like that), i.e. a way to attach sender cgroup membership to
messages sent via AF_UNIX. This is useful in case services such as
syslog shall be shared among various containers (or service cgroups),
and the syslog implementation needs to be able to distinguish the
sending cgroup in order to separate the logs on disk. Of course stm
SCM_CREDENTIALS can be used to look up the PID of the sender followed by
a check in /proc/$PID/cgroup, but that is necessarily racy, and actually
a very real race in real life.

* SCM_COMM, with a similar use case as SCM_CGROUPS. This auxiliary
control message should carry the process name as available
in /proc/$PID/comm.

What You Need to Know When Becoming a Free Software Hacker

2011-10-06 Lennart Poettering

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/hinter-den-kulissen.html

Earlier today I gave a presentation at the Technical University Berlin about
things you need to know, things you should expect and things you shouldn’t
expect when your are aspiring to become a successful Free Software Hacker.

I have put my slides up on Google Docs in case you are interested, either
because you are the target audience (i.e. a university student) or because you
need inspiration for a similar talk about the same topic.

The first two slides are in German language, so skip over them. The
interesting bits are all in English. I hope it’s quite comprehensive (though of
course terse). Enjoy:

In case your feed reader/planet messes this up, here’s the non-embedded version.

Oh, and thanks to everybody who reviewed and suggested additions to the the slides on +.

PulseAudio 1.0

2011-09-27 Lennart Poettering

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/pa-one-dot-zero.html

#nocomments y

PulseAudio 1.0 is out now. It’s awesome. Get it while it is hot!

I’d like to thank Colin Guthrie and Arun Raghavan (and all the others involved) for getting this release out of the door!

systemd for Administrators, Part XI

2011-09-26 Lennart Poettering

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/inetd.html

Here’s the eleventh installment
of
my ongoing series
on
systemd
for
Administrators:

Converting inetd Services

In a
previous episode of this series I covered how to convert a SysV
init script to a systemd unit file. In this story I hope to explain
how to convert inetd services into systemd units.

Let’s start with a bit of background. inetd has a long tradition as
one of the classic Unix services. As a superserver it listens on
an Internet socket on behalf of another service and then activate that
service on an incoming connection, thus implementing an on-demand
socket activation system. This allowed Unix machines with limited
resources to provide a large variety of services, without the need to
run processes and invest resources for all of them all of the
time. Over the years a number of independent implementations of inetd
have been shipped on Linux distributions. The most prominent being the
ones based on BSD inetd and xinetd. While inetd used to be installed
on most distributions by default, it nowadays is used only for very
few selected services and the common services are all run
unconditionally at boot, primarily for (perceived) performance
reasons.

One of the core feature of systemd (and Apple’s launchd for the
matter) is socket activation, a scheme pioneered by inetd, however
back then with a different focus. Systemd-style socket activation focusses on
local sockets (AF_UNIX), not so much Internet sockets (AF_INET), even
though both are supported. And more importantly even, socket
activation in systemd is not primarily about the on-demand aspect that
was key in inetd, but more on increasing parallelization (socket
activation allows starting clients and servers of the socket at the
same time), simplicity (since the need to configure explicit
dependencies between services is removed) and robustness (since
services can be restarted or may crash without loss of connectivity of the
socket). However, systemd can also activate services on-demand when
connections are incoming, if configured that way.

Socket activation of any kind requires support in the services
themselves. systemd provides a very simple interface that services may
implement to provide socket activation, built around sd_listen_fds(). As such
it is already a very minimal, simple scheme. However, the
traditional inetd interface is even simpler. It allows passing only a
single socket to the activated service: the socket fd is simply
duplicated to STDIN and STDOUT of the process spawned, and that’s
already it. In order to provide compatibility systemd optionally
offers the same interface to processes, thus taking advantage of the
many services that already support inetd-style socket activation, but not yet
systemd’s native activation.

Before we continue with a concrete example, let’s have a look at
three different schemes to make use of socket activation:

Socket activation for parallelization, simplicity,
robustness: sockets are bound during early boot and a singleton
service instance to serve all client requests is immediately started
at boot. This is useful for all services that are very likely used
frequently and continously, and hence starting them early and in
parallel with the rest of the system is advisable. Examples: D-Bus,
Syslog.
On-demand socket activation for singleton services: sockets
are bound during early boot and a singleton service instance is
executed on incoming traffic. This is useful for services that are
seldom used, where it is advisable to save the resources and time at
boot and delay activation until they are actually needed. Example: CUPS.
On-demand socket activation for per-connection service
instances: sockets are bound during early boot and for each
incoming connection a new service instance is instantiated and the
connection socket (and not the listening one) is passed to it. This is
useful for services that are seldom used, and where performance is not
critical, i.e. where the cost of spawning a new service process for
each incoming connection is limited. Example: SSH.

The three schemes provide different performance characteristics. After
the service finishes starting up the performance provided by the first two
schemes is identical to a stand-alone service (i.e. one that is
started without a super-server, without socket activation), since the
listening socket is passed to the actual service, and code paths from
then on are identical to those of a stand-alone service and all
connections are processes exactly the same way as they are in a
stand-alone service. On the other hand, performance of the third scheme
is usually not as good: since for each connection a new service needs
to be started the resource cost is much higher. However, it also has a
number of advantages: for example client connections are better
isolated and it is easier to develop services activated this way.

For systemd primarily the first scheme is in focus, however the
other two schemes are supported as well. (In fact, the blog story I
covered the necessary code changes for systemd-style socket activation
in was about a service of the second type, i.e. CUPS). inetd
primarily focusses on the third scheme, however the second scheme is
supported too. (The first one isn’t. Presumably due the focus on the
third scheme inetd got its — a bit unfair — reputation for being
“slow”.)

So much about the background, let’s cut to the beef now and show an
inetd service can be integrated into systemd’s socket
activation. We’ll focus on SSH, a very common service that is widely
installed and used but on the vast majority of machines probably not
started more often than 1/h in average (and usually even much
less). SSH has supported inetd-style activation since a long time,
following the third scheme mentioned above. Since it is started only
every now and then and only with a limited number of connections at
the same time it is a very good candidate for this scheme as the extra
resource cost is negligble: if made socket-activatable SSH is
basically free as long as nobody uses it. And as soon as somebody logs
in via SSH it will be started and the moment he or she disconnects all
its resources are freed again. Let’s find out how to make SSH
socket-activatable in systemd taking advantage of the provided inetd
compatibility!

Here’s the configuration line used to hook up SSH with classic inetd:

ssh stream tcp nowait root /usr/sbin/sshd sshd -i

And the same as xinetd configuration fragment:

service ssh {
        socket_type = stream
        protocol = tcp
        wait = no
        user = root
        server = /usr/sbin/sshd
        server_args = -i
}

Most of this should be fairly easy to understand, as these two
fragments express very much the same information. The non-obvious
parts: the port number (22) is not configured in inetd configuration,
but indirectly via the service database in /etc/services: the
service name is used as lookup key in that database and translated to
a port number. This indirection via /etc/services has been
part of Unix tradition though has been getting more and more out of
fashion, and the newer xinetd hence optionally allows configuration
with explicit port numbers. The most interesting setting here is the
not very intuitively named nowait (resp. wait=no)
option. It configures whether a service is of the second
(wait) resp. third (nowait) scheme mentioned
above. Finally the -i switch is used to enabled inetd mode in
SSH.

The systemd translation of these configuration fragments are the
following two units. First: sshd.socket is a unit encapsulating
information about a socket to listen on:

[Unit]
Description=SSH Socket for Per-Connection Servers

[Socket]
ListenStream=22
Accept=yes

[Install]
WantedBy=sockets.target

Most of this should be self-explanatory. A few notes:
Accept=yes corresponds to nowait. It’s hopefully
better named, referring to the fact that for nowait the
superserver calls accept() on the listening socket, where for
wait this is the job of the executed
service process. WantedBy=sockets.target is used to ensure that when
enabled this unit is activated at boot at the right time.

And here’s the matching service file [email protected]:

[Unit]
Description=SSH Per-Connection Server

[Service]
ExecStart=-/usr/sbin/sshd -i
StandardInput=socket

This too should be mostly self-explanatory. Interesting is
StandardInput=socket, the option that enables inetd
compatibility for this service. StandardInput= may be used to
configure what STDIN of the service should be connected for this
service (see the man
page for details). By setting it to socket we make sure
to pass the connection socket here, as expected in the simple inetd
interface. Note that we do not need to explicitly configure
StandardOutput= here, since by default the setting from
StandardInput= is inherited if nothing else is
configured. Important is the “-” in front of the binary name. This
ensures that the exit status of the per-connection sshd process is
forgotten by systemd. Normally, systemd will store the exit status of
a all service instances that die abnormally. SSH will sometimes die
abnormally with an exit code of 1 or similar, and we want to make sure
that this doesn’t cause systemd to keep around information for
numerous previous connections that died this way (until this
information is forgotten with systemctl reset-failed).

[email protected] is an instantiated service, as described in the preceeding
installment of this series. For each incoming connection systemd
will instantiate a new instance of [email protected], with the
instance identifier named after the connection credentials.

You may wonder why in systemd configuration of an inetd service
requires two unit files instead of one. The reason for this is that to
simplify things we want to make sure that the relation between live
units and unit files is obvious, while at the same time we can order
the socket unit and the service units independently in the dependency
graph and control the units as independently as possible. (Think: this
allows you to shutdown the socket independently from the instances,
and each instance individually.)

Now, let’s see how this works in real life. If we drop these files
into /etc/systemd/system we are ready to enable the socket and
start it:

# systemctl enable sshd.socket
ln -s '/etc/systemd/system/sshd.socket' '/etc/systemd/system/sockets.target.wants/sshd.socket'
# systemctl start sshd.socket
# systemctl status sshd.socket
sshd.socket - SSH Socket for Per-Connection Servers
	  Loaded: loaded (/etc/systemd/system/sshd.socket; enabled)
	  Active: active (listening) since Mon, 26 Sep 2011 20:24:31 +0200; 14s ago
	Accepted: 0; Connected: 0
	  CGroup: name=systemd:/system/sshd.socket

This shows that the socket is listening, and so far no connections
have been made (Accepted: will show you how many connections
have been made in total since the socket was started,
Connected: how many connections are currently active.)

Now, let’s connect to this from two different hosts, and see which services are now active:

$ systemctl --full | grep ssh
[email protected]:22-172.31.0.4:47779.service  loaded active running       SSH Per-Connection Server
[email protected]:22-172.31.0.54:52985.service loaded active running       SSH Per-Connection Server
sshd.socket                                   loaded active listening     SSH Socket for Per-Connection Servers

As expected, there are now two service instances running, for the
two connections, and they are named after the source and destination
address of the TCP connection as well as the port numbers. (For
AF_UNIX sockets the instance identifier will carry the PID and UID of
the connecting client.) This allows us to invidiually introspect or
kill specific sshd instances, in case you want to terminate the
session of a specific client:

# systemctl kill [email protected]:22-172.31.0.4:47779.service

And that’s probably already most of what you need to know for
hooking up inetd services with systemd and how to use them afterwards.

In the case of SSH it is probably a good suggestion for most
distributions in order to save resources to default to this kind of
inetd-style socket activation, but provide a stand-alone unit file to
sshd as well which can be enabled optionally. I’ll soon file a
wishlist bug about this against our SSH package in Fedora.

A few final notes on how xinetd and systemd compare feature-wise,
and whether xinetd is fully obsoleted by systemd. The short answer
here is that systemd does not provide the full xinetd feature set and
that is does not fully obsolete xinetd. The longer answer is a bit
more complex: if you look at the multitude of options
xinetd provides you’ll notice that systemd does not compare. For
example, systemd does not come with built-in echo,
time, daytime or discard servers, and never
will include those. TCPMUX is not supported, and neither are RPC
services. However, you will also find that most of these are either
irrelevant on today’s Internet or became other way out-of-fashion. The
vast majority of inetd services do not directly take advantage of
these additional features. In fact, none of the xinetd services
shipped on Fedora make use of these options. That said, there are a
couple of useful features that systemd does not support, for example
IP ACL management. However, most administrators will probably agree
that firewalls are the better solution for these kinds of problems and
on top of that, systemd supports ACL management via tcpwrap for those
who indulge in retro technologies like this. On the other hand systemd
also provides numerous features xinetd does not provide,
starting with the individual control of instances shown above, or the
more expressive configurability of the execution
context for the instances. I believe that what systemd provides is
quite comprehensive, comes with little legacy cruft but should provide
you with everything you need. And if there’s something systemd does
not cover, xinetd will always be there to fill the void as
you can easily run it in conjunction with systemd. For the
majority of uses systemd should cover what is necessary, and allows
you cut down on the required components to build your system from. In
a way, systemd brings back the functionality of classic Unix inetd and
turns it again into a center piece of a Linux system.

And that’s all for now. Thanks for reading this long piece. And
now, get going and convert your services over! Even better, do this
work in the individual packages upstream or in your distribution!

systemd for Administrators, Part X

2011-09-26 Lennart Poettering

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/instances.html

Here’s the tenth installment
of
my ongoing series
on
systemd
for
Administrators:

Instantiated Services

Most services on Linux/Unix are singleton services: there’s
usually only one instance of Syslog, Postfix, or Apache running on a
specific system at the same time. On the other hand some select
services may run in multiple instances on the same host. For example,
an Internet service like the Dovecot IMAP service could run in
multiple instances on different IP ports or different local IP
addresses. A more common example that exists on all installations is
getty, the mini service that runs once for each TTY and
presents a login prompt on it. On most systems this service is
instantiated once for each of the first six virtual consoles
tty1 to tty6. On some servers depending on
administrator configuration or boot-time parameters an additional
getty is instantiated for a serial or virtualizer console. Another
common instantiated service in the systemd world is fsck, the
file system checker that is instantiated once for each block device
that needs to be checked. Finally, in systemd socket activated
per-connection services (think classic inetd!) are also implemented
via instantiated services: a new instance is created for each incoming
connection. In this installment I hope to explain a bit how systemd
implements instantiated services and how to take advantage of them as
an administrator.

If you followed the previous episodes of this series you are
probably aware that services in systemd are named according to the
pattern foobar.service, where foobar is an
identification string for the service, and .service simply a
fixed suffix that is identical for all service units. The definition files
for these services are searched for in /etc/systemd/system
and /lib/systemd/system (and possibly other directories) under this name. For
instantiated services this pattern is extended a bit: the service name becomes
foobar@quux.service where foobar is the
common service identifier, and quux the instance
identifier. Example: [email protected] is the serial
getty service instantiated for ttyS2.

Service instances can be created dynamically as needed. Without
further configuration you may easily start a new getty on a serial
port simply by invoking a systemctl start command for the new
instance:

# systemctl start [email protected]

If a command like the above is run systemd will first look for a
unit configuration file by the exact name you requested. If this
service file is not found (and usually it isn’t if you use
instantiated services like this) then the instance id is removed from
the name and a unit configuration file by the resulting
template name searched. In other words, in the above example,
if the precise [email protected] unit file cannot
be found, [email protected] is loaded instead. This unit
template file will hence be common for all instances of this
service. For the serial getty we ship a template unit file in systemd
(/lib/systemd/system/[email protected]) that looks
something like this:

[Unit]
Description=Serial Getty on %I
BindTo=dev-%i.device
After=dev-%i.device systemd-user-sessions.service

[Service]
ExecStart=-/sbin/agetty -s %I 115200,38400,9600
Restart=always
RestartSec=0

(Note that the unit template file we actually ship along with
systemd for the serial gettys is a bit longer. If you are interested,
have a look at the actual
file which includes additional directives for compatibility with
SysV, to clear the screen and remove previous users from the TTY
device. To keep things simple I have shortened the unit file to the
relevant lines here.)

This file looks mostly like any other unit file, with one
distinction: the specifiers %I and %i are used at
multiple locations. At unit load time %I and %i are
replaced by systemd with the instance identifier of the service. In
our example above, if a service is instantiated as
[email protected] the specifiers %I and
%i will be replaced by ttyUSB0. If you introspect
the instanciated unit with systemctl status [email protected] you will see these replacements
having taken place:

$ systemctl status [email protected]
[email protected] - Getty on ttyUSB0
	  Loaded: loaded (/lib/systemd/system/[email protected]; static)
	  Active: active (running) since Mon, 26 Sep 2011 04:20:44 +0200; 2s ago
	Main PID: 5443 (agetty)
	  CGroup: name=systemd:/system/[email protected]/ttyUSB0
		  └ 5443 /sbin/agetty -s ttyUSB0 115200,38400,9600

And that is already the core idea of instantiated services in
systemd. As you can see systemd provides a very simple templating
system, which can be used to dynamically instantiate services as
needed. To make effective use of this, a few more notes:

You may instantiate these services on-the-fly in
.wants/ symbolic links in the file system. For example, to
make sure the serial getty on ttyUSB0 is started
automatically at every boot, create a symlink like this:

# ln -s /lib/systemd/system/[email protected] /etc/systemd/system/getty.target.wants/serial-getty@ttyUSB0.service

systemd will instantiate the symlinked unit file with the
instance name specified in the symlink name.

You cannot instantiate a unit template without specifying an
instance identifier. In other words systemctl start [email protected] will necessarily fail since the instance
name was left unspecified.

Sometimes it is useful to opt-out of the generic template
for one specific instance. For these cases make use of the fact that
systemd always searches first for the full instance file name before
falling back to the template file name: make sure to place a unit file
under the fully instantiated name in /etc/systemd/system and
it will override the generic templated version for this specific
instance.

The unit file shown above uses %i at some places and
%I at others. You may wonder what the difference between
these specifiers are. %i is replaced by the exact characters
of the instance identifier. For %I on the other hand the
instance identifier is first passed through a simple unescaping
algorithm. In the case of a simple instance identifier like
ttyUSB0 there is no effective difference. However, if the
device name includes one or more slashes (“/“) this cannot be
part of a unit name (or Unix file name). Before such a device name can
be used as instance identifier it needs to be escaped so that “/”
becomes “-” and most other special characters (including “-“) are
replaced by “\xAB” where AB is the ASCII code of the character in
hexadecimal notation^[1]. Example: to refer to a USB serial port by its
bus path we want to use a port name like
serial/by-path/pci-0000:00:1d.0-usb-0:1.4:1.1-port0. The
escaped version of this name is
serial-by\x2dpath-pci\x2d0000:00:1d.0\x2dusb\x2d0:1.4:1.1\x2dport0. %I
will then refer to former, %i to the latter. Effectively this
means %i is useful wherever it is necessary to refer to other
units, for example to express additional dependencies. On the other
hand %I is useful for usage in command lines, or inclusion in
pretty description strings. Let’s check how this looks with the above unit file:

# systemctl start 'serial-getty@serial-by\x2dpath-pci\x2d0000:00:1d.0\x2dusb\x2d0:1.4:1.1\x2dport0.service'
# systemctl status 'serial-getty@serial-by\x2dpath-pci\x2d0000:00:1d.0\x2dusb\x2d0:1.4:1.1\x2dport0.service'
serial-getty@serial-by\x2dpath-pci\x2d0000:00:1d.0\x2dusb\x2d0:1.4:1.1\x2dport0.service - Serial Getty on serial/by-path/pci-0000:00:1d.0-usb-0:1.4:1.1-port0
	  Loaded: loaded (/lib/systemd/system/[email protected]; static)
	  Active: active (running) since Mon, 26 Sep 2011 05:08:52 +0200; 1s ago
	Main PID: 5788 (agetty)
	  CGroup: name=systemd:/system/[email protected]/serial-by\x2dpath-pci\x2d0000:00:1d.0\x2dusb\x2d0:1.4:1.1\x2dport0
		  └ 5788 /sbin/agetty -s serial/by-path/pci-0000:00:1d.0-usb-0:1.4:1.1-port0 115200 38400 9600

As we can see the while the instance identifier is the escaped
string the command line and the description string actually use the
unescaped version, as expected.

(Side note: there are more specifiers available than just
%i and %I, and many of them are actually
available in all unit files, not just templates for service
instances. For more details see the man
page which includes a full list and terse explanations.)

And at this point this shall be all for now. Stay tuned for a
follow-up article on how instantiated services are used for
inetd-style socket activation.

Footnotes

[1] Yupp, this escaping algorithm doesn’t really result in
particularly pretty escaped strings, but then again, most escaping
algorithms don’t help readability. The algorithm we used here is
inspired by what udev does in a similar case, with one change. In the
end, we had to pick something. If you’ll plan to comment on the
escaping algorithm please also mention where you live so that I can
come around and paint your bike shed yellow with blue stripes. Thanks!

Boot/Init LPC MC Summary at LWN

2011-09-17 Lennart Poettering

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/lwn-lpc-2011.html

Make sure to read the summary of the Boot & Init
Microconf at the Linux Plumbers Conference 2011 In Santa Rosa, CA. It was a
fantastic conference (at the social event we took busses from the appetizers to
the mains…), and this summary should give you quite a good idea what
we discussed there. Highly recommended read.

systemd US Tour Dates

2011-09-01 Lennart Poettering

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/us-tour-dates.html

Kay Sievers, Harald Hoyer and I will tour the US in the next weeks. If you
have any questions on systemd, udev or dracut (or any of the related
technologies), then please do get in touch with us on the following occasions:

Linux Plumbers Conference, Santa Rosa, CA, Sep 7-9th
Google, Googleplex, Mountain View, CA, Sep 12th
Red Hat, Westford, MA, Sep 13-14th

As usual LPC is going to rock, so make sure to be there!

How to Write syslog Daemons Which Cooperate Nicely With systemd

2011-08-31 Lennart Poettering

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/syslog.html

I just finished putting together a text on the systemd wiki explaining what
to do to write a syslog service that is nicely integrated with systemd, and
does all the right things. It’s supposed to be a checklist for all syslog
hackers:

Read it now.

rsyslog already implements everything on this list afaics, and that’s
pretty cool. If other implementations want to catch up, please consider
following these recommendations, too.

I put this together since I have changed systemd 35 to set
StandardOutput=syslog as default, so that all stdout/stderr of all
services automatically ends up in syslog. And since that change requires some
(minimal) changes to all syslog implementations I decided to document this all
properly (if you are curious: they need to set StandardOutput=null to
opt out of this default in order to avoid logging loops).

Anyway, please have a peek and comment if you spot a mistake or
something I forgot. Or if you have questions, just ask.

How to Behave Nicely in the cgroup Trees

2011-08-19 Lennart Poettering

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/pax-cgroups.html

The Linux cgroup hierarchies of the various kernel controllers are a shared
resource. Recently many components of Linux userspace started making use of these
hierarchies. In order to avoid that the various programs step on each others
toes while manipulating this shared resource we have put together a list of
recommendations. Programs following these guidelines should work together
nicely without interfering with other users of the hierarchies.

These
guidelines are available in the systemd wiki. I’d be very interested in
feedback, and would like to ask you to ping me in case we forgot something or left something too vague.

And please, if you are writing software that interfaces with the cgroup
tree consider following these recommendations. Thank you.

The Desktop Summit Wiki Is Full Of Interesting Stuff

2011-08-02 Lennart Poettering

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/ds-wiki.html

Just wanted to draw your attention to the Desktop Summit Wiki. If you
are attending the Desktop Summit in Berlin you might find some
interesting information in the Wiki.

If you are arriving by plane and want to share a ride (even
S-Bahn trains/bus) from ether of the two airports, consider adding your name to this list. It’s
still a bit empty (since I just set it up 3min ago) but that’ll hopefully
change quickly.
Some information on getting around in Berlin (i.e. which public transport tickets to buy)
Where to get a SIM card for your phone
Some sights to see
Where to get wasted
Where to eat

Go to the main page of
the Wiki here. You are welcome to edit and add additional information to
the Wiki. To edit the Wiki authenticate with the same credentials you used to
sign up for the conference at the Desktop Summit web site.

See you on friday!

Desktop Summit Announcements, Part II

2011-08-01 Lennart Poettering

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/desktop-summit-announce2.html

Read
the first part of the announcements.

And now there are more exciting announcements:

The
Panel on Copyright Assignement has been announced, featuring SUSE’s Michael Meeks,
Canonical’s Mark Shuttleworth and
Bradley Kuhn from the Software Freedom Conservancy. This
session will be moderated by GNOME’s Karen Sandler.
The
fifth and final keynote Interview has been published, with Nick
Richards from Intel.
The
conference attendee policy as been published.

Only 5 days are now left to beginning of the conference. The first event
will already take place on Friday August 5th, at c-base at U/S Jannowitzbrücke,
starting at 4pm. The conference programme itself will begin on Saturday August
6th, 10am (though do come earlier, for registration, if you didn’t register at
the c-base event already!). Note that the primary entrance to the Desktop
Summit is in the north-eastern corner of the main building of Humboldt
University. That’s on Dorotheenstr./Hegelplatz, and not on Unter den
Linden.

See you on Friday at c-base!

Desktop Summit Announcements

2011-07-22 Lennart Poettering

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/desktop-summit-announce.html

In case you missed them, there have been a couple of exciting announcements
around the Desktop Summit in Berlin, Germany.

The three keynotes have been announced.
Interviews with the keynote speakers have been published: Thomas Twaite,
Claire
Rowland, Dirk
Hohndel, Stuart
Jarvis.
The Desktop
Summit T-Shirt design has been announced.
The
Desktop Summit social events have been announced. One is on an island! In the river Spree! In summer! In Berlin! How awesome is that?
The BoF and workshop schedule has been published.

And there will be more exciting announcements coming!

See you in 14 days! Oh, and if you still haven’t registered, do so now. It’s free, and if you don’t register you might not get on the WLAN at the conference right-away.

systemd for Administrators, Part IX

2011-07-18 Lennart Poettering

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/on-etc-sysinit.html

Here’s the ninth installment
of
my
ongoing
series
on
systemd
for
Administrators:

On /etc/sysconfig and /etc/default

So, here’s a bit of an opinion piece on the /etc/sysconfig/ and
/etc/default directories that exist on the various distributions in
one form or another, and why I believe their use should be faded out. Like
everything I say on this blog what follows is just my personal opinion, and not
the gospel and has nothing to do with the position of the Fedora project or my
employer. The topic of /etc/sysconfig has been coming up in
discussions over and over again. I hope with this blog story I can explain a
bit what we as systemd upstream think about these files.

A few lines about the historical context: I wasn’t around when
/etc/sysconfig was introduced — suffice to say it has been around on Red Hat
and SUSE distributions since a long long time. Eventually /etc/default was
introduced on Debian with very similar semantics. Many other distributions know
a directory with similar semantics too, most of them call it either one or the
other way. In fact, even other Unix-OSes sported a directory like this. (Such
as SCO. If you are interested in the details, I am sure a Unix greybeard of
your trust can fill in what I am leaving vague here.) So, even though a
directory like this has been known widely on Linuxes and Unixes, it never has
been standardized, neither in POSIX nor in LSB/FHS. These directories very much
are something where distributions distuingish themselves from each other.

The semantics of /etc/default and /etc/sysconfig are very
losely defined only. What almost all files stored in these directories have in common
though is that they are sourcable shell scripts which primarily consist of
environment variable assignments. Most of the files in these directories are
sourced by the SysV init scripts of the same name. The Debian
Policy Manual (9.3.2) and the Fedora Packaging
Guidelines suggest this use of the directories, however both distributions
also have files in them that do not follow this scheme, i.e. that do not have a
matching SysV init script — or not even are shell scripts at all.

Why have these files been introduced? On SysV systems services are started
via init scripts in /etc/rc.d/init.d (or a similar directory).
/etc/ is (these days) considered the place where system configuration
is stored. Originally these init scripts were subject to customization by the
administrator. But as they grew and become complex most distributions no longer
considered them true configuration files, but more just a special kind of programs.
To make customization easy and guarantee a safe upgrade path the customizable
bits hence have been moved to separate configuration files, which the init
scripts then source.

Let’s have a quick look what kind of configuration you can do with these
files. Here’s a short incomprehensive list of various things that can be
configured via environment settings in these source files I found browsing
through the directories on a Fedora and a Debian machine:

Additional command line parameters for the daemon binaries
Locale settings for a daemon
Shutdown time-out for a daemon
Shutdown mode for a daemon
System configuration like system locale, time zone information, console keyboard
Redundant system configuration, like whether the RTC is in local timezone
Firewall configuration data, not in shell format (!)
CPU affinity for a daemon
Settings unrelated to boot, for example including information how to install a new kernel package, how to configure nspluginwrap or whether to do library prelinking
Whether a specific service should be started or not
Networking configuration
Which kernel modules to statically load
Whether to halt or power-off on shutdown
Access modes for device nodes (!)
A description string for the SysV service (!)
The user/group ID, umask to run specific daemons as
Resource limits to set for a specific daemon
OOM adjustment to set for a specific daemon

Now, let’s go where the beef is: what’s wrong with /etc/sysconfig
(resp. /etc/default)? Why might it make sense to fade out use of these
files in a systemd world?

For the majority of these files the reason for having them simply does not
exist anymore: systemd unit files are not programs like SysV init scripts
were. Unit files are simple, declarative descriptions, that usually do not consist of more
than 6 lines or so. They can easily be generated, parsed without a Bourne
interpreter and understood by the reader. Also, they are very easy to modify:
just copy them from /lib/systemd/system to
/etc/systemd/system and edit them there, where they will not be
modified by the package manager. The need to separate code and configuration
that was the original reason to introduce these files does not exist anymore,
as systemd unit files do not include code. These files hence now are a solution
looking for a problem that no longer exists.
They are inherently distribution-specific. With systemd we hope to encourage
standardization between distributions. Part of this is that we want that unit files are
supplied with upstream, and not just added by the packager — how it has usually
been done in the SysV world. Since the location of the directory and the
available variables in the files is very different on each distribution,
supporting /etc/sysconfig files in upstream unit files is not
feasible. Configuration stored in these files works against de-balkanization of
the Linux platform.
Many settings are fully redundant in a systemd world. For example, various
services support configuration of the process credentials like the user/group
ID, resource limits, CPU affinity or the OOM adjustment settings. However, these settings are
supported only by some SysV init scripts, and often have different names if
supported in multiple of them. OTOH in systemd, all these settings are
available equally and uniformly for all services, with the same configuration
option in unit files.
Unit files know a large number of easy-to-use process context settings,
that are more comprehensive than what most /etc/sysconfig files offer.
A number of these settings are entirely questionnabe. For example, the
aforementiond configuration option for the user/group ID a service runs as is
primarily something the distributor has to take care of. There is little to win
for administrators to change these settings, and only the distributor has the
broad overview to make sure that UID/GID and name collisions do not
happen.
The file format is not ideal. Since the files are usually sourced as shell
scripts, parse errors are very hard to decypher and are not logged along the
other configuration problems of the services. Generally, unknown variable
assignments simply have no effect but this is not warned about. This makes
these files harder to debug than necessary.
Configuration files sources from shell scripts are subject to the execution
parameters of the interpreter, and it has many: settings like IFS or LANG tend
to modify drastically how shell scripts are parsed and understood. This makes
them fragile.
Interpretation of these files is slow, since it requires spawning of a
shell, which adds at least one process for each service to be spawned at boot.
Often, files in /etc/sysconfig are used to “fake” configuration
files for daemons which do not support configuration files natively. This is
done by glueing together command line arguments from these variable assignments
that are then passed to the daemon. In general proper, native configuration
files in these daemons are the much prettier solution however. Command line
options like “-k”, “-a” or “-f” are not self-explanatory and have a very
cryptic syntax. Moreover the same switches in many daemons have (due to the
limited vocabulary) often very much contradicting effects. (On one daemon
-f might cause the daemon to daemonize, while on another one this
option turns exactly this behaviour off.) Command lines generally cannot include
sensible comments which most configuration files however can.
A number of configuration settings in /etc/sysconfig are entirely
redundant: for example, on many distributions it can be controlled via
/etc/sysconfig files whether the RTC is in UTC or local time. Such an
option already exists however in the 3rd line of the /etc/adjtime
(which is known on all distributions). Adding a second, redundant,
distribution-specific option overriding this is hence needless and complicates
things for no benefit.
Many of the configuration settings in /etc/sysconfig allow
disabling services. By this they basically become a second level of
enabling/disabling over what the init system already offers: when a service is
enabled with systemctl enable or chkconfig on these settings
override this, and turn the daemon of even though the init system was
configured to start it. This of course is very confusing to the
user/administrator, and brings virtually no benefit.
For options like the configuration of static kernel modules to load: there
are nowadays usually much better ways to load kernel modules at boot. For
example, most modules may now be autoloaded by udev when the right hardware is
found. This goes very far, and even includes ACPI and other high-level
technologies. One of the very few exceptions where we currently do not do
kernel module autoloading is CPU feature and model based autoloading which
however will be supported soon too. And even if your specific module cannot be
auto-loaded there’s usually a better way to statically load it, for example by
sticking it in /etc/load-modules.d so that the administrator can check
a standardized place for all statically loaded modules.
Last but not least, /etc already is intended to be the place for system
configuration (“Host-specific system configuration” according to FHS). A
subdirectory beneath it called sysconfig to place system configuration
in is hence entirely redundant, already on the language level.

What to use instead? Here are a few recommendations of what to do with these
files in the long run in a systemd world:

Just drop them without replacement. If they are fully redundant (like the
local/UTC RTC setting) this is should be a relatively easy way out (well,
ignoring the need for compatibility). If systemd natively supports an
equivalent option in the unit files there is no need to duplicate these
settings in sysconfig files. For a list of execution options you may
set for a service check out the respective man pages: systemd.exec(5)
and systemd.service(5).
If your setting simply adds another layer where a service can be disabled,
remove it to keep things simple. There’s no need to have multiple ways to
disable a service.
Find a better place for them. For configuration of the system locale or
system timezone we hope to gently push distributions into the right direction,
for more details see previous
episode of this series.
Turn these settings into native settings of the daemon. If necessary add
support for reading native configuration files to the daemon. Thankfully, most
of the stuff we run on Linux is Free Software, so this can relatively easily be
done.

Of course, there’s one very good reason for supporting these files for a bit
longer: compatibility for upgrades. But that’s is really the only one I could
come up with. It’s reason enough to keep compatibility for a while, but I think
it is a good idea to phase out usage of these files at least in new packages.

If compatibility is important, then systemd will still allow you to read
these configuration files even if you otherwise use native systemd unit files.
If your sysconfig file only knows simple options
EnvironmentFile=-/etc/sysconfig/foobar (See systemd.exec(5) for more information about this option.) may be used to import the
settings into the environment and use them to put together command lines. If
you need a programming language to make sense of these settings, then use a
programming language like shell. For example, place an short shell script in
/usr/lib/<your package>/ which reads these files for
compatibility, and then exec‘s the actual daemon binary. Then spawn
this script instead of the actual daemon binary with ExecStart= in the
unit file.

And this is all for now. Thank you very much
for your interest.

Noise