Tag Archives: Projects

Authenticated Boot and Disk Encryption on Linux

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/authenticated-boot-and-disk-encryption-on-linux.html

The Strange State of Authenticated Boot and Disk Encryption on Generic Linux Distributions

TL;DR: Linux has been supporting Full Disk Encryption (FDE) and
technologies such as UEFI SecureBoot and TPMs for a long
time. However, the way they are set up by most distributions is not as
secure as they should be, and in some ways quite frankly weird. In
fact, right now, your data is probably more secure if stored on
current ChromeOS, Android, Windows or MacOS devices, than it is on
typical Linux distributions.

Generic Linux distributions (i.e. Debian, Fedora, Ubuntu, …) adopted
Full Disk Encryption (FDE) more than 15 years ago, with the
LUKS/cryptsetup infrastructure. It was a big step forward to a more
secure environment. Almost ten years ago the big distributions started
adding UEFI SecureBoot to their boot process. Support for Trusted
Platform Modules (TPMs) has been added to the distributions a long
time ago as well — but even though many PCs/laptops these days have
TPM chips on-board it’s generally not used in the default setup of
generic Linux distributions.

How these technologies currently fit together on generic Linux
distributions doesn’t really make too much sense to me — and falls
short of what they could actually deliver. In this story I’d like to
have a closer look at why I think that, and what I propose to do about

The Basic Technologies

Let’s have a closer look what these technologies actually deliver:

  1. LUKS/dm-crypt/cryptsetup provide disk encryption, and optionally
    data authentication. Disk encryption means that reading the data in
    clear-text form is only possible if you possess a secret of some
    form, usually a password/passphrase. Data authentication means that
    no one can make changes to the data on disk unless they possess a
    secret of some form. Most distributions only enable the former
    though — the latter is a more recent addition to LUKS/cryptsetup,
    and is not used by default on most distributions (though it
    probably should be). Closely related to LUKS/dm-crypt is
    dm-verity (which can authenticate immutable volumes) and
    dm-integrity (which can authenticate writable volumes, among
    other things).

  2. UEFI SecureBoot provides mechanisms for authenticating boot loaders
    and other pre-OS binaries before they are invoked. If those boot
    loaders then authenticate the next step of booting in a similar
    fashion there’s a chain of trust which can ensure that only code
    that has some level of trust associated with it will run on the
    system. Authentication of boot loaders is done via cryptographic
    signatures: the OS/boot loader vendors cryptographically sign their
    boot loader binaries. The cryptographic certificates that may be
    used to validate these signatures are then signed by Microsoft, and
    since Microsoft’s certificates are basically built into all of
    today’s PCs and laptops this will provide some basic trust chain:
    if you want to modify the boot loader of a system you must have
    access to the private key used to sign the code (or to the private
    keys further up the certificate chain).

  3. TPMs do many things. For this text we’ll focus one facet: they can
    be used to protect secrets (for example for use in disk encryption,
    see above), that are released only if the code that booted the host
    can be authenticated in some form. This works roughly like this:
    every component that is used during the boot process (i.e. code,
    certificates, configuration, …) is hashed with a cryptographic hash
    function before it is used. The resulting hash is written to some
    small volatile memory the TPM maintains that is write-only (the so
    called Platform Configuration Registers, “PCRs”): each step of the
    boot process will write hashes of the resources needed by the next
    part of the boot process into these PCRs. The PCRs cannot be
    written freely: the hashes written are combined with what is
    already stored in the PCRs — also through hashing and the result of
    that then replaces the previous value. Effectively this means: only
    if every component involved in the boot matches expectations the
    hash values exposed in the TPM PCRs match the expected values
    too. And if you then use those values to unlock the secrets you
    want to protect you can guarantee that the key is only released to
    the OS if the expected OS and configuration is booted. The process
    of hashing the components of the boot process and writing that to
    the TPM PCRs is called “measuring”. What’s also important to
    mention is that the secrets are not only protected by these PCR
    values but encrypted with a “seed key” that is generated on the TPM
    chip itself, and cannot leave the TPM (at least so goes the
    theory). The idea is that you cannot read out a TPM’s seed key, and
    thus you cannot duplicate the chip: unless you possess the
    original, physical chip you cannot retrieve the secret it might be
    able to unlock for you. Finally, TPMs can enforce a limit on unlock
    attempts per time (“anti-hammering”): this makes it hard to brute
    force things: if you can only execute a certain number of unlock
    attempts within some specific time then brute forcing will be
    prohibitively slow.

How Linux Distributions use these Technologies

As mentioned already, Linux distributions adopted the first two
of these technologies widely, the third one not so much.

So typically, here’s how the boot process of Linux distributions works
these days:

  1. The UEFI firmware invokes a piece of code called “shim” (which is
    stored in the EFI System Partition — the “ESP” — of your system),
    that more or less is just a list of certificates compiled into code
    form. The shim is signed with the aforementioned Microsoft key,
    that is built into all PCs/laptops. This list of certificates then
    can be used to validate the next step of the boot process. The shim
    is measured by the firmware into the TPM. (Well, the shim can do a
    bit more than what I describe here, but this is outside of the
    focus of this article.)

  2. The shim then invokes a boot loader (often Grub) that is signed by
    a private key owned by the distribution vendor. The boot loader is
    stored in the ESP as well, plus some other places (i.e. possibly a
    separate boot partition). The corresponding certificate is included
    in the list of certificates built into the shim. The boot loader
    components are also measured into the TPM.

  3. The boot loader then invokes the kernel and passes it an initial
    RAM disk image (initrd), which contains initial userspace code. The
    kernel itself is signed by the distribution vendor too. It’s also
    validated via the shim. The initrd is not validated, though
    (!). The kernel is measured into the TPM, the initrd sometimes too.

  4. The kernel unpacks the initrd image, and invokes what is contained
    in it. Typically, the initrd then asks the user for a password for
    the encrypted root file system. The initrd then uses that to set up
    the encrypted volume. No code authentication or TPM measurements
    take place.

  5. The initrd then transitions into the root file system. No code
    authentication or TPM measurements take place.

  6. When the OS itself is up the user is prompted for their user name,
    and their password. If correct, this will unlock the user account:
    the system is now ready to use. At this point no code
    authentication, no TPM measurements take place. Moreover, the
    user’s password is not used to unlock any data, it’s used only to
    allow or deny the login attempt — the user’s data has already been
    decrypted a long time ago, by the initrd, as mentioned above.

What you’ll notice here of course is that code validation happens for
the shim, the boot loader and the kernel, but not for the initrd or
the main OS code anymore. TPM measurements might go one step further:
the initrd is measured sometimes too, if you are lucky. Moreover, you
might notice that the disk encryption password and the user password
are inquired by code that is not validated, and is thus not safe from
external manipulation. You might also notice that even though TPM
measurements of boot loader/OS components are done nothing actually
ever makes use of the resulting PCRs in the typical setup.

Attack Scenarios

Of course, before determining whether the setup described above makes
sense or not, one should have an idea what one actually intends to
protect against.

The most basic attack scenario to focus on is probably that you want
to be reasonably sure that if someone steals your laptop that contains
all your data then this data remains confidential. The model described
above probably delivers that to some degree: the full disk encryption
when used with a reasonably strong password should make it hard for
the laptop thief to access the data. The data is as secure as the
password used is strong. The attacker might attempt to brute force the
password, thus if the password is not chosen carefully the attacker
might be successful.

Two more interesting attack scenarios go something like this:

  1. Instead of stealing your laptop the attacker takes the harddisk
    from your laptop while you aren’t watching (e.g. while you went for
    a walk and left it at home or in your hotel room), makes a copy of
    it, and then puts it back. You’ll never notice they did that. The
    attacker then analyzes the data in their lab, maybe trying to brute
    force the password. In this scenario you won’t even know that your
    data is at risk, because for you nothing changed — unlike in the
    basic scenario above. If the attacker manages to break your
    password they have full access to the data included on it,
    i.e. everything you so far stored on it, but not necessarily on
    what you are going to store on it later. This scenario is worse
    than the basic one mentioned above, for the simple fact that you
    won’t know that you might be attacked. (This scenario could be
    extended further: maybe the attacker has a chance to watch you type
    in your password or so, effectively lowering the password

  2. Instead of stealing your laptop the attacker takes the harddisk
    from your laptop while you aren’t watching, inserts backdoor code
    on it, and puts it back. In this scenario you won’t know your data
    is at risk, because physically everything is as before. What’s
    really bad though is that the attacker gets access to anything you
    do on your laptop, both the data already on it, and whatever you
    will do in the future.

I think in particular this backdoor attack scenario is something we
should be concerned about. We know for a fact that attacks like that
happen all the time (Pegasus, industry espionage, …), hence we should
make them hard.

Are we Safe?

So, does the scheme so far implemented by generic Linux distributions
protect us against the latter two scenarios? Unfortunately not at
all. Because distributions set up disk encryption the way they do, and
only bind it to a user password, an attacker can easily duplicate the
disk, and then attempt to brute force your password. What’s worse:
since code authentication ends at the kernel — and the initrd is not
authenticated anymore —, backdooring is trivially easy: an attacker
can change the initrd any way they want, without having to fight any
kind of protections. And given that FDE unlocking is implemented in
the initrd, and it’s the initrd that asks for the encryption password
things are just too easy: an attacker could trivially easily insert
some code that picks up the FDE password as you type it in and send it
wherever they want. And not just that: since once they are in they are
in, they can do anything they like for the rest of the system’s
lifecycle, with full privileges — including installing backdoors for
versions of the OS or kernel that are installed on the device in the
future, so that their backdoor remains open for as long as they like.

That is sad of course. It’s particular sad given that the other
popular OSes all address this much better. ChromeOS, Android, Windows
and MacOS all have way better built-in protections against attacks
like this. And it’s why one can certainly claim that your data is
probably better protected right now if you store it on those OSes then
it is on generic Linux distributions.

(Yeah, I know that there are some niche distros which do this better,
and some hackers hack their own. But I care about general purpose
distros here, i.e. the big ones, that most people base their work on.)

Note that there are more problems with the current setup. For example,
it’s really weird that during boot the user is queried for an FDE
password which actually protects their data, and then once the system
is up they are queried again – now asking for a username, and another
password. And the weird thing is that this second authentication that
appears to be user-focused doesn’t really protect the user’s data
anymore — at that moment the data is already unlocked and
accessible. The username/password query is supposed to be useful in
multi-user scenarios of course, but how does that make any sense,
given that these multiple users would all have to know a disk
encryption password that unlocks the whole thing during the FDE step,
and thus they have access to every user’s data anyway if they make an
offline copy of the harddisk?

Can we do better?

Of course we can, and that is what this story is actually supposed to
be about.

Let’s first figure out what the minimal issues we should fix are (at
least in my humble opinion):

  1. The initrd must be authenticated before being booted into. (And
    measured unconditionally.)

  2. The OS binary resources (i.e. /usr/) must be authenticated before
    being booted into. (But don’t need to be encrypted, since everyone
    has the same anyway, there’s nothing to hide here.)

  3. The OS configuration and state (i.e. /etc/ and /var/) must be
    encrypted, and authenticated before they are used. The encryption
    key should be bound to the TPM device; i.e system data should be
    locked to a security concept belonging to the system, not the user.

  4. The user’s home directory (i.e. /home/lennart/ and similar) must
    be encrypted and authenticated. The unlocking key should be bound
    to a user password or user security token (FIDO2 or PKCS#11 token);
    i.e. user data should be locked to a security concept belonging to
    the user, not the system.

Or to summarize this differently:

  1. Every single component of the boot
    process and OS needs to be authenticated, i.e. all of shim (done),
    boot loader (done), kernel (done), initrd (missing so far), OS binary
    resources (missing so far), OS configuration and state (missing so
    far), the user’s home (missing so far).

  2. Encryption is necessary for the OS configuration and state (bound
    to TPM), and for the user’s home directory (bound to a user
    password or user security token).

In Detail

Let’s see how we can achieve the above in more detail.

How to Authenticate the initrd

At the moment initrds are generated on the installed host via scripts
(dracut and similar) that try to figure out a minimal set of binaries
and configuration data to build an initrd that contains just enough to
be able to find and set up the root file system. What is included in
the initrd hence depends highly on the individual installation and its
configuration. Pretty likely no two initrds generated that way will be
fully identical due to this. This model clearly has benefits: the
initrds generated this way are very small and minimal, and support
exactly what is necessary for the system to boot, and not less or
more. It comes with serious drawbacks too though: the generation
process is fragile and sometimes more akin to black magic than
following clear rules: the generator script natively has to understand
a myriad of storage stacks to determine what needs to be included and
what not. It also means that authenticating the image is hard: given
that each individual host gets a different specialized initrd, it
means we cannot just sign the initrd with the vendor key like we sign
the kernel. If we want to keep this design we’d have to figure out
some other mechanism (e.g. a per-host signature key – that is
generated locally; or by authenticating it with a message
authentication code bound to the TPM). While these approaches are
certainly thinkable, I am not convinced they actually are a good idea
though: locally and dynamically generated per-host initrds is
something we probably should move away from.

If we move away from locally generated initrds, things become a lot
simpler. If the distribution vendor generates the initrds on their
build systems then it can be attached to the kernel image itself, and
thus be signed and measured along with the kernel image, without any
further work. This simplicity is simply lovely. Besides robustness and
reproducibility this gives us an easy route to authenticated initrds.

But of course, nothing is really that simple: working with
vendor-generated initrds means that we can’t adjust them anymore to
the specifics of the individual host: if we pre-build the initrds and
include them in the kernel image in immutable fashion then it becomes
harder to support complex, more exotic storage or to parameterize it
with local network server information, credentials, passwords, and so
on. Now, for my simple laptop use-case these things don’t matter,
there’s no need to extend/parameterize things, laptops and their
setups are not that wildly different. But what to do about the cases
where we want both: extensibility to cover for less common storage
subsystems (iscsi, LVM, multipath, drivers for exotic hardware…) and

Here’s a proposal how to achieve that: let’s build a basic initrd into
the kernel as suggested, but then do two things to make this scheme
both extensible and parameterizable, without compromising security.

  1. Let’s define a way how the basic initrd can be extended with
    additional files, which are stored in separate “extension
    images”. The basic initrd should be able to discover these extension
    images, authenticate them and then activate them, thus extending
    the initrd with additional resources on-the-fly.

  2. Let’s define a way how we can safely pass additional parameters to
    the kernel/initrd (and actually the rest of the OS, too) in an
    authenticated (and possibly encrypted) fashion. Parameters in this
    context can be anything specific to the local installation,
    i.e. server information, security credentials, certificates, SSH
    server keys, or even just the root password that shall be able to
    unlock the root account in the initrd …

In such a scheme we should be able to deliver everything we are
looking for:

  1. We’ll have a full trust chain for the code: the boot loader will
    authenticate and measure the kernel and basic initrd. The initrd
    extension images will then be authenticated by the basic initrd

  2. We’ll have authentication for all the parameters passed to the

This so far sounds very unspecific? Let’s make it more specific by
looking closer at the components I’d suggest to be used for this

  1. The systemd suite since a few months contains a subsystem
    implementing system extensions (v248). System extensions are
    ultimately just disk images (for example a squashfs file system in
    a GPT envelope) that can extend an underlying OS tree. Extending
    in this regard means they simply add additional files and
    directories into the OS tree, i.e. below /usr/. For a longer
    explanation see
    systemd-sysext(8). When
    a system extension is activated it is simply mounted and then
    merged into the main /usr/ tree via a read-only overlayfs
    mount. Now what’s particularly nice about them in this context we
    are talking about here is that the extension images may carry
    dm-verity authentication data, and PKCS#7 signatures (once this
    is merged, that
    is, i.e. v250

  2. The systemd suite also contains a concept called service
    “credentials”. These are small pieces of information passed to
    services in a secure way. One key feature of these credentials is
    that they can be encrypted and authenticated in a very simple way
    with a key bound to the TPM (v250). See
    for details. They are great for safely storing SSL private keys and
    similar on your system, but they also come handy for parameterizing
    initrds: an encrypted credential is just a file that can only be
    decoded if the right TPM is around with the right PCR values set.

  3. The systemd suite contains a component called
    systemd-stub(7). It’s
    an EFI stub, i.e. a small piece of code that is attached to a
    kernel image, and turns the kernel image into a regular EFI binary
    that can be directly executed by the firmware (or a boot
    loader). This stub has a number of nice features (for example, it
    can show a boot splash before invoking the Linux kernel itself and
    such). Once this work is
    merged (v250)
    the stub
    will support one more feature: it will automatically search for
    system extension image files and credential files next to the
    kernel image file, measure them and pass them on to the main initrd
    of the host.

Putting this together we have nice way to provide fully authenticated
kernel images, initrd images and initrd extension images; as well as
encrypted and authenticated parameters via the credentials logic.

How would a distribution actually make us of this? A distribution
vendor would pre-build the basic initrd, and glue it into the kernel
image, and sign that as a whole. Then, for each supposed extension of
the basic initrd (e.g. one for iscsi support, one for LVM, one for
multipath, …), the vendor would use a tool such as
mkosi to build an extension image,
i.e. a GPT disk image containing the files in squashfs format, a
Verity partition that authenticates it, plus a PKCS#7 signature
partition that validates the root hash for the dm-verity partition,
and that can be checked against a key provided by the boot loader or
main initrd. Then, any parameters for the initrd will be encrypted
using systemd-creds encrypt
. The
resulting encrypted credentials and the initrd extension images are
then simply placed next to the kernel image in the ESP (or boot
partition). Done.

This checks all boxes: everything is authenticated and measured, the
credentials also encrypted. Things remain extensible and modular, can
be pre-built by the vendor, and installation is as simple as dropping
in one file for each extension and/or credential.

How to Authenticate the Binary OS Resources

Let’s now have a look how to authenticate the Binary OS resources,
i.e. the stuff you find in /usr/, i.e. the stuff traditionally
shipped to the user’s system via RPMs or DEBs.

I think there are three relevant ways how to authenticate this:

  1. Make /usr/ a dm-verity volume. dm-verity is a concept
    implemented in the Linux kernel that provides authenticity to
    read-only block devices: every read access is cryptographically
    verified against a top-level hash value. This top-level
    hash is typically a 256bit value that you can either encode in the
    kernel image you are using, or cryptographically sign (which is
    particularly nice once this is
    ). I think
    this is actually the best approach since it makes the /usr/ tree
    entirely immutable in a very simple way. However, this also means
    that the whole of /usr/ needs to be updated as once, i.e. the
    traditional rpm/apt based update logic cannot work in this

  2. Make /usr/ a dm-integrity volume. dm-integrity is a concept
    provided by the Linux kernel that offers integrity guarantees to
    writable block devices, i.e. in some ways it can be considered to be
    a bit like dm-verity while permitting write access. It can be
    used in three ways, one of which I think is particularly relevant
    here. The first way is with a simple hash function in “stand-alone”
    mode: this is not too interesting here, it just provides greater
    data safety for file systems that don’t hash check their files’ data
    on their own. The second way is in combination with dm-crypt,
    i.e. with disk encryption. In this case it adds authenticity to
    confidentiality: only if you know the right secret you can read and
    make changes to the data, and any attempt to make changes without
    knowing this secret key will be detected as IO error on next read
    by those in possession of the secret (more about this below). The
    third way is the one I think is most interesting here: in
    “stand-alone” mode, but with a keyed hash function
    (e.g. HMAC). What’s this good for? This provides authenticity
    without encryption: if you make changes to the disk without knowing
    the secret this will be noticed on the next read attempt of the
    data and result in IO errors. This mode provides what we want
    (authenticity) and doesn’t do what we don’t need (encryption). Of
    course, the secret key for the HMAC must be provided somehow, I
    think ideally by the TPM.

  3. Make /usr/ a dm-crypt (LUKS) + dm-integrity volume. This
    provides both authenticity and encryption. The latter isn’t
    typically needed for /usr/ given that it generally contains no
    secret data: anyone can download the binaries off the Internet
    anyway, and the sources too. By encrypting this you’ll waste CPU
    cycles, but beyond that it doesn’t hurt much. (Admittedly, some
    people might want to hide the precise set of packages they have
    installed, since it of course does reveal a bit of information
    about you: i.e. what you are working on, maybe what your job is –
    think: if you are a hacker you have hacking tools installed – and
    similar). Going this way might simplify things in some cases, as it
    means you don’t have to distinguish “OS binary resources” (i.e
    /usr/) and “OS configuration and state” (i.e. /etc/ + /var/,
    see below), and just make it the same volume. Here too, the secret
    key must be provided somehow, I think ideally by the TPM.

All three approach are valid. The first approach has my primary
sympathies, but for distributions not willing to abandon client-side
updates via RPM/dpkg this is not an option, in which case I would
propose the other two approaches for these cases.

The LUKS encryption key (and in case of dm-integrity standalone mode
the key for the keyed hash function) should be bound to the TPM. Why
the TPM for this? You could also use a user password, a FIDO2 or
PKCS#11 security token — but I think TPM is the right choice: why
that? To reduce the requirement for repeated authentication, i.e. that
you first have to provide the disk encryption password, and then you
have to login, providing another password. It should be possible that
the system boots up unattended and then only one authentication prompt
is needed to unlock the user’s data properly. The TPM provides a way
to do this in a reasonably safe and fully unattended way. Also, when
we stop considering just the laptop use-case for a moment: on servers
interactive disk encryption prompts don’t make much sense — the fact
that TPMs can provide secrets without this requiring user interaction
and thus the ability to work in entirely unattended environments is
quite desirable. Note that
as implemented by systemd (v248) provides native support for
authentication via password, via TPM2, via PKCS#11 or via FIDO2, so
the choice is ultimately all yours.

How to Encrypt/Authenticate OS Configuration and State

Let’s now look at the OS configuration and state, i.e. the stuff in
/etc/ and /var/. It probably makes sense to not consider these two
hierarchies independently but instead just consider this to be the
root file system. If the OS binary resources are in a separate file
system it is then mounted onto the /usr/ sub-directory of the root
file system.

The OS configuration and state (or: root file system) should be both
encrypted and authenticated: it might contain secret keys, user
passwords, privileged logs and similar. This data matters and contains
plenty data that should remain confidential.

The encryption of choice here is dm-crypt (LUKS) + dm-integrity
similar as discussed above, again with the key bound to the TPM.

If the OS binary resources are protected the same way it is safe to
merge these two volumes and have a single partition for both (see

How to Encrypt/Authenticate the User’s Home Directory

The data in the user’s home directory should be encrypted, and bound
to the user’s preferred token of authentication (i.e. a password or
FIDO2/PKCS#11 security token). As mentioned, in the traditional mode
of operation the user’s home directory is not individually encrypted,
but only encrypted because FDE is in use. The encryption key for that
is a system wide key though, not a per-user key. And I think that’s
problem, as mentioned (and probably not even generally understood by
our users). We should correct that and ensure that the user’s password
is what unlocks the user’s data.

In the systemd suite we provide a service
(v245) that implements this in a safe way: each user gets its own LUKS
volume stored in a loopback file in /home/, and this is enough to
synthesize a user account. The encryption password for this volume is
the user’s account password, thus it’s really the password provided at
login time that unlocks the user’s data. systemd-homed also supports
other mechanisms of authentication, in particular PKCS#11/FIDO2
security tokens. It also provides support for other storage back-ends
(such as fscrypt), but I’d always suggest to use the LUKS back-end
since it’s the only one providing the comprehensive confidentiality
guarantees one wants for a UNIX-style home directory.

Note that there’s one special caveat here: if the user’s home
directory (e.g. /home/lennart/) is encrypted and authenticated, what
about the file system this data is stored on, i.e. /home/ itself? If
that dir is part of the the root file system this would result in
double encryption: first the data is encrypted with the TPM root file
system key, and then again with the per-user key. Such double
encryption is a waste of resources, and unnecessary. I’d thus suggest
to make /home/ its own dm-integrity volume with a HMAC, keyed by
the TPM. This means the data stored directly in /home/ will be
authenticated but not encrypted. That’s good not only for performance,
but also has practical benefits: it allows extracting the encrypted
volume of the various users in case the TPM key is lost, as a way to
recover from dead laptops or similar.

Why authenticate /home/, if it only contains per-user home
directories that are authenticated on their own anyway? That’s a
valid question: it’s because the kernel file system maintainers made
clear that Linux file system code is not considered safe against rogue
disk images, and is not tested for that; this means before you mount
anything you need to establish trust in some way because otherwise
there’s a risk that the act of mounting might exploit your kernel.

Summary of Resources and their Protections

So, let’s now put this all together. Here’s a table showing the
various resources we deal with, and how I think they should be
protected (in my idealized world).

Resource Needs Authentication Needs Encryption Suggested Technology Validation/Encryption Keys/Certificates acquired via Stored where
Shim yes no SecureBoot signature verification firmware certificate database ESP
Boot loader yes no ditto firmware certificate database/shim ESP/boot partition
Kernel yes no ditto ditto ditto
initrd yes no ditto ditto ditto
initrd parameters yes yes systemd TPM encrypted credentials TPM ditto
initrd extensions yes no systemd-sysext with Verity+PKCS#7 signatures firmware/initrd certificate database ditto
OS binary resources yes no dm-verity root hash linked into kernel image, or firmware/initrd certificate database top-level partition
OS configuration and state yes yes dm-crypt (LUKS) + dm-integrity TPM top-level partition
/home/ itself yes no dm-integrity with HMAC TPM top-level partition
User home directories yes yes dm-crypt (LUKS) + dm-integrity in loopback files User password/FIDO2/PKCS#11 security token loopback file inside /home partition

This should provide all the desired guarantees: everything is
authenticated, and the individualized per-host or per-user data
is also encrypted. No double encryption takes place. The encryption
keys/verification certificates are stored/bound to the most appropriate

Does this address the three attack scenarios mentioned earlier? I
think so, yes. The basic attack scenario I described is addressed by
the fact that /var/, /etc/ and /home/*/ are encrypted. Brute
forcing the former two is harder than in the status quo ante model,
since a high entropy key is used instead of one derived from a user
provided password. Moreover, the “anti-hammering” logic of the TPM
will make brute forcing prohibitively slow. The home directories are
protected by the user’s password or ideally a personal FIDO2/PKCS#11
security token in this model. Of course, a password isn’t better
security-wise then the status quo ante. But given the FIDO2/PKCS#11
support built into systemd-homed it should be easier to lock down
the home directories securely.

Binding encryption of /var/ and /etc/ to the TPM also addresses
the first of the two more advanced attack scenarios: a copy of the
harddisk is useless without the physical TPM chip, since the seed key
is sealed into that. (And even if the attacker had the chance to watch
you type in your password, it won’t help unless they possess access to
to the TPM chip.) For the home directory this attack is not addressed
as long as a plain password is used. However, since binding home
directories to FIDO2/PKCS#11 tokens is built into systemd-homed
things should be safe here too — provided the user actually possesses
and uses such a device.

The backdoor attack scenario is addressed by the fact that every
resource in play now is authenticated: it’s hard to backdoor the OS if
there’s no component that isn’t verified by signature keys or TPM
secrets the attacker hopefully doesn’t know.

For general purpose distributions that focus on updating the OS per
RPM/dpkg the idealized model above won’t work out, since (as
mentioned) this implies an immutable /usr/, and thus requires
updating /usr/ via an atomic update operation. For such distros a
setup like the following is probably more realistic, but see above.

Resource Needs Authentication Needs Encryption Suggested Technology Validation/Encryption Keys/Certificates acquired via Stored where
Shim yes no SecureBoot signature verification firmware certificate database ESP
Boot loader yes no ditto firmware certificate database/shim ESP/boot partition
Kernel yes no ditto ditto ditto
initrd yes no ditto ditto ditto
initrd parameters yes yes systemd TPM encrypted credentials TPM ditto
initrd extensions yes no systemd-sysext with Verity+PKCS#7 signatures firmware/initrd certificate database ditto
OS binary resources, configuration and state yes yes dm-crypt (LUKS) + dm-integrity TPM top-level partition
/home/ itself yes no dm-integrity with HMAC TPM top-level partition
User home directories yes yes dm-crypt (LUKS) + dm-integrity in loopback files User password/FIDO2/PKCS#11 security token loopback file inside /home partition

This means there’s only one root file system that contains all of
/etc/, /var/ and /usr/.

Recovery Keys

When binding encryption to TPMs one problem that arises is what
strategy to adopt if the TPM is lost, due to hardware failure: if I
need the TPM to unlock my encrypted volume, what do I do if I need the
data but lost the TPM?

The answer here is supporting recovery keys (this is similar to how
other OSes approach this). Recovery keys are pretty much the same
concept as passwords. The main difference being that they are computer
generated rather than user-chosen. Because of that they typically have
much higher entropy (which makes them more annoying to type in, i.e
you want to use them only when you must, not day-to-day). By having
higher entropy they are useful in combination with TPM, FIDO2 or
PKCS#11 based unlocking: unlike a combination with passwords they do
not compromise the higher strength of protection that
TPM/FIDO2/PKCS#11 based unlocking is supposed to provide.

Current versions of
implement a recovery key concept in an attempt to address this
problem. You may enroll any combination of TPM chips, PKCS#11 tokens,
FIDO2 tokens, recovery keys and passwords on the same LUKS
volume. When enrolling a recovery key it is generated and shown on
screen both in text form and as QR code you can scan off screen if you
like. The idea is write down/store this recovery key at a safe place so
that you can use it when you need it. Note that such recovery keys can
be entered wherever a LUKS password is requested, i.e. after
generation they behave pretty much the same as a regular password.

TPM PCR Brittleness

Locking devices to TPMs and enforcing a PCR policy with this
(i.e. configuring the TPM key to be unlockable only if certain PCRs
match certain values, and thus requiring the OS to be in a certain
state) brings a problem with it: TPM PCR brittleness. If the key you
want to unlock with the TPM requires the OS to be in a specific state
(i.e. that all OS components’ hashes match certain expectations or
similar) then doing OS updates might have the affect of making your
key inaccessible: the OS updates will cause the code to change, and
thus the hashes of the code, and thus certain PCRs. (Thankfully, you
unrolled a recovery key, as described above, so this doesn’t mean you
lost your data, right?).

To address this I’d suggest three strategies:

  1. Most importantly: don’t actually use the TPM PCRs that contain code
    hashes. There are actually multiple PCRs
    each containing measurements of different aspects of the boot
    process. My recommendation is to bind keys to PCR 7 only, a PCR
    that contains measurements of the UEFI SecureBoot certificate
    databases. Thus, the keys will remain accessible as long as these
    databases remain the same, and updates to code will not affect it
    (updates to the certificate databases will, and they do happen too,
    though hopefully much less frequent then code updates). Does this
    reduce security? Not much, no, because the code that’s run is after
    all not just measured but also validated via code signatures, and
    those signatures are validated with the aforementioned certificate
    databases. Thus binding an encrypted TPM key to PCR 7 should
    enforce a similar level of trust in the boot/OS code as binding it
    to a PCR with hashes of specific versions of that code. i.e. using
    PCR 7 means you say “every code signed by these vendors is allowed
    to unlock my key” while using a PCR that contains code hashes means
    “only this exact version of my code may access my key”.

  2. Use LUKS key management to enroll multiple versions of the TPM keys
    in relevant volumes, to support multiple versions of the OS code
    (or multiple versions of the certificate database, as discussed
    above). Specifically: whenever an update is done that might result
    changing the relevant PCRs, pre-calculate the new PCRs, and enroll
    them in an additional LUKS slot on the relevant volumes. This means
    that the unlocking keys tied to the TPM remain accessible in both
    states of the system. Eventually, once rebooted after the update,
    remove the old slots.

  3. If these two strategies didn’t work out (maybe because the
    OS/firmware was updated outside of OS control, or the update
    mechanism was aborted at the wrong time) and the TPM PCRs changed
    unexpectedly, and the user now needs to use their recovery key to
    get access to the OS back, let’s handle this gracefully and
    automatically reenroll the current TPM PCRs at boot, after the
    recovery key checked out, so that for future boots everything is in
    order again.

Other approaches can work too: for example, some OSes simply remove
TPM PCR policy protection of disk encryption keys altogether
immediately before OS or firmware updates, and then reenable it right
after. Of course, this opens a time window where the key bound to the
TPM is much less protected than people might assume. I’d try to avoid
such a scheme if possible.

Anything Else?

So, given that we are talking about idealized systems: I personally
actually think the ideal OS would be much simpler, and thus more
secure than this:

I’d try to ditch the Shim, and instead focus on enrolling the
distribution vendor keys directly in the UEFI firmware certificate
list. This is actually supported by all firmwares too. This has
various benefits: it’s no longer necessary to bind everything to
Microsoft’s root key, you can just enroll your own stuff and thus make
sure only what you want to trust is trusted and nothing else. To make
an approach like this easier, we have been working on doing automatic
enrollment of these keys from the systemd-boot boot loader, see
this work in progress for
. This way the
Firmware will authenticate the boot loader/kernel/initrd without any
further component for this in place.

I’d also not bother with a separate boot partition, and just use the
ESP for everything. The ESP is required anyway by the firmware, and is
good enough for storing the few files we need.


Can I implement all of this in my distribution today?

Probably not. While the big issues have mostly been addressed there’s
a lot of integration work still missing. As you might have seen I
linked some PRs that haven’t even been merged into our tree yet, and
definitely not been released yet or even entered the distributions.

Will this show up in Fedora/Debian/Ubuntu soon?

I don’t know. I am making a proposal how these things might work, and
am working on getting various building blocks for this into
shape. What the distributions do is up to them. But even if they don’t
follow the recommendations I make 100%, or don’t want to use the
building blocks I propose I think it’s important they start thinking
about this, and yes, I think they should be thinking about defaulting
to setups like this.

Work for measuring/signing initrds on Fedora has been started,
here’s a slide deck with some information about

But isn’t a TPM evil?

Some corners of the community tried (unfortunately successfully to
some degree) to paint TPMs/Trusted Computing/SecureBoot as generally
evil technologies that stop us from using our systems the way we
want. That idea is rubbish though, I think. We should focus on what it
can deliver for us (and that’s a lot I think, see above), and
appreciate the fact we can actually use it to kick out perceived evil
empires from our devices instead of being subjected to them. Yes, the
way SecureBoot/TPMs are defined puts you in the driver seat if you
want — and you may enroll your own certificates to keep out everything
you don’t like.

What if my system doesn’t have a TPM?

TPMs are becoming quite ubiquitous, in particular as the upcoming
Windows versions will require them. In general I think we should focus
on modern, fully equipped systems when designing all this, and then
find fall-backs for more limited systems. Frankly it feels as if so
far the design approach for all this was the other way round: try to
make the new stuff work like the old rather than the old like the new
(I mean, to me it appears this thinking is the main raison d’être for
the Grub boot loader).

More specifically, on the systems where we have no TPM we ultimately
cannot provide the same security guarantees as for those which
have. So depending on the resource to protect we should fall back to
different TPM-less mechanisms. For example, if we have no TPM then the
root file system should probably be encrypted with a user provided
password, typed in at boot as before. And for the encrypted boot
credentials we probably should simply not encrypt them, and place them
in the ESP unencrypted.

Effectively this means: without TPM you’ll still get protection regarding the
basic attack scenario, as before, but not the other two.

What if my system doesn’t have UEFI?

Many of the mechanisms explained above taken individually do not
require UEFI. But of course the chain of trust suggested above requires
something like UEFI SecureBoot. If your system lacks UEFI it’s
probably best to find work-alikes to the technologies suggested above,
but I doubt I’ll be able to help you there.

rpm/dpkg already cryptographically validates all packages at installation time (gpg), why would I need more than that?

This type of package validation happens once: at the moment of
installation (or update) of the package, but not anymore when the data
installed is actually used. Thus when an attacker manages to modify
the package data after installation and before use they can make any
change they like without this ever being noticed. Such package download
validation does address certain attack scenarios
(i.e. man-in-the-middle attacks on network downloads), but it doesn’t
protect you from attackers with physical access, as described in the
attack scenarios above.

Systems such as ostree aren’t better than rpm/dpkg regarding this
BTW, their data is not validated on use either, but only during
download or when processing tree checkouts.

Key really here is that the scheme explained here provides offline
protection for the data “at rest” — even someone with physical access
to your device cannot easily make changes that aren’t noticed on next
use. rpm/dpkg/ostree provide online protection only: as long as the
system remains up, and all OS changes are done through the intended
program code-paths, and no one has physical access everything should
be good. In today’s world I am sure this is not good enough though. As
mentioned most modern OSes provide offline protection for the data at
rest in one way or another. Generic Linux distributions are terribly
behind on this.

This is all so desktop/laptop focused, what about servers?

I am pretty sure servers should provide similar security guarantees as
outlined above. In a way servers are a much simpler case: there are no
users and no interactivity. Thus the discussion of /home/ and what
it contains and of user passwords doesn’t matter. However, the
authenticated initrd and the unattended TPM-based encryption I think
are very important for servers too, in a trusted data center
environment. It provides security guarantees so far not given by Linux
server OSes.

I’d like to help with this, or discuss/comment on this

Submit patches or reviews through
GitHub. General discussion about
this is best done on the systemd mailing

The Wondrous World of Discoverable GPT Disk Images

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/the-wondrous-world-of-discoverable-gpt-disk-images.html

TL;DR: Tag your GPT partitions with the right, descriptive partition
types, and the world will become a better place.

A number of years ago we started the Discoverable Partitions
defines GPT
partition type UUIDs and partition flags for the various partitions
Linux systems typically deal with. Before the specification all Linux
partitions usually just used the same type, basically saying “Hey, I
am a Linux partition” and not much else. With this specification the
GPT partition type, flags and label system becomes a lot more
expressive, as it can tell you:

  1. What kind of data a partition contains (i.e. is this swap data, a file system or Verity data?)
  2. What the purpose/mount point of a partition is (i.e. is this a /home/ partition or a root file system?)
  3. What CPU architecture a partition is intended for (i.e. is this a root partition for x86-64 or for aarch64?)
  4. Shall this partition be mounted automatically? (i.e. without specifically be configured via /etc/fstab)
  5. And if so, shall it be mounted read-only?
  6. And if so, shall the file system be grown to its enclosing partition size, if smaller?
  7. Which partition contains the newer version of the same data (i.e. multiple root file systems, with different versions)

By embedding all of this information inside the GPT partition table
disk images become self-descriptive: without requiring any other
source of information (such as /etc/fstab) if you look at a
compliant GPT disk image it is clear how an image is put together and
how it should be used and mounted. This self-descriptiveness in
particular breaks one philosophical weirdness of traditional Linux
installations: the original source of information which file system
the root file system is, typically is embedded in the root file system
itself, in /etc/fstab. Thus, in a way, in order to know what the
root file system is you need to know what the root file system is. 🤯
🤯 🤯

(Of course, the way this recursion is traditionally broken up is by
then copying the root file system information from /etc/fstab into
the boot loader configuration, resulting in a situation where the
primary source of information for this — i.e. /etc/fstab — is
actually mostly irrelevant, and the secondary source — i.e. the copy
in the boot loader — becomes the configuration that actually matters.)

Today, the GPT partition type UUIDs defined by the specification have
been adopted quite widely, by distributions and their installers, as
well as a variety of partitioning tools and other tools.

In this article I want to highlight how the various tools the
systemd project provides make use of the
concepts the specification introduces.

But before we start with that, let’s underline why tagging partitions
with these descriptive partition type UUIDs (and the associated
partition flags) is a good thing, besides the philosophical points
made above.

  1. Simplicity: in particular OS installers become simpler — adjusting
    /etc/fstab as part of the installation is not necessary anymore,
    as the partitioning step already put all information into place for
    assembling the system properly at boot. i.e. installing doesn’t
    mean that you always have to get fdisk and /etc/fstab into
    place, the former suffices entirely.

  2. Robustness: since partition tables mostly remain static after
    installation the chance of corruption is much lower than if the
    data is stored in file systems (e.g. in /etc/fstab). Moreover by
    associating the metadata directly with the objects it describes the
    chance of things getting out of sync is reduced. (i.e. if you lose
    /etc/fstab, or forget to rerun your initrd builder you still know
    what a partition is supposed to be just by looking at it.)

  3. Programmability: if partitions are self-descriptive it’s much
    easier to automatically process them with various tools. In fact,
    this blog story is mostly about that: various systemd tools can
    naturally process disk images prepared like this.

  4. Alternative entry points: on traditional disk images, the boot
    loader needs to be told which kernel command line option root= to
    use, which then provides access to the root file system, where
    /etc/fstab is then found which describes the rest of the file
    systems. Where precisely root= is configured for the boot loader
    highly depends on the boot loader and distribution used, and is
    typically encoded in a Turing complete programming language
    (Grub…). This makes it very hard to automatically determine the
    right root file system to use, to implement alternative entry points
    to the system. By alternative entry points I mean other ways to boot
    the disk image, specifically for running it as a systemd-nspawn
    container — but this extends to other mechanisms where the boot
    loader may be bypassed to boot up the system, for example qemu
    when configured without a boot loader.

  5. User friendliness: it’s simply a lot nicer for the user looking at
    a partition table if the partition table explains what is what,
    instead of just saying “Hey, this is a Linux partition!” and
    nothing else.

Uses for the concept

Now that we cleared up the Why?, lets have a closer look how this is
currently used and exposed in systemd‘s various components.

Use #1: Running a disk image in a container

If a disk image follows the Discoverable Partition Specification then
has all it needs to just boot it up. Specifically, if you have a GPT
disk image in a file foobar.raw and you want to boot it up in a
container, just run systemd-nspawn -i foobar.raw -b, and that’s it
(you can specify a block device like /dev/sdb too if you like). It
becomes easy and natural to prepare disk images that can be booted
either on a physical machine, inside a virtual machine manager or
inside such a container manager: the necessary meta-information is
included in the image, easily accessible before actually looking into
its file systems.

Use #2: Booting an OS image on bare-metal without /etc/fstab or kernel command line root=

If a disk image follows the specification in many cases you can remove
/etc/fstab (or never even install it) — as the basic information
needed is already included in the partition table. The
logic implements automatic discovery of the root file system as well
as all auxiliary file systems. (Note that the former requires an
initrd that uses systemd, some more conservative distributions do not
support that yet, unfortunately). Effectively this means you can boot
up a kernel/initrd with an entirely empty kernel command line, and the
initrd will automatically find the root file system (by looking for a
suitably marked partition on the same drive the EFI System Partition
was found on).

(Note, if /etc/fstab or root= exist and contain relevant
information they always takes precedence over the automatic logic. This
is in particular useful to tweaks thing by specifying additional mount
options and such.)

Use #3: Mounting a complex disk image for introspection or manipulation

tool may be used to introspect and manipulate OS disk images that
implement the specification. If you pass the path to a disk image (or
block device) it will extract various bits of useful information from
the image (e.g. what OS is this? what partitions to mount?) and display it.

With the --mount switch a disk image (or block device) can be
mounted to some location. This is useful for looking what is inside
it, or changing its contents. This will dissect the image and then
automatically mount all contained file systems matching their GPT
partition description to the right places, so that you subsequently
could chroot into it. (But why chroot if you can just use systemd-nspawn? 😎)

Use #4: Copying files in and out of a disk image

tool also has two switches --copy-from and --copy-to which allow
copying files out of or into a compliant disk image, taking all
included file systems and the resulting mount hierarchy into account.

Use #5: Running services directly off a disk image

setting in service unit files accepts paths to compliant disk images
(or block device nodes), and can mount them automatically, running
service binaries directly off them (in chroot() style). In fact,
this is the base for the Portable
concept of systemd.

Use #6: Provisioning disk images

systemd provides various tools that can run operations provisioning
disk images in an “offline” mode. Specifically:


With the --image= switch
can directly operate on a disk image, and for example create all
directories and other inodes defined in its declarative configuration
files included in the image. This can be useful for example to set up
the /var/ or /etc/ tree according to such configuration before
first boot.


Similar, the --image= switch of
tells the tool to read the declarative system user specifications
included in the image and synthesizes system users from it, writing
them to the /etc/passwd (and related) files in the image. This is
useful for provisioning these users before the first boot, for example
to ensure UID/GID numbers are pre-allocated, and such allocations not
delayed until first boot.


The --image= switch of
may be used to provision a fresh machine ID into
of a disk image, before first boot.


The --image= switch of
may be used to set various basic system setting (such as root
password, locale information, hostname, …) on the specified disk
image, before booting it up.

Use #7: Extracting log information

switch --image= may be used to show the journal log data included in
a disk image (or, as usual, the specified block device). This is very
useful for analyzing failed systems offline, as it gives direct access
to the logs without any further, manual analysis.

Use #8: Automatic repartitioning/growing of file systems

tool may be used to repartition a disk or image in an declarative and
additive way. One primary use-case for it is to run during boot on
physical or VM systems to grow the root file system to the disk size,
or to add in, format, encrypt, populate additional partitions at boot.

With its --image= switch it the tool may operate on compliant disk
images in offline mode of operation: it will then read the partition
definitions that shall be grown or created off the image itself, and
then apply them to the image. This is particularly useful in
combination with the --size= which allows growing disk images to the
specified size.

Specifically, consider the following work-flow: you download a
minimized disk image foobar.raw that contains only the minimized
root file system (and maybe an ESP, if you want to boot it on
bare-metal, too). You then run systemd-repart --image=foo.raw
to enlarge the image to the 15G, based on the declarative
rules defined in the
drop-in files included in the image (this means this can grow the root
partition, and/or add in more partitions, for example for /srv or
so, maybe encrypted with a locally generated key or so). Then, you
proceed to boot it up with systemd-nspawn --image=foo.raw -b, making
use of the full 15G.

Versioning + Multi-Arch

Disk images implementing this specifications can carry OS executables in one of three ways:

  1. Only a root file system

  2. Only a /usr/ file system (in which case the root file system is automatically picked as tmpfs).

  3. Both a root and a /usr/file system (in which case the two are
    combined, the /usr/ file system mounted into the root file system,
    and the former possibly in read-only fashion`)

They may also contain OS executables for different architectures,
permitting “multi-arch” disk images that can safely boot up on
multiple CPU architectures. As the root and /usr/ partition type
UUIDs are specific to architectures this is easily done by including
one such partition for x86-64, and another for aarch64. If the
image is now used on an x86-64 system automatically the former
partition is used, on aarch64 the latter.

Moreover, these OS executables may be contained in different versions,
to implement a simple versioning scheme: when tools such as
systemd-nspawn or systemd-gpt-auto-generator dissect a disk image,
and they find two or more root or /usr/ partitions of the same type
UUID, they will automatically pick the one whose GPT partition label
(a 36 character free-form string every GPT partition may have) is the
newest according to
(OK, truth be told, we don’t use strverscmp() as-is, but a modified
version with some more modern syntax and semantics, but conceptually

This logic allows to implement a very simple and natural A/B update
scheme: an updater can drop multiple versions of the OS into separate
root or /usr/ partitions, always updating the partition label to the
version included there-in once the download is complete. All of the
tools described here will then honour this, and always automatically
pick the newest version of the OS.


When building modern OS appliances, security is highly
relevant. Specifically, offline security matters: an attacker with
physical access should have a difficult time modifying the OS in a way
that isn’t noticed. i.e. think of a car or a cell network base
station: these appliances are usually parked/deployed in environments
attackers can get physical access to: it’s essential that in this case
the OS itself sufficiently protected, so that the attacker cannot just
mount the OS file system image, make modifications (inserting a
backdoor, spying software or similar) and the system otherwise
continues to run without this being immediately detected.

A great way to implement offline security is via Linux’ dm-verity
subsystem: it allows to securely bind immutable disk IO to a single,
short trusted hash value: if an attacker manages to offline modify the
disk image the modified disk image won’t match the trusted hash
anymore, and will not be trusted anymore (depending on policy this
then just result in IO errors being generated, or automatic

The Discoverable Partitions Specification declares how to include
Verity validation data in disk images, and how to relate them to the file
systems they protect, thus making if very easy to deploy and work with
such protected images. For example systemd-nspawn supports a
--root-hash= switch, which accepts the Verity root hash and then
will automatically assemble dm-verity with this, automatically
matching up the payload and verity partitions. (Alternatively, just
place a .roothash file next to the image file).


The above already is a powerful tool set for working with disk
images. However, there are some more areas I’d like to extend this
logic to:


Similar to the other tools mentioned above,
(which is a tool to interface with the boot loader, and install/update
systemd’s own EFI boot loader
should learn a --image= switch, to make installation of the boot
loader on disk images easy and natural. It would automatically find
the ESP and other relevant partitions in the image, and copy the boot
loader binaries into them (or update them).


Similar to the existing journalctl --image= logic the coredumpctl
tool should also gain an --image= switch for extracting coredumps
from compliant disk images. The combination of journalctl --image=
and coredumpctl --image= would make it exceptionally easy to work
with OS disk images of appliances and extracting logging and debugging
information from them after failures.

And that’s all for now. Please refer to the specification and the man
pages for further details. If your distribution’s installer does not
yet tag the GPT partition it creates with the right GPT type UUIDs,
consider asking them to do so.

Thank you for your time.

File Descriptor Limits

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/file-descriptor-limits.html

TL;DR: don’t use select() + bump the RLIMIT_NOFILE soft limit to
the hard limit in your modern programs.

The primary way to reference, allocate and pin runtime OS resources on
Linux today are file descriptors (“fds”). Originally they were used to
reference open files and directories and maybe a bit more, but today
they may be used to reference almost any kind of runtime resource in
Linux userspace, including open devices, memory
and even processes (with the new
system call). In a way, the philosophically skewed UNIX concept of
“everything is a file” through the proliferation of fds actually
acquires a bit of sensible meaning: “everything has a file
descriptor” is certainly a much better motto to adopt.

Because of this proliferation of fds, non-trivial modern programs tend
to have to deal with substantially more fds at the same time than they
traditionally did. Today, you’ll often encounter real-life programs
that have a few thousand fds open at the same time.

Like on most runtime resources on Linux limits are enforced on file
descriptors: once you hit the resource limit configured via
any attempt to allocate more is refused with the EMFILE error —
until you close a couple of those you already have open.

Because fds weren’t such a universal concept traditionally, the limit
of RLIMIT_NOFILE used to be quite low. Specifically, when the Linux
kernel first invokes userspace it still sets RLIMIT_NOFILE to a low
value of 1024 (soft) and 4096 (hard). (Quick explanation: the soft
limit is what matters and causes the EMFILE issues, the hard limit
is a secondary limit that processes may bump their soft limit to — if
they like — without requiring further privileges to do so. Bumping the
limit further would require privileges however.). A limit of 1024 fds
made fds a scarce resource: APIs tried to be careful with using fds,
since you simply couldn’t have that many of them at the same
time. This resulted in some questionable coding decisions and
concepts at various places: often secondary descriptors that are very
similar to fds — but were not actually fds — were introduced
(e.g. inotify watch descriptors), simply to avoid for them the low
limits enforced on true fds. Or code tried to aggressively close fds
when not absolutely needing them (e.g. ftw()/nftw()), losing the
nice + stable “pinning” effect of open fds.

Worse though is that certain OS level APIs were designed having only
the low limits in mind. The worst offender being the BSD/POSIX
system call: it only works with fds in the numeric range of 0…1023
(aka FD_SETSIZE-1). If you have an fd outside of this range, tough
luck: select() won’t work, and only if you are lucky you’ll detect
that and can handle it somehow.

Linux fds are exposed as simple integers, and for most calls it is
guaranteed that the lowest unused integer is allocated for new
fds. Thus, as long as the RLIMIT_NOFILE soft limit is set to 1024
everything remains compatible with select(): the resulting fds will
also be below 1024. Yay. If we’d bump the soft limit above this
threshold though and at some point in time an fd higher than the
threshold is allocated, this fd would not be compatible with
select() anymore.

Because of that, indiscriminately increasing the soft RLIMIT_NOFILE
resource limit today for every userspace process is problematic: as
long as there’s userspace code still using select() doing so will
risk triggering hard-to-handle, hard-to-debug errors all over the

However, given the nowadays ubiquitous use of fds for all
kinds of resources (did you know, an eBPF program is an fd? and a
cgroup too? and attaching an eBPF program to cgroup is another fd? …),
we’d really like to raise the limit anyway. 🤔

So before we continue thinking about this problem, let’s make the
problem more complex (…uh, I mean… “more exciting”) first. Having just
one hard and one soft per-process limit on fds is boring. Let’s add
more limits on fds to the mix. Specifically on Linux there are two
system-wide sysctls: fs.nr_open and fs.file-max. (Don’t ask me why
one uses a dash and the other an underscore, or why there are two of
them…) On today’s kernels they kinda lost their relevance. They had
some originally, because fds weren’t accounted by any other
counter. But today, the kernel tracks fds mostly as small pieces of
memory allocated on userspace requests — because that’s ultimately
what they are —, and thus charges them to the memory accounting done

So now, we have four limits (actually: five if you count the memory
accounting) on the same kind of resource, and all of them make a
resource artificially scarce that we don’t want to be scarce. So what
to do?

Back in systemd v240 already (i.e. 2019) we decided to do something
about it. Specifically:

  • Automatically at boot we’ll now bump the two sysctls to their
    maximum, making them effectively ineffective. This one was easy. We
    got rid of two pretty much redundant knobs. Nice!

  • The RLIMIT_NOFILE hard limit is bumped substantially to 512K. Yay,
    cheap fds! You may have an fd, and you, and you as well,
    everyone may have an fd!

  • But … we left the soft RLIMIT_NOFILE limit at 1024. We weren’t
    quite ready to break all programs still using select() in 2019
    yet. But it’s not as bad as it might sound I think: given the hard
    limit is bumped every program can easily opt-in to a larger number
    of fds, by setting the soft limit to the hard limit early on —
    without requiring privileges.

So effectively, with this approach fds should be much less scarce (at
least for programs that opt into that), and the limits should be much
easier to configure, since there are only two knobs now one really
needs to care about:

  • Configure the RLIMIT_NOFILE hard limit to the maximum number of
    fds you actually want to allow a process.

  • In the program code then either bump the soft to the hard limit, or
    not. If you do, you basically declare “I understood the problem, I
    promise to not use select(), drown me fds please!”. If you don’t
    then effectively everything remains as it always was.

Apparently this approach worked, since the negative feedback on change
was even scarcer than fds traditionally were (ha, fun!). We got
reports from pretty much only two projects that were bitten by the
change (one being a JVM implementation): they already bumped their
soft limit automatically to their hard limit during program
initialization, and then allocated an array with one entry per
possible fd. With the new high limit this resulted in one massive
allocation that traditionally was just a few K, and this caused memory
checks to be hit.

Anyway, here’s the take away of this blog story:

  • Don’t use select() anymore in 2021. Use poll(), epoll,
    iouring, …, but for heaven’s sake don’t use select(). It might
    have been all the rage in the 1990s but it doesn’t scale and is
    simply not designed for today’s programs. I wished the man page of
    select() would make clearer how icky it is and that there are
    plenty of more preferably APIs.

  • If you hack on a program that potentially uses a lot of fds, add
    some simple

    somewhere to its start-up that bumps the RLIMIT_NOFILE soft limit
    to the hard limit. But if you do this, you have to make sure your
    code (and any code that you link to from it) refrains from using
    select(). (Note: there’s at least one glibc NSS plugin using
    select() internally. Given that NSS modules can end up being
    loaded into pretty much any process such modules should probably
    be considered just buggy.)

  • If said program you hack on forks off foreign programs, make sure to
    reset the RLIMIT_NOFILE soft limit back to

    for them. Just because your program might be fine with fds >= 1024
    it doesn’t mean that those foreign programs might. And unfortunately
    RLIMIT_NOFILE is inherited down the process tree unless explicitly

And that’s all I have for today. I hope this was enlightening.

Unlocking LUKS2 volumes with TPM2, FIDO2, PKCS#11 Security Hardware on systemd 248

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/unlocking-luks2-volumes-with-tpm2-fido2-pkcs11-security-hardware-on-systemd-248.html

TL;DR: It’s now easy to unlock your LUKS2 volume with a FIDO2
security token (e.g. YubiKey, Nitrokey FIDO2, AuthenTrend
ATKey.Pro). And TPM2 unlocking is easy now too.

Blogging is a lot of work, and a lot less fun than hacking. I mostly
focus on the latter because of that, but from time to time I guess
stuff is just too interesting to not be blogged about. Hence here,
finally, another blog story about exciting new features in systemd.

With the upcoming systemd v248 the
component of systemd (which is responsible for assembling encrypted
volumes during boot) gained direct support for unlocking encrypted
storage with three types of security hardware:

  1. Unlocking with FIDO2 security tokens (well, at least with those
    which implement the hmac-secret extension; most do). i.e. your
    YubiKeys (series 5 and above), Nitrokey FIDO2, AuthenTrend
    ATKey.Pro and such.

  2. Unlocking with TPM2 security chips (pretty ubiquitous on non-budget

  3. Unlocking with PKCS#11 security tokens, i.e. your smartcards and
    older YubiKeys (the ones that implement PIV). (Strictly speaking
    this was supported on older systemd already, but was a lot more

For completeness’ sake, let’s keep in mind that the component also
allows unlocking with these more traditional mechanisms:

  1. Unlocking interactively with a user-entered passphrase (i.e. the
    way most people probably already deploy it, supported since
    about forever)

  2. Unlocking via key file on disk (optionally on removable media
    plugged in at boot), supported since forever.

  3. Unlocking via a key acquired through trivial
    AF_UNIX/SOCK_STREAM socket IPC. (Also new in v248)

  4. Unlocking via recovery keys. These are pretty much the same
    thing as a regular passphrase (and in fact can be entered wherever
    a passphrase is requested) — the main difference being that they
    are always generated by the computer, and thus have guaranteed high
    entropy, typically higher than user-chosen passphrases. They are
    generated in a way they are easy to type, in many cases even if the
    local key map is misconfigured. (Also new in v248)

In this blog story, let’s focus on the first three items, i.e. those
that talk to specific types of hardware for implementing unlocking.

To make working with security tokens and TPM2 easy, a new, small tool
was added to the systemd tool set:
systemd-cryptenroll. It’s
only purpose is to make it easy to enroll your security token/chip of
choice into an encrypted volume. It works with any LUKS2 volume, and
embeds a tiny bit of meta-information into the LUKS2 header with
parameters necessary for the unlock operation.

Unlocking with FIDO2

So, let’s see how this fits together in the FIDO2 case. Most likely
this is what you want to use if you have one of these fancy FIDO2 tokens
(which need to implement the hmac-secret extension, as
mentioned). Let’s say you already have your LUKS2 volume set up, and
previously unlocked it with a simple passphrase. Plug in your token,
and run:

# systemd-cryptenroll --fido2-device=auto /dev/sda5

(Replace /dev/sda5 with the underlying block device of your volume).

This will enroll the key as an additional way to unlock the volume,
and embeds all necessary information for it in the LUKS2 volume
header. Before we can unlock the volume with this at boot, we need to
allow FIDO2 unlocking via
/etc/crypttab. For
that, find the right entry for your volume in that file, and edit it
like so:

myvolume /dev/sda5 - fido2-device=auto

Replace myvolume and /dev/sda5 with the right volume name, and
underlying device of course. Key here is the fido2-device=auto
option you need to add to the fourth column in the file. It tells
systemd-cryptsetup to use the FIDO2 metadata now embedded in the
LUKS2 header, wait for the FIDO2 token to be plugged in at boot
(utilizing systemd-udevd, …) and unlock the volume with it.

And that’s it already. Easy-peasy, no?

Note that all of this doesn’t modify the FIDO2 token itself in any
way. Moreover you can enroll the same token in as many volumes as you
like. Since all enrollment information is stored in the LUKS2 header
(and not on the token) there are no bounds on any of this. (OK, well,
admittedly, there’s a cap on LUKS2 key slots per volume, i.e. you
can’t enroll more than a bunch of keys per volume.)

Unlocking with PKCS#11

Let’s now have a closer look how the same works with a PKCS#11
compatible security token or smartcard. For this to work, you need a
device that can store an RSA key pair. I figure most security
tokens/smartcards that implement PIV qualify. How you actually get the
keys onto the device might differ though. Here’s how you do this for
any YubiKey that implements the PIV feature:

# ykman piv reset
# ykman piv generate-key -a RSA2048 9d pubkey.pem
# ykman piv generate-certificate --subject "Knobelei" 9d pubkey.pem
# rm pubkey.pem

(This chain of commands erases what was stored in PIV feature of your
token before, be careful!)

For tokens/smartcards from other vendors a different series of
commands might work. Once you have a key pair on it, you can enroll it
with a LUKS2 volume like so:

# systemd-cryptenroll --pkcs11-token-uri=auto /dev/sda5

Just like the same command’s invocation in the FIDO2 case this enrolls
the security token as an additional way to unlock the volume, any
passphrases you already have enrolled remain enrolled.

For the PKCS#11 case you need to edit your /etc/crypttab entry like this:

myvolume /dev/sda5 - pkcs11-uri=auto

If you have a security token that implements both PKCS#11 PIV and
FIDO2 I’d probably enroll it as FIDO2 device, given it’s the more
contemporary, future-proof standard. Moreover, it requires no special
preparation in order to get an RSA key onto the device: FIDO2 keys
typically just work.

Unlocking with TPM2

Most modern (non-budget) PC hardware (and other kind of hardware too)
nowadays comes with a TPM2 security chip. In many ways a TPM2 chip is
a smartcard that is soldered onto the mainboard of your system. Unlike
your usual USB-connected security tokens you thus cannot remove them
from your PC, which means they address quite a different security
scenario: they aren’t immediately comparable to a physical key you can
take with you that unlocks some door, but they are a key you leave at
the door, but that refuses to be turned by anyone but you.

Even though this sounds a lot weaker than the FIDO2/PKCS#11 model TPM2
still bring benefits for securing your systems: because the
cryptographic key material stored in TPM2 devices cannot be extracted
(at least that’s the theory), if you bind your hard disk encryption to
it, it means attackers cannot just copy your disk and analyze it
offline — they always need access to the TPM2 chip too to have a
chance to acquire the necessary cryptographic keys. Thus, they can
still steal your whole PC and analyze it, but they cannot just copy
the disk without you noticing and analyze the copy.

Moreover, you can bind the ability to unlock the harddisk to specific
software versions: for example you could say that only your trusted
Fedora Linux can unlock the device, but not any arbitrary OS some
hacker might boot from a USB stick they plugged in. Thus, if you trust
your OS vendor, you can entrust storage unlocking to the vendor’s OS
together with your TPM2 device, and thus can be reasonably sure
intruders cannot decrypt your data unless they both hack your OS
vendor and steal/break your TPM2 chip.

Here’s how you enroll your LUKS2 volume with your TPM2 chip:

# systemd-cryptenroll --tpm2-device=auto --tpm2-pcrs=7 /dev/sda5

This looks almost as straightforward as the two earlier
sytemd-cryptenroll command lines — if it wasn’t for the
--tpm2-pcrs= part. With that option you can specify to which TPM2
PCRs you want to bind the enrollment. TPM2 PCRs are a set of
(typically 24) hash values that every TPM2 equipped system at boot
calculates from all the software that is invoked during the boot
sequence, in a secure, unfakable way (this is called
“measurement”). If you bind unlocking to a specific value of a
specific PCR you thus require the system has to follow the same
sequence of software at boot to re-acquire the disk encryption
key. Sounds complex? Well, that’s because it is.

For now, let’s see how we have to modify your /etc/crypttab to
unlock via TPM2:

myvolume /dev/sda5 - tpm2-device=auto

This part is easy again: the tpm2-device= option is what tells
systemd-cryptsetup to use the TPM2 metadata from the LUKS2 header
and to wait for the TPM2 device to show up.

Bonus: Recovery Key Enrollment

FIDO2, PKCS#11 and TPM2 security tokens and chips pair well with
recovery keys: since you don’t need to type in your password everyday
anymore it makes sense to get rid of it, and instead enroll a
high-entropy recovery key you then print out or scan off screen and
store a safe, physical location. i.e. forget about good ol’
passphrase-based unlocking, go for FIDO2 plus recovery key instead!
Here’s how you do it:

# systemd-cryptenroll --recovery-key /dev/sda5

This will generate a key, enroll it in the LUKS2 volume, show it to
you on screen and generate a QR code you may scan off screen if you
like. The key has highest entropy, and can be entered wherever you can
enter a passphrase. Because of that you don’t have to modify
/etc/crypttab to make the recovery key work.


There’s still plenty room for further improvement in all of this. In
particular for the TPM2 case: what the text above doesn’t really
mention is that binding your encrypted volume unlocking to specific
software versions (i.e. kernel + initrd + OS versions) actually sucks
hard: if you naively update your system to newer versions you might
lose access to your TPM2 enrolled keys (which isn’t terrible, after
all you did enroll a recovery key — right? — which you then can use
to regain access). To solve this some more integration with
distributions would be necessary: whenever they upgrade the system
they’d have to make sure to enroll the TPM2 again — with the PCR
hashes matching the new version. And whenever they remove an old
version of the system they need to remove the old TPM2
enrollment. Alternatively TPM2 also knows a concept of signed PCR
hash values. In this mode the distro could just ship a set of PCR
signatures which would unlock the TPM2 keys. (But quite frankly I
don’t really see the point: whether you drop in a signature file on
each system update, or enroll a new set of PCR hashes in the LUKS2
header doesn’t make much of a difference). Either way, to make TPM2
enrollment smooth some more integration work with your distribution’s
system update mechanisms need to happen. And yes, because of this OS
updating complexity the example above — where I referenced your trusty
Fedora Linux — doesn’t actually work IRL (yet? hopefully…). Nothing
updates the enrollment automatically after you initially enrolled it,
hence after the first kernel/initrd update you have to manually
re-enroll things again, and again, and again … after every update.

The TPM2 could also be used for other kinds of key policies, we might
look into adding later too. For example, Windows uses TPM2 stuff to
allow short (4 digits or so) “PINs” for unlocking the harddisk,
i.e. kind of a low-entropy password you type in. The reason this is
reasonably safe is that in this case the PIN is passed to the TPM2
which enforces that not more than some limited amount of unlock
attempts may be made within some time frame, and that after too many
attempts the PIN is invalidated altogether. Thus making dictionary
attacks harder (which would normally be easier given the short length
of the PINs).


(BTW: Yubico sent me two YubiKeys for testing, Nitrokey a Nitrokey
FIDO2, and AuthenTrend three ATKey.Pro tokens, thank you! — That’s why
you see all those references to YubiKey/Nitrokey/AuthenTrend devices
in the text above: it’s the hardware I had to test this with. That
said, I also tested the FIDO2 stuff with a SoloKey I bought, where it
also worked fine. And yes, you!, other vendors!, who might be reading
this, please send me your security tokens for free, too, and I
might test things with them as well. No promises though. And I am not
going to give them back, if you do, sorry. ;-))

ASG! 2019 CfP Re-Opened!

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/asg-2019-cfp-re-opened.html

The All Systems Go! 2019 Call for Participation Re-Opened for ONE DAY!

Due to popular request we have re-opened the Call for Participation
(CFP) for All Systems Go! 2019 for one
day. It will close again TODAY, on 15 of July 2019, midnight Central
European Summit Time! If you missed the deadline so far, we’d like to
invite you to submit your proposals for consideration to the CFP
submission site
(And yes, this is the last extension, there’s not going to be any
more extensions.)

ASG image

All Systems Go! is everybody’s favourite low-level Userspace Linux
conference, taking place in Berlin, Germany in September 20-22, 2019.

For more information please visit our conference

Walkthrough for Portable Services in Go

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/walkthrough-for-portable-services-in-go.html

Portable Services Walkthrough (Go Edition)

A few months ago I posted a blog story with a walkthrough of systemd
. The
example service given was written in C, and the image was built with
mkosi. In this blog story I’d
like to revisit the exercise, but this time focus on a different
aspect: modern programming languages like Go and Rust push users a lot
more towards static linking of libraries than the usual dynamic
linking preferred by C (at least in the way C is used by traditional
Linux distributions).

Static linking means we can greatly simplify image building: if we
don’t have to link against shared libraries during runtime we don’t
have to include them in the portable service image. And that means
pretty much all need for building an image from a Linux distribution
of some kind goes away as we’ll have next to no dependencies that
would require us to rely on a distribution package manager or
distribution packages. In fact, as it turns out, we only need as few
as three files in the portable service image to be fully functional.

So, let’s have a closer look how such an image can be put
together. All of the following is available in this git

A Simple Go Service

Let’s start with a simple Go service, an HTTP service that simply
counts how often a page from it is requested. Here are the sources:
— note that I am not a seasoned Go programmer, hence please be

The service implements systemd’s socket activation protocol, and thus
can receive bound TCP listener sockets from systemd, using the
$LISTEN_PID and $LISTEN_FDS environment variables.

The service will store the counter data in the directory indicated in
the $STATE_DIRECTORY environment variable, which happens to be an
environment variable current systemd versions set based on the
setting in service files.

Two Simple Unit Files

When a service shall be managed by systemd a unit file is
required. Since the service we are putting together shall be socket
activatable, we even have two:
(the description of the service binary itself) and
(the description of the sockets to listen on for the service).

These units are not particularly remarkable: the .service file
primarily contains the command line to invoke and a StateDirectory=
setting to make sure the service when invoked gets its own private
state directory under /var/lib/ (and the $STATE_DIRECTORY
environment variable is set to the resulting path). The .socket file
simply lists 8088 as TCP/IP port to listen on.

An OS Description File

OS images (and that includes portable service images) generally should
include an
file. Usually, that is provided by the distribution. Since we are
building an image without any distribution let’s write our own
version of such a
. Later
on we can use the portablectl inspect command to have a look at this
metadata of our image.

Putting it All Together

The four files described above are already every file we need to build
our image. Let’s now put the portable service image together. For that
I’ve written a
Makefile. It
contains two relevant rules: the first one builds the static binary
from the Go program sources. The second one then puts together a
squashfs file system combining the following:

  1. The compiled, statically linked service binary
  2. The two systemd unit files
  3. The os-release file
  4. A couple of empty directories such as /proc/, /sys/, /dev/
    and so on that need to be over-mounted with the respective kernel
    API file system. We need to create them as empty directories here
    since Linux insists on directories to exist in order to over-mount
    them, and since the image we are building is going to be an
    immutable read-only image (squashfs) these directories cannot be
    created dynamically when the portable image is mounted.
  5. Two empty files /etc/resolv.conf and /etc/machine-id that can
    be over-mounted with the same files from the host.

And that’s already it. After a quick make we’ll have our portable
service image portable-walkthrough-go.raw and are ready to go.

Trying it out

Let’s now attach the portable service image to our host system:

# portablectl attach ./portable-walkthrough-go.raw
(Matching unit files with prefix 'portable-walkthrough-go'.)
Created directory /etc/systemd/system.attached.
Created directory /etc/systemd/system.attached/portable-walkthrough-go.socket.d.
Written /etc/systemd/system.attached/portable-walkthrough-go.socket.d/20-portable.conf.
Copied /etc/systemd/system.attached/portable-walkthrough-go.socket.
Created directory /etc/systemd/system.attached/portable-walkthrough-go.service.d.
Written /etc/systemd/system.attached/portable-walkthrough-go.service.d/20-portable.conf.
Created symlink /etc/systemd/system.attached/portable-walkthrough-go.service.d/10-profile.conf → /usr/lib/systemd/portable/profile/default/service.conf.
Copied /etc/systemd/system.attached/portable-walkthrough-go.service.
Created symlink /etc/portables/portable-walkthrough-go.raw → /home/lennart/projects/portable-walkthrough-go/portable-walkthrough-go.raw.

The portable service image is now attached to the host, which means we
can now go and start it (or even enable it):

# systemctl start portable-walkthrough-go.socket

Let’s see if our little web service works, by doing an HTTP request on port 8088:

# curl localhost:8088
Hello! You are visitor #1!

Let’s try this again, to check if it counts correctly:

# curl localhost:8088
Hello! You are visitor #2!

Nice! It worked. Let’s now stop the service again, and detach the image again:

# systemctl stop portable-walkthrough-go.service portable-walkthrough-go.socket
# portablectl detach portable-walkthrough-go
Removed /etc/systemd/system.attached/portable-walkthrough-go.service.
Removed /etc/systemd/system.attached/portable-walkthrough-go.service.d/10-profile.conf.
Removed /etc/systemd/system.attached/portable-walkthrough-go.service.d/20-portable.conf.
Removed /etc/systemd/system.attached/portable-walkthrough-go.service.d.
Removed /etc/systemd/system.attached/portable-walkthrough-go.socket.
Removed /etc/systemd/system.attached/portable-walkthrough-go.socket.d/20-portable.conf.
Removed /etc/systemd/system.attached/portable-walkthrough-go.socket.d.
Removed /etc/portables/portable-walkthrough-go.raw.
Removed /etc/systemd/system.attached.

And there we go, the portable image file is detached from the host again.

A Couple of Notes

  1. Of course, this is a simplistic example: in real life services will
    be more than one compiled file, even when statically linked. But
    you get the idea, and it’s very easy to extend the example above to
    include any additional, auxiliary files in the portable service

  2. The service is very nicely sandboxed during runtime: while it runs
    as regular service on the host (and you thus can watch its logs or
    do resource management on it like you would do for all other
    systemd services), it runs in a very restricted environment under a
    dynamically assigned UID that ceases to exist when the service is
    stopped again.

  3. Originally I wanted to make the service not only socket activatable
    but also implement exit-on-idle, i.e. add a logic so that the
    service terminates on its own when there’s no ongoing HTTP
    connection for a while. I couldn’t figure out how to do this
    race-freely in Go though, but I am sure an interested reader might
    want to add that? By combining socket activation with exit-on-idle
    we can turn this project into an excercise of putting together an
    extremely resource-friendly and robust service architecture: the
    service is started only when needed and terminates when no longer
    needed. This would allow to pack services at a much higher density
    even on systems with few resources.

  4. While the basic concepts of portable services have been around
    since systemd 239, it’s best to try the above with systemd 241 or
    newer since the portable service logic received a number of fixes
    since then.

Further Reading

A low-level document introducing Portable Services is shipped along
with systemd

Please have a look at the blog story from a few months

that did something very similar with a service written in C.

There are also relevant manual pages:

Brand-new books from The MagPi and HackSpace magazine

Post Syndicated from Rob Zwetsloot original https://www.raspberrypi.org/blog/book-of-making-1-magpi-projects-book-4/

Hey folks, Rob from The MagPi here! Halloween is over and November has just begun, which means CHRISTMAS IS ALMOST HERE! It’s never too early to think about Christmas — I start in September, the moment mince pies hit shelves.


What most people seem to dread about Christmas is finding the right gifts, so I’m here to help you out. We’ve just released two new books: our Official Raspberry Pi Projects Book volume 4, and the brand-new Book of Making volume 1 from the team at HackSpace magazine!

Book of Making volume 1

HackSpace magazine book 1 - Raspberry Pi

Spoiler alert: it’s a book full of making

The Book of Making volume 1 contains 50 of the very best projects from HackSpace magazine, including awesome project showcases and amazing guides for building your own incredible creations. Expect to encounter trebuchets, custom drones, a homemade tandoori oven, and much more! And yes, there are some choice Raspberry Pi projects as well.

The Official Raspberry Pi Projects Book volume 4

The MagPi Raspberry pi Projects book 4

More projects, more guides, and more reviews!

Volume 4 of the Official Raspberry Pi Projects Book is once again jam-packed with Raspberry Pi goodness in its 200 pages, with projects, build guides, reviews, and a little refresher for beginners to the world of Raspberry Pi. Whether you’re new to Pi or have every single model, there’s something in there for you, no matter your skill level.

Free shipping? Worldwide??

You can buy the Book of Making and the Official Raspberry Pi Projects Book volume 4 right now from the Raspberry Pi Press Store, and here’s the best part: they both have free worldwide shipping! They also roll up pretty neatly, in case you want to slot them into someone’s Christmas stocking. And you can also find them at our usual newsagents.

Both books are available as free PDF downloads, so you can try before you buy. When you purchase any of our publications, you contribute toward the hard work of the Raspberry Pi Foundation, so why not double your giving this holiday season by helping us put the power of digital making into the hands of people all over the world?

Anyway, that’s it for now — I’m off for more mince pies!

The post Brand-new books from The MagPi and HackSpace magazine appeared first on Raspberry Pi.

ASG! 2018 Tickets

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/asg-2018-tickets.html

All Systems Go! 2018 Tickets Selling Out Quickly!

Buy your tickets for All Systems Go!
soon, they are quickly selling out!
The conference takes place on September 28-30, in Berlin, Germany, in
a bit over two weeks.

Why should you attend? If you are interested in low-level Linux
userspace, then All Systems Go! is the right conference for you. It
covers all topics relevant to foundational open-source Linux
technologies. For details on the covered topics see our schedule for day #1
and for day #2.

For more information please visit our conference

See you in Berlin!

ASG! 2018 CfP Closes TODAY

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/asg-2018-cfp-closes-today.html

The All Systems Go! 2018 Call for Participation Closes TODAY!

The Call for Participation (CFP) for All Systems Go!
will close TODAY, on 30th of
July! We’d like to invite you to submit your proposals for
consideration to the CFP submission

ASG image

All Systems Go! is everybody’s favourite low-level Userspace Linux
conference, taking place in Berlin, Germany in September 28-30, 2018.

For more information please visit our conference

ASG! 2018 CfP Closes Soon

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/asg-2018-cfp-closes-soon.html

The All Systems Go! 2018 Call for Participation Closes in One Week!

The Call for Participation (CFP) for All Systems Go!
will close in one week, on 30th of
July! We’d like to invite you to submit your proposals for
consideration to the CFP submission

ASG image

Notification of acceptance and non-acceptance will go out within 7
days of the closing of the CFP.

All topics relevant to foundational open-source Linux technologies are
welcome. In particular, however, we are looking for proposals
including, but not limited to, the following topics:

  • Low-level container executors and infrastructure
  • IoT and embedded OS infrastructure
  • BPF and eBPF filtering
  • OS, container, IoT image delivery and updating
  • Building Linux devices and applications
  • Low-level desktop technologies
  • Networking
  • System and service management
  • Tracing and performance measuring
  • IPC and RPC systems
  • Security and Sandboxing

While our focus is definitely more on the user-space side of things,
talks about kernel projects are welcome, as long as they have a clear
and direct relevance for user-space.

For more information please visit our conference

Walkthrough for Portable Services

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/walkthrough-for-portable-services.html

Portable Services with systemd v239


contains a great number of new features. One of them is first class
support for Portable
. In this blog story
I’d like to shed some light on what they are and why they might be
interesting for your application.

What are “Portable Services”?

The “Portable Service” concept takes inspiration from classic
chroot() environments as well as container management and brings a
number of their features to more regular system service management.

While the definition of what a “container” really is is hotly debated,
I figure people can generally agree that the “container” concept
primarily provides two major features:

  1. Resource bundling: a container generally brings its own file system
    tree along, bundling any shared libraries and other resources it
    might need along with the main service executables.

  2. Isolation and sand-boxing: a container operates in a name-spaced
    environment that is relatively detached from the host. Besides
    living in its own file system namespace it usually also has its own
    user database, process tree and so on. Access from the container to
    the host is limited with various security technologies.

Of these two concepts the first one is also what traditional UNIX
chroot() environments are about.

Both resource bundling and isolation/sand-boxing are concepts systemd
has implemented to varying degrees for a longer time. Specifically,
have been around for a long time, and so have been the various

systemd provides. The Portable Services concept builds on that,
putting these features together in a new, integrated way to make them
more accessible and usable.

OK, so what precisely is a “Portable Service”?

Much like a container image, a portable service on disk can be just a
directory tree that contains service executables and all their
dependencies, in a hierarchy resembling the normal Linux directory
hierarchy. A portable service can also be a raw disk image, containing
a file system containing such a tree (which can be mounted via a
loop-back block device), or multiple file systems (in which case they
need to follow the Discoverable Partitions

and be located within a GPT partition table). Regardless whether the
portable service on disk is a simple directory tree or a raw disk
image, let’s call this concept the portable service image.

Such images can be generated with any tool typically used for the
purpose of installing OSes inside some directory, for example dnf
or debootstrap. There are very few requirements made
on these trees, except the following two:

  1. The tree should carry systemd unit

    for relevant services in them.

  2. The tree should carry
    (or /etc/os-release) OS release information.

Of course, as you might notice, OS trees generated from any of today’s
big distributions generally qualify for these two requirements without
any further modification, as pretty much all of them adopted
/usr/lib/os-release and tend to ship their major services with
systemd unit files.

A portable service image generated like this can be “attached” or
“detached” from a host:

  1. “Attaching” an image to a host is done through the new

    command. This command dissects the image, reading the os-release
    information, and searching for unit files in them. It then copies
    relevant unit files out of the images and into
    /etc/systemd/system/. After that it augments any copied service
    unit files in two ways: a drop-in adding a RootDirectory= or
    RootImage= line is added in so that even though the unit files
    are now available on the host when started they run the referenced
    binaries from the image. It also symlinks in a second drop-in which
    is called a “profile”, which is supposed to carry additional
    security settings to enforce on the attached services, to ensure
    the right amount of sand-boxing.

  2. “Detaching” an image from the host is done through portable
    . It reverses the steps above: the unit files copied out are
    removed again, and so are the two drop-in files generated for them.

While a portable service is attached its relevant unit files are made
available on the host like any others: they will appear in systemctl
, you can enable and disable them, you can start them
and stop them. You can extend them with systemctl edit. You can
introspect them. You can apply resource management to them like to any
other service, and you can process their logs like any other service
and so on. That’s because they really are native systemd services,
except that they have ‘twist’ if you so will: they have tougher
security by default and store their resources in a root directory or

And that’s already the essence of what Portable Services are.

A couple of interesting points:

  1. Even though the focus is on shipping service unit files in
    portable service images, you can actually ship timer units, socket
    units, target units, path units in portable services too. This
    means you can very naturally do time, socket and path based
    activation. It’s also entirely fine to ship multiple service units
    in the same image, in case you have more complex applications.

  2. This concept introduces zero new metadata. Unit files are an
    existing concept, as are os-release files, and — in case you opt
    for raw disk images — GPT partition tables are already established
    too. This also means existing tools to generate images can be
    reused for building portable service images to a large degree as no
    completely new artifact types need to be generated.

  3. Because the Portable Service concepts introduces zero new metadata
    and just builds on existing security and resource bundling
    features of systemd it’s implemented in a set of distinct tools,
    relatively disconnected from the rest of systemd. Specifically, the
    main user-facing command is
    and the actual operations are implemented in
    systemd-portabled.service. If
    you so will, portable services are a true add-on to systemd, just
    making a specific work-flow nicer to use than with the basic
    operations systemd otherwise provides. Also note that
    systemd-portabled provides bus APIs accessible to any program
    that wants to interface with it, portablectl is just one tool
    that happens to be shipped along with systemd.

  4. Since Portable Services are a feature we only added very recently
    we wanted to keep some freedom to make changes still. Due to that
    we decided to install the portablectl command into
    /usr/lib/systemd/ for now, so that it does not appear in $PATH
    by default. This means, for now you have to invoke it with a full
    path: /usr/lib/systemd/portablectl. We expect to move it into
    /usr/bin/ very soon though, and make it a fully supported
    interface of systemd.

  5. You may wonder which unit files contained in a portable service
    image are the ones considered “relevant” and are actually copied
    out by the portablectl attach operation. Currently, this is
    derived from the image name. Let’s say you have an image stored in
    a directory /var/lib/portables/foobar_4711/ (or alternatively in
    a raw image /var/lib/portables/foobar_4711.raw). In that case the
    unit files copied out match the pattern foobar*.service,
    foobar*.socket, foobar*.target, foobar*.path,

  6. The Portable Services concept does not define any specific method
    how images get on the deployment machines, that’s entirely up to
    administrators. You can just scp them there, or wget them. You
    could even package them as RPMs and then deploy them with dnf if
    you feel adventurous.

  7. Portable service images can reside in any directory you
    like. However, if you place them in /var/lib/portables/ then
    portablectl will find them easily and can show you a list of
    images you can attach and suchlike.

  8. Attaching a portable service image can be done persistently, so
    that it remains attached on subsequent boots (which is the default),
    or it can be attached only until the next reboot, by passing
    --runtime to portablectl.

  9. Because portable service images are ultimately just regular OS
    images, it’s natural and easy to build a single image that can be
    used in three different ways:

    1. It can be attached to any host as a portable service image.

    2. It can be booted as OS container, for example in a container
      manager like systemd-nspawn.

    3. It can be booted as host system, for example on bare metal or
      in a VM manager.

    Of course, to qualify for the latter two the image needs to
    contain more than just the service binaries, the os-release file
    and the unit files. To be bootable an OS container manager such as
    systemd-nspawn the image needs to contain an init system of some
    form, for example
    systemd. To
    be bootable on bare metal or as VM it also needs a boot loader of
    some form, for example


In the previous section the “profile” concept was briefly
mentioned. Since they are a major feature of the Portable Services
concept, they deserve some focus. A “profile” is ultimately just a
pre-defined drop-in file for unit files that are attached to a
host. They are supposed to mostly contain sand-boxing and security
settings, but may actually contain any other settings, too. When a
portable service is attached a suitable profile has to be selected. If
none is selected explicitly, the default profile called default is
used. systemd ships with four different profiles out of the box:

  1. The
    profile provides a medium level of security. It contains settings to
    drop capabilities, enforce system call filters, restrict many kernel
    interfaces and mount various file systems read-only.

  2. The
    profile is similar to the default profile, but generally uses the
    most restrictive sand-boxing settings. For example networking is turned
    off and access to AF_NETLINK sockets is prohibited.

  3. The
    profile is the least strict of them all. In fact it makes almost no
    restrictions at all. A service run with this profile has basically
    full access to the host system.

  4. The
    profile is mostly identical to default, but also turns off network access.

Note that the profile is selected at the time the portable service
image is attached, and it applies to all service files attached, in
case multiple are shipped in the same image. Thus, the sand-boxing
restriction to enforce are selected by the administrator attaching the
image and not the image vendor.

Additional profiles can be defined easily by the administrator, if
needed. We might also add additional profiles sooner or later to be
shipped with systemd out of the box.

What’s the use-case for this? If I have containers, why should I bother?

Portable Services are primarily intended to cover use-cases where code
should more feel like “extensions” to the host system rather than live
in disconnected, separate worlds. The profile concept is
supposed to be tunable to the exact right amount of integration or
isolation needed for an application.

In the container world the concept of “super-privileged containers”
has been touted a lot, i.e. containers that run with full
privileges. It’s precisely that use-case that portable services are
intended for: extensions to the host OS, that default to isolation,
but can optionally get as much access to the host as needed, and can
naturally take benefit of the full functionality of the host. The
concept should hence be useful for all kinds of low-level system
software that isn’t shipped with the OS itself but needs varying
degrees of integration with it. Besides servers and appliances this
should be particularly interesting for IoT and embedded devices.

Because portable services are just a relatively small extension to the
way system services are otherwise managed, they can be treated like
regular service for almost all use-cases: they will appear along
regular services in all tools that can introspect systemd unit data,
and can be managed the same way when it comes to logging, resource
management, runtime life-cycles and so on.

Portable services are a very generic concept. While the original
use-case is OS extensions, it’s of course entirely up to you and other
users to use them in a suitable way of your choice.


Let’s have a look how this all can be used. We’ll start with building
a portable service image from scratch, before we attach, enable and
start it on a host.

Building a Portable Service image

As mentioned, you can use any tool you like that can create OS trees
or raw images for building Portable Service images, for example
debootstrap or dnf --installroot=. For this example walkthrough
run we’ll use mkosi, which is
ultimately just a fancy wrapper around dnf and debootstrap but
makes a number of things particularly easy when repetitively building
images from source trees.

I have pushed everything necessary to reproduce this walkthrough
locally to a GitHub
. Let’s check it out:

$ git clone https://github.com/systemd/portable-walkthrough.git

Let’s have a look in the repository:

  1. First of all,
    is the main source file of our little service. To keep things
    simple it’s written in C, but it could be in any language of your
    choice. The daemon as implemented won’t do much: it just starts up
    and waits for SIGTERM, at which point it will shut down. It’s
    ultimately useless, but hopefully illustrates how this all fits
    together. The C code has no dependencies besides libc.

  2. walkthroughd.service
    is a systemd unit file that starts our little daemon. It’s a simple
    service, hence the unit file is trivial.

  3. Makefile
    is a short make build script to build the daemon binary. It’s
    pretty trivial, too: it just takes the C file and builds a binary
    from it. It can also install the daemon. It places the binary in
    /usr/local/lib/walkthroughd/walkthroughd (why not in
    /usr/local/bin? because it’s not a user-facing binary but a system
    service binary), and its unit file in
    /usr/local/lib/systemd/walkthroughd.service. If you want to test
    the daemon on the host we can now simply run make and then
    ./walkthroughd in order to check everything works.

  4. mkosi.default
    is file that tells mkosi how to build the image. We opt for a
    Fedora-based image here (but we might as well have used Debian
    here, or any other supported distribution). We need no particular
    packages during runtime (after all we only depend on libc), but
    during the build phase we need gcc and make, hence these are the
    only packages we list in BuildPackages=.

  5. mkosi.build
    is a shell script that is invoked during mkosi’s build logic. All
    it does is invoke make and make install to build and install
    our little daemon, and afterwards it extends the
    distribution-supplied /etc/os-release file with an additional
    field that describes our portable service a bit.

Let’s now use this to build the portable service image. For that we
use the mkosi tool. It’s
sufficient to invoke it without parameter to build the first image: it
will automatically discover mkosi.default and mkosi.build which
tells it what to do. (Note that if you work on a project like this for
a longer time, mkosi -if is probably the better command to use, as
it that speeds up building substantially by using an incremental build
mode). mkosi will download the necessary RPMs, and put them all
together. It will build our little daemon inside the image and after
all that’s done it will output the resulting image:

Because we opted to build a GPT raw disk image in mkosi.default this
file is actually a raw disk image containing a GPT partition
table. You can use fdisk -l walkthroughd_1.raw to enumerate the
partition table. You can also use systemd-nspawn -i
to explore the image quickly if you need.

Using the Portable Service Image

Now that we have a portable service image, let’s see how we can
attach, enable and start the service included within it.

First, let’s attach the image:

# /usr/lib/systemd/portablectl attach ./walkthroughd_1.raw
(Matching unit files with prefix 'walkthroughd'.)
Created directory /etc/systemd/system/walkthroughd.service.d.
Written /etc/systemd/system/walkthroughd.service.d/20-portable.conf.
Created symlink /etc/systemd/system/walkthroughd.service.d/10-profile.conf → /usr/lib/systemd/portable/profile/default/service.conf.
Copied /etc/systemd/system/walkthroughd.service.
Created symlink /etc/portables/walkthroughd_1.raw → /home/lennart/projects/portable-walkthrough/walkthroughd_1.raw.

The command will show you exactly what is has been doing: it just
copied the main service file out, and added the two drop-ins, as

Let’s see if the unit is now available on the host, just like a regular unit, as promised:

# systemctl status walkthroughd.service
● walkthroughd.service - A simple example service
   Loaded: loaded (/etc/systemd/system/walkthroughd.service; disabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/walkthroughd.service.d
           └─10-profile.conf, 20-portable.conf
   Active: inactive (dead)

Nice, it worked. We see that the unit file is available and that
systemd correctly discovered the two drop-ins. The unit is neither
enabled nor started however. Yes, attaching a portable service image
doesn’t imply enabling nor starting. It just means the unit files
contained in the image are made available to the host. It’s up to the
administrator to then enable them (so that they are automatically
started when needed, for example at boot), and/or start them (in case
they shall run right-away).

Let’s now enable and start the service in one step:

# systemctl enable --now walkthroughd.service
Created symlink /etc/systemd/system/multi-user.target.wants/walkthroughd.service → /etc/systemd/system/walkthroughd.service.

Let’s check if it’s running:

# systemctl status walkthroughd.service
● walkthroughd.service - A simple example service
   Loaded: loaded (/etc/systemd/system/walkthroughd.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/walkthroughd.service.d
           └─10-profile.conf, 20-portable.conf
   Active: active (running) since Wed 2018-06-27 17:55:30 CEST; 4s ago
 Main PID: 45003 (walkthroughd)
    Tasks: 1 (limit: 4915)
   Memory: 4.3M
   CGroup: /system.slice/walkthroughd.service
           └─45003 /usr/local/lib/walkthroughd/walkthroughd

Jun 27 17:55:30 sigma walkthroughd[45003]: Initializing.

Perfect! We can see that the service is now enabled and running. The daemon is running as PID 45003.

Now that we verified that all is good, let’s stop, disable and detach the service again:

# systemctl disable --now walkthroughd.service
Removed /etc/systemd/system/multi-user.target.wants/walkthroughd.service.
# /usr/lib/systemd/portablectl detach ./walkthroughd_1.raw
Removed /etc/systemd/system/walkthroughd.service.
Removed /etc/systemd/system/walkthroughd.service.d/10-profile.conf.
Removed /etc/systemd/system/walkthroughd.service.d/20-portable.conf.
Removed /etc/systemd/system/walkthroughd.service.d.
Removed /etc/portables/walkthroughd_1.raw.

And finally, let’s see that it’s really gone:

# systemctl status walkthroughd
Unit walkthroughd.service could not be found.

Perfect! It worked!

I hope the above gets you started with Portable Services. If you have
further questions, please contact our mailing

Further Reading

A more low-level document explaining details is shipped
along with systemd

There are also relevant manual pages:

For further information about mkosi see its homepage.

Microsoft acquires GitHub

Post Syndicated from corbet original https://lwn.net/Articles/756443/rss

Here’s the
press release
announcing Microsoft’s agreement to acquire GitHub for a
mere $7.5 billion. “GitHub will retain its developer-first
ethos and will operate independently to provide an open platform for all
developers in all industries. Developers will continue to be able to use
the programming languages, tools and operating systems of their choice for
their projects — and will still be able to deploy their code to any
operating system, any cloud and any device.

Build your own weather station with our new guide!

Post Syndicated from Richard Hayler original https://www.raspberrypi.org/blog/build-your-own-weather-station/

One of the most common enquiries I receive at Pi Towers is “How can I get my hands on a Raspberry Pi Oracle Weather Station?” Now the answer is: “Why not build your own version using our guide?”

Build Your Own weather station kit assembled

Tadaaaa! The BYO weather station fully assembled.

Our Oracle Weather Station

In 2016 we sent out nearly 1000 Raspberry Pi Oracle Weather Station kits to schools from around the world who had applied to be part of our weather station programme. In the original kit was a special HAT that allows the Pi to collect weather data with a set of sensors.

The original Raspberry Pi Oracle Weather Station HAT – Build Your Own Raspberry Pi weather station

The original Raspberry Pi Oracle Weather Station HAT

We designed the HAT to enable students to create their own weather stations and mount them at their schools. As part of the programme, we also provide an ever-growing range of supporting resources. We’ve seen Oracle Weather Stations in great locations with a huge differences in climate, and they’ve even recorded the effects of a solar eclipse.

Our new BYO weather station guide

We only had a single batch of HATs made, and unfortunately we’ve given nearly* all the Weather Station kits away. Not only are the kits really popular, we also receive lots of questions about how to add extra sensors or how to take more precise measurements of a particular weather phenomenon. So today, to satisfy your demand for a hackable weather station, we’re launching our Build your own weather station guide!

Build Your Own Raspberry Pi weather station

Fun with meteorological experiments!

Our guide suggests the use of many of the sensors from the Oracle Weather Station kit, so can build a station that’s as close as possible to the original. As you know, the Raspberry Pi is incredibly versatile, and we’ve made it easy to hack the design in case you want to use different sensors.

Many other tutorials for Pi-powered weather stations don’t explain how the various sensors work or how to store your data. Ours goes into more detail. It shows you how to put together a breadboard prototype, it describes how to write Python code to take readings in different ways, and it guides you through recording these readings in a database.

Build Your Own Raspberry Pi weather station on a breadboard

There’s also a section on how to make your station weatherproof. And in case you want to move past the breadboard stage, we also help you with that. The guide shows you how to solder together all the components, similar to the original Oracle Weather Station HAT.

Who should try this build

We think this is a great project to tackle at home, at a STEM club, Scout group, or CoderDojo, and we’re sure that many of you will be chomping at the bit to get started. Before you do, please note that we’ve designed the build to be as straight-forward as possible, but it’s still fairly advanced both in terms of electronics and programming. You should read through the whole guide before purchasing any components.

Build Your Own Raspberry Pi weather station – components

The sensors and components we’re suggesting balance cost, accuracy, and easy of use. Depending on what you want to use your station for, you may wish to use different components. Similarly, the final soldered design in the guide may not be the most elegant, but we think it is achievable for someone with modest soldering experience and basic equipment.

You can build a functioning weather station without soldering with our guide, but the build will be more durable if you do solder it. If you’ve never tried soldering before, that’s OK: we have a Getting started with soldering resource plus video tutorial that will walk you through how it works step by step.

Prototyping HAT for Raspberry Pi weather station sensors

For those of you who are more experienced makers, there are plenty of different ways to put the final build together. We always like to hear about alternative builds, so please post your designs in the Weather Station forum.

Our plans for the guide

Our next step is publishing supplementary guides for adding extra functionality to your weather station. We’d love to hear which enhancements you would most like to see! Our current ideas under development include adding a webcam, making a tweeting weather station, adding a light/UV meter, and incorporating a lightning sensor. Let us know which of these is your favourite, or suggest your own amazing ideas in the comments!

*We do have a very small number of kits reserved for interesting projects or locations: a particularly cool experiment, a novel idea for how the Oracle Weather Station could be used, or places with specific weather phenomena. If have such a project in mind, please send a brief outline to [email protected], and we’ll consider how we might be able to help you.

The post Build your own weather station with our new guide! appeared first on Raspberry Pi.

Protecting coral reefs with Nemo-Pi, the underwater monitor

Post Syndicated from Janina Ander original https://www.raspberrypi.org/blog/coral-reefs-nemo-pi/

The German charity Save Nemo works to protect coral reefs, and they are developing Nemo-Pi, an underwater “weather station” that monitors ocean conditions. Right now, you can vote for Save Nemo in the Google.org Impact Challenge.

Nemo-Pi — Save Nemo

Save Nemo

The organisation says there are two major threats to coral reefs: divers, and climate change. To make diving saver for reefs, Save Nemo installs buoy anchor points where diving tour boats can anchor without damaging corals in the process.

reef damaged by anchor
boat anchored at buoy

In addition, they provide dos and don’ts for how to behave on a reef dive.

The Nemo-Pi

To monitor the effects of climate change, and to help divers decide whether conditions are right at a reef while they’re still on shore, Save Nemo is also in the process of perfecting Nemo-Pi.

Nemo-Pi schematic — Nemo-Pi — Save Nemo

This Raspberry Pi-powered device is made up of a buoy, a solar panel, a GPS device, a Pi, and an array of sensors. Nemo-Pi measures water conditions such as current, visibility, temperature, carbon dioxide and nitrogen oxide concentrations, and pH. It also uploads its readings live to a public webserver.

Inside the Nemo-Pi device — Save Nemo
Inside the Nemo-Pi device — Save Nemo
Inside the Nemo-Pi device — Save Nemo

The Save Nemo team is currently doing long-term tests of Nemo-Pi off the coast of Thailand and Indonesia. They are also working on improving the device’s power consumption and durability, and testing prototypes with the Raspberry Pi Zero W.

web dashboard — Nemo-Pi — Save Nemo

The web dashboard showing live Nemo-Pi data

Long-term goals

Save Nemo aims to install a network of Nemo-Pis at shallow reefs (up to 60 metres deep) in South East Asia. Then diving tour companies can check the live data online and decide day-to-day whether tours are feasible. This will lower the impact of humans on reefs and help the local flora and fauna survive.

Coral reefs with fishes

A healthy coral reef

Nemo-Pi data may also be useful for groups lobbying for reef conservation, and for scientists and activists who want to shine a spotlight on the awful effects of climate change on sea life, such as coral bleaching caused by rising water temperatures.

Bleached coral

A bleached coral reef

Vote now for Save Nemo

If you want to help Save Nemo in their mission today, vote for them to win the Google.org Impact Challenge:

  1. Head to the voting web page
  2. Click “Abstimmen” in the footer of the page to vote
  3. Click “JA” in the footer to confirm

Voting is open until 6 June. You can also follow Save Nemo on Facebook or Twitter. We think this organisation is doing valuable work, and that their projects could be expanded to reefs across the globe. It’s fantastic to see the Raspberry Pi being used to help protect ocean life.

The post Protecting coral reefs with Nemo-Pi, the underwater monitor appeared first on Raspberry Pi.

Randomly generated, thermal-printed comics

Post Syndicated from Alex Bate original https://www.raspberrypi.org/blog/random-comic-strip-generation-vomit-comic-robot/

Python code creates curious, wordless comic strips at random, spewing them from the thermal printer mouth of a laser-cut body reminiscent of Disney Pixar’s WALL-E: meet the Vomit Comic Robot!

The age of the thermal printer!

Thermal printers allow you to instantly print photos, data, and text using a few lines of code, with no need for ink. More and more makers are using this handy, low-maintenance bit of kit for truly creative projects, from Pierre Muth’s tiny PolaPi-Zero camera to the sound-printing Waves project by Eunice Lee, Matthew Zhang, and Bomani McClendon (and our own Secret Santa Babbage).

Vomiting robots

Interaction designer and developer Cadin Batrack, whose background is in game design and interactivity, has built the Vomit Comic Robot, which creates “one-of-a-kind comics on demand by processing hand-drawn images through a custom software algorithm.”

The robot is made up of a Raspberry Pi 3, a USB thermal printer, and a handful of LEDs.

Comic Vomit Robot Cadin Batrack's Raspberry Pi comic-generating thermal printer machine

At the press of a button, Processing code selects one of a set of Cadin’s hand-drawn empty comic grids and then randomly picks images from a library to fill in the gaps.

Vomit Comic Robot Cadin Batrack's Raspberry Pi comic-generating thermal printer machine

Each image is associated with data that allows the code to fit it correctly into the available panels. Cadin says about the concept behing his build:

Although images are selected and placed randomly, the comic panel format suggests relationships between elements. Our minds create a story where there is none in an attempt to explain visuals created by a non-intelligent machine.

The Raspberry Pi saves the final image as a high-resolution PNG file (so that Cadin can sell prints on thick paper via Etsy), and a Python script sends it to be vomited up by the thermal printer.

Comic Vomit Robot Cadin Batrack's Raspberry Pi comic-generating thermal printer machine

For more about the Vomit Comic Robot, check out Cadin’s blog. If you want to recreate it, you can find the info you need in the Imgur album he has put together.

We ❤ cute robots

We have a soft spot for cute robots here at Pi Towers, and of course we make no exception for the Vomit Comic Robot. If, like us, you’re a fan of adorable bots, check out Mira, the tiny interactive robot by Alonso Martinez, and Peeqo, the GIF bot by Abhishek Singh.

Mira Alfonso Martinez Raspberry Pi

The post Randomly generated, thermal-printed comics appeared first on Raspberry Pi.

Project Floofball and more: Pi pet stuff

Post Syndicated from Janina Ander original https://www.raspberrypi.org/blog/project-floofball-pi-pet-stuff/

It’s a public holiday here today (yes, again). So, while we indulge in the traditional pastime of barbecuing stuff (ourselves, mainly), here’s a little trove of Pi projects that cater for our various furry friends.

Project Floofball

Nicole Horward created Project Floofball for her hamster, Harold. It’s an IoT hamster wheel that uses a Raspberry Pi and a magnetic door sensor to log how far Harold runs.

Project Floofball: an IoT hamster wheel

An IoT Hamsterwheel using a Raspberry Pi and a magnetic door sensor, to see how far my hamster runs.

You can follow Harold’s runs in real time on his ThingSpeak channel, and you’ll find photos of the build on imgur. Nicole’s Python code, as well as her template for the laser-cut enclosure that houses the wiring and LCD display, are available on the hamster wheel’s GitHub repo.

A live-streaming pet feeder

JaganK3 used to work long hours that meant he couldn’t be there to feed his dog on time. He found that he couldn’t buy an automated feeder in India without paying a lot to import one, so he made one himself. It uses a Raspberry Pi to control a motor that turns a dispensing valve in a hopper full of dry food, giving his dog a portion of food at set times.

A transparent cylindrical hopper of dry dog food, with a motor that can turn a dispensing valve at the lower end. The motor is connected to a Raspberry Pi in a plastic case. Hopper, motor, Pi, and wiring are all mounted on a board on the wall.

He also added a web cam for live video streaming, because he could. Find out more in JaganK3’s Instructable for his pet feeder.

Shark laser cat toy

Sam Storino, meanwhile, is using a Raspberry Pi to control a laser-pointer cat toy with a goshdarned SHARK (which is kind of what I’d expect from the guy who made the steampunk-looking cat feeder a few weeks ago). The idea is to keep his cats interested and active within the confines of a compact city apartment.

Raspberry Pi Automatic Cat Laser Pointer Toy

Post with 52 votes and 7004 views. Tagged with cat, shark, lasers, austin powers, raspberry pi; Shared by JeorgeLeatherly. Raspberry Pi Automatic Cat Laser Pointer Toy

If I were a cat, I would definitely be entirely happy with this. Find out more on Sam’s website.

And there’s more

Michel Parreno has written a series of articles to help you monitor and feed your pet with Raspberry Pi.

All of these makers are generous in acknowledging the tutorials and build logs that helped them with their projects. It’s lovely to see the Raspberry Pi and maker community working like this, and I bet their projects will inspire others too.

Now, if you’ll excuse me. I’m late for a barbecue.

The post Project Floofball and more: Pi pet stuff appeared first on Raspberry Pi.