systemd for Developers III

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/journal-submit.html

Here’s the third episode of of my
systemd for Developers series.

Logging to the Journal

In a recent blog
story
intended for administrators I shed some light on how to use
the journalctl(1)
tool to browse and search the systemd journal. In this blog story for developers
I want to explain a little how to get log data into the systemd
Journal in the first place.

The good thing is that getting log data into the Journal is not
particularly hard, since there’s a good chance the Journal already
collects it anyway and writes it to disk. The journal collects:

  1. All data logged via libc syslog()
  2. The data from the kernel logged with printk()
  3. Everything written to STDOUT/STDERR of any system service

This covers pretty much all of the traditional log output of a
Linux system, including messages from the kernel initialization phase,
the initial RAM disk, the early boot logic, and the main system
runtime.

syslog()

Let’s have a quick look how syslog() is used again. Let’s
write a journal message using this call:

#include <syslog.h>

int main(int argc, char *argv[]) {
        syslog(LOG_NOTICE, "Hello World");
        return 0;
}

This is C code, of course. Many higher level languages provide APIs
that allow writing local syslog messages. Regardless which language
you choose, all data written like this ends up in the Journal.

Let’s have a look how this looks after it has been written into the
journal (this is the JSON
output
journalctl -o json-pretty generates):

{
        "_BOOT_ID" : "5335e9cf5d954633bb99aefc0ec38c25",
        "_TRANSPORT" : "syslog",
        "PRIORITY" : "5",
        "_UID" : "500",
        "_GID" : "500",
        "_AUDIT_SESSION" : "2",
        "_AUDIT_LOGINUID" : "500",
        "_SYSTEMD_CGROUP" : "/user/lennart/2",
        "_SYSTEMD_SESSION" : "2",
        "_SELINUX_CONTEXT" : "unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023",
        "_MACHINE_ID" : "a91663387a90b89f185d4e860000001a",
        "_HOSTNAME" : "epsilon",
        "_COMM" : "test-journal-su",
        "_CMDLINE" : "./test-journal-submit",
        "SYSLOG_FACILITY" : "1",
        "_EXE" : "/home/lennart/projects/systemd/test-journal-submit",
        "_PID" : "3068",
        "SYSLOG_IDENTIFIER" : "test-journal-submit",
        "MESSAGE" : "Hello World!",
        "_SOURCE_REALTIME_TIMESTAMP" : "1351126905014938"
}

This nicely shows how the Journal implicitly augmented our little
log message with various meta data fields which describe in more
detail the context our message was generated from. For an explanation
of the various fields, please refer to systemd.journal-fields(7)

printf()

If you are writing code that is run as a systemd service, generating journal
messages is even easier:

#include <stdio.h>

int main(int argc, char *argv[]) {
        printf("Hello World\n");
        return 0;
}

Yupp, that’s easy, indeed.

The printed string in this example is logged at a default log
priority of LOG_INFO[1]. Sometimes it is useful to change
the log priority for such a printed string. When systemd parses
STDOUT/STDERR of a service it will look for priority values enclosed
in < > at the beginning of each line[2], following the scheme
used by the kernel’s printk() which in turn took
inspiration from the BSD syslog network serialization of messages. We
can make use of this systemd feature like this:

#include <stdio.h>

#define PREFIX_NOTICE "<5>"

int main(int argc, char *argv[]) {
        printf(PREFIX_NOTICE "Hello World\n");
        return 0;
}

Nice! Logging with nothing but printf() but we still get
log priorities!

This scheme works with any programming language, including, of course, shell:

#!/bin/bash

echo "<5>Hellow world"

Native Messages

Now, what I explained above is not particularly exciting: the
take-away is pretty much only that things end up in the journal if
they are output using the traditional message printing APIs. Yaaawn!

Let’s make this more interesting, let’s look at what the Journal
provides as native APIs for logging, and let’s see what its benefits
are. Let’s translate our little example into the 1:1 counterpart
using the Journal’s logging API sd_journal_print(3):

#include <systemd/sd-journal.h>

int main(int argc, char *argv[]) {
        sd_journal_print(LOG_NOTICE, "Hello World");
        return 0;
}

This doesn’t look much more interesting than the two examples
above, right? After compiling this with `pkg-config --cflags
--libs libsystemd-journal`
appended to the compiler parameters,
let’s have a closer look at the JSON representation of the journal
entry this generates:

 {
        "_BOOT_ID" : "5335e9cf5d954633bb99aefc0ec38c25",
        "PRIORITY" : "5",
        "_UID" : "500",
        "_GID" : "500",
        "_AUDIT_SESSION" : "2",
        "_AUDIT_LOGINUID" : "500",
        "_SYSTEMD_CGROUP" : "/user/lennart/2",
        "_SYSTEMD_SESSION" : "2",
        "_SELINUX_CONTEXT" : "unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023",
        "_MACHINE_ID" : "a91663387a90b89f185d4e860000001a",
        "_HOSTNAME" : "epsilon",
        "CODE_FUNC" : "main",
        "_TRANSPORT" : "journal",
        "_COMM" : "test-journal-su",
        "_CMDLINE" : "./test-journal-submit",
        "CODE_FILE" : "src/journal/test-journal-submit.c",
        "_EXE" : "/home/lennart/projects/systemd/test-journal-submit",
        "MESSAGE" : "Hello World",
        "CODE_LINE" : "4",
        "_PID" : "3516",
        "_SOURCE_REALTIME_TIMESTAMP" : "1351128226954170"
}

This looks pretty much the same, right? Almost! I highlighted three new
lines compared to the earlier output. Yes, you guessed it, by using
sd_journal_print() meta information about the generating
source code location is implicitly appended to each
message[3], which is helpful for a developer to identify
the source of a problem.

The primary reason for using the Journal’s native logging APIs is a
not just the source code location however: it is to allow
passing additional structured log messages from the program into the
journal. This additional log data may the be used to search the
journal for, is available for consumption for other programs, and
might help the administrator to track down issues beyond what is
expressed in the human readable message text. Here’s and example how
to do that with sd_journal_send():

#include <systemd/sd-journal.h>
#include <unistd.h>
#include <stdlib.h>

int main(int argc, char *argv[]) {
        sd_journal_send("MESSAGE=Hello World!",
                        "MESSAGE_ID=52fb62f99e2c49d89cfbf9d6de5e3555",
                        "PRIORITY=5",
                        "HOME=%s", getenv("HOME"),
                        "TERM=%s", getenv("TERM"),
                        "PAGE_SIZE=%li", sysconf(_SC_PAGESIZE),
                        "N_CPUS=%li", sysconf(_SC_NPROCESSORS_ONLN),
                        NULL);
        return 0;
}

This will write a log message to the journal much like the earlier
examples. However, this times a few additional, structured fields are
attached:

{
        "__CURSOR" : "s=ac9e9c423355411d87bf0ba1a9b424e8;i=5930;b=5335e9cf5d954633bb99aefc0ec38c25;m=16544f875b;t=4ccd863cdc4f0;x=896defe53cc1a96a",
        "__REALTIME_TIMESTAMP" : "1351129666274544",
        "__MONOTONIC_TIMESTAMP" : "95903778651",
        "_BOOT_ID" : "5335e9cf5d954633bb99aefc0ec38c25",
        "PRIORITY" : "5",
        "_UID" : "500",
        "_GID" : "500",
        "_AUDIT_SESSION" : "2",
        "_AUDIT_LOGINUID" : "500",
        "_SYSTEMD_CGROUP" : "/user/lennart/2",
        "_SYSTEMD_SESSION" : "2",
        "_SELINUX_CONTEXT" : "unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023",
        "_MACHINE_ID" : "a91663387a90b89f185d4e860000001a",
        "_HOSTNAME" : "epsilon",
        "CODE_FUNC" : "main",
        "_TRANSPORT" : "journal",
        "_COMM" : "test-journal-su",
        "_CMDLINE" : "./test-journal-submit",
        "CODE_FILE" : "src/journal/test-journal-submit.c",
        "_EXE" : "/home/lennart/projects/systemd/test-journal-submit",
        "MESSAGE" : "Hello World!",
        "_PID" : "4049",
        "CODE_LINE" : "6",
        "MESSAGE_ID" : "52fb62f99e2c49d89cfbf9d6de5e3555",
        "HOME" : "/home/lennart",
        "TERM" : "xterm-256color",
        "PAGE_SIZE" : "4096",
        "N_CPUS" : "4",
        "_SOURCE_REALTIME_TIMESTAMP" : "1351129666241467"
}

Awesome! Our simple example worked! The five meta data fields we
attached to our message appeared in the journal. We used sd_journal_send()
for this which works much like sd_journal_print() but takes a
NULL terminated list of format strings each followed by its
arguments. The format strings must include the field name and a ‘=’
before the values.

Our little structured message included seven fields. The first three we passed are well-known fields:

  1. MESSAGE= is the actual human readable message part of the structured message.
  2. PRIORITY= is the numeric message priority value as known from BSD syslog formatted as an integer string.
  3. MESSAGE_ID= is a 128bit ID that identifies our specific
    message call, formatted as hexadecimal string. We randomly generated
    this string with journalctl --new-id128. This can be used by
    applications to track down all occasions of this specific
    message. The 128bit can be a UUID, but this is not a requirement or enforced.

Applications may relatively freely define additional fields as they
see fit (we defined four pretty arbitrary ones in our example). A
complete list of the currently well-known fields is available in systemd.journal-fields(7).

Let’s see how the message ID helps us finding this message and all
its occasions in the journal:

$ journalctl MESSAGE_ID=52fb62f99e2c49d89cfbf9d6de5e3555
-- Logs begin at Thu, 2012-10-18 04:07:03 CEST, end at Thu, 2012-10-25 04:48:21 CEST. --
Oct 25 03:47:46 epsilon test-journal-se[4049]: Hello World!
Oct 25 04:40:36 epsilon test-journal-se[4480]: Hello World!

Seems I already invoked this example tool twice!

Many messages systemd itself generates have
message IDs
. This is useful for example, to find all occasions
where a program dumped core (journalctl
MESSAGE_ID=fc2e22bc6ee647b6b90729ab34a250b1
), or when a user
logged in (journalctl
MESSAGE_ID=8d45620c1a4348dbb17410da57c60c66
). If your application
generates a message that might be interesting to recognize in the
journal stream later on, we recommend attaching such a message ID to
it. You can easily allocate a new one for your message with journalctl
--new-id128
.

This example shows how we can use the Journal’s native APIs to
generate structured, recognizable messages. You can do much more than
this with the C API. For example, you may store binary data in journal
fields as well, which is useful to attach coredumps or hard disk SMART
states to events where this applies. In order to make this blog story
not longer than it already is we’ll not go into detail about how to do
this, an I ask you to check out sd_journal_send(3)
for further information on this.

Python

The examples above focus on C. Structured logging to the Journal is
also available from other languages. Along with systemd itself we ship
bindings for Python. Here’s an example how to use this:

from systemd import journal
journal.send('Hello world')
journal.send('Hello, again, world', FIELD2='Greetings!', FIELD3='Guten tag')

Other binding exist for Node.js,
PHP, Lua.

Portability

Generating structured data is a very useful feature for services to
make their logs more accessible both for administrators and other
programs. In addition to the implicit structure the Journal
adds to all logged messages it is highly beneficial if the various
components of our stack also provide explicit structure
in their messages, coming from within the processes themselves.

Porting an existing program to the Journal’s logging APIs comes
with one pitfall though: the Journal is Linux-only. If non-Linux
portability matters for your project it’s a good idea to provide an
alternative log output, and make it selectable at compile-time.

Regardless which way to log you choose, in all cases we’ll forward
the message to a classic syslog daemon running side-by-side with the
Journal, if there is one. However, much of the structured meta data of
the message is not forwarded since the classic syslog protocol simply
has no generally accepted way to encode this and we shouldn’t attempt
to serialize meta data into classic syslog messages which might turn
/var/log/messages into an unreadable dump of machine
data. Anyway, to summarize this: regardless if you log with
syslog(), printf(), sd_journal_print() or
sd_journal_send(), the message will be stored and indexed by
the journal and it will also be forwarded to classic syslog.

And that’s it for today. In a follow-up episode we’ll focus on
retrieving messages from the Journal using the C API, possibly
filtering for a specific subset of messages. Later on, I hope to give
a real-life example how to port an existing service to the Journal’s
logging APIs. Stay tuned!

Footnotes

[1] This can be changed with the SyslogLevel= service
setting. See systemd.exec(5)
for details.

[2] Interpretation of the < > prefixes of logged lines
may be disabled with the SyslogLevelPrefix= service setting. See systemd.exec(5)
for details.

[3] Appending the code location to the log messages can be
turned off at compile time by defining
-DSD_JOURNAL_SUPPRESS_LOCATION.