All posts by Lennart Poettering

Everybody Loves Pretty Graphics

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/the-linux-audio-stack.html

As kind of a followup to my Guide to Linux
Sound APIs
here’re some pretty graphics I just drew. (At least “pretty” to
the degree of my limited drawing abilities). It’s a block diagram depicting the
Linux audio stack. A lot of people already drew something similar, and often
enough the result was horribly complicated and — in its conclusion
disappointing. So, here’s my try:

Linux Audio Stack

The components interface each other across the horizontal lines. The
vertical lines seperate unrelated components. The drawing only includes
modern, supported APIs and systems as described in the aforementioned blog
article. It (hopefully) shows that things in the Linux audio world are not
all that bad at all and we have workable answers for most questions without
too much complexity, although they might not entirely make everyone overly
happy.

In an outburst of bias I completely ommited KDE-specific technologies from
this drawing. I guess even if I would have included them it’d be called biased
anyway, so why bother? Also, they would have distracted the reader and complicated the
drawing considerably due to KDE’s affection for pluggable backends. So: if you
care about KDE, please ignore this diagram.

A Guide Through The Linux Sound API Jungle

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/guide-to-sound-apis.html

At the Audio MC at the Linux Plumbers Conference one
thing became very clear: it is very difficult for programmers to
figure out which audio API to use for which purpose and which API not
to use when doing audio programming on Linux. So here’s my try to
guide you through this jungle:

What do you want to do?

I want to write a media-player-like application!
Use GStreamer! (Unless your focus is only KDE in which cases Phonon might be an alternative.)
I want to add event sounds to my application!
Use libcanberra, install your sound files according to the XDG Sound Theming/Naming Specifications! (Unless your focus is only KDE in which case KNotify might be an alternative although it has a different focus.)
I want to do professional audio programming, hard-disk recording, music synthesizing, MIDI interfacing!
Use JACK and/or the full ALSA interface.
I want to do basic PCM audio playback/capturing!
Use the safe ALSA subset.
I want to add sound to my game!
Use the audio API of SDL for full-screen games, libcanberra for simple games with standard UIs such as Gtk+.
I want to write a mixer application!
Use the layer you want to support directly: if you want to support enhanced desktop software mixers, use the PulseAudio volume control APIs. If you want to support hardware mixers, use the ALSA mixer APIs.
I want to write audio software for the plumbing layer!
Use the full ALSA stack.
I want to write audio software for embedded applications!
For technical appliances usually the safe ALSA subset is a good choice, this however depends highly on your use-case.

You want to know more about the different sound APIs?

GStreamer
GStreamer is the de-facto
standard media streaming system for Linux desktops. It supports decoding and
encoding of audio and video streams. You can use it for a wide range of
purposes from simple audio file playback to elaborate network
streaming setups. GStreamer supports a wide range of CODECs and audio
backends. GStreamer is not particularly suited for basic PCM playback
or low-latency/realtime applications. GStreamer is portable and not
limited in its use to Linux. Among the supported backends are ALSA, OSS, PulseAudio. [Programming Manuals and References]
libcanberra
libcanberra
is an abstract event sound API. It implements the XDG
Sound Theme and Naming Specifications
. libcanberra is a blessed
GNOME dependency, but itself has no dependency on GNOME/Gtk/GLib and can be
used with other desktop environments as well. In addition to an easy
interface for playing sound files, libcanberra provides caching
(which is very useful for networked thin clients) and allows passing
of various meta data to the underlying audio system which then can be
used to enhance user experience (such as positional event sounds) and
for improving accessibility. libcanberra supports multiple backends
and is portable beyond Linux. Among the supported backends are ALSA, OSS, PulseAudio, GStreamer. [API Reference]
JACK
JACK is a sound system for
connecting professional audio production applications and hardware
output. It’s focus is low-latency and application interconnection. It
is not useful for normal desktop or embedded use. It is not an API
that is particularly useful if all you want to do is simple PCM
playback. JACK supports multiple backends, although ALSA is best
supported. JACK is portable beyond Linux. Among the supported backends are ALSA, OSS. [API Reference]
Full ALSA
ALSA is the Linux API
for doing PCM playback and recording. ALSA is very focused on
hardware devices, although other backends are supported as well (to a
limit degree, see below). ALSA as a name is used both for the Linux
audio kernel drivers and a user-space library that wraps these. ALSA — the library — is
comprehensive, and portable (to a limited degree). The full ALSA API
can appear very complex and is large. However it supports almost
everything modern sound hardware can provide. Some of the
functionality of the ALSA API is limited in its use to actual hardware
devices supported by the Linux kernel (in contrast to software sound
servers and sound drivers implemented in user-space such as those for
Bluetooth and FireWire audio — among others) and Linux specific
drivers. [API
Reference
]
Safe ALSA
Only a subset of the full ALSA API works on all backends ALSA
supports. It is highly recommended to stick to this safe subset
if you do ALSA programming to keep programs portable, future-proof and
compatible with sound servers, Bluetooth audio and FireWire audio. See
below for more details about which functions of ALSA are considered
safe. The safe ALSA API is a suitable abstraction for basic,
portable PCM playback and recording — not just for ALSA kernel driver
supported devices. Among the supported backends are ALSA kernel driver
devices, OSS, PulseAudio, JACK.
Phonon and KNotify
Phonon is high-level
abstraction for media streaming systems such as GStreamer, but goes a
bit further than that. It supports multiple backends. KNotify is a
system for “notifications”, which goes beyond mere event
sounds. However it does not support the XDG Sound Theming/Naming
Specifications at this point, and also doesn’t support caching or
passing of event meta-data to an underlying sound system. KNotify
supports multiple backends for audio playback via Phonon. Both APIs
are KDE/Qt specific and should not be used outside of KDE/Qt
applications. [Phonon API Reference] [KNotify API Reference]
SDL
SDL is a portable API
primarily used for full-screen game development. Among other stuff it
includes a portable audio interface. Among others SDL support OSS,
PulseAudio, ALSA as backends. [API Reference]
PulseAudio
PulseAudio is a sound system
for Linux desktops and embedded environments that runs in user-space
and (usually) on top of ALSA. PulseAudio supports network
transparency, per-application volumes, spatial events sounds, allows
switching of sound streams between devices on-the-fly, policy
decisions, and many other high-level operations. PulseAudio adds a glitch-free
audio playback model to the Linux audio stack. PulseAudio is not
useful in professional audio production environments. PulseAudio is
portable beyond Linux. PulseAudio has a native API and also supports
the safe subset of ALSA, in addition to limited,
LD_PRELOAD-based OSS compatibility. Among others PulseAudio supports
OSS and ALSA as backends and provides connectivity to JACK. [API
Reference
]
OSS
The Open Sound System is a
low-level PCM API supported by a variety of Unixes including Linux. It
started out as the standard Linux audio system and is supported on
current Linux kernels in the API version 3 as OSS3. OSS3 is considered
obsolete and has been fully replaced by ALSA. A successor to OSS3
called OSS4 is available but plays virtually no role on Linux and is
not supported in standard kernels or by any of the relevant
distributions. The OSS API is very low-level, based around direct
kernel interfacing using ioctl()s. It it is hence awkward to use and
can practically not be virtualized for usage on non-kernel audio
systems like sound servers (such as PulseAudio) or user-space sound
drivers (such as Bluetooth or FireWire audio). OSS3’s timing model
cannot properly be mapped to software sound servers at all, and is
also problematic on non-PCI hardware such as USB audio. Also, OSS does
not do sample type conversion, remapping or resampling if
necessary. This means that clients that properly want to support OSS
need to include a complete set of converters/remappers/resamplers for
the case when the hardware does not natively support the requested
sampling parameters. With modern sound cards it is very common to
support only S32LE samples at 48KHz and nothing else. If an OSS client
assumes it can always play back S16LE samples at 44.1KHz it will thus
fail. OSS3 is portable to other Unix-like systems, various differences
however apply. OSS also doesn’t support surround sound and other
functionality of modern sounds systems properly. OSS should be
considered obsolete and not be used in new applications.
ALSA and
PulseAudio have limited LD_PRELOAD-based compatibility with OSS. [Programming Guide]

All sound systems and APIs listed above are supported in all
relevant current distributions. For libcanberra support the newest
development release of your distribution might be necessary.

All sound systems and APIs listed above are suitable for
development for commercial (read: closed source) applications, since
they are licensed under LGPL or more liberal licenses or no client
library is involved.

You want to know why and when you should use a specific sound API?

GStreamer
GStreamer is best used for very high-level needs: i.e. you want to
play an audio file or video stream and do not care about all the tiny
details down to the PCM or codec level.
libcanberra
libcanberra is best used when adding sound feedback to user input
in UIs. It can also be used to play simple sound files for
notification purposes.
JACK
JACK is best used in professional audio production and where interconnecting applications is required.
Full ALSA
The full ALSA interface is best used for software on “plumbing layer” or when you want to make use of very specific hardware features, which might be need for audio production purposes.
Safe ALSA
The safe ALSA interface is best used for software that wants to output/record basic PCM data from hardware devices or software sound systems.
Phonon and KNotify
Phonon and KNotify should only be used in KDE/Qt applications and only for high-level media playback, resp. simple audio notifications.
SDL
SDL is best used in full-screen games.
PulseAudio
For now, the PulseAudio API should be used only for applications
that want to expose sound-server-specific functionality (such as
mixers) or when a PCM output abstraction layer is already available in
your application and it thus makes sense to add an additional backend
to it for PulseAudio to keep the stack of audio layers minimal.
OSS
OSS should not be used for new programs.

You want to know more about the safe ALSA subset?

Here’s a list of DOS and DONTS in the ALSA API if you care about
that you application stays future-proof and works fine with
non-hardware backends or backends for user-space sound drivers such as
Bluetooth and FireWire audio. Some of these recommendations apply for
people using the full ALSA API as well, since some functionality
should be considered obsolete for all cases.

If your application’s code does not follow these rules, you must have
a very good reason for that. Otherwise your code should simply be considered
broken!

DONTS:

  • Do not use “async handlers”, e.g. via
    snd_async_add_pcm_handler() and friends. Asynchronous
    handlers are implemented using POSIX signals, which is a very
    questionable use of them, especially from libraries and plugins. Even
    when you don’t want to limit yourself to the safe ALSA subset
    it is highly recommended not to use this functionality. Read
    this for a longer explanation why signals for audio IO are
    evil.
  • Do not parse the ALSA configuration file yourself or with
    any of the ALSA functions such as snd_config_xxx(). If you
    need to enumerate audio devices use snd_device_name_hint()
    (and related functions). That
    is the only API that also supports enumerating non-hardware audio
    devices and audio devices with drivers implemented in userspace.
  • Do not parse any of the files from
    /proc/asound/. Those files only include information about
    kernel sound drivers — user-space plugins are not listed there. Also,
    the set of kernel devices might differ from the way they are presented
    in user-space. (i.e. sub-devices are mapped in different ways to
    actual user-space devices such as surround51 an suchlike.
  • Do not rely on stable device indexes from ALSA. Nowadays
    they depend on the initialization order of the drivers during boot-up
    time and are thus not stable.
  • Do not use the snd_card_xxx() APIs. For
    enumerating use snd_device_name_hint() (and related
    functions). snd_card_xxx() is obsolete. It will only list
    kernel hardware devices. User-space devices such as sound servers,
    Bluetooth audio are not included. snd_card_load() is
    completely obsolete in these days.
  • Do not hard-code device strings, especially not
    hw:0 or plughw:0 or even dmix — these devices define no channel
    mapping and are mapped to raw kernel devices. It is highly recommended
    to use exclusively default as device string. If specific
    channel mappings are required the correct device strings should be
    front for stereo, surround40 for Surround 4.0,
    surround41, surround51, and so on. Unfortunately at
    this point ALSA does not define standard device names with channel
    mappings for non-kernel devices. This means default may only
    be used safely for mono and stereo streams. You should probably prefix
    your device string with plug: to make sure ALSA transparently
    reformats/remaps/resamples your PCM stream for you if the
    hardware/backend does not support your sampling parameters
    natively.
  • Do not assume that any particular sample type is supported
    except the following ones: U8, S16_LE, S16_BE, S32_LE, S32_BE,
    FLOAT_LE, FLOAT_BE, MU_LAW, A_LAW.
  • Do not use snd_pcm_avail_update() for
    synchronization purposes. It should be used exclusively to query the
    amount of bytes that may be written/read right now. Do not use
    snd_pcm_delay() to query the fill level of your playback
    buffer. It should be used exclusively for synchronisation
    purposes. Make sure you fully understand the difference, and note that
    the two functions return values that are not necessarily directly
    connected!
  • Do not assume that the mixer controls always know dB information.
  • Do not assume that all devices support MMAP style buffer access.
  • Do not assume that the hardware pointer inside the (possibly mmaped) playback buffer is the actual position of the sample in the DAC. There might be an extra latency involved.
  • Do not try to recover with your own code from ALSA error conditions such as buffer under-runs. Use snd_pcm_recover() instead.
  • Do not touch buffering/period metrics unless you have
    specific latency needs. Develop defensively, handling correctly the
    case when the backend cannot fulfill your buffering metrics
    requests. Be aware that the buffering metrics of the playback buffer
    only indirectly influence the overall latency in many
    cases. i.e. setting the buffer size to a fixed value might actually result in
    practical latencies that are much higher.
  • Do not assume that snd_pcm_rewind() is available and works and to which degree.
  • Do not assume that the time when a PCM stream can receive
    new data is strictly dependant on the sampling and buffering
    parameters and the resulting average throughput. Always make sure to
    supply new audio data to the device when it asks for it by signalling
    “writability” on the fd. (And similarly for capturing)
  • Do not use the “simple” interface snd_spcm_xxx().
  • Do not use any of the functions marked as “obsolete”.
  • Do not use the timer, midi, rawmidi, hwdep subsystems.

DOS:

  • Use snd_device_name_hint() for enumerating audio devices.
  • Use snd_smixer_xx() instead of raw snd_ctl_xxx()
  • For synchronization purposes use snd_pcm_delay().
  • For checking buffer playback/capture fill level use snd_pcm_update_avail().
  • Use snd_pcm_recover() to recover from errors returned by any of the ALSA functions.
  • If possible use the largest buffer sizes the device supports to maximize power saving and drop-out safety. Use snd_pcm_rewind() if you need to react to user input quickly.

FAQ

What about ESD and NAS?
ESD and NAS are obsolete, both as API and as sound daemon. Do not develop for it any further.
ALSA isn’t portable!
That’s not true! Actually the user-space library is relatively portable, it even includes a backend for OSS sound devices. There is no real reason that would disallow using the ALSA libraries on other Unixes as well.
Portability is key to me! What can I do?
Unfortunately no truly portable (i.e. to Win32) PCM API is
available right now that I could truly recommend. The systems shown
above are more or less portable at least to Unix-like operating
systems. That does not mean however that there are suitable backends
for all of them available. If you care about portability to Win32 and
MacOS you probably have to find a solution outside of the
recommendations above, or contribute the necessary
backends/portability fixes. None of the systems (with the exception of
OSS) is truly bound to Linux or Unix-like kernels.
What about PortAudio?
I don’t think that PortAudio is very good API for Unix-like operating systems. I cannot recommend it, but it’s your choice.
Oh, why do you hate OSS4 so much?
I don’t hate anything or anyone. I just don’t think OSS4 is a
serious option, especially not on Linux. On Linux, it is also
completely redundant due to ALSA.
You idiot, you have no clue!
You are right, I totally don’t. But that doesn’t hinder me from recommending things. Ha!
Hey I wrote/know this tiny new project which is an awesome abstraction layer for audio/media!
Sorry, that’s not sufficient. I only list software here that is known to be sufficiently relevant and sufficiently well maintained.

Final Words

Of course these recommendations are very basic and are only intended to
lead into the right direction. For each use-case different necessities
apply and hence options that I did not consider here might become
viable. It’s up to you to decide how much of what I wrote here
actually applies to your application.

This summary only includes software systems that are considered
stable and universally available at the time of writing. In the
future I hope to introduce a more suitable and portable replacement
for the safe ALSA subset of functions. I plan to update this text
from time to time to keep things up-to-date.

If you feel that I forgot a use case or an important API, then
please contact me or leave a comment. However, I think the summary
above is sufficiently comprehensive and if an entry is missing I most
likely deliberately left it out.

(Also note that I am upstream for both PulseAudio and libcanberra and did some minor contributions to ALSA, GStreamer and some other of the systems listed above. Yes, I am biased.)

Oh, and please syndicate this, digg it. I’d like to see this guide to be well-known all around the Linux community. Thank you!

My take on the Plumbers Conference

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/lpc-summary.html

I just came back from the Linux
Plumbers Conference
. As some of you might know I was doing an MC about Audio
there. Don Marti attended the track and wrote up an interesting article over at LWN.
It’s a recommended read, including the immense number of comments it already
resulted in. (I will try to reply to all comments coming up, in case you have questions — just post them over at LWN)

I must really say though that calling that article “It’s a mess” and
highlighting my critical comments on the situation this way makes me feel
slightly uncomfortable, though. Sure, we have some issues to fix and it’s the
words I chose at the conference — but it’s only part of the story. Things are
not really all that bad, and we have enough good stuff to focus on.

I enjoyed LPC, and especially the audio MC a lot. The discussions during the
MC were lively, focussed and very enlightening. Much better than at others
conferences I have been to the information flow was two-ways: instead of just
having a speaker who talks about stuff and attendees that listen to them, here
all talks were very interactive — a lot of people in the audience had
something to say, and the others did benefit from it.

LPC organization was flawless, Portland is awesome. The food was good, too.
To summarize: I am happy, very happy! I look forward for another iteration next year and hope we’ll be able to
have an audio MC then, too.

LPC organizers: rock on! Takashi, Jonathan: thank you very much for your
participation in the Audio MC!

(If you are not subscribed to LWN but want to read the article linked above, ping me, I can hand out a few free links. Alternatively, wait for thursday and it will be available for free.)

New libcanberra backends

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/canberra-oh-eight.html

I released libcanberra 0.8 a
few hours ago. Biggest changes are some portability fixes for Solaris/FreeBSD,
inclusion of an OSS backend (contributed by Joe Marcus Clarke) and a
GStreamer backend (contributed by Marc-André Lureau). This will hopefully make
certain doubts regarding libcanberra void.

Oh, and libcanberra now has a homepage.

PulseAudio on Transifex

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/pa-on-tx.html

Thanks to Dimitris Glezos
PulseAudio and its auxiliary tools are
now available on Fedora’s Transifex for
translation. If you want to contribute translations, please submit them via
Transifex, which will then result in direct commits to our upstream source
code repositories — without further delay or workload on my side. Submission via
other ways (bug report, mail …) will no longer be accepted.

Submit your translations now for
PulseAudio
, for
the volume control
, and for
the preferences dialog
. And while we are at it, Avahi’s waiting
for your translations, too
.

Scott,

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/apple-development-platform.html

in
contrast to what you say
the Apple audio stack (CoreAudio) is far less
streamlined that it might appear on first sight. The different APIs that make
up the Apple audio stack are far more redundant than you might think. Also,
they are different in programming style, and you can list at least as many
seperate components for different areas of audio with different API/naming
styles as you just did for the Linux audio stack.

Listing two components of the Linux audio stack that are considered
obsolete these days, and listing one item twice doesn’t really help making your
post unassailable.

Having said that, yes, our Linux audio stack is still chaotic,
redundant, badly documented and incomplete. You are very welcome to help fixing
this. But just doing a bit PR and sticking a single name on the sum of it all doesn’t
even touch the real problems we have with the audio APIs on Linux.

Free software development is in its very essence distributed. The fact that
our APIs sometimes appear a bit higgledy-piggledy is probably just an
inevitable consequence of this.

String Pools

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/string-pools.html

In part 2.4.3 of Ulrich Drepper’s excellent How To Write Shared
Libraries
(which unfortunately is a bit out-of-date these days) Ulrich
suggests replacing arrays of constant strings by a single concatenated string
plus an index lookup table, to avoid unnecessary relocations during startup of
ELF programs. Maintaining this string pool is however troublesome,
it is hard to read and difficult to edit. In appendix B Ulrich
lists an example C excerpt which contains some code for simplifying the
maintaining of such strings pools, after an idea from Bruno Haible. In my
opinion however that suggestion is not that much simpler, and requires
splitting off the actual strings into a seperate source file. Ugly!

Some Free Software uses string pools to speed up relocation, e.g. GTK+.
Some development tools like gperf
contain support for string pools.

All solutions for string pool maintaining I could find on the Internet were not exactly
beautiful. Either they were completely manual, manual plus a validity checking
tool, or very very cumbersome. Googling around I was unable to find a satisfactory tool for this purpose[1].

After Diego Petteno complained about
my heavy use of arrays of constant strings in libatasmart I sat down to
change the situation, and wrote strpool.c,
a simple parser for a very, very minimal subset of C, written in plain ANSI C.
It looks for two special comment markers /* %STRINGPOOLSTART% */ and
/* %STRINGPOOLSTOP% */, moves all immediate strings between those
markers into a common string pool and rewrites the input with the strings
replaced by indexes. Code accessing those strings must use the
special _P() macro. With these minimal changes to a
source file, passing it through strpool.c will automatically rewrite
it to a string-poolized version. The nice thing about this is that the
necessary changes in the source are minimal, and the code stays compilable with
and without passing it through the strpool.c preprocessor.

Here’s an example. First the original non-string-poolized version:

static const char* const table[] = {
	"waldo",
	"uxknurz",
	"foobar",
	"fubar"
};

static int main(int argc, char* argv[]) {
	printf("%s\n", table[2]);
	return 1;
}

For later use with strpool.c we change this like this:

#ifndef STRPOOL
#define _P(x) x
#endif

/* %STRINGPOOLSTART% */
static const char* const table[] = {
	"waldo",
	"uxknurz",
	"foobar",
	"fubar"
};
/* %STRINGPOOLSTOP% */

static int main(int argc, char* argv[]) {
	printf("%s\n", _P(table[2]));
	return 1;
}

When passed through strpool.c this will be rewritten as:

/* Saved 3 relocations, saved 0 strings (0 b) due to suffix compression. */
static const char _strpool_[] =
	"waldo\0"
	"uxknurz\0"
	"foobar\0"
	"fubar\0";
#ifndef STRPOOL
#define STRPOOL
#endif
#ifndef _P
#define _P(x) (_strpool_ + ((x) - (const char*) 1))
#endif

#ifndef STRPOOL
#define _P(x) x
#endif

/* %STRINGPOOLSTART% */
static const char* const table[] = {
	((const char*) 1),
	((const char*) 7),
	((const char*) 15),
	((const char*) 22)
};
/* %STRINGPOOLSTOP% */

static int main(int argc, char* argv[]) {
	printf("%s\n", _P(table[2]));
	return 1;
}

All three versions can be compiled directly with gcc. However, the version
that was passed through strpool.c compresses the number of
relocations for the table array from 4 to 1. Which isn’t much of a
difference, but the larger your tables are the more relevant the difference in
the number of necessary relocations gets.

A more realistic example is atasmart.c which after being preprocessed with strpool.c looks like this. In this specific example the number of necessary startup relocations goes down from > 100 to 9.

I am note sure if the parser is 100% correct, but it works fine with all sources I tried. It even does suffix compression like gcc does for normal strings too.

Footnotes

[1] Or maybe I just suck in googling? Anyone has a suggestion for such a tool?

Linux Plumbers Conference CFP Extended!

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/plumbersconf-2.html

The Call for Papers for
the Linux Plumbers Conference
in September in Portland, Oregon has been extended until July 31st 2008. It’s a conference
about the core infrastructure of Linux systems: the part of the system where
userspace and the kernel interface. It’s the first conference where the focus
is specifically on getting together the kernel people who work on the
userspace interfaces and the userspace people who have to deal with kernel
interfaces. It’s supposed to be a place where all the people doing
infrastructure work sit down and talk, so that each other understands better
what the requirements and needs of the other are, and where we can work
towards fixing the major problems we currently have with our lower-level
APIs.

I am running the Audio microconf of the Plumbers Conference. Audio
infrastructure on Linux is still heavily fragmented. Pro, desktop and embedded worlds are
almost completely seperate worlds. While we have quite good driver support the
user experience is far from perfect, mostly due because our infrastructure is
so balkanized. Join us at the Plumbers Conference and help to fix this! If you are doing audio infrastructure work on Linux, make sure to attend and submit a paper!

Sign up soon! Send in your paper early! The conference is expected to sell out pretty quickly!

Plumbers Logo

See you in Portland!

PulseAudio FUD

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/jeffrey-stedfast.html

Jeffrey Stedfast

Jeffrey Stedfast seems to have made it his new hobby
to
bash
PulseAudio.
In a series of very negative blog postings he flamed my software and hence me
in best NotZed-like fashion. Particularly interesting in this case is the
fact that he apologized to me privately on IRC for this behaviour shortly
after his first posting when he was critizised on #gnome-hackers
only to continue flaming and bashing in more blog posts shortly after. Flaming
is very much part of the Free Software community I guess. A lot of people do
it from time to time (including me). But maybe there are better places for
this than Planet Gnome. And maybe doing it for days is not particularly nice.
And maybe flaming sucks in the first place anyway.

Regardless what I think about Jeffrey and his behaviour on Planet Gnome,
let’s have a look on his trophies, the five “bugs” he posted:

  1. Not directly related to PulseAudio itself. Also, finding errors in code that is related to esd is not exactly the most difficult thing in the world.
  2. The same theme.
  3. Fixed 3 months ago. It is certainly not my fault that this isn’t available in Jeffrey’s distro.
  4. A real, valid bug report. Fixed in git a while back, but not available in any released version. May only be triggered under heavy load or with a bad high-latency scheduler.
  5. A valid bug, but not really in PulseAudio. Mostly caused because the ALSA API and PA API don’t really match 100%.

OK, Jeffrey found a real bug, but I wouldn’t say this is really enough to make all the fuss about. Or is it?

Why PulseAudio?

Jeffrey wrote something about ‘solution looking for a problem‘ when
speaking of PulseAudio. While that was certainly not a nice thing to say it
however tells me one thing: I apparently didn’t manage to communicate well
enough why I am doing PulseAudio in the first place. So, why am I doing it then?

  • There’s so much more a good audio system needs to provide than just the
    most basic mixing functionality. Per-application volumes, moving streams
    between devices during playback, positional event sounds (i.e. click on the
    left side of the screen, have the sound event come out through the left
    speakers), secure session-switching support, monitoring of sound playback
    levels, rescuing playback streams to other audio devices on hot unplug,
    automatic hotplug configuration, automatic up/downmixing stereo/surround,
    high-quality resampling, network transparency, sound effects, simultaneous
    output to multiple sound devices are all features PA provides right now, and
    what you don’t get without it. It also provides the infrastructure for
    upcoming features like volume-follows-focus, automatic attenuation of music on
    signal on VoIP stream, UPnP media renderer support, Apple RAOP support,
    mixing/volume adjustments with dynamic range compression, adaptive volume of
    event sounds based on the volume of music streams, jack sensing, switching
    between stereo/surround/spdif during runtime, …
  • And even for the most basic mixing functionality plain ALSA/dmix is not
    really everlasting happiness. Due to the way it works all clients are forced
    to use the same buffering metrics all the time, that means all clients are
    limited in their wakeup/latency settings. You will burn more CPU than
    necessary this way, keep the risk of drop-outs unnecessarily high and still
    not be able to make clients with low-latency requirements happy. ‘Glitch-Free’
    PulseAudio
    fixes all this. Quite frankly I believe that ‘glitch-free’
    PulseAudio is the single most important killer feature that should be enough
    to convince everyone why PulseAudio is the right thing to do. Maybe people
    actually don’t know that they want this. But they absolutely do, especially
    the embedded people — if used properly it is a must for power-saving during
    audio playback. It’s a pity that how awesome this feature is you cannot
    directly see from the user interface.[1]
  • PulseAudio provides compatibility with a lot of sound systems/APIs that bare ALSA
    or bare OSS don’t provide.
  • And last but not least, I love breaking Jeffrey’s audio. It’s just soo much fun, you really have to try it! 😉

If you want to know more about why I think that PulseAudio is an important part of the modern Linux desktop audio stack, please read my slides from FOSS.in 2007.

Misconceptions

Many people (like Jeffrey) wonder why have software mixing at all if you
have hardware mixing? The thing is, hardware mixing is a thing of the past,
modern soundcards don’t do it anymore. Precisely for doing things like mixing
in software SIMD CPU extensions like SSE have been invented. Modern sound
cards these days are kind of “dumbed” down, high-quality DACs. They don’t do
mixing anymore, many modern chips don’t even do volume control anymore.
Remember the days where having a Wavetable chip was a killer feature of a
sound card? Those days are gone, today wavetable synthesizing is done almost
exlcusively in software — and that’s exactly what happened to hardware mixing
too. And it is good that way. In software mixing is is much easier to do
fancier stuff like DRC which will increase quality of mixing. And modern CPUs provide
all the necessary SIMD command sets to implement this efficiently.

Other people believe that JACK would be a better solution for the problem.
This is nonsense. JACK has been designed for a very different purpose. It is
optimized for low latency inter-application communication. It requires
floating point samples, it knows nothing about channel mappings, it depends on
every client to behave correctly. And so on, and so on. It is a sound server
for audio production. For desktop applications it is however not well suited.
For a desktop saving power is very important, one application misbehaving
shouldn’t have an effect on other application’s playback; converting from/to
FP all the time is not going to help battery life either. Please understand
that for the purpose of pro audio you can make completely different
compromises than you can do on the desktop. For example, while having
‘glitch-free’ is great for embedded and desktop use, it makes no sense at all
for pro audio, and would only have a drawback on performance. So, please stop
bringing up JACK again and again. It’s just not the right tool for desktop
audio, and this opinion is shared by the JACK developers themselves.

Jeffrey thinks that audio mixing is nothing for userspace. Which is
basically what OSS4 tries to do: mixing in kernel space. However, the future
of PCM audio is floating points. Mixing them in kernel space is problematic because (at least on Linux) FP in kernel space is a no-no.
Also, the kernel people made clear more than once that maths/decoding/encoding like this
should happen in userspace. Quite honestly, doing the mixing in kernel space
is probably one of the primary reasons why I think that OSS4 is a bad idea.
The fancier your mixing gets (i.e. including resampling, upmixing, downmixing,
DRC, …) the more difficulties you will have to move such a complex,
time-intensive code into the kernel.

Not everytime your audio breaks it is alone PulseAudio’s fault. For
example, the original flame of Jeffrey’s was about the low volume that he
experienced when running PA. This is mostly due to the suckish way we
initialize the default volumes of ALSA sound cards. Most distributions have
simple scripts that initialize ALSA sound card volumes to fixed values like
75% of the available range, without understanding what the range or the
controls actually mean. This is actually a very bad thing to do. Integrated
USB speakers for example tend export the full amplification range via the
mixer controls. 75% for them is incredibly loud. For other hardware (like
apparently Jeffrey’s) it is too low in volume. How to fix this has been
discussed on the ALSA mailing list, but no final solution has been presented
yet. Nonetheless, the fact that the volume was too low, is completely
unrelated to PulseAudio.

PulseAudio interfaces with lower-level technologies like ALSA on one hand,
and with high-level applications on the other hand. Those systems are not
perfect. Especially closed-source applications tend to do very evil things
with the audio APIs (Flash!) that are very hard to support on virtualized
sound systems such as PulseAudio [2]. However, things are getting better. My list of issues I found in
ALSA
is getting shorter. Many applications have already been fixed.

The reflex “my audio is broken it must be PulseAudio’s fault” is certainly
easy to come up with, but it certainly is not always right.

Also note that — like many areas in Free Software — development of the
desktop audio stack on Linux is a bit understaffed. AFAIK there are only two
people working on ALSA full-time and only me working on PulseAudio and other
userspace audio infrastructure, assisted by a few others who supply code and patches
from time to time, some more and some less.

More Breakage to Come

I now tried to explain why the audio experience on systems with PulseAudio
might not be as good as some people hoped, but what about the future? To be
frank: the next version of PulseAudio (0.9.11) will break even more things.
The ‘glitch-free’ stuff mentioned above uses quite a few features of the
underlying ALSA infrastructure that apparently noone has been using before —
and which just don’t work properly yet on all drivers. And there are quite a
few drivers around, and I only have a very limited set of hardware to test
with. Already I know that the some of the most popular drivers (USB and HDA)
do not work entirely correctly with ‘glitch-free’.

So you ask why I plan to release this code knowing that it will break
things? Well, it works on some hardware/drivers properly, and for the others I
know work-arounds to get things to work. And 0.9.11 has been delayed for too
long already. Also I need testing from a bigger audience. And it is not so
much 0.9.11 that is buggy, it is the code it is based on. ‘Glitch-free’ PA
0.9.11 is going to part of Fedora 10. Fedora has always been more bleeding
edge than other other distributions. Picking 0.9.11 just like that for an
‘LTS’ release might however be a not a good idea.

So, please bear with me when I release 0.9.11. Snapshots have already
been available in Rawhide for a while, and hell didn’t freeze over.

The Distributions’ Role in the Game

Some distributions did a better job adopting PulseAudio than others. On the
good side I certainly have to list Mandriva, Debian[3], and
Fedora[4]. OTOH Ubuntu didn’t exactly do a stellar job. They didn’t
do their homework. Adopting PA in a distribution is a fair amount of work,
given that it interfaces with so many different things at so many different
places. The integration with other systems is crucial. The information was all
out there, communicated on the wiki, the mailing lists and on the PA IRC
channel. But if you join and hang around on neither, then you won’t get the
memo. To my surprise when Ubuntu adopted PulseAudio they moved into one of their
‘LTS’ releases rightaway [5]. Which I guess can be called gutsy —
on the background that I work for Red Hat and PulseAudio is not part of RHEL
at this time. I get a lot of flak from Ubuntu users, and I am pretty sure the
vast amount of it is undeserving and not my fault.

Why Jeffrey’s distro of choice (SUSE?) didn’t package pavucontrol 0.9.6
although it has been released months ago I don’t know. But there’s certainly no reason to whine about
that to me
and bash me for it.

Having said all this — it’s easy to point to other software’s faults or
other people’s failures. So, admitting this, PulseAudio is certainly not
bug-free, far from that. It’s a relatively complex piece of software
(threading, real-time, lock-free, sensitive to timing, …), and every
software has its bugs. In some workloads they might be easier to find than it
others. And I am working on fixing those which are found. I won’t forget any
bug report, but the order and priority I work on them is still mostly up to me
I guess, right? There’s still a lot of work to do in desktop audio, it will
take some time to get things completely right and complete.

Calls for “audio should just work ™” are often heard. But if you don’t
want to stick with a sound system that was state of the art in the 90’s for
all times, then I fear things *will have* to break from time to time. And
Jeffrey, I have no idea what you are actually hacking on. Some people
mentioned something with Evolution. If that’s true, then quite honestly,
“email should just work”, too, shouldn’t it? Evolution is not exactly
famous for it’s legendary bug-freeness and stability, or did I miss something?
Maybe you should be the one to start with making things “just work”, especially since
Evolution has been around for much longer already.

Back to Work

Now that I responded to Jeffrey’s FUD I think we all can go back to work
and end this flamefest! I wish people would actually try to understand
things before writing an insulting rant — without the slightest clue — but
with words like “clusterfuck”. I’d like to thank all the people who commented
on Jeffrey’s blog and basically already said what I wrote here
now.

So, and now I am off hacking a bit on PulseAudio a bit more — or should
I say in Jeffrey’s words: on my clusterfuck that is an epic fail and that no desktop user needs?

Footnotes

[1] BTW ‘glitch-free’ is nothing I invented, other OS have been doing something
like this for quite a while (Vista, Mac OS). On Linux however, PulseAudio is
the first and only implementation (at least to my knowledge).

[2] In fact, Flash 9 can not be made fully working on PulseAudio.
This is because the way Flash destructs it’s driver backends is racy.
Unfixably racy, from external code. Jeffrey complained about Flash instability
in his second post. This is unfair to PulseAudio, because I cannot fix this.
This is like complaining that X crashes when you use binary-only
fglrx.

[3] To Debian’s standards at least. Since development of Debian is
very distributed the integration of such a system as PulseAudio is much more
difficult since in touches so many different packages in the system that are
kind of private property by a lot of different maintainers with different
views on things.

[4] I maintain the Fedora stuff myself, so I might be a bit biased on this one… 😉

[5] I guess Ubuntu sees that this was a bit too much too early, too.
At least that’s how I understood my invitation to UDS in Prague. Since that
summit I haven’t heard anything from them anymore, though.

The Thing with Planet Fedora

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/fedora-people.html

A while
ago
I posted a story on my blog which then appeared on Fedora Planet. In it
I expressed my doubts on the usefulness of the planet, due to its low
signal-to-noise ratio, due to the babel-like mix of languages. As a response to
this posting I got a lot of really dumb comments, both directly on the blog
story and by email. I was called “intolerant”, a “Nazi”, “stupid”, that I
should “revise my geography”, that I should go “fuck myself”, that I apparently
thought that the “world was USA property” [1]. Back then I thought
that there were just a few morons in the peripherals of the community. But now, since this
incident happened
I started to wonder if we might actually have a bigger
problem in the community.

I guess this is a good opportunity to pimp David Arlie’s alternative
Fedora aggregator which I find a very useful replacement for Fedora Planet.

Footnotes

[1] I am wondering though why people think that I am a monoglot
american? I am not. Neither monoglot, nor american. And if suggesting that I
was was intended as an insult, then I can only say that it insulted me far
less than the insulter might have thought…

Being Smart

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/being-smart.html

Last weekend I set myself the task to write an ATA S.M.A.R.T. (i.e. hard
disk health monitoring) reader and parser. After spending some time reading
all kinds of T13 and T10 docs and a bit of hacking I now present
you the following new software:

  • libatasmart: a lean, small and clean implementation of an ATA S.M.A.R.T.
    reading and parsing library. It’s fairly comprehensive, however I only support
    a subset of the full S.M.A.R.T. set of functions: those parts which made sense to
    me, not the esoteric stuff. Here’s the API and here’s the README.
  • skdump: a little tool that produces a similar output to smartctl
    -a
    , but uses libatasmart.
  • sktest: a little tool for starting/aborting S.M.A.R.T. self-tests, based on libatasmart
  • gnome-disk-health-service: a little wrapper around
    libatasmart that exports its entire functionality via D-Bus, so that
    unpriviliged processes can introspect a drive’s health records, including
    temperature, number of bad sectors and suchlike. This is written in Vala, which
    BTW is awesome for doing D-Bus services. Actually after having done this once now I
    really hope I will never have to write a D-Bus server without Vala again. I
    also wrote a Vala .vapi file for libatasmart which is shipped in
    its tarball.
  • gnome-disk-health: a little tool that reads the S.M.A.R.T.
    data from g-d-h-s and presents it in a pretty dialog. Includes support for
    viewing attributes and starting self-tests and stuff. Also written with
    Vala.

Why? You might ask what the point of all this stuff is where
smartmontools already
exists. What I’d like to see on future GNOME desktops is that as soon as a
disk starts to fail a notification bubble pops up warning the user about this
fact, and suggesting that he makes backups and replaces the disk. For a tight
integration into the desktop, a S.M.A.R.T. implementation that is small, and not
C++, and a library (i.e. embeddable into other software with a sane interface)
is highly preferable. Also, stuff like distribution installers should link
against libatasmart to warn the user about old, and defective disks
before he even starts the installation on them. (Hey, anaconda developers! That means you! It’s a tiny library, and all you need to do is a single call: int sk_disk_smart_status(SkDisk *d, SkBool *good);)

Please note that I certainly don’t plan to replace smartmontools.
libatasmart will always implement only a subset of S.M.A.R.T. If you want
the full set of functionality then please refer to smartmontools.

Where’s this going? I plan to fully maintain libatasmart
(including skdump and sktest) for the future. However
g-d-h and g-d-h-s will probably just bitrot in my repository
— unless someone else wants to pick this up and maintain it. The reason my
further interest in those tools is rather limited is that for the long run we
will hopefully will see davidz’s DeviceKit-disks (screenhot)
changed to use this library for health monitoring. Then DK-d will export the
S.M.A.R.T. info on the bus, and a separate daemon would not be necessary anymore.
DK-d provides a single interface for all kinds of health parameters for
storage, including RAID health and suchlike. I thus think this is the way
forward and not g-d-h-s. (That should, of course, not hinder anyone to step up
and take up maintainership of g-d-h/g-d-h-s if he wants to. There might be good
reasons for doing so. Maybe because you need something to do, or because you
want a S.M.A.R.T. solution for the desktop now, and not wait until DeviceKit gets
pushed into all the distros).

So, here’s where you can get this stuff:

git://git.0pointer.de/libatasmart.git

git://git.0pointer.de/gnome-disk-health.git

Browse the GIT repos.

I will roll a 0.1 tarball of libatasmart soon. I’d be thankful if people could run
skdump on their disks and check if its output is basically the same as
smartctl -a‘s. Especially people with BE machines.

Of course the most important part of a software announcement is always the screenshot:

Smart-Ass!

return -ETOOMANYDOTS;

On Version Control Systems

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/on-version-control-systems.html

Here’s what I have to say about today’s state of version control systems in Free Software:

We shouldn’t forget that a VC system is just a development tool. Preferring
one over the other is nothing that has any direct influence on code quality, it
doesn’t make your algorithms perform any better, or your applications look
prettier. It’s just a tool. As such it should just do its job and get out of the
way. A programmer should have religious arguments about code quality, about
algorithms or about UIs, but what he certainly should not have is religious
arguments over the feature set of specific VCSes[1].

Does this mean it doesn’t matter at all which VCS to choose? No, of course
it does matter a lot. The step from traditional VCSes to DVCS is a major one, an
important one. Starting a fresh new Free Software project today and choosing
CVS or SVN is anachronistic at best.

Which leaves of course the question, which DVCS to pick. If you take the
“get out of the way” requirement seriously than there can only be one answer to
the question: GIT. Why? It certainly (still) has a steep learning curve, and a
steeper one than most other VC systems. But what is even harder to learn than
GIT is learning all of GIT, Mercurial, Monotone, Bizarre^H^H^H^H^H^H^HBazaar,
Darcs, Arch, SVK at the same time. If every project picked a different VCS
system, and you’d want to contribute to more than just a single project, then
you’d have to learn them all. And learning them all means learning them all not
very well. And needing to learn them all means scaring people away who don’t
want to learn yet another VCS just to check out your code. Fragmentation in use of VCSes for Free Software projects hinders development.

Which brings me to the main point I want to raise with this blog story:

It is much more important to make contributing to Free Software projects
easy by choosing a VCS everyone knows well — than it is to make it easy by
choosing a VCS that everyone could learn easily.

So, and which VCS is it that has a chance of qualifying as “everyone knows
well” and is a DVCS? I would say there is only one answer to the question: GIT.
Sure, there are some high-profile projects using HG (Mozilla, Java, Solaris),
but my impression is that the vast majority of projects that are central to
free desktops do use GIT.

Certainly, some DVCSes might be nicer than others, there might be areas
where GIT is lacking in comparison to others, but those differences are tiny.
What matters more is not scaring contributors away by making it hard for them
to contribute by requiring them to learn yet another VCS.

Yes, with CVS, SVN and GIT I think I have learned enough VC systems for now.
My hunger for learning further ones is exactly zero. Let me just code, and
don’t make it hard for me by asking me to learn your favourite one, please.

Or in other, frank words, if you start a new Open Source project today, and you
don’t choose GIT as VCS then you basically ask potential
contributors to go away.

ALSA recently switched from Mercurial to GIT. That was a good move.

So, please stop discussing which DVCS is the best one. It doesn’t matter. Picking one
that everyone knows is far more important.

That’s all I have to say.

Footnotes

[1] Of course, unless he himself develops a VC system.

FOMS 2009 CFP

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/foms-2009.html

And here’s a another
conference CFP
, this time for Foundations of Open Media
Software 2009
(FOMS). It’s simply the best conference about multimedia on
free systems. Period.

It’s the third iteration now, and the first two were
plain awesome, so don’t miss this one. It happens in Hobart, Tasmania, next to linux.conf.au 2009.

FOMS Logo

Send in your paper! Attend! Spread the word!

Linux Plumbers Conference CFP

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/projects/plumbersconf.html

The Call for Papers for
the Linux Plumbers Conference
in September in Portland is out now. It’s a conference about the core
infrastructure of Linux systems: the part of the system where userspace and the
kernel interface. It’s the first conference where the focus is specifically on
getting together the kernel people who work on the userspace interfaces and the
userspace people who have to deal with kernel interfaces. It’s supposed to be a
place where all the people doing infrastructure work sit down and talk, so that
each other understands better what the requirements and needs of the other are,
and where we can work towards fixing the major problems we currently have with
our lower-level APIs.

I am running the Audio microconf of the Plumbers Conference. Audio
infrastructure on Linux is still heavily fragmented. Pro, desktop and embedded worlds are
almost completely seperate worlds. While we have quite good driver support the
user experience is far from perfect, mostly due because our infrastructure is
so balkanized. Join us at the Plumbers Conference and help to fix this! If you are doing audio infrastructure work on Linux, make sure to attend or — even better — submit a paper!

Sign up soon! Send in your paper early! The conference is expected to sell out pretty quickly!

Plumbers Logo

See you in Portland!