Tag Archives: VoIP

Gettys: The Blind Men and the Elephant

Post Syndicated from corbet original https://lwn.net/Articles/747084/rss

Jim Gettys provides
an extensive look at the FQ_CoDel queue-management algorithm
as a big
piece of the solution to bufferbloat problems. “Simple
‘request/response’ or time based protocols are preferentially scheduled
relative to bulk data transport. This means that your VOIP packets, your
TCP handshakes, cryptographic associations, your button press in your game,
your DHCP or other basic network protocols all get preferential service
without the complexity of extensive packet classification, even under very
heavy load of other ongoing flows. Your phone call can work well despite
large downloads or video use.

Security Flaws in 4G VoLTE

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2017/06/security_flaws_1.html

Research paper: “Subscribers remote geolocation and tracking using 4G VoLTE enabled Android phone,” by Patrick Ventuzelo, Olivier Le Moal, and Thomas Coudray.

Abstract: VoLTE (Voice over LTE) is a technology implemented by many operators over the world. Unlike previous 2G/3G technologies, VoLTE offers the possibility to use the end-to-end IP networks to handle voice communications. This technology uses VoIP (Voice over IP) standards over IMS (IP Multimedia Subsystem) networks. In this paper, we will first introduce the basics of VoLTE technology. We will then demonstrate how to use an Android phone to communicate with VoLTE networks and what normal VoLTE communications look like. Finally, we will describe different issues and implementations’ problems. We will present vulnerabilities, both passive and active, and attacks that can be done using VoLTE Android smartphones to attack subscribers and operators’ infrastructures. Some of these vulnerabilities are new and not previously disclosed: they may allow an attacker to silently retrieve private pieces of information on targeted subscribers, such as their geolocation.

News article. Slashdot thread.

Inmates Secretly Build and Network Computers while in Prison

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2017/05/inmates_secretl.html

This is kind of amazing:

Inmates at a medium-security Ohio prison secretly assembled two functioning computers, hid them in the ceiling, and connected them to the Marion Correctional Institution’s network. The hard drives were loaded with pornography, a Windows proxy server, VPN, VOIP and anti-virus software, the Tor browser, password hacking and e-mail spamming tools, and the open source packet analyzer Wireshark.

Another article.

Clearly there’s a lot about prison security, or the lack thereof, that I don’t know. This article reveals some of it.

2016-12-31 равносметка

Post Syndicated from Vasil Kolev original https://vasil.ludost.net/blog/?p=3335

И време за стандартната годишна равносметка…

– Ще ставам баща. Това мисля, че за всички е доста изненадващо, даже малко и за Елена 🙂
– Избутахме OpenFest 2016 – много работа, много неща, като цяло изяде сериозна част от времето ми за годината;
– Имах количество занимания около initLab, който още оцелява;
– Направих няколко BGP workshop-а (и може би скоро трябва да направя още един, докато мога);
– Занимавах се с видеото на FOSDEM 2016, и догодина пак (направили сме бая интересен setup, който се интересува може да види в github);
– В общи линии за да тестваме нещата за FOSDEM ходихме да правим видео/аудио записване и streaming на State of the Map 2016;
– Направих една лекция (на PlovdivConf) и планирах и една на тема теоретична политика, която не можах да довърша (като цяло имам вече поне 5 висящи написани до средата лекции);
– Умря Pieter Hintjens;
– Да не повярва човек, не писах повече тая година…

И да стои и да се вижда, ето списъкът висящи ми лекции:
– “Теоретична политика” (системи за гласуване/взимане на решения, компромиси и т.н.);
– Debug workshop;
– VoIP/asterisk workshop (или цял курс);
– Приложение на принципи от системната администрация в реалния живот;
– Представяне на initLab с много картинки;
– Криптография за журналисти;
– Подкарване и поддръжка на собствен mail server;

Мисля, че имах идея и за още някакви. Един хубав ден мога да допиша поне една-две от тях и да ги кача в блога, че време за говорене не се намира 🙂
(а много ми се иска да систематизирам нещо по темата системи от хора и други неща, но това няма да стане в следващите една-две години)

Eavesdropping on Typing Over Voice-Over-IP

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2016/10/eavesdropping_o_6.html

Interesting research: “Don’t Skype & Type! Acoustic Eavesdropping in Voice-Over-IP“:

Abstract: Acoustic emanations of computer keyboards represent a serious privacy issue. As demonstrated in prior work, spectral and temporal properties of keystroke sounds might reveal what a user is typing. However, previous attacks assumed relatively strong adversary models that are not very practical in many real-world settings. Such strong models assume: (i) adversary’s physical proximity to the victim, (ii) precise profiling of the victim’s typing style and keyboard, and/or (iii) significant amount of victim’s typed information (and its corresponding sounds) available to the adversary.

In this paper, we investigate a new and practical keyboard acoustic eavesdropping attack, called Skype & Type (S&T), which is based on Voice-over-IP (VoIP). S&T relaxes prior strong adversary assumptions. Our work is motivated by the simple observation that people often engage in secondary activities (including typing) while participating in VoIP calls. VoIP software can acquire acoustic emanations of pressed keystrokes (which might include passwords and other sensitive information) and transmit them to others involved in the call. In fact, we show that very popular VoIP software (Skype) conveys enough audio information to reconstruct the victim’s input ­ keystrokes typed on the remote keyboard. In particular, our results demonstrate
that, given some knowledge on the victim’s typing style and the keyboard, the attacker attains top-5 accuracy of 91:7% in guessing a random key pressed by the victim. (The accuracy goes down to still alarming 41:89% if the attacker is oblivious to both the typing style and the keyboard). Finally, we provide evidence that Skype & Type attack is robust to various VoIP issues (e.g., Internet bandwidth fluctuations and presence of voice over keystrokes), thus confirming feasibility of this attack.

News article.

2016-04-22

Post Syndicated from Vasil Kolev original https://vasil.ludost.net/blog/?p=3300

Трябва да пиша по-редовно, да не се получава миш-маш като тоя по-долу.

Както обикновено, ми върви на дебъгване. В последната седмица от по-странните неща се сещам за:
– build на android image (за нещо, правено и писано от (некадърни) китайци);
– Java/groovy;
– Python;
– И нормалното количество VoIP бози.
За да завърша картинката, обмислям да седна да подкарам VAX-а, който виси в initLab.

Тая вечер ходих на концерт на “band H.”, хора, които свирят Tool. Прилично се справиха, въпреки че им куцаше ритъма на моменти (което не е учудващо, Tool са учудващо гадни за свирене).
(по някаква причина в същия ден имаше 3 концерта – band H., Smallman и Irfan, не беше лесен избора)
(random човек ме разпозна на концерта и каза колко се радва на разните проекти като initLab, дето правим)

Седнах да подкарвам най-накрая сертификати от letsencrypt за нещата по marla, и успях да наслагам на половината, преди да ме удари resource limit-а при тях. Следващата седмица ще ги довърша. Разписах нещата с acme-tiny, базирано на нещо, което Петко беше драснал за лаба, оказва се сравнително просто (ако config-а е подреден както трябва) да се parse-ва apache config-а и да се смята какви точно сертификати да се генерират за кого.
(открих кофа неща, които вече не се host-ват при мен и ги почистих)

Събрал съм резултатите от теста на FOSDEM-ската техника (сравнение на запис на stream-а и encode-нат резултат, от нашия и от FOSDEM-ския setup), и като цяло с още малко пипване това може да се окаже достатъчно лесно за по-малки конференции (на които не си влачим 6-7-8 човека от екипа).

На opendata.government.bg тия дни пак качиха нещо интересно (тоя път – целия търговски регистър от 2008ма досега) и пак претовариха нещастната виртуалка (която е един debian в/у microsoft hyper-v). Обмислям някаква магия да може да преживява такива неща по-лесно, щото не се очаква да намалеят интересните данни.

За който се интересува, върви световното първенство по snooker, прилично забавно е. Тия дни пътувахме в метрото и tether-нати през телефона ми и с един таблет си гледахме вървящия мач…
(да живее технологията)

И не си спомням нещо друго да ми се е случвало.

Be Sure to Comment on FCC’s NPRM 14-28

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2014/06/04/fcc-14-28.html

I remind everyone today, particularly USA Citizens, to be sure to comment
on
the FCC’s
Notice of Proposed Rulemaking (NPRM) 14-28
. They even did a sane thing
and provided
an email address you can write to rather than using their poorly designed
web forums
,
but PC
Magazine published relatively complete instructions for other ways
.
The deadline isn’t for a while yet, but it’s worth getting it done so you
don’t forget. Below is my letter in case anyone is interested.

Dear FCC Commissioners,

I am writing in response to NPRM 14-28 — your request for comments regarding
the “Open Internet”.

I am a trained computer scientist and I work in the technology industry.
(I’m a software developer and software freedom activist.) I have subscribed
to home network services since 1989, starting with the Prodigy service, and
switching to Internet service in 1991. Initially, I used a PSTN single-pair
modem and eventually upgraded to DSL in 1999. I still have a DSL line, but
it’s sadly not much faster than the one I had in 1999, and I explain below
why.

In fact, I’ve watched the situation get progressively worse, not better,
since the Telecommunications Act of 1996. While my download speeds are
little bit faster than they were in the late 1990s, I now pay
substantially more for only small increases of upload speeds, even in a
major urban markets. In short, it’s become increasingly more difficult
to actually purchase true Internet connectivity service anywhere in the
USA. But first, let me explain what I mean by “true Internet
connectivity”.

The Internet was created as a peer-to-peer medium where all nodes were
equal. In the original design of the Internet, every device has its own
IP address and, if the user wanted, that device could be addressed
directly and fully by any other device on the Internet. For its part,
the network in between the two nodes were intended to merely move the
packets between those nodes as quickly as possible — treating all those
packets the same way, and analyzing those packets only with publicly
available algorithms that everyone agreed were correct and fair.

Of course, the companies who typically appeal to (or even fight) the FCC
want the true Internet to simply die. They seek to turn the promise of
a truly peer-to-peer network of equality into a traditional broadcast
medium that they control. They frankly want to manipulate the Internet
into a mere television broadcast system (with the only improvement to
that being “more stations”).

Because of this, the three following features of the Internet —
inherent in its design — that are now extremely difficult for
individual home users to purchase at reasonable cost from so-called
“Internet providers” like Time Warner, Verizon, and Comcast:

A static IP address, which allows the user to be a true, equal node on
the Internet. (And, related: IPv6 addresses, which could end the claim
that static IP addresses are a precious resource.)

An unfiltered connection, that allows the user to run their own
webserver, email server and the like. (Most of these companies block TCP
ports 80 and 25 at the least, and usually many more ports, too).

Reasonable choices between the upload/download speed tradeoff.

For example, in New York, I currently pay nearly $150/month to an
independent ISP just to have a static, unfiltered IP address with 10
Mbps down and 2 Mbps up. I work from home and the 2 Mbps up is
incredibly slow for modern usage. However, I still live in the Slowness
because upload speeds greater than that are extremely price-restrictive
from any provider.

In other words, these carriers have designed their networks to
prioritize all downloading over all uploading, and to purposely place
the user behind many levels of Network Address Translation and network
filtering. In this environment, many Internet applications simply do
not work (or require complex work-arounds that disable key features).
As an example: true diversity in VoIP accessibility and service has
almost entirely been superseded by proprietary single-company services
(such as Skype) because SIP, designed by the IETF (in part) for VoIP
applications, did not fully anticipate that nearly every user would be
behind NAT and unable to use SIP without complex work-arounds.

I believe this disastrous situation centers around problems with the
Telecommunications Act of 1996. While
the
ILECs

are theoretically required to license network infrastructure fairly at bulk
rates to

CLEC
s,
I’ve frequently seen — both professional and personally — wars
waged against
CLECs by
ILECs. CLECs
simply can’t offer their own types of services that merely “use”
the ILECs’ connectivity. The technical restrictions placed by ILECs force
CLECs to offer the same style of service the ILEC offers, and at a higher
price (to cover their additional overhead in dealing with the CLECs)! It’s
no wonder there are hardly any CLECs left.

Indeed, in my 25 year career as a technologist, I’ve seen many nasty
tricks by Verizon here in NYC, such as purposeful work-slowdowns in
resolution of outages and Verizon technicians outright lying to me and
to CLEC technicians about the state of their network. For my part, I
stick with one of the last independent ISPs in NYC, but I suspect they
won’t be able to keep their business going for long. Verizon either (a)
buys up any CLEC that looks too powerful, or, (b) if Verizon can’t buy
them, Verizon slowly squeezes them out of business with dirty tricks.

The end result is that we don’t have real options for true Internet
connectivity for home nor on-site business use. I’m already priced
out of getting a 10 Mbps upload with a static IP and all ports usable.
I suspect within 5 years, I’ll be priced out of my current 2 Mbps upload
with a static IP and all ports usable.

I realize the problems that most users are concerned about on this issue
relate to their ability to download bytes from third-party companies
like Netflix. Therefore, it’s all too easy for Verizon to play out this
argument as if it’s big companies vs. big companies.

However, the real fallout from the current system is that the cost for
personal Internet connectivity that allows individuals equal existence
on the network is so high that few bother. The consequence, thus, is
that only those who are heavily involved in the technology industry even
know what types of applications would be available if everyone had a
static IP with all ports usable and equal upload and download speeds
of 10 Mbs or higher.

Yet, that’s the exact promise of network connectivity that I was taught
about as an undergraduate in Computer Science in the early 1990s. What
I see today is the dystopian version of the promise. My generation of
computer scientists have been forced to constrain their designs of
Internet-enabled applications to fit a model that the network carriers
dictate.

I realize you can’t possibly fix all these social ills in the network
connectivity industry with one rule-making, but I hope my comments have
perhaps given a slightly different perspective of what you’ll hear from
most of the other commenters on this issue. I thank you for reading my
comments and would be delighted to talk further with any of your staff
about these issues at your convenience.

Sincerely,

Bradley M. Kuhn,
a citizen of the USA since birth, currently living in New York, NY.

PulseAudio and Jack

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/when-pa-and-when-not.html

#nocomments yes

One thing became very clear to me during my trip to the Linux Audio Conference 2010
in Utrecht: even many pro audio folks are not sure what Jack does that PulseAudio doesn’t do and what
PulseAudio does that Jack doesn’t do; why they are not competing, why
you cannot replace one by the other, and why merging them (at least in
the short term) might not make immediate sense. In other words, why
millions of phones on this world run PulseAudio and not Jack, and why
a music studio running PulseAudio is crack.

To light this up a bit and for future reference I’ll try to explain in the
following text why there is this seperation between the two systems and why this isn’t
necessarily bad. This is mostly a written up version of (parts of) my slides
from LAC
, so if you attended that event you might find little new, but I hope
it is interesting nonetheless.

This is mostly written from my perspective as a hacker working on
consumer audio stuff (more specifically having written most of
PulseAudio), but I am sure most pro audio folks would agree with the
points I raise here, and have more things to add. What I explain below
is in no way comprehensive, just a list of a couple of points I think
are the most important, as they touch the very core of both
systems (and we ignore all the toppings here, i.e. sound effects, yadda, yadda).

First of all let’s clear up the background of the sound server use cases here:

Consumer Audio (i.e. PulseAudio)
Pro Audio (i.e. Jack)

Reducing power usage is a defining requirement, most systems are battery powered (Laptops, Cell Phones).
Power usage usually not an issue, power comes out of the wall.

Must support latencies low enough for telephony and
games. Also covers high latency uses, such as movie and music playback
(2s of latency is a good choice). Minimal latencies are a
definining requirement.

System is highly dynamic, with applications starting/stopping, hardware added and removed all the time.
System is usually static in its configuration during operation.

User is usually not proficient in the technologies used.[1]
User is usually a professional and knows audio technology and computers well.

User is not necessarily the administrator of his machine, might have limited access.
User usually administrates his own machines, has root privileges.

Audio is just one use of the system among many, and often just a background job.
Audio is the primary purpose of the system.

Hardware tends to have limited resources and be crappy and cheap.
Hardware is powerful, expensive and high quality.

Of course, things are often not as black and white like this, there are uses
that fall in the middle of these two areas.

From the table above a few conclusions may be drawn:

A consumer sound system must support both low and high latency operation.
Since low latencies mean high CPU load and hence high power
consumption[2] (Heisenberg…), a system should always run with the
highest latency latency possible, but the lowest latency necessary.

Since the consumer system is highly dynamic in its use latencies must be
adjusted dynamically too. That makes a design such as PulseAudio’s timer-based scheduling important.

A pro audio system’s primary optimization target is low latency. Low
power usage, dynamic changeble configuration (i.e. a short drop-out while you
change your pipeline is acceptable) and user-friendliness may be sacrificed for
that.

For large buffer sizes a zero-copy design suggests itself: since data
blocks are large the cache pressure can be considerably reduced by zero-copy
designs. Only for large buffers the cost of passing pointers around is
considerable smaller than the cost of passing around the data itself (or the
other way round: if your audio data has the same size as your pointers, then
passing pointers around is useless extra work).

On a resource constrained system the ideal audio pipeline does not touch
and convert the data passed along it unnecessarily. That makes it important to
support natively the sample types and interleaving modes of the audio source or
destination.

A consumer system needs to simplify the view on the hardware, hide the its
complexity: hide redundant mixer elements, or merge them while making use of
the hardware capabilities, and extending it in software so that the same
functionality is provided on all hardware. A production system should not hide
or simplify the hardware functionality.

A consumer system should not drop-out when a client misbehaves or the
configuration changes (OTOH if it happens in exceptions it is not disastrous
either). A synchronous pipeline is hence not advisable, clients need to supply
their data asynchronously.

In a pro audio system a drop-out during reconfiguration is acceptable,
during operation unacceptable.

In consumer audio we need to make compromises on resource usage,
which pro audio does not have to commit to. Example: a pro audio
system can issue memlock() with little limitations since the
hardware is powerful (i.e. a lot of RAM available) and audio is the
primary purpose. A consumer audio system cannot do that because that
call practically makes memory unavailable to other applications,
increasing their swap pressure. And since audio is not the primary
purpose of the system and resources limited we hence need to find a
different way.

Jack has been designed for low latencies, where synchronous
operation is advisable, meaning that a misbehaving client call stall
the entire pipeline. Changes of the pipeline or latencies usually
result in drop-outs in one way or the other, since the entire pipeline
is reconfigured, from the hardware to the various clients. Jack only
supports FLOAT32 samples and non-interleaved audio channels (and that
is a good thing). Jack does not employ reference-counted zero-copy
buffers. It does not try to simplify the hardware mixer in any
way.

PulseAudio OTOH can deal with varying latancies, dynamically
adjusting to the lowest latencies any of the connected clients
needs
. Client communication is fully asynchronous, a single client
cannot stall the entire pipeline. PulseAudio supports a variety of PCM
formats and channel setups. PulseAudio’s design is heavily based on
reference-counted zero-copy buffers that are passed around, even
between processes, instead of the audio data itself. PulseAudio tries
to simplify the hardware mixer as suggested above.

Now, the two paragraphs above hopefully show how Jack is more
suitable for the pro audio use case and PulseAudio more for the
consumer audio use case. One question asks itself though: can we marry
the two approaches? Yes, we probably can, MacOS has a unified approach
for both uses. However, it is not clear this would be a good
idea. First of all, a system with the complexities introduced by
sample format/channel mapping conversion, as well as dynamically
changing latencies and pipelines, and asynchronous behaviour would
certainly be much less attractive to pro audio developers. In fact,
that Jack limits itself to synchronous, FLOAT32-only,
non-interleaved-only audio streams is one of the big features of its
design. Marrying the two approaches would corrupt that. A merged
solution would probably not have a good stand in the community.

But it goes even further than this: what would the use case for
this be? After all, most of the time, you don’t want your event
sounds, your Youtube, your VoIP and your Rhythmbox mixed into the new
record you are producing. Hence a clear seperation between the two
worlds might even be handy?

Also, let’s not forget that we lack the manpower to even create
such an audio chimera.

So, where to from here? Well, I think we should put the focus on
cooperation instead of amalgamation: teach PulseAudio to go out of the
way as soon as Jack needs access to the device, and optionally make
PulseAudio a normal JACK client while both are running. That way, the
user has the option to use the PulseAudio supplied streams, but
normally does not see them in his pipeline. The first part of this has
already been implemented: Jack2 and PulseAudio do not fight for the
audio device, a friendly handover takes place. Jack takes precedence,
PulseAudio takes the back seat. The second part is still missing: you
still have to manually hookup PulseAudio to Jack if you are interested
in its streams. If both are implemented starting Jack basically has
the effect of replacing PulseAudio’s core with the Jack core, while
still providing full compatibility with PulseAudio clients.

And that I guess is all I have to say on the entire Jack and
PulseAudio story.

Oh, one more thing, while we are at clearing things up: some news
sites claim that PulseAudio’s not necessarily stellar reputation in
some parts of the community comes from Ubuntu and other distributions
having integrated it too early. Well, let me stress here explicitly,
that while they might have made a mistake or two in packaging
PulseAudio and I publicly pointed that out (and probably not in a too
friendly way), I do believe that the point in time they adopted it was
right. Why? Basically, it’s a chicken and egg problem. If it is not
used in the distributions it is not tested, and there is no pressure
to get fixed what then turns out to be broken: in PulseAudio itself,
and in both the layers on top and below of it. Don’t forget that
pushing a new layer into an existing stack will break a lot of
assumptions that the neighboring layers made. Doing this must
break things. Most Free Software projects could probably use more
developers, and that is particularly true for Audio on Linux. And
given that that is how it is, pushing the feature in at that point in
time was the right thing to do. Or in other words, if the features are
right, and things do work correctly as far as the limited test base
the developers control shows, then one day you need to push into the
distributions, even if this might break setups and software that
previously has not been tested, unless you want to stay stuck in your
development indefinitely. So yes, Ubuntu, I think you did well with
adopting PulseAudio when you did.

Footnotes

[1] Side note: yes, consumers tend not to know what dB is, and expect
volume settings in “percentages”, a mostly meaningless unit in
audio. This even spills into projects like VLC or Amarok which expose
linear volume controls (which is a really bad idea).

[2] In case you are wondering why that is the case: if the latency is
low the buffers must be sized smaller. And if the buffers are sized smaller
then the CPU will have to wake up more often to fill them up for the same
playback time. This drives up the CPU load since less actual payload can be
processed for the amount of housekeeping that the CPU has to do during each
buffer iteration. Also, frequent wake-ups make it impossible for the CPU to go
to deeper sleep states. Sleep states are the primary way for modern CPUs
to save power.

PulseAudio and Jack

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/when-pa-and-when-not.html

#nocomments yes

One thing became very clear to me during my trip to the Linux Audio Conference 2010
in Utrecht: even many pro audio folks are not sure what Jack does that PulseAudio doesn’t do and what
PulseAudio does that Jack doesn’t do; why they are not competing, why
you cannot replace one by the other, and why merging them (at least in
the short term) might not make immediate sense. In other words, why
millions of phones on this world run PulseAudio and not Jack, and why
a music studio running PulseAudio is crack.

To light this up a bit and for future reference I’ll try to explain in the
following text why there is this seperation between the two systems and why this isn’t
necessarily bad. This is mostly a written up version of (parts of) my slides
from LAC
, so if you attended that event you might find little new, but I hope
it is interesting nonetheless.

This is mostly written from my perspective as a hacker working on
consumer audio stuff (more specifically having written most of
PulseAudio), but I am sure most pro audio folks would agree with the
points I raise here, and have more things to add. What I explain below
is in no way comprehensive, just a list of a couple of points I think
are the most important, as they touch the very core of both
systems (and we ignore all the toppings here, i.e. sound effects, yadda, yadda).

First of all let’s clear up the background of the sound server use cases here:

Consumer Audio (i.e. PulseAudio)Pro Audio (i.e. Jack)
Reducing power usage is a defining requirement, most systems are battery powered (Laptops, Cell Phones).Power usage usually not an issue, power comes out of the wall.
Must support latencies low enough for telephony and
games. Also covers high latency uses, such as movie and music playback
(2s of latency is a good choice).
Minimal latencies are a
definining requirement.
System is highly dynamic, with applications starting/stopping, hardware added and removed all the time.System is usually static in its configuration during operation.
User is usually not proficient in the technologies used.[1]User is usually a professional and knows audio technology and computers well.
User is not necessarily the administrator of his machine, might have limited access.User usually administrates his own machines, has root privileges.
Audio is just one use of the system among many, and often just a background job.Audio is the primary purpose of the system.
Hardware tends to have limited resources and be crappy and cheap.Hardware is powerful, expensive and high quality.

Of course, things are often not as black and white like this, there are uses
that fall in the middle of these two areas.

From the table above a few conclusions may be drawn:

  • A consumer sound system must support both low and high latency operation.
    Since low latencies mean high CPU load and hence high power
    consumption[2] (Heisenberg…), a system should always run with the
    highest latency latency possible, but the lowest latency necessary.
  • Since the consumer system is highly dynamic in its use latencies must be
    adjusted dynamically too. That makes a design such as PulseAudio’s timer-based scheduling important.
  • A pro audio system’s primary optimization target is low latency. Low
    power usage, dynamic changeble configuration (i.e. a short drop-out while you
    change your pipeline is acceptable) and user-friendliness may be sacrificed for
    that.
  • For large buffer sizes a zero-copy design suggests itself: since data
    blocks are large the cache pressure can be considerably reduced by zero-copy
    designs. Only for large buffers the cost of passing pointers around is
    considerable smaller than the cost of passing around the data itself (or the
    other way round: if your audio data has the same size as your pointers, then
    passing pointers around is useless extra work).
  • On a resource constrained system the ideal audio pipeline does not touch
    and convert the data passed along it unnecessarily. That makes it important to
    support natively the sample types and interleaving modes of the audio source or
    destination.
  • A consumer system needs to simplify the view on the hardware, hide the its
    complexity: hide redundant mixer elements, or merge them while making use of
    the hardware capabilities, and extending it in software so that the same
    functionality is provided on all hardware. A production system should not hide
    or simplify the hardware functionality.
  • A consumer system should not drop-out when a client misbehaves or the
    configuration changes (OTOH if it happens in exceptions it is not disastrous
    either). A synchronous pipeline is hence not advisable, clients need to supply
    their data asynchronously.
  • In a pro audio system a drop-out during reconfiguration is acceptable,
    during operation unacceptable.
  • In consumer audio we need to make compromises on resource usage,
    which pro audio does not have to commit to. Example: a pro audio
    system can issue memlock() with little limitations since the
    hardware is powerful (i.e. a lot of RAM available) and audio is the
    primary purpose. A consumer audio system cannot do that because that
    call practically makes memory unavailable to other applications,
    increasing their swap pressure. And since audio is not the primary
    purpose of the system and resources limited we hence need to find a
    different way.

Jack has been designed for low latencies, where synchronous
operation is advisable, meaning that a misbehaving client call stall
the entire pipeline. Changes of the pipeline or latencies usually
result in drop-outs in one way or the other, since the entire pipeline
is reconfigured, from the hardware to the various clients. Jack only
supports FLOAT32 samples and non-interleaved audio channels (and that
is a good thing). Jack does not employ reference-counted zero-copy
buffers. It does not try to simplify the hardware mixer in any
way.

PulseAudio OTOH can deal with varying latancies, dynamically
adjusting to the lowest latencies any of the connected clients
needs
. Client communication is fully asynchronous, a single client
cannot stall the entire pipeline. PulseAudio supports a variety of PCM
formats and channel setups. PulseAudio’s design is heavily based on
reference-counted zero-copy buffers that are passed around, even
between processes, instead of the audio data itself. PulseAudio tries
to simplify the hardware mixer as suggested above.

Now, the two paragraphs above hopefully show how Jack is more
suitable for the pro audio use case and PulseAudio more for the
consumer audio use case. One question asks itself though: can we marry
the two approaches? Yes, we probably can, MacOS has a unified approach
for both uses. However, it is not clear this would be a good
idea. First of all, a system with the complexities introduced by
sample format/channel mapping conversion, as well as dynamically
changing latencies and pipelines, and asynchronous behaviour would
certainly be much less attractive to pro audio developers. In fact,
that Jack limits itself to synchronous, FLOAT32-only,
non-interleaved-only audio streams is one of the big features of its
design. Marrying the two approaches would corrupt that. A merged
solution would probably not have a good stand in the community.

But it goes even further than this: what would the use case for
this be? After all, most of the time, you don’t want your event
sounds, your Youtube, your VoIP and your Rhythmbox mixed into the new
record you are producing. Hence a clear seperation between the two
worlds might even be handy?

Also, let’s not forget that we lack the manpower to even create
such an audio chimera.

So, where to from here? Well, I think we should put the focus on
cooperation instead of amalgamation: teach PulseAudio to go out of the
way as soon as Jack needs access to the device, and optionally make
PulseAudio a normal JACK client while both are running. That way, the
user has the option to use the PulseAudio supplied streams, but
normally does not see them in his pipeline. The first part of this has
already been implemented: Jack2 and PulseAudio do not fight for the
audio device, a friendly handover takes place. Jack takes precedence,
PulseAudio takes the back seat. The second part is still missing: you
still have to manually hookup PulseAudio to Jack if you are interested
in its streams. If both are implemented starting Jack basically has
the effect of replacing PulseAudio’s core with the Jack core, while
still providing full compatibility with PulseAudio clients.

And that I guess is all I have to say on the entire Jack and
PulseAudio story.

Oh, one more thing, while we are at clearing things up: some news
sites claim that PulseAudio’s not necessarily stellar reputation in
some parts of the community comes from Ubuntu and other distributions
having integrated it too early. Well, let me stress here explicitly,
that while they might have made a mistake or two in packaging
PulseAudio and I publicly pointed that out (and probably not in a too
friendly way), I do believe that the point in time they adopted it was
right. Why? Basically, it’s a chicken and egg problem. If it is not
used in the distributions it is not tested, and there is no pressure
to get fixed what then turns out to be broken: in PulseAudio itself,
and in both the layers on top and below of it. Don’t forget that
pushing a new layer into an existing stack will break a lot of
assumptions that the neighboring layers made. Doing this must
break things. Most Free Software projects could probably use more
developers, and that is particularly true for Audio on Linux. And
given that that is how it is, pushing the feature in at that point in
time was the right thing to do. Or in other words, if the features are
right, and things do work correctly as far as the limited test base
the developers control shows, then one day you need to push into the
distributions, even if this might break setups and software that
previously has not been tested, unless you want to stay stuck in your
development indefinitely. So yes, Ubuntu, I think you did well with
adopting PulseAudio when you did.

Footnotes

[1] Side note: yes, consumers tend not to know what dB is, and expect
volume settings in “percentages”, a mostly meaningless unit in
audio. This even spills into projects like VLC or Amarok which expose
linear volume controls (which is a really bad idea).

[2] In case you are wondering why that is the case: if the latency is
low the buffers must be sized smaller. And if the buffers are sized smaller
then the CPU will have to wake up more often to fill them up for the same
playback time. This drives up the CPU load since less actual payload can be
processed for the amount of housekeeping that the CPU has to do during each
buffer iteration. Also, frequent wake-ups make it impossible for the CPU to go
to deeper sleep states. Sleep states are the primary way for modern CPUs
to save power.

Denouncing vs. Advocating: In Defense of the Occasional Denouncement

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2009/10/11/denouncing-v-advocating.html

For the last decade, I’ve regularly seen complaints when we harder-core
software freedom advocates spend some time criticizing proprietary
software in addition to our normal work preserving, protecting and
promoting software freedom. While I think entire campaigns focused on
criticism are warranted in only extreme cases, I do believe that
denouncement of certain threatening proprietary technologies is a
necessary part of the software freedom movement, when done sparingly.

Denouncements are, of course, negative, and in general, negative
tactics are never as valuable as positive ones. Negative campaigns
alienate some people, and it’s always better to talk about the advantages
of software freedom than focus on the negative of proprietary
software.

The place where negative campaigns that denounce are simply necessary,
in my view, is when the practice either (a) will somehow completely
impeded the creation of FLOSS or (b) has become, or is becoming,
widespread among people who are otherwise supportive of software
freedom.

I can think quickly of two historical examples of the first type: UCITA
and DRM. UCITA was a State/Commonwealth-level law in the USA that was
proposed to make local laws more consistent regarding software
distribution. Because the implications were so bad
for software freedom (details of which are beyond scope of this post but
can be learned at the link)
, and because it was so unlikely that we
could get the UCITA drafts changed, it was necessary to publicly denounce
the law and hope that it didn’t pass. (Fortunately, it only ever passed
in my home state of Maryland and in Virginia. I am still, probably
pointlessly, careful never to distribute software when I visit my
hometown. 🙂

DRM, for its part, posed an even greater threat to software freedom
because its widespread adoption would require proprietarization of all
software that touched any television, movie, music, or book media. There
was also a concerted widespread pro-DRM campaign from USA corporations.
Therefore, grassroots campaigns denouncing DRM are extremely necessary
even despite that they are primarily negative in operation.

The second common need for denouncement when use of a proprietary
software package has become acceptable in the software freedom community.
The most common examples are usually specific proprietary software
programs that have become (or seem about to become) “all but
standard” part of the toolset for Free Software developers and
advocates.

Historically, this category included Java, and that’s why there were
anti-Java campaigns in the Free Software community that ran concurrently
with Free Software Java development efforts. The need for the former is
now gone, of course, because the latter efforts were so successful and we
have a fully FaiF Java system. Similarly, denouncement of Bitkeeper was
historically necessary, but is also now moot because of the advent and
widespread popularity of Mercurial, Git, and Bazaar.

Today, there are still a few proprietary programs that quickly rose to
ranks of “must install on my GNU/Linux system” for all but the
hardest-core Free Software advocates. The key examples are Adobe Flash
and Skype. Indeed, much to my chagrin, nearly all of my co-workers at
SFLC insist on using Adobe Flash, and nearly every Free Software developer
I meet at conferences uses it too. And, despite excellent VoIP technology
available as Free Software, Skype has sadly become widely used in our
community as well.

When a proprietary system becomes as pervasive in our community as
these have (or looks like it might), it’s absolutely time for
denouncement. It’s often very easy to forget that we’re relying more and
more heavily on proprietary software. When a proprietary system
effectively becomes the “default” for use on software freedom
systems, it means fewer people will be inspired to write a
replacement. (BTW, contribute to Gnash!) It means that Free Software
advocates will, in direct contradiction of their primary mission, start to
advocate that users install that proprietary software, because it
seems to make the FaiF platform “more useful”.

Hopefully, by now, most of us in the software freedom community agree
that proprietary software is a long term trap that we want to avoid.
However, in the short term, there is always some new shiny thing.
Something that appeals to our prurient desire for software that
“does something cool”. Something that just seems so
convenient that we convince ourselves we cannot live without it, so we
install it. Over time, short term becomes the long term, and suddenly we
have gaping holes in the Free Software infrastructure that only the very
few notice because the rest just install the proprietary thing. For
example, how many of us bother to install Linux Libre,
even long enough to at least know which of our hardware
components needs proprietary software? Even I have to admit I don’t do
this, and probably should.

An old adage of software development is that software is always better
if the developers of it actually have to use the thing from day to day.
If we agree that our goal is ultimately convincing everyone to run only
Free Software (and for that Free Software to fit their needs), then we
have to trailblaze by avoiding running proprietary software ourselves. If
you do run proprietary software, I hope you won’t celebrate the fact or
encourage others to do so. Skype is particularly insidious here, because
it’s a community application. Encouraging people to call you on Skype is
the same as emailing someone a Microsoft Word document: it’s encouraging
someone to install a proprietary application just to work with you.

Finally, I think the only answer to the FLOSS community
celebrating the arrival of some new proprietary program for
GNU/Linux is to denounce it, as a counterbalance to the fervor that such
an announcement causes. My podcast co-host Karen
often calls me the canary in the software coalmine because I am
usually the first to notice something that is bad for the advancement of
software freedom before anyone else does. In playing this role, I often
end up denouncing a few things here and there, although I can still count
on my two hands the times I’ve done so. I agree that advocacy should be
the norm, but the occasional denouncement is also a necessary part of the
picture.

(Note: this blog is part of an ongoing public discussion of a software
program that is not too popular yet, but was heralded widely as a win for
Free Software in the USA. I didn’t mention it by name mainly because I
don’t want to give it more press than it’s already gotten, as it is one of
this programs that is becoming a standard GNU/Linux user
application (at least in the USA), but hasn’t yet risen to the level of
ubiquity of the other examples I give above. Here’s to hoping that it
doesn’t.)

PulseAudio FUD

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/jeffrey-stedfast.html

Jeffrey Stedfast

Jeffrey Stedfast seems to have made it his new hobby
to
bash
PulseAudio.
In a series of very negative blog postings he flamed my software and hence me
in best NotZed-like fashion. Particularly interesting in this case is the
fact that he apologized to me privately on IRC for this behaviour shortly
after his first posting when he was critizised on #gnome-hackers —
only to continue flaming and bashing in more blog posts shortly after. Flaming
is very much part of the Free Software community I guess. A lot of people do
it from time to time (including me). But maybe there are better places for
this than Planet Gnome. And maybe doing it for days is not particularly nice.
And maybe flaming sucks in the first place anyway.

Regardless what I think about Jeffrey and his behaviour on Planet Gnome,
let’s have a look on his trophies, the five “bugs” he posted:

Not directly related to PulseAudio itself. Also, finding errors in code that is related to esd is not exactly the most difficult thing in the world.
The same theme.
Fixed 3 months ago. It is certainly not my fault that this isn’t available in Jeffrey’s distro.
A real, valid bug report. Fixed in git a while back, but not available in any released version. May only be triggered under heavy load or with a bad high-latency scheduler.
A valid bug, but not really in PulseAudio. Mostly caused because the ALSA API and PA API don’t really match 100%.

OK, Jeffrey found a real bug, but I wouldn’t say this is really enough to make all the fuss about. Or is it?

Why PulseAudio?

Jeffrey wrote something about ‘solution looking for a problem’ when
speaking of PulseAudio. While that was certainly not a nice thing to say it
however tells me one thing: I apparently didn’t manage to communicate well
enough why I am doing PulseAudio in the first place. So, why am I doing it then?

There’s so much more a good audio system needs to provide than just the
most basic mixing functionality. Per-application volumes, moving streams
between devices during playback, positional event sounds (i.e. click on the
left side of the screen, have the sound event come out through the left
speakers), secure session-switching support, monitoring of sound playback
levels, rescuing playback streams to other audio devices on hot unplug,
automatic hotplug configuration, automatic up/downmixing stereo/surround,
high-quality resampling, network transparency, sound effects, simultaneous
output to multiple sound devices are all features PA provides right now, and
what you don’t get without it. It also provides the infrastructure for
upcoming features like volume-follows-focus, automatic attenuation of music on
signal on VoIP stream, UPnP media renderer support, Apple RAOP support,
mixing/volume adjustments with dynamic range compression, adaptive volume of
event sounds based on the volume of music streams, jack sensing, switching
between stereo/surround/spdif during runtime, …

And even for the most basic mixing functionality plain ALSA/dmix is not
really everlasting happiness. Due to the way it works all clients are forced
to use the same buffering metrics all the time, that means all clients are
limited in their wakeup/latency settings. You will burn more CPU than
necessary this way, keep the risk of drop-outs unnecessarily high and still
not be able to make clients with low-latency requirements happy. ‘Glitch-Free’
PulseAudio
fixes all this. Quite frankly I believe that ‘glitch-free’
PulseAudio is the single most important killer feature that should be enough
to convince everyone why PulseAudio is the right thing to do. Maybe people
actually don’t know that they want this. But they absolutely do, especially
the embedded people — if used properly it is a must for power-saving during
audio playback. It’s a pity that how awesome this feature is you cannot
directly see from the user interface.[1]

PulseAudio provides compatibility with a lot of sound systems/APIs that bare ALSA
or bare OSS don’t provide.

And last but not least, I love breaking Jeffrey’s audio. It’s just soo much fun, you really have to try it! 😉

If you want to know more about why I think that PulseAudio is an important part of the modern Linux desktop audio stack, please read my slides from FOSS.in 2007.

Misconceptions

Many people (like Jeffrey) wonder why have software mixing at all if you
have hardware mixing? The thing is, hardware mixing is a thing of the past,
modern soundcards don’t do it anymore. Precisely for doing things like mixing
in software SIMD CPU extensions like SSE have been invented. Modern sound
cards these days are kind of “dumbed” down, high-quality DACs. They don’t do
mixing anymore, many modern chips don’t even do volume control anymore.
Remember the days where having a Wavetable chip was a killer feature of a
sound card? Those days are gone, today wavetable synthesizing is done almost
exlcusively in software — and that’s exactly what happened to hardware mixing
too. And it is good that way. In software mixing is is much easier to do
fancier stuff like DRC which will increase quality of mixing. And modern CPUs provide
all the necessary SIMD command sets to implement this efficiently.

Other people believe that JACK would be a better solution for the problem.
This is nonsense. JACK has been designed for a very different purpose. It is
optimized for low latency inter-application communication. It requires
floating point samples, it knows nothing about channel mappings, it depends on
every client to behave correctly. And so on, and so on. It is a sound server
for audio production. For desktop applications it is however not well suited.
For a desktop saving power is very important, one application misbehaving
shouldn’t have an effect on other application’s playback; converting from/to
FP all the time is not going to help battery life either. Please understand
that for the purpose of pro audio you can make completely different
compromises than you can do on the desktop. For example, while having
‘glitch-free’ is great for embedded and desktop use, it makes no sense at all
for pro audio, and would only have a drawback on performance. So, please stop
bringing up JACK again and again. It’s just not the right tool for desktop
audio, and this opinion is shared by the JACK developers themselves.

Jeffrey thinks that audio mixing is nothing for userspace. Which is
basically what OSS4 tries to do: mixing in kernel space. However, the future
of PCM audio is floating points. Mixing them in kernel space is problematic because (at least on Linux) FP in kernel space is a no-no.
Also, the kernel people made clear more than once that maths/decoding/encoding like this
should happen in userspace. Quite honestly, doing the mixing in kernel space
is probably one of the primary reasons why I think that OSS4 is a bad idea.
The fancier your mixing gets (i.e. including resampling, upmixing, downmixing,
DRC, …) the more difficulties you will have to move such a complex,
time-intensive code into the kernel.

Not everytime your audio breaks it is alone PulseAudio’s fault. For
example, the original flame of Jeffrey’s was about the low volume that he
experienced when running PA. This is mostly due to the suckish way we
initialize the default volumes of ALSA sound cards. Most distributions have
simple scripts that initialize ALSA sound card volumes to fixed values like
75% of the available range, without understanding what the range or the
controls actually mean. This is actually a very bad thing to do. Integrated
USB speakers for example tend export the full amplification range via the
mixer controls. 75% for them is incredibly loud. For other hardware (like
apparently Jeffrey’s) it is too low in volume. How to fix this has been
discussed on the ALSA mailing list, but no final solution has been presented
yet. Nonetheless, the fact that the volume was too low, is completely
unrelated to PulseAudio.

PulseAudio interfaces with lower-level technologies like ALSA on one hand,
and with high-level applications on the other hand. Those systems are not
perfect. Especially closed-source applications tend to do very evil things
with the audio APIs (Flash!) that are very hard to support on virtualized
sound systems such as PulseAudio [2]. However, things are getting better. My list of issues I found in
ALSA
is getting shorter. Many applications have already been fixed.

The reflex “my audio is broken it must be PulseAudio’s fault” is certainly
easy to come up with, but it certainly is not always right.

Also note that — like many areas in Free Software — development of the
desktop audio stack on Linux is a bit understaffed. AFAIK there are only two
people working on ALSA full-time and only me working on PulseAudio and other
userspace audio infrastructure, assisted by a few others who supply code and patches
from time to time, some more and some less.

More Breakage to Come

I now tried to explain why the audio experience on systems with PulseAudio
might not be as good as some people hoped, but what about the future? To be
frank: the next version of PulseAudio (0.9.11) will break even more things.
The ‘glitch-free’ stuff mentioned above uses quite a few features of the
underlying ALSA infrastructure that apparently noone has been using before —
and which just don’t work properly yet on all drivers. And there are quite a
few drivers around, and I only have a very limited set of hardware to test
with. Already I know that the some of the most popular drivers (USB and HDA)
do not work entirely correctly with ‘glitch-free’.

So you ask why I plan to release this code knowing that it will break
things? Well, it works on some hardware/drivers properly, and for the others I
know work-arounds to get things to work. And 0.9.11 has been delayed for too
long already. Also I need testing from a bigger audience. And it is not so
much 0.9.11 that is buggy, it is the code it is based on. ‘Glitch-free’ PA
0.9.11 is going to part of Fedora 10. Fedora has always been more bleeding
edge than other other distributions. Picking 0.9.11 just like that for an
‘LTS’ release might however be a not a good idea.

So, please bear with me when I release 0.9.11. Snapshots have already
been available in Rawhide for a while, and hell didn’t freeze over.

The Distributions’ Role in the Game

Some distributions did a better job adopting PulseAudio than others. On the
good side I certainly have to list Mandriva, Debian[3], and
Fedora[4]. OTOH Ubuntu didn’t exactly do a stellar job. They didn’t
do their homework. Adopting PA in a distribution is a fair amount of work,
given that it interfaces with so many different things at so many different
places. The integration with other systems is crucial. The information was all
out there, communicated on the wiki, the mailing lists and on the PA IRC
channel. But if you join and hang around on neither, then you won’t get the
memo. To my surprise when Ubuntu adopted PulseAudio they moved into one of their
‘LTS’ releases rightaway [5]. Which I guess can be called gutsy —
on the background that I work for Red Hat and PulseAudio is not part of RHEL
at this time. I get a lot of flak from Ubuntu users, and I am pretty sure the
vast amount of it is undeserving and not my fault.

Why Jeffrey’s distro of choice (SUSE?) didn’t package pavucontrol 0.9.6
although it has been released months ago I don’t know. But there’s certainly no reason to whine about
that to me
and bash me for it.

Having said all this — it’s easy to point to other software’s faults or
other people’s failures. So, admitting this, PulseAudio is certainly not
bug-free, far from that. It’s a relatively complex piece of software
(threading, real-time, lock-free, sensitive to timing, …), and every
software has its bugs. In some workloads they might be easier to find than it
others. And I am working on fixing those which are found. I won’t forget any
bug report, but the order and priority I work on them is still mostly up to me
I guess, right? There’s still a lot of work to do in desktop audio, it will
take some time to get things completely right and complete.

Calls for “audio should just work ™” are often heard. But if you don’t
want to stick with a sound system that was state of the art in the 90’s for
all times, then I fear things *will have* to break from time to time. And
Jeffrey, I have no idea what you are actually hacking on. Some people
mentioned something with Evolution. If that’s true, then quite honestly,
“email should just work”, too, shouldn’t it? Evolution is not exactly
famous for it’s legendary bug-freeness and stability, or did I miss something?
Maybe you should be the one to start with making things “just work”, especially since
Evolution has been around for much longer already.

Back to Work

Now that I responded to Jeffrey’s FUD I think we all can go back to work
and end this flamefest! I wish people would actually try to understand
things before writing an insulting rant — without the slightest clue — but
with words like “clusterfuck”. I’d like to thank all the people who commented
on Jeffrey’s blog and basically already said what I wrote here
now.

So, and now I am off hacking a bit on PulseAudio a bit more — or should
I say in Jeffrey’s words: on my clusterfuck that is an epic fail and that no desktop user needs?

Footnotes

[1] BTW ‘glitch-free’ is nothing I invented, other OS have been doing something
like this for quite a while (Vista, Mac OS). On Linux however, PulseAudio is
the first and only implementation (at least to my knowledge).

[2] In fact, Flash 9 can not be made fully working on PulseAudio.
This is because the way Flash destructs it’s driver backends is racy.
Unfixably racy, from external code. Jeffrey complained about Flash instability
in his second post. This is unfair to PulseAudio, because I cannot fix this.
This is like complaining that X crashes when you use binary-only
fglrx.

[3] To Debian’s standards at least. Since development of Debian is
very distributed the integration of such a system as PulseAudio is much more
difficult since in touches so many different packages in the system that are
kind of private property by a lot of different maintainers with different
views on things.

[4] I maintain the Fedora stuff myself, so I might be a bit biased on this one… 😉

[5] I guess Ubuntu sees that this was a bit too much too early, too.
At least that’s how I understood my invitation to UDS in Prague. Since that
summit I haven’t heard anything from them anymore, though.

PulseAudio FUD

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/jeffrey-stedfast.html

Jeffrey Stedfast

Jeffrey Stedfast seems to have made it his new hobby
to
bash
PulseAudio.
In a series of very negative blog postings he flamed my software and hence me
in best NotZed-like fashion. Particularly interesting in this case is the
fact that he apologized to me privately on IRC for this behaviour shortly
after his first posting when he was critizised on #gnome-hackers
only to continue flaming and bashing in more blog posts shortly after. Flaming
is very much part of the Free Software community I guess. A lot of people do
it from time to time (including me). But maybe there are better places for
this than Planet Gnome. And maybe doing it for days is not particularly nice.
And maybe flaming sucks in the first place anyway.

Regardless what I think about Jeffrey and his behaviour on Planet Gnome,
let’s have a look on his trophies, the five “bugs” he posted:

  1. Not directly related to PulseAudio itself. Also, finding errors in code that is related to esd is not exactly the most difficult thing in the world.
  2. The same theme.
  3. Fixed 3 months ago. It is certainly not my fault that this isn’t available in Jeffrey’s distro.
  4. A real, valid bug report. Fixed in git a while back, but not available in any released version. May only be triggered under heavy load or with a bad high-latency scheduler.
  5. A valid bug, but not really in PulseAudio. Mostly caused because the ALSA API and PA API don’t really match 100%.

OK, Jeffrey found a real bug, but I wouldn’t say this is really enough to make all the fuss about. Or is it?

Why PulseAudio?

Jeffrey wrote something about ‘solution looking for a problem‘ when
speaking of PulseAudio. While that was certainly not a nice thing to say it
however tells me one thing: I apparently didn’t manage to communicate well
enough why I am doing PulseAudio in the first place. So, why am I doing it then?

  • There’s so much more a good audio system needs to provide than just the
    most basic mixing functionality. Per-application volumes, moving streams
    between devices during playback, positional event sounds (i.e. click on the
    left side of the screen, have the sound event come out through the left
    speakers), secure session-switching support, monitoring of sound playback
    levels, rescuing playback streams to other audio devices on hot unplug,
    automatic hotplug configuration, automatic up/downmixing stereo/surround,
    high-quality resampling, network transparency, sound effects, simultaneous
    output to multiple sound devices are all features PA provides right now, and
    what you don’t get without it. It also provides the infrastructure for
    upcoming features like volume-follows-focus, automatic attenuation of music on
    signal on VoIP stream, UPnP media renderer support, Apple RAOP support,
    mixing/volume adjustments with dynamic range compression, adaptive volume of
    event sounds based on the volume of music streams, jack sensing, switching
    between stereo/surround/spdif during runtime, …
  • And even for the most basic mixing functionality plain ALSA/dmix is not
    really everlasting happiness. Due to the way it works all clients are forced
    to use the same buffering metrics all the time, that means all clients are
    limited in their wakeup/latency settings. You will burn more CPU than
    necessary this way, keep the risk of drop-outs unnecessarily high and still
    not be able to make clients with low-latency requirements happy. ‘Glitch-Free’
    PulseAudio
    fixes all this. Quite frankly I believe that ‘glitch-free’
    PulseAudio is the single most important killer feature that should be enough
    to convince everyone why PulseAudio is the right thing to do. Maybe people
    actually don’t know that they want this. But they absolutely do, especially
    the embedded people — if used properly it is a must for power-saving during
    audio playback. It’s a pity that how awesome this feature is you cannot
    directly see from the user interface.[1]
  • PulseAudio provides compatibility with a lot of sound systems/APIs that bare ALSA
    or bare OSS don’t provide.
  • And last but not least, I love breaking Jeffrey’s audio. It’s just soo much fun, you really have to try it! 😉

If you want to know more about why I think that PulseAudio is an important part of the modern Linux desktop audio stack, please read my slides from FOSS.in 2007.

Misconceptions

Many people (like Jeffrey) wonder why have software mixing at all if you
have hardware mixing? The thing is, hardware mixing is a thing of the past,
modern soundcards don’t do it anymore. Precisely for doing things like mixing
in software SIMD CPU extensions like SSE have been invented. Modern sound
cards these days are kind of “dumbed” down, high-quality DACs. They don’t do
mixing anymore, many modern chips don’t even do volume control anymore.
Remember the days where having a Wavetable chip was a killer feature of a
sound card? Those days are gone, today wavetable synthesizing is done almost
exlcusively in software — and that’s exactly what happened to hardware mixing
too. And it is good that way. In software mixing is is much easier to do
fancier stuff like DRC which will increase quality of mixing. And modern CPUs provide
all the necessary SIMD command sets to implement this efficiently.

Other people believe that JACK would be a better solution for the problem.
This is nonsense. JACK has been designed for a very different purpose. It is
optimized for low latency inter-application communication. It requires
floating point samples, it knows nothing about channel mappings, it depends on
every client to behave correctly. And so on, and so on. It is a sound server
for audio production. For desktop applications it is however not well suited.
For a desktop saving power is very important, one application misbehaving
shouldn’t have an effect on other application’s playback; converting from/to
FP all the time is not going to help battery life either. Please understand
that for the purpose of pro audio you can make completely different
compromises than you can do on the desktop. For example, while having
‘glitch-free’ is great for embedded and desktop use, it makes no sense at all
for pro audio, and would only have a drawback on performance. So, please stop
bringing up JACK again and again. It’s just not the right tool for desktop
audio, and this opinion is shared by the JACK developers themselves.

Jeffrey thinks that audio mixing is nothing for userspace. Which is
basically what OSS4 tries to do: mixing in kernel space. However, the future
of PCM audio is floating points. Mixing them in kernel space is problematic because (at least on Linux) FP in kernel space is a no-no.
Also, the kernel people made clear more than once that maths/decoding/encoding like this
should happen in userspace. Quite honestly, doing the mixing in kernel space
is probably one of the primary reasons why I think that OSS4 is a bad idea.
The fancier your mixing gets (i.e. including resampling, upmixing, downmixing,
DRC, …) the more difficulties you will have to move such a complex,
time-intensive code into the kernel.

Not everytime your audio breaks it is alone PulseAudio’s fault. For
example, the original flame of Jeffrey’s was about the low volume that he
experienced when running PA. This is mostly due to the suckish way we
initialize the default volumes of ALSA sound cards. Most distributions have
simple scripts that initialize ALSA sound card volumes to fixed values like
75% of the available range, without understanding what the range or the
controls actually mean. This is actually a very bad thing to do. Integrated
USB speakers for example tend export the full amplification range via the
mixer controls. 75% for them is incredibly loud. For other hardware (like
apparently Jeffrey’s) it is too low in volume. How to fix this has been
discussed on the ALSA mailing list, but no final solution has been presented
yet. Nonetheless, the fact that the volume was too low, is completely
unrelated to PulseAudio.

PulseAudio interfaces with lower-level technologies like ALSA on one hand,
and with high-level applications on the other hand. Those systems are not
perfect. Especially closed-source applications tend to do very evil things
with the audio APIs (Flash!) that are very hard to support on virtualized
sound systems such as PulseAudio [2]. However, things are getting better. My list of issues I found in
ALSA
is getting shorter. Many applications have already been fixed.

The reflex “my audio is broken it must be PulseAudio’s fault” is certainly
easy to come up with, but it certainly is not always right.

Also note that — like many areas in Free Software — development of the
desktop audio stack on Linux is a bit understaffed. AFAIK there are only two
people working on ALSA full-time and only me working on PulseAudio and other
userspace audio infrastructure, assisted by a few others who supply code and patches
from time to time, some more and some less.

More Breakage to Come

I now tried to explain why the audio experience on systems with PulseAudio
might not be as good as some people hoped, but what about the future? To be
frank: the next version of PulseAudio (0.9.11) will break even more things.
The ‘glitch-free’ stuff mentioned above uses quite a few features of the
underlying ALSA infrastructure that apparently noone has been using before —
and which just don’t work properly yet on all drivers. And there are quite a
few drivers around, and I only have a very limited set of hardware to test
with. Already I know that the some of the most popular drivers (USB and HDA)
do not work entirely correctly with ‘glitch-free’.

So you ask why I plan to release this code knowing that it will break
things? Well, it works on some hardware/drivers properly, and for the others I
know work-arounds to get things to work. And 0.9.11 has been delayed for too
long already. Also I need testing from a bigger audience. And it is not so
much 0.9.11 that is buggy, it is the code it is based on. ‘Glitch-free’ PA
0.9.11 is going to part of Fedora 10. Fedora has always been more bleeding
edge than other other distributions. Picking 0.9.11 just like that for an
‘LTS’ release might however be a not a good idea.

So, please bear with me when I release 0.9.11. Snapshots have already
been available in Rawhide for a while, and hell didn’t freeze over.

The Distributions’ Role in the Game

Some distributions did a better job adopting PulseAudio than others. On the
good side I certainly have to list Mandriva, Debian[3], and
Fedora[4]. OTOH Ubuntu didn’t exactly do a stellar job. They didn’t
do their homework. Adopting PA in a distribution is a fair amount of work,
given that it interfaces with so many different things at so many different
places. The integration with other systems is crucial. The information was all
out there, communicated on the wiki, the mailing lists and on the PA IRC
channel. But if you join and hang around on neither, then you won’t get the
memo. To my surprise when Ubuntu adopted PulseAudio they moved into one of their
‘LTS’ releases rightaway [5]. Which I guess can be called gutsy —
on the background that I work for Red Hat and PulseAudio is not part of RHEL
at this time. I get a lot of flak from Ubuntu users, and I am pretty sure the
vast amount of it is undeserving and not my fault.

Why Jeffrey’s distro of choice (SUSE?) didn’t package pavucontrol 0.9.6
although it has been released months ago I don’t know. But there’s certainly no reason to whine about
that to me
and bash me for it.

Having said all this — it’s easy to point to other software’s faults or
other people’s failures. So, admitting this, PulseAudio is certainly not
bug-free, far from that. It’s a relatively complex piece of software
(threading, real-time, lock-free, sensitive to timing, …), and every
software has its bugs. In some workloads they might be easier to find than it
others. And I am working on fixing those which are found. I won’t forget any
bug report, but the order and priority I work on them is still mostly up to me
I guess, right? There’s still a lot of work to do in desktop audio, it will
take some time to get things completely right and complete.

Calls for “audio should just work ™” are often heard. But if you don’t
want to stick with a sound system that was state of the art in the 90’s for
all times, then I fear things *will have* to break from time to time. And
Jeffrey, I have no idea what you are actually hacking on. Some people
mentioned something with Evolution. If that’s true, then quite honestly,
“email should just work”, too, shouldn’t it? Evolution is not exactly
famous for it’s legendary bug-freeness and stability, or did I miss something?
Maybe you should be the one to start with making things “just work”, especially since
Evolution has been around for much longer already.

Back to Work

Now that I responded to Jeffrey’s FUD I think we all can go back to work
and end this flamefest! I wish people would actually try to understand
things before writing an insulting rant — without the slightest clue — but
with words like “clusterfuck”. I’d like to thank all the people who commented
on Jeffrey’s blog and basically already said what I wrote here
now.

So, and now I am off hacking a bit on PulseAudio a bit more — or should
I say in Jeffrey’s words: on my clusterfuck that is an epic fail and that no desktop user needs?

Footnotes

[1] BTW ‘glitch-free’ is nothing I invented, other OS have been doing something
like this for quite a while (Vista, Mac OS). On Linux however, PulseAudio is
the first and only implementation (at least to my knowledge).

[2] In fact, Flash 9 can not be made fully working on PulseAudio.
This is because the way Flash destructs it’s driver backends is racy.
Unfixably racy, from external code. Jeffrey complained about Flash instability
in his second post. This is unfair to PulseAudio, because I cannot fix this.
This is like complaining that X crashes when you use binary-only
fglrx.

[3] To Debian’s standards at least. Since development of Debian is
very distributed the integration of such a system as PulseAudio is much more
difficult since in touches so many different packages in the system that are
kind of private property by a lot of different maintainers with different
views on things.

[4] I maintain the Fedora stuff myself, so I might be a bit biased on this one… 😉

[5] I guess Ubuntu sees that this was a bit too much too early, too.
At least that’s how I understood my invitation to UDS in Prague. Since that
summit I haven’t heard anything from them anymore, though.

Stop Obsessing and Just Do It: VoIP Encryption Is Easier than You Think

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2008/06/20/voip-encryption-easy.html

Ian Sullivan showed me
an article
that he read about eavesdropping on Internet telephony calls
. I’m baffled at
the obsession about this issue on two fronts. First, I am amazed that
people want to hand their phone calls over to yet another proprietary
vendor (aka Skype) using unpublished, undocumented non-standard
protocols and who respects your privacy even less than the traditional
PSTN vendors. Second, I don’t understand why cryptography experts
believe we need to develop complicated new technology to solve this
problem in the medium term.

At SFLC, I set up the telephony system as VoIP with encryption on
every possible leg. While SFLC sometimes uses Skype, I don’t, of course, because it is (a)
proprietary software and (b) based on an undocumented protocol, (c)
controlled by a company that has less respect for users’ privacy than
the PSTN companies themselves. Indeed, security was actually last on
our list for reasons to reject Skype, because we already had a simple
solution for encrypting our telephony traffic: All calls are made
through a VPN.

Specifically, at SFLC, I set up a system whereby all users have an OpenVPN connection back to the
home office. From there, they have access to register a SIP client to
an internal Asterisk server living inside the VPN network.
Using that SIP phone, they could call any SFLC employee, fully encrypted. That call
continues either on the internal secured network, or back out over the
same VPN to the other SIP client. Users can also dial out from there to any
PSTN DID.

Of course, when calling the PSTN, the encryption ends at SFLC’s office, but that’s the PSTN’s fault, not ours. No technological solution — save using a modem to turn that traffic digital — can easily solve that. However,
with minimal effort, and using existing encryption subsystems, we have
end-to-end encryption for all employee-to-employee calls.

And it could go even further with a day’s effort of work! I have a
pretty simple idea on how to have an encrypted call to anyone
who happens to have a SIP client and an OpenVPN client. My plan is to
make a public OpenVPN server that accepts connection from any
host at all, that would then allow encrypted “phone the
office” calls to any SFLC phone with any SIP client anywhere on
the Internet. In this way, anyone wishing end-to-end phone encryption
to the SFLC need only connect to that publicly accessible OpenVPN and
dial our extensions with their SIP client over that line. This solution
even has the added bonus that it avoids the common firewall and NAT
related SIP problems, since all traffic gets tunneled through the
OpenVPN: if OpenVPN (which is, unlike SIP, a single-port UDP/IP protocol)
works, SIP automatically does!

The main criticism of this technique regards the silliness of two
employees at a conference in San Francisco bouncing all the way through
our NYC offices just to make a call to each other. While the Bandwidth
Wasting Police might show up at my door someday, I don’t actually find
this to be a serious problem. The last mile is always the problem in
Internet telephony, so a call that goes mostly across a single set of
last mile infrastructure in a particular municipality is no worse nor
better than one that takes a long haul round trip. Very occasionally,
there is a half second of delay when you have a few VPN-based users on a
conference call together, but that has a nice social side effect of
stopping people from trying to interrupt each other.

Finally, the article linked above talks about the issue of variable bit
rate compression changing packet size such that even encrypted packets
yield possible speech information, since some sounds need larger packets
than others. This problem is solved simply for us with two systems: (a)
we
use µ-law,
a very old, constant bit rate codec
, and (b) a tiny bit of entropy
is added to our packets by default, because the encryption is occurring
for all traffic across the VPN connection, not just the phone
call itself. Remember: all the traffic is going together across the one
OpenVPN UDP port, so an eavesdropper would need to detangle the VoIP
traffic from everything else. Indeed, I could easily make (b) even
stronger by simply having the SIP client open another connection back to
the asterisk host and exchange payloads generated
from /dev/random back and forth while the phone call is
going on.

This is really one of those cases where the simpler the solution, the
more secure it is. Trying to focus on “encryption of VoIP and VoIP only” is
what leads us to the kinds of vulnerabilities described in that article.
VoIP isn’t like email, where you always need an encryption-unaware
delivery mechanism between Alice and Bob. I
believe I’ve described a simple mechanism that can allow anyone with an
Asterisk box, an OpenVPN server, and an Internet connection to publish to the world easy instructions for phoning them securely with merely a SIP client plus and OpenVPN client. Why don’t
we just take the easy and more secure route and do our VoIP this
way?