Understanding Conservancy Through the GSoC Lens

2014-09-11 Bradley M. Kuhn

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2014/09/11/gsoc-conservancy.html

[ A version of this post originally appeared
on the
Google Open Source Blog, and was cross-posted
on Conservancy’s
blog. ]

Software Freedom Conservancy, Inc. is a 501(c)(3) non-profit charity that
serves as a home to Open Source and Free Software projects. Such is easily
said, but in this post I’d like to discuss what that means in practice for an
Open Source and Free Software project and why such projects need a
non-profit home. In short, a non-profit home makes the lives of Free
Software developers easier, because they have less work to do outside of
their area of focus (i.e., software development and documentation).

As the summer of 2014 ends, Google Summer of Code (GSoC) coordnation work exemplifies the value a non-profit home brings its Free
Software projects. GSoC
is likely the largest philanthropic program in the Open Source and Free
Software community today. However, one of the most difficult things for
organizations that seek to take advantage of such programs is the
administrative overhead necessary to take full advantage of the program.
Google invests heavily in making it easy for organizations to participate
in the program — such as by handling the details of stipend payments
to students directly. However, to take full advantage of any philanthropic
program, the benefiting organization has some work to do. For its member
projects, Conservancy is the organization that gets that logistical work
done.

For example, Google kindly donates $500 to the mentoring organization for
every student it mentors. However, these funds need to go
“somewhere”. If the funds go to an individual, there are two
inherent problems. First, that individual is responsible for taxes on that
income. Second, funds that belong to an organization as a whole are now in
the bank account of a single project leader. Conservancy solves both those
problems: as a tax-exempt charity, the mentor payments are available for
organizational use under its tax exemption. Furthermore, Conservancy
maintains earmarked funds for each of its projects. Thus, Conservancy
keeps the mentor funds for the Free Software project, and the project
leaders can later vote to make use of the funds in a manner that helps the
project and Conservancy’s charitable mission. Often, projects in
Conservancy use their mentor funds to send developers to important
conferences to speak about the project and recruit new developers and
users.

Meanwhile, Google also offers to pay travel expenses for two mentors from
each mentoring organization to attend the annual GSoC Mentor Summit (and,
this year, it’s an even bigger Reunion conference!). Conservancy handles
this work on behalf of its member projects in two directions. First, for
developers who don’t have a credit card or otherwise are unable to pay for
their own flight and receive reimbursement later, Conservancy staff book
the flights on Conservancy’s credit card. For the other travelers,
Conservancy handles the reimbursement details. On the back end of all of
this, Conservancy handles all the overhead annoyances and issues in
requesting the POs from Google, invoicing for the funds, and tracking to
ensure payment is made. While the Google staff is incredibly responsive
and helpful on these issues, the Googlers need someone on the project’s
side to take care of the details. That’s what Conservancy does.

GSoC coordination is just one of the many things that Conservancy does
every day for its member projects. If there’s anything other than software
development and documentation that you can imagine a project needs,
Conservancy does that job for its member projects. This includes not only
mundane items such as travel coordination, but also issues as complex as
trademark filings and defense, copyright licensing advice and enforcement,
governance coordination and mentoring, and fundraising for the projects.
Some of Conservancy’s member projects have been so successful in
Conservancy that they’ve been able to fund developer salaries — often
part-time but occasionally full-time — for years on end to allow them
to focus on improving the project’s software for the public benefit.

Finally, if your project seeks help with regard to handling its GSoC
funds and travel, or anything else mentioned
on Conservancy’s list
of services to member projects, Conservancy is welcoming
new applications for
membership. Your project could
join Conservancy’s more
than thirty other member projects and receive these wonderful services
to help your community grow and focus on its core mission of building
software for the public good.

Understanding Conservancy Through the GSoC Lens

2014-09-11 Bradley M. Kuhn

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2014/09/11/gsoc-conservancy.html

[ A version of this post originally appeared
on the
Google Open Source Blog, and was cross-posted
on Conservancy’s
blog. ]

Understanding Conservancy Through the GSoC Lens

2014-09-11 Bradley M. Kuhn

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2014/09/11/gsoc-conservancy.html

[ A version of this post originally appeared
on the
Google Open Source Blog, and was cross-posted
on Conservancy’s
blog. ]

Understanding Conservancy Through the GSoC Lens

2014-09-11 Bradley M. Kuhn

Post Syndicated from Bradley M. Kuhn original http://ebb.org/bkuhn/blog/2014/09/11/gsoc-conservancy.html

[ A version of this post originally appeared
on the
Google Open Source Blog, and was cross-posted
on Conservancy’s
blog. ]

Неделя, 7 Септември 2014

2014-09-08 georgi

Post Syndicated from georgi original http://georgi.unixsol.org/diary/archive.php/2014-09-07

Качването до хижа Алеко не е особено трудно, повече е досадно. Паветата са
гадост, особено в началото на пътя големите направо ти пречат да караш.
Едва стигнах до отбивката за Драгалевския манастир, където вече можеше
да се кара горе-долу нормално. Хубавото беше, че почти нямаше движение
нито нагоре, нито надолу и можех да избирам къде да карам, защото по
крайщата на пътя е доста кофти.

От горната отбивка за манастира до хижата с много почивки и без напъване
се качих за час и 35 мин. Което си направо бавно като се има предвид,
че става дума за 10.5 км. и именно от тази бавност е и коментарът ми
за досадата. Не зная колко е денивелацията (трябва да я сметна, но би
трябвало да е поне 850-900 м.).

На връщане имаше моменти когато колелото върви с 60+ км/ч и си е
страшничко. При над час и половина качването, спускането стана за
~17 мин (до горната отбивка за манастира).

Humans!

2014-09-07 Oglaf! -- Comics. Often dirty.

Post Syndicated from Oglaf! -- Comics. Often dirty. original https://www.oglaf.com/humans/

CVE-2014-1564: Uninitialized memory with truncated images in Firefox

2014-09-03 Unknown

Post Syndicated from Unknown original https://lcamtuf.blogspot.com/2014/09/cve-2014-1564-uninitialized-memory-when.html

The recent release of Firefox 32 fixes another interesting image parsing issue found by american fuzzy lop: following a refactoring of memory management code, the past few versions of the browser ended up using uninitialized memory for certain types of truncated images, which is easily measurable with a simple <canvas> + toDataURL() harness that examines all the fuzzer-generated test cases.

In general, problems like that may leak secrets across web origins, or more prosaically, may help attackers bypass security measures such as ASLR. For a slightly more detailed discussion, check out this post.

Here’s a short proof-of-concept that should work if you haven’t updated to 32 yet:

http://lcamtuf.coredump.cx/ffgif/

This is tracked as CVE-2014-1564, Mozilla bug 1045977. Several more should be coming soon.

Понеделник, 1 Септември 2014

2014-09-02 georgi

Post Syndicated from georgi original http://georgi.unixsol.org/diary/archive.php/2014-09-01

Тръгнах с идея да карам до Кладница, но силите ми стигнаха повече (всъщност ми
свърши времето) и откарах чак до края на Боснек (белият дроб на Перник), където
в началото на селото нещо гореше и едвам се дишаше (бахти и белият дроб). Щях да
продължа, но стана късно и добре че реших да се върна, че половината от 30-те
километра до София ги карах в тъмното. До края на Боснек стигнах за около два
часа и половина от които 20-25 минути бяха почивки.

От Водния канал (над Владая) до Кладница е едно от най-готините карания, които
съм правил. Минават се около 15 километра по тясна горска пътека с леко спускане,
на която двама човека не могат да се разминат, но пък спокойно се поддържа скорост
около 25 км/ч при спускане. Ако има повече колоездачи по пътеката сигурно ще е опасно,
но понеже бях сам се изкефих максимално.

Изкачването от Владая до Водния канал е гадно и там е хубаво да се бута или носи
колелото. Изобщо където има яко изровен път с големи камъни и е по-стръмно, май е
най-добре да се минава пеша. Не се губи много време, а и в каране ще се хвърлят
повече сили отколкото ще е полезно.

70+ километра каране, 40 отиване през гори и камъняци и 30 връщане по магистрала
и “първокласен” път и естествено задната гума я спуках на пет километра от вкъщи.
Да ви сера на първокласните български пътища. В гората е по-безопасно…

Revisiting How We Put Together Linux Systems

2014-09-01 Lennart Poettering

Post Syndicated from Lennart Poettering original https://0pointer.net/blog/revisiting-how-we-put-together-linux-systems.html

In a previous blog story I discussed
Factory Reset, Stateless Systems, Reproducible Systems & Verifiable Systems,
I now want to take the opportunity to explain a bit where we want to
take this with
systemd in the
longer run, and what we want to build out of it. This is going to be a
longer story, so better grab a cold bottle of
Club Mate before you start
reading.

Traditional Linux distributions are built around packaging systems
like RPM or dpkg, and an organization model where upstream developers
and downstream packagers are relatively clearly separated: an upstream
developer writes code, and puts it somewhere online, in a tarball. A
packager than grabs it and turns it into RPMs/DEBs. The user then
grabs these RPMs/DEBs and installs them locally on the system. For a
variety of uses this is a fantastic scheme: users have a large
selection of readily packaged software available, in mostly uniform
packaging, from a single source they can trust. In this scheme the
distribution vets all software it packages, and as long as the user
trusts the distribution all should be good. The distribution takes the
responsibility of ensuring the software is not malicious, of timely
fixing security problems and helping the user if something is wrong.

Upstream Projects

However, this scheme also has a number of problems, and doesn’t fit
many use-cases of our software particularly well. Let’s have a look at
the problems of this scheme for many upstreams:

Upstream software vendors are fully dependent on downstream
distributions to package their stuff. It’s the downstream
distribution that decides on schedules, packaging details, and how
to handle support. Often upstream vendors want much faster release
cycles then the downstream distributions follow.
Realistic testing is extremely unreliable and next to
impossible. Since the end-user can run a variety of different
package versions together, and expects the software he runs to just
work on any combination, the test matrix explodes. If upstream tests
its version on distribution X release Y, then there’s no guarantee
that that’s the precise combination of packages that the end user
will eventually run. In fact, it is very unlikely that the end user
will, since most distributions probably updated a number of
libraries the package relies on by the time the package ends up being
made available to the user. The fact that each package can be
individually updated by the user, and each user can combine library
versions, plug-ins and executables relatively freely, results in a high
risk of something going wrong.
Since there are so many different distributions in so many different
versions around, if upstream tries to build and test software for
them it needs to do so for a large number of distributions, which is
a massive effort.
The distributions are actually quite different in many ways. In
fact, they are different in a lot of the most basic
functionality. For example, the path where to put x86-64 libraries
is different on Fedora and Debian derived systems..
Developing software for a number of distributions and versions is
hard: if you want to do it, you need to actually install them, each
one of them, manually, and then build your software for each.
Since most downstream distributions have strict licensing and
trademark requirements (and rightly so), any kind of closed source
software (or otherwise non-free) does not fit into this scheme at
all.

This all together makes it really hard for many upstreams to work
nicely with the current way how Linux works. Often they try to improve
the situation for them, for example by bundling libraries, to make
their test and build matrices smaller.

System Vendors

The toolbox approach of classic Linux distributions is fantastic for
people who want to put together their individual system, nicely
adjusted to exactly what they need. However, this is not really how
many of today’s Linux systems are built, installed or updated. If you
build any kind of embedded device, a server system, or even user
systems, you frequently do your work based on complete system images,
that are linearly versioned. You build these images somewhere, and
then you replicate them atomically to a larger number of systems. On
these systems, you don’t install or remove packages, you get a defined
set of files, and besides installing or updating the system there are
no ways how to change the set of tools you get.

The current Linux distributions are not particularly good at providing
for this major use-case of Linux. Their strict focus on individual
packages as well as package managers as end-user install and update
tool is incompatible with what many system vendors want.

Users

The classic Linux distribution scheme is frequently not what end users
want, either. Many users are used to app markets like Android, Windows
or iOS/Mac have. Markets are a platform that doesn’t package, build or
maintain software like distributions do, but simply allows users to
quickly find and download the software they need, with the app vendor
responsible for keeping the app updated, secured, and all that on the
vendor’s release cycle. Users tend to be impatient. They want their
software quickly, and the fine distinction between trusting a single
distribution or a myriad of app developers individually is usually not
important for them. The companies behind the marketplaces usually try
to improve this trust problem by providing sand-boxing technologies: as
a replacement for the distribution that audits, vets, builds and
packages the software and thus allows users to trust it to a certain
level, these vendors try to find technical solutions to ensure that
the software they offer for download can’t be malicious.

Existing Approaches To Fix These Problems

Now, all the issues pointed out above are not new, and there are
sometimes quite successful attempts to do something about it. Ubuntu
Apps, Docker, Software Collections, ChromeOS, CoreOS all fix part of
this problem set, usually with a strict focus on one facet of Linux
systems. For example, Ubuntu Apps focus strictly on end user (desktop)
applications, and don’t care about how we built/update/install the OS
itself, or containers. Docker OTOH focuses on containers only, and
doesn’t care about end-user apps. Software Collections tries to focus
on the development environments. ChromeOS focuses on the OS itself,
but only for end-user devices. CoreOS also focuses on the OS, but
only for server systems.

The approaches they find are usually good at specific things, and use
a variety of different technologies, on different layers. However,
none of these projects tried to fix this problems in a generic way,
for all uses, right in the core components of the OS itself.

Linux has come to tremendous successes because its kernel is so
generic: you can build supercomputers and tiny embedded devices out of
it. It’s time we come up with a basic, reusable scheme how to solve
the problem set described above, that is equally generic.

What We Want

The systemd cabal (Kay Sievers, Harald Hoyer, Daniel Mack, Tom
Gundersen, David Herrmann, and yours truly) recently met in Berlin
about all these things, and tried to come up with a scheme that is
somewhat simple, but tries to solve the issues generically, for all
use-cases, as part of the systemd project. All that in a way that is
somewhat compatible with the current scheme of distributions, to allow
a slow, gradual adoption. Also, and that’s something one cannot stress
enough: the toolbox scheme of classic Linux distributions is
actually a good one, and for many cases the right one. However, we
need to make sure we make distributions relevant again for all
use-cases, not just those of highly individualized systems.

Anyway, so let’s summarize what we are trying to do:

We want an efficient way that allows vendors to package their
software (regardless if just an app, or the whole OS) directly for
the end user, and know the precise combination of libraries and
packages it will operate with.
We want to allow end users and administrators to install these
packages on their systems, regardless which distribution they have
installed on it.
We want a unified solution that ultimately can cover updates for
full systems, OS containers, end user apps, programming ABIs, and
more. These updates shall be double-buffered, (at least). This is an
absolute necessity if we want to prepare the ground for operating
systems that manage themselves, that can update safely without
administrator involvement.
We want our images to be trustable (i.e. signed). In fact we want a
fully trustable OS, with images that can be verified by a full
trust chain from the firmware (EFI SecureBoot!), through the boot loader, through the
kernel, and initrd. Cryptographically secure verification of the
code we execute is relevant on the desktop (like ChromeOS does), but
also for apps, for embedded devices and even on servers (in a post-Snowden
world, in particular).

What We Propose

So much about the set of problems, and what we are trying to do. So,
now, let’s discuss the technical bits we came up with:

The scheme we propose is built around the variety of concepts of btrfs
and Linux file system name-spacing. btrfs at this point already has a
large number of features that fit neatly in our concept, and the
maintainers are busy working on a couple of others we want to
eventually make use of.

As first part of our proposal we make heavy use of btrfs sub-volumes and
introduce a clear naming scheme for them. We name snapshots like this:

usr:<vendorid>:<architecture>:<version> — This refers to a full
vendor operating system tree. It’s basically a /usr tree (and no
other directories), in a specific version, with everything you need to boot
it up inside it. The <vendorid> field is replaced by some vendor
identifier, maybe a scheme like
org.fedoraproject.FedoraWorkstation. The <architecture> field
specifies a CPU architecture the OS is designed for, for example
x86-64. The <version> field specifies a specific OS version, for
example 23.4. An example sub-volume name could hence look like this:
usr:org.fedoraproject.FedoraWorkstation:x86_64:23.4
root:<name>:<vendorid>:<architecture> — This refers to an
instance of an operating system. Its basically a root directory,
containing primarily /etc and /var (but possibly more). Sub-volumes
of this type do not contain a populated /usr tree though. The
<name> field refers to some instance name (maybe the host name of
the instance). The other fields are defined as above. An example
sub-volume name is
root:revolution:org.fedoraproject.FedoraWorkstation:x86_64.
runtime:<vendorid>:<architecture>:<version> — This refers to a
vendor runtime. A runtime here is supposed to be a set of
libraries and other resources that are needed to run apps (for the
concept of apps see below), all in a /usr tree. In this regard this
is very similar to the usr sub-volumes explained above, however,
while a usr sub-volume is a full OS and contains everything
necessary to boot, a runtime is really only a set of
libraries. You cannot boot it, but you can run apps with it. An
example sub-volume name is: runtime:org.gnome.GNOME3_20:x86_64:3.20.1
framework:<vendorid>:<architecture>:<version> — This is very
similar to a vendor runtime, as described above, it contains just a
/usr tree, but goes one step further: it additionally contains all
development headers, compilers and build tools, that allow
developing against a specific runtime. For each runtime there should
be a framework. When you develop against a specific framework in a
specific architecture, then the resulting app will be compatible
with the runtime of the same vendor ID and architecture. Example:
framework:org.gnome.GNOME3_20:x86_64:3.20.1
app:<vendorid>:<runtime>:<architecture>:<version> — This
encapsulates an application bundle. It contains a tree that at
runtime is mounted to /opt/<vendorid>, and contains all the
application’s resources. The <vendorid> could be a string like
org.libreoffice.LibreOffice, the <runtime> refers to one the
vendor id of one specific runtime the application is built for, for
example org.gnome.GNOME3_20:3.20.1. The <architecture> and
<version> refer to the architecture the application is built for,
and of course its version. Example:
app:org.libreoffice.LibreOffice:GNOME3_20:x86_64:133
home:<user>:<uid>:<gid> — This sub-volume shall refer to the home
directory of the specific user. The <user> field contains the user
name, the <uid> and <gid> fields the numeric Unix UIDs and GIDs
of the user. The idea here is that in the long run the list of
sub-volumes is sufficient as a user database (but see
below). Example: home:lennart:1000:1000.

btrfs partitions that adhere to this naming scheme should be clearly
identifiable. It is our intention to introduce a new GPT partition type
ID for this.

How To Use It

After we introduced this naming scheme let’s see what we can build of
this:

When booting up a system we mount the root directory from one of the
root sub-volumes, and then mount /usr from a matching usr
sub-volume. Matching here means it carries the same <vendor-id>
and <architecture>. Of course, by default we should pick the
matching usr sub-volume with the newest version by default.
When we boot up an OS container, we do exactly the same as the when
we boot up a regular system: we simply combine a usr sub-volume
with a root sub-volume.
When we enumerate the system’s users we simply go through the
list of home snapshots.
When a user authenticates and logs in we mount his home
directory from his snapshot.
When an app is run, we set up a new file system name-space, mount the
app sub-volume to /opt/<vendorid>/, and the appropriate runtime
sub-volume the app picked to /usr, as well as the user’s
/home/$USER to its place.
When a developer wants to develop against a specific runtime he
installs the right framework, and then temporarily transitions into
a name space where /usris mounted from the framework sub-volume, and
/home/$USER from his own home directory. In this name space he then
runs his build commands. He can build in multiple name spaces at the
same time, if he intends to builds software for multiple runtimes or
architectures at the same time.

Instantiating a new system or OS container (which is exactly the same
in this scheme) just consists of creating a new appropriately named
root sub-volume. Completely naturally you can share one vendor OS
copy in one specific version with a multitude of container instances.

Everything is double-buffered (or actually, n-fold-buffered), because
usr, runtime, framework, app sub-volumes can exist in multiple
versions. Of course, by default the execution logic should always pick
the newest release of each sub-volume, but it is up to the user keep
multiple versions around, and possibly execute older versions, if he
desires to do so. In fact, like on ChromeOS this could even be handled
automatically: if a system fails to boot with a newer snapshot, the
boot loader can automatically revert back to an older version of the
OS.

An Example

Note that in result this allows installing not only multiple end-user
applications into the same btrfs volume, but also multiple operating
systems, multiple system instances, multiple runtimes, multiple
frameworks. Or to spell this out in an example:

Let’s say Fedora, Mageia and ArchLinux all implement this scheme,
and provide ready-made end-user images. Also, the GNOME, KDE, SDL
projects all define a runtime+framework to develop against. Finally,
both LibreOffice and Firefox provide their stuff according to this
scheme. You can now trivially install of these into the same btrfs
volume:

usr:org.fedoraproject.WorkStation:x86_64:24.7
usr:org.fedoraproject.WorkStation:x86_64:24.8
usr:org.fedoraproject.WorkStation:x86_64:24.9
usr:org.fedoraproject.WorkStation:x86_64:25beta
usr:org.mageia.Client:i386:39.3
usr:org.mageia.Client:i386:39.4
usr:org.mageia.Client:i386:39.6
usr:org.archlinux.Desktop:x86_64:302.7.8
usr:org.archlinux.Desktop:x86_64:302.7.9
usr:org.archlinux.Desktop:x86_64:302.7.10
root:revolution:org.fedoraproject.WorkStation:x86_64
root:testmachine:org.fedoraproject.WorkStation:x86_64
root:foo:org.mageia.Client:i386
root:bar:org.archlinux.Desktop:x86_64
runtime:org.gnome.GNOME3_20:x86_64:3.20.1
runtime:org.gnome.GNOME3_20:x86_64:3.20.4
runtime:org.gnome.GNOME3_20:x86_64:3.20.5
runtime:org.gnome.GNOME3_22:x86_64:3.22.0
runtime:org.kde.KDE5_6:x86_64:5.6.0
framework:org.gnome.GNOME3_22:x86_64:3.22.0
framework:org.kde.KDE5_6:x86_64:5.6.0
app:org.libreoffice.LibreOffice:GNOME3_20:x86_64:133
app:org.libreoffice.LibreOffice:GNOME3_22:x86_64:166
app:org.mozilla.Firefox:GNOME3_20:x86_64:39
app:org.mozilla.Firefox:GNOME3_20:x86_64:40
home:lennart:1000:1000
home:hrundivbakshi:1001:1001

In the example above, we have three vendor operating systems
installed. All of them in three versions, and one even in a beta
version. We have four system instances around. Two of them of Fedora,
maybe one of them we usually boot from, the other we run for very
specific purposes in an OS container. We also have the runtimes for
two GNOME releases in multiple versions, plus one for KDE. Then, we
have the development trees for one version of KDE and GNOME around, as
well as two apps, that make use of two releases of the GNOME
runtime. Finally, we have the home directories of two users.

Now, with the name-spacing concepts we introduced above, we can
actually relatively freely mix and match apps and OSes, or develop
against specific frameworks in specific versions on any operating
system. It doesn’t matter if you booted your ArchLinux instance, or
your Fedora one, you can execute both LibreOffice and Firefox just
fine, because at execution time they get matched up with the right
runtime, and all of them are available from all the operating systems
you installed. You get the precise runtime that the upstream vendor of
Firefox/LibreOffice did their testing with. It doesn’t matter anymore
which distribution you run, and which distribution the vendor prefers.

Also, given that the user database is actually encoded in the
sub-volume list, it doesn’t matter which system you boot, the
distribution should be able to find your local users automatically,
without any configuration in /etc/passwd.

Building Blocks

With this naming scheme plus the way how we can combine them on
execution we already came quite far, but how do we actually get these
sub-volumes onto the final machines, and how do we update them? Well,
btrfs has a feature they call “send-and-receive”. It basically allows
you to “diff” two file system versions, and generate a binary
delta. You can generate these deltas on a developer’s machine and then
push them into the user’s system, and he’ll get the exact same
sub-volume too. This is how we envision installation and updating of
operating systems, applications, runtimes, frameworks. At installation
time, we simply deserialize an initial send-and-receive delta into
our btrfs volume, and later, when a new version is released we just
add in the few bits that are new, by dropping in another
send-and-receive delta under a new sub-volume name. And we do it
exactly the same for the OS itself, for a runtime, a framework or an
app. There’s no technical distinction anymore. The underlying
operation for installing apps, runtime, frameworks, vendor OSes, as well
as the operation for updating them is done the exact same way for all.

Of course, keeping multiple full /usr trees around sounds like an
awful lot of waste, after all they will contain a lot of very similar
data, since a lot of resources are shared between distributions,
frameworks and runtimes. However, thankfully btrfs actually is able to
de-duplicate this for us. If we add in a new app snapshot, this simply
adds in the new files that changed. Moreover different runtimes and
operating systems might actually end up sharing the same tree.

Even though the example above focuses primarily on the end-user,
desktop side of things, the concept is also extremely powerful in
server scenarios. For example, it is easy to build your own usr
trees and deliver them to your hosts using this scheme. The usr
sub-volumes are supposed to be something that administrators can put
together. After deserializing them into a couple of hosts, you can
trivially instantiate them as OS containers there, simply by adding a
new root sub-volume for each instance, referencing the usr tree you
just put together. Instantiating OS containers hence becomes as easy
as creating a new btrfs sub-volume. And you can still update the images
nicely, get fully double-buffered updates and everything.

And of course, this scheme also applies great to embedded
use-cases. Regardless if you build a TV, an IVI system or a phone: you
can put together you OS versions as usr trees, and then use
btrfs-send-and-receive facilities to deliver them to the systems, and
update them there.

Many people when they hear the word “btrfs” instantly reply with “is
it ready yet?”. Thankfully, most of the functionality we really need
here is strictly read-only. With the exception of the home
sub-volumes (see below) all snapshots are strictly read-only, and are
delivered as immutable vendor trees onto the devices. They never are
changed. Even if btrfs might still be immature, for this kind of
read-only logic it should be more than good enough.

Note that this scheme also enables doing fat systems: for example,
an installer image could include a Fedora version compiled for x86-64,
one for i386, one for ARM, all in the same btrfs volume. Due to btrfs’
de-duplication they will share as much as possible, and when the image
is booted up the right sub-volume is automatically picked. Something
similar of course applies to the apps too!

This also allows us to implement something that we like to call
Operating-System-As-A-Virus. Installing a new system is little more
than:

Creating a new GPT partition table
Adding an EFI System Partition (FAT) to it
Adding a new btrfs volume to it
Deserializing a single usr sub-volume into the btrfs volume
Installing a boot loader into the EFI System Partition
Rebooting

Now, since the only real vendor data you need is the usr sub-volume,
you can trivially duplicate this onto any block device you want. Let’s
say you are a happy Fedora user, and you want to provide a friend with
his own installation of this awesome system, all on a USB stick. All
you have to do for this is do the steps above, using your installed
usr tree as source to copy. And there you go! And you don’t have to
be afraid that any of your personal data is copied too, as the usr
sub-volume is the exact version your vendor provided you with. Or with
other words: there’s no distinction anymore between installer images
and installed systems. It’s all the same. Installation becomes
replication, not more. Live-CDs and installed systems can be fully
identical.

Note that in this design apps are actually developed against a single,
very specific runtime, that contains all libraries it can link against
(including a specific glibc version!). Any library that is not
included in the runtime the developer picked must be included in the
app itself. This is similar how apps on Android declare one very
specific Android version they are developed against. This greatly
simplifies application installation, as there’s no dependency hell:
each app pulls in one runtime, and the app is actually free to pick
which one, as you can have multiple installed, though only one is used
by each app.

Also note that operating systems built this way will never see
“half-updated” systems, as it is common when a system is updated using
RPM/dpkg. When updating the system the code will either run the old or
the new version, but it will never see part of the old files and part
of the new files. This is the same for apps, runtimes, and frameworks,
too.

Where We Are Now

We are currently working on a lot of the groundwork necessary for
this. This scheme relies on the ability to monopolize the
vendor OS resources in /usr, which is the key of what I described in
Factory Reset, Stateless Systems, Reproducible Systems & Verifiable Systems
a few weeks back. Then, of course, for the full desktop app concept we
need a strong sandbox, that does more than just hiding files from the
file system view. After all with an app concept like the above the
primary interfacing between the executed desktop apps and the rest of the
system is via IPC (which is why we work on kdbus and teach it all
kinds of sand-boxing features), and the kernel itself. Harald Hoyer has
started working on generating the btrfs send-and-receive images based
on Fedora.

Getting to the full scheme will take a while. Currently we have many
of the building blocks ready, but some major items are missing. For
example, we push quite a few problems into btrfs, that other solutions
try to solve in user space. One of them is actually
signing/verification of images. The btrfs maintainers are working on
adding this to the code base, but currently nothing exists. This
functionality is essential though to come to a fully verified system
where a trust chain exists all the way from the firmware to the
apps. Also, to make the home sub-volume scheme fully workable we
actually need encrypted sub-volumes, so that the sub-volume’s
pass-phrase can be used for authenticating users in PAM. This doesn’t
exist either.

Working towards this scheme is a gradual process. Many of the steps we
require for this are useful outside of the grand scheme though, which
means we can slowly work towards the goal, and our users can already
take benefit of what we are working on as we go.

Also, and most importantly, this is not really a departure from
traditional operating systems:

Each app, each OS and each app sees a traditional Unix hierarchy with
/usr, /home, /opt, /var, /etc. It executes in an environment that is
pretty much identical to how it would be run on traditional systems.

There’s no need to fully move to a system that uses only btrfs and
follows strictly this sub-volume scheme. For example, we intend to
provide implicit support for systems that are installed on ext4 or
xfs, or that are put together with traditional packaging tools such as
RPM or dpkg: if the the user tries to install a
runtime/app/framework/os image on a system that doesn’t use btrfs so
far, it can just create a loop-back btrfs image in /var, and push the
data into that. Even us developers will run our stuff like this for a
while, after all this new scheme is not particularly useful for highly
individualized systems, and we developers usually tend to run
systems like that.

Also note that this in no way a departure from packaging systems like
RPM or DEB. Even if the new scheme we propose is used for installing
and updating a specific system, it is RPM/DEB that is used to put
together the vendor OS tree initially. Hence, even in this scheme
RPM/DEB are highly relevant, though not strictly as an end-user tool
anymore, but as a build tool.

So Let’s Summarize Again What We Propose

We want a unified scheme, how we can install and update OS images,
user apps, runtimes and frameworks.
We want a unified scheme how you can relatively freely mix OS
images, apps, runtimes and frameworks on the same system.
We want a fully trusted system, where cryptographic verification of
all executed code can be done, all the way to the firmware, as
standard feature of the system.
We want to allow app vendors to write their programs against very
specific frameworks, under the knowledge that they will end up being
executed with the exact same set of libraries chosen.
We want to allow parallel installation of multiple OSes and versions
of them, multiple runtimes in multiple versions, as well as multiple
frameworks in multiple versions. And of course, multiple apps in
multiple versions.
We want everything double buffered (or actually n-fold buffered), to
ensure we can reliably update/rollback versions, in particular to
safely do automatic updates.
We want a system where updating a runtime, OS, framework, or OS
container is as simple as adding in a new snapshot and restarting
the runtime/OS/framework/OS container.
We want a system where we can easily instantiate a number of OS
instances from a single vendor tree, with zero difference for doing
this on order to be able to boot it on bare metal/VM or as a
container.
We want to enable Linux to have an open scheme that people can use
to build app markets and similar schemes, not restricted to a
specific vendor.

Final Words

I’ll be talking about this at LinuxCon Europe in October. I originally
intended to discuss this at the Linux Plumbers Conference (which I
assumed was the right forum for this kind of major plumbing level
improvement), and at linux.conf.au, but there was no interest in my
session submissions there…

Of course this is all work in progress. These are our current ideas we
are working towards. As we progress we will likely change a number of
things. For example, the precise naming of the sub-volumes might look
very different in the end.

Of course, we are developers of the systemd project. Implementing this
scheme is not just a job for the systemd developers. This is a
reinvention how distributions work, and hence needs great support from
the distributions. We really hope we can trigger some interest by
publishing this proposal now, to get the distributions on board. This
after all is explicitly not supposed to be a solution for one specific
project and one specific vendor product, we care about making this
open, and solving it for the generic case, without cutting corners.

If you have any questions about this, you know how you can reach us
(IRC, mail, G+, …).

The future is going to be awesome!

pretaliation

2014-08-31 Oglaf! -- Comics. Often dirty.

Post Syndicated from Oglaf! -- Comics. Often dirty. original https://www.oglaf.com/pretaliation/

Четвъртък, 28 Август 2014

2014-08-28 georgi

Post Syndicated from georgi original http://georgi.unixsol.org/diary/archive.php/2014-08-28

Втора официална тренировка – качване до Копитото. Резултат – 15 км. за 90 мин.,
а връщането от Копитото до светофара на бул. България само за 20 мин. Много
е готино когато в един момент излизаш над дърветата и започват да се виждат
съседните хълмове.

Междувременно три пъти вече се качвам до Тихия кът и се спускам до Владая
по пътеката. В момента я правят и да се кара по нея с 25-30 км/ч както е
покрита с три пръста пясък и тук там чакъл си е добро приключение. Общо
взето държиш здраво кормилото и гледаш да не мърдаш много-много, че иначе
падането не ти мърда.

Дано пътеката стане по-твърда, че в момента ако караме по нея няколко
човека едновременно ще е доста опасно. Иначе от Тихия кът до Владая спускането
ми отнема около 6 мин.

Следващата цел е Кладница, до където трябва да карам три-четири пъти, за
да видя как ще се чувствам след качването и спускането. Трябва да се трупат
километри.

Battledress

2014-08-24 Oglaf! -- Comics. Often dirty.

Post Syndicated from Oglaf! -- Comics. Often dirty. original https://www.oglaf.com/battledress/

Tower of Assumptions

2014-08-17 Oglaf! -- Comics. Often dirty.

Post Syndicated from Oglaf! -- Comics. Often dirty. original https://www.oglaf.com/assumptions/

Понеделник, 11 Август 2014

2014-08-11 georgi

Post Syndicated from georgi original http://georgi.unixsol.org/diary/archive.php/2014-08-11

Започнах подготовка за обиколката на Витоша през 2015 г. От няколко години
се каня, но мисля този път да съм сериозен и да участвам (ако не вземе да
ме блъсне нещо междувременно).

Щях да карам следобед, но нещо не бях на кеф и чак надвечер ми дойде вдъхновение.

Тръгнах в 20:10 от нас (даже с каска!) като спирах само на едно място, за да
почина половин минута и да сменя водата на рибките. Оказа се, че е можело и
да не спирам, бил съм само на двеста метра от Тихия кът.

Като за първи опит по трасето – резултат ~60 минути от вкъщи в Борово
до Беловодски път, където започва спускането към Владая. Изминатото
разстояние – малко над 10 километра, денивелация около 450 метра).

Вече беше станало към 21:10 и беше доста тъмно като по цялата пътека през
гората почти нищо не се виждаше, та се наложи да се върна.

Спукането до Била на България ми отне отне 12-13 минути, заради тъмното. На светло
вероятно ще съм по-смел.

Рапорт даден, а следващото каране ще след поне седмица защото просто няма
да съм в София. Голяма подготовка ще падне…

За авто/мото туристите пък, ето връзка към скорошна разходка на
Дедо Влади из Балканските страни на мотор –
Македония, Албания, Черна гора и малко Хърватска – 2014 год..

Perfect buttocks now!

2014-08-10 Oglaf! -- Comics. Often dirty.

Post Syndicated from Oglaf! -- Comics. Often dirty. original https://www.oglaf.com/buttocks/

Binary fuzzing strategies: what works, what doesn’t

2014-08-08 Unknown

Post Syndicated from Unknown original https://lcamtuf.blogspot.com/2014/08/binary-fuzzing-strategies-what-works.html

Successful fuzzers live and die by their fuzzing strategies. If the changes made to the input file are too conservative, the fuzzer will achieve very limited coverage. If the tweaks are too aggressive, they will cause most inputs to fail parsing at a very early stage, wasting CPU cycles and spewing out messy test cases that are difficult to investigate and troubleshoot.

Designing the mutation engine for a new fuzzer has more to do with art than science. But one of the interesting side effects of the design of american fuzzy lop is that it provides a rare feedback loop: you can carefully measure what types of changes to the input file actually result in the discovery of new branches in the code, and which ones just waste your time or money.

This data is particularly easy to read because the fuzzer also approaches every new input file by going through a series of progressively more complex, but exhaustive and deterministic fuzzing strategies – say, sequential bit flips and simple arithmetics – before diving into purely random behaviors. The reason for this is the desire to generate the simplest and most elegant test cases first; but the design also provides a very good way to quantify how much value each new strategy brings in to the table – and whether we need it at all.

The measurements of afl fuzzing efficiency for reasonably-sized test cases are remarkably consistent across a variety of real-world binary formats – anything ranging from image files (JPEG, PNG, GIF, WebP) to archives (gzip, xz, tar) – and because of this, I figured that sharing the data more broadly will be useful to folks who are working on fuzzers of their own. So, let’s dive in:

Walking bit flips: the first and most rudimentary strategy employed by afl involves performing sequential, ordered bit flips. The stepover is always one bit; the number of bits flipped in a row varies from one to four. Across a large and diverse corpus of input files, the observed yields are:
- Flipping a single bit: ~70 new paths per one million generated inputs,
- Flipping two bits in a row: ~20 additional paths per million generated inputs,
- Flipping four bits in a row: ~10 additional paths per million inputs.
(Note that the counts for every subsequent pass include only the paths that could not have been discovered by the preceding strategy.)

Of course, the strategy is relatively expensive, with each pass requiring eight execve() per every byte of the input file. With the returns are diminishing rapidly, afl stops after these three passes – and switches to a second, less expensive strategy past that point.
Walking byte flips: a natural extension of walking bit flip approach, this method relies on 8-, 16-, or 32-bit wide bitflips with a constant stepover of one byte. This strategy discovers around ~30 additional paths per million inputs, on top of what could have been triggered with shorter bit flips.

It should be fairly obvious that each pass takes approximately one execve() call per one byte of the input file, making it surprisingly cheap, but also limiting its potential yields in absolute terms.
Simple arithmetics: to trigger more complex conditions in a deterministic fashion, the third stage employed by afl attempts to subtly increment or decrement existing integer values in the input file; this is done with a stepover of one byte. The experimentally chosen range for the operation is -35 to +35; past these bounds, fuzzing yields drop dramatically. In particular, the popular option of sequentially trying every single value for each byte (equivalent to arithmetics in the range of -128 to +127) helps very little and is skipped by afl.

When it comes to the implementation, the stage consists of three separate operations. First, the fuzzer attempts to perform subtraction and addition on individual bytes. With this out of the way, the second pass involves looking at 16-bit values, using both endians – but incrementing or decrementing them only if the operation would have also affected the most significant byte (otherwise, the operation would simply duplicate the results of the 8-bit pass). The final stage follows the same logic, but for 32-bit integers.

The yields for this method vary depending on the format – ranging from ~2 additional paths per million in JPEG to ~8 per million in xz. The cost is relatively high, averaging around 20 execve() calls per one byte of the input file – but can be significantly improved with only a modest impact on path coverage by sticking to +/- 16.
Known integers: the last deterministic approach employed by afl relies on a hardcoded set of integers chosen for their demonstrably elevated likelihood of triggering edge conditions in typical code (e.g., -1, 256, 1024, MAX_INT-1, MAX_INT). The fuzzer uses a stepover of one byte to sequentially overwrite existing data in the input file with one of the approximately two dozen “interesting” values, using both endians (the writes are 8-, 16-, and 32-bit wide).

The yields for this stage are between 2 and 5 additional paths per one million tries; the average cost is roughly 30 execve() calls per one byte of input file.
Stacked tweaks: with deterministic strategies exhausted for a particular input file, the fuzzer continues with a never-ending loop of randomized operations that consist of a stacked sequence of:
- Single-bit flips,
- Attempts to set “interesting” bytes, words, or dwords (both endians),
- Addition or subtraction of small integers to bytes, words, or dwords (both endians),
- Completely random single-byte sets,
- Block deletion,
- Block duplication via overwrite or insertion,
- Block memset.
Based on a fair amount of testing, the optimal execution path yields appear to be achieved when the probability of each operation is roughly the same; the number of stacked operations is chosen as a power-of-two between 1 and 64; and the block size for block operations is capped at around 1 kB.

The absolute yield for this stage is typically comparable or higher than the total number of execution paths discovered by all deterministic stages earlier on.
Test case splicing: this is a last-resort strategy that involves taking two distinct input files from the queue that differ in at least two locations; and splicing them at a random location in the middle before sending this transient input file through a short run of the “stacked tweaks” algorithm. This strategy usually discovers around 20% additional execution paths that are unlikely to trigger using the previous operation alone.

(Of course, this method requires a good, varied corpus of input files to begin with; afl generates one automatically, but for other tools, you may have to construct it manually.)

As you can see, deterministic block operations (duplication, splicing) are not attempted in an exhaustive fashion; this is because they generally require quadratic time (or worse) – so while their yields may be good for very short inputs, they degrade very quickly.

Well, that’s it! If you ever decide to try out afl, you can watch these and other cool stats on your screen in real time.

Logs and more logs, who has time to read them ?

2014-08-05 Michael "Monty" Widenius

Post Syndicated from Michael "Monty" Widenius original http://monty-says.blogspot.com/2014/08/logs-and-more-logs-who-has-time-to-read.html

While working on some new features in MariaDB 10.1, I noticed that a normal user couldn’t disable the slow query log, which I thought was a bit silly.

While implementing and documenting this feature, I noticed that the information about the different logs is quite spread around and it’s not that trivial to find out how to enable/disable the different logs.

To solve this, I created a new MariaDB kb entry, “Overview of the MariaDB logs that I hope MariaDB and MySQL users will be find useful.

Here follows a copy of the kb entry.
If you have any comments or things that could be added, please do it in the kb entry so that it will benefit as many as possible!

Overview of MariaDB logs

There are many variables in MariaDB that you can use to define what to log and when to log.

This article will give you an overview of the different logs and how to enable/disable logging to these.

The error log

Always enabled
Usually a file in the data directory, but some distributions may move this to other locations.
All critical errors are logged here.
One can get warnings to be logged by setting log_warnings.
With the mysqld_safe –syslog option one can duplicate the messages to the system’s syslog.

General query log

Enabled with –general-log
Logs all queries to a file or table.
Useful for debugging or auditing queries.
The super user can disable logging to it for a connection by setting SQL_LOG_OFF to 1.

Slow Query log

Enabled by starting mysqld with –slow-query-log
Logs all queries to a file or table.
Useful to find queries that causes performance problems.
Logs all queries that takes more than long_query_time to run.
One can decide what to log with the options –log-slow-admin-statments, –log-slow-slave-statements, log_slow_filter or log_slow_rate_limit.
One can change what is logged by setting log_slow_verbosity.
One can disable it globally by setting global.slow_query_log to 0
In 10.1 one can disable it for a connection by setting local.slow_query_log to 0.

The binary log

Enabled by starting mysqld with –log-bin
Normally only used on machines that are, or may become, replication masters.
Binary log files are mainly used by replication and can also be used with –binlog-ignore-db=database_name or –binlog-do-db=database_name.
The super user can disable logging for a connection by setting SQL_LOG_BIN to 0. However while this is 0, no changes done in this connection will be replicated to the slaves!
For examples, see Using and Maintaining the Binary Log.

Examples

If you know that your next query will be slow and you don’t want to log it in the slow query log, do:

SET LOCAL SLOW_QUERY_LOG=0;

If you are a super user running a log batch job that you don’t want to have logged (for example mysqldump), do:

SET LOCAL SQL_LOG_OFF=1, LOCAL SLOW_QUERY_LOG=0;

mysqldump in MariaDB 10.1 will do this automatically if you run it with the –disable-log-querys option.

A bit more about american fuzzy lop

2014-08-05 Unknown

Post Syndicated from Unknown original https://lcamtuf.blogspot.com/2014/08/a-bit-more-about-american-fuzzy-lop.html

Fuzzing is one of the most powerful strategies for identifying security issues in real-world software. Unfortunately, it also offers fairly shallow coverage: it is impractical to exhaustively cycle through all possible inputs, so even something as simple as setting three separate bytes to a specific value to reach a chunk of unsafe code can be an insurmountable obstacle to a typical fuzzer.

There have been numerous attempts to solve this problem by augmenting the process with additional information about the behavior of the tested code. These techniques can be divided into three broad groups:

Simple coverage maximization. This approach boils down to trying to isolate initial test cases that offer diverse code coverage in the targeted application – and them fuzzing them using conventional techniques.
Control flow analysis. A more sophisticated technique that leverages instrumented binaries to focus the fuzzing efforts on mutations that generate distinctive sequences of conditional branches within the instrumented binary.
Static analysis. An approach that attempts to reason about potentially interesting states within the tested program and then make educated guesses about the input values that could possibly trigger them.

The first technique is surprisingly powerful when used to pre-select initial test cases from a massive corpus of valid data – say, the result of a large-scale web crawl. Unfortunately, coverage measurements provide only a very simplistic view of the internal state of the program, making them less suited for creatively guiding the fuzzing process later on.

The latter two techniques are extremely promising in experimental settings. That said, in real-world applications, they are not only very slow, but frequently lead to irreducible complexity: most of the high-value targets will have a vast number of internal states and possible execution paths, and deciding which ones are interesting and substantially different from the rest is an extremely difficult challenge that, if not solved, usually causes the “smart” fuzzer to perform no better than a traditional one.

American fuzzy lop tries to find a reasonable middle ground between sophistication and practical utility. In essence, it’s a fuzzer that relies on a form of edge coverage measurements to detect subtle, local-scale changes to program control flow without having to perform complex global-scale comparisons between series of long and winding execution traces – a common failure point for similar tools.

In almost-plain English, the fuzzer does this by instrumenting every effective line of C or C++ code (or any other GCC-supported language) to record a tuple in the following format:

[ID of current code location], [ID of previously-executed code location]

The ordering information for tuples is discarded; the primary signal used by the fuzzer is the appearance of a previously-unseen tuple in the output dataset; this is also coupled with coarse magnitude count for tuple hit rate. This method combines the self-limiting nature of simple coverage measurements with the sensitivity of control flow analysis. It detects both explicit conditional branches, and indirect variations in the behavior of the tested app.

The output from this instrumentation is used as a part of a simple, vaguely “genetic” algorithm:

Load user-supplied initial test cases into the queue,
Take input file from the queue,
Repeatedly mutate the file using a balanced variety of traditional fuzzing strategies (see later),
If any of the generated mutations resulted in a new tuple being recorded by the instrumentation, add mutated output as a new entry in the queue.
Go to 2.

The discovered test cases are also periodically culled to eliminate ones that have been made obsolete by more inclusive finds discovered later in the fuzzing process. Because of this, the fuzzer is useful not only for identifying crashes, but is exceptionally effective at turning a single valid input file into a reasonably-sized corpus of interesting test cases that can be manually investigated for non-crashing problems, handed over to valgrind, or used to stress-test applications that are harder to instrument or too slow to fuzz efficiently. In particular, it can be extremely useful for generating small test sets that may be programatically or manually examined for anomalies in a browser environment.

(For a quick partial demo, click here.)

Of course, there are countless “smart” fuzzer designs that look good on paper, but fail in real-world applications. I tried to make sure that this is not the case here: for example, afl can easily tackle security-relevant and tough targets such as gzip, xz, lzo, libjpeg, libpng, giflib, libtiff, or webp – all with absolutely no fine-tuning and while running at blazing speeds. The control flow information is also extremely useful for accurately de-duping crashes, so the tool does that for you.

In fact, I spent some time running it on a single machine against libjpeg, giflib, and libpng – some of the most robust best-tested image parsing libraries out there. So far, the tool found:

CVE-2013-6629: JPEG SOS component uninitialized memory disclosure in jpeg6b and libjpeg-turbo,
CVE-2013-6630: JPEG DHT uninitialized memory disclosure in libjpeg-turbo,
MSRC 0380191: A separate JPEG DHT uninitialized memory disclosure in Internet Explorer,
CVE-2014-1564: Uninitialized memory disclosure via GIF images in Firefox,
CVE-2014-1580: Uninitialized memory disclosure via <canvas> in Firefox,
Chromium bug #398235, Mozilla bug #1050342: Probable library-related JPEG security issues in Chrome and Firefox (pending),
PNG zlib API misuse bug in MSIE (DoS-only),
Several browser-crashing images in WebKit browsers (DoS-only).

More is probably to come. In other words, you should probably try it out. The most significant limitation today is that the current fuzzing strategies are optimized for binary files; the fuzzer does:

Walking bitflips – 1, 2, and 4 bits,
Walking byte flips – 1, 2, and 4 bytes,
Walking addition and subtraction of small integers – byte, word, dword (both endians),
Walking insertion of interesting integers (-1, MAX_INT, etc) – byte, word, dword (both endians),
Random stacked flips, arithmetics, block cloning, insertion, deletion, etc,
Random splicing of synthetized test cases – pretty unique!

All these strategies have been specifically selected for an optimal balance between fuzzing cost and yields measured in terms of the number of discovered execution paths with binary formats; for highly-redundant text-based formats such as HTML or XML, syntax-aware strategies (template- or ABNF-based) will obviously yield better results. Plugging them into AFL would not be hard, but requires work.