All posts by corbet

[$] Documenting page flags by committee

Post Syndicated from corbet original https://lwn.net/Articles/974515/

For every page of memory in the system, the kernel maintains a set of page
flags describing how the page is used and various aspects of its current
state. Space for page flags has been in chronic short supply, leading to a desire to
eliminate or consolidate them whenever possible. That objective, though,
is hampered by the fact that the purpose of many page flags is not well
understood. In a memory-management-track session at the 2024 Linux Storage,
Filesystem, Memory-Management and BPF Summit
, Matthew Wilcox set out to
cooperatively update the page-flag documentation to improve that situation.

[$] Merging msharefs

Post Syndicated from corbet original https://lwn.net/Articles/974512/

The problem of sharing page tables across processes has been discussed
numerous times over the years, Khaled Aziz said at the beginning of his 2024 Linux Storage,
Filesystem, Memory-Management and BPF Summit
session on the topic. He
was there to, once again, talk about the proposed mshare() system call (which, in its
current form, is no longer actually a system call but the feature still
goes by that name) and to see what can be done to finally get it into the
mainline.

[$] Toward the unification of hugetlbfs

Post Syndicated from corbet original https://lwn.net/Articles/974491/

The kernel’s hugetlbfs
subsystem
was the first mechanism by which the kernel made huge pages
available to user space; it was added to the 2.5.46 development kernel in
2002. While hugetlbfs remains useful, it is also viewed as a sort of
second memory-management subsystem that would be best unified with the rest
of the kernel. At the 2024 Linux Storage,
Filesystem, Memory-Management and BPF Summit
, Peter Xu raised the
question of what that unification would involve and what the first steps
might be.

[$] The interaction between memory reclaim and RCU

Post Syndicated from corbet original https://lwn.net/Articles/974487/

The 2024 Linux
Storage, Filesystem, Memory-Management and BPF Summit
was a development
conference, where discussion was prioritized and presentations with a lot
of slides were discouraged. Paul McKenney seemingly flouted this
convention in a joint session of the storage, filesystem, and
memory-management tracks where he presented about 50 slides — in five
minutes, twice. The subject was the use of the read-copy-update (RCU)
mechanism in the memory-reclaim process, and whether changes to RCU would
be needed for that purpose.

[$] Faster page faults with RCU-protected VMA walks

Post Syndicated from corbet original https://lwn.net/Articles/974392/

Looking up a virtual memory area (VMA) in a process’s address space, for
the handling of page faults or any of a number of other tasks, in
multi-threaded processes has long been bedeviled by lock contention in the
kernel. As a result, developer gatherings have been subjected to many
sessions on how to improve the situation. At the 2024 Linux Storage,
Filesystem, Memory-Management and BPF Summit
, developers in the
memory-management track met, in a session led by Liam Howlett, to talk
about a situation that has improved considerably in recent times, but which
still offers opportunities for optimization.

[$] Another try for address-space isolation

Post Syndicated from corbet original https://lwn.net/Articles/974390/

Brendan Jackman started his memory-management-track session at the 2024 Linux Storage,
Filesystem, Memory-Management and BPF Summit
by saying that, for some
years now, the kernel community has been stuck in a reactive posture with
regard to hardware vulnerabilities. Each problem shows up with its own
scary name, and kernel developers find a way to mitigate it, usually losing
performance in the process. Jackman said that it is time to take back the
initiative against these vulnerabilities by reconsidering the more
general use of address-space isolation.

[$] Memory-allocation profiling for the kernel

Post Syndicated from corbet original https://lwn.net/Articles/974380/

Optimizing the kernel’s memory use is made much easier if developers have
an accurate idea of how memory is being used, but the kernel’s
instrumentation is not as good as it could be. When Suren Baghdasaryan and
Kent Overstreet presented their
memory-allocation profiling work, which is meant to address this
shortcoming, at the 2023 Linux Storage, Filesystem, Memory Management, and
BPF Summit, their objective was uncontroversial but the proposed solution
ran into opposition that played out at length on the mailing lists (example)
over the last year. So it may be a bit surprising that, when the two
returned to the memory-management track in the 2024 gathering, the
controversy was gone and the discussion focused on improving details of the
implementation.

[$] Dynamically sizing the kernel stack

Post Syndicated from corbet original https://lwn.net/Articles/974367/

The kernel stack is a scarce and tightly constrained resource; kernel
developers often have to go far out of their way to avoid using too much
stack space. The size of the stack is also fixed, leading to situations
where it is too small for some code paths, while wastefully large for
others. At the 2024 Linux Storage,
Filesystem, Memory Management, and BPF Summit
, Pasha Tatashin proposed
making the kernel stack size dynamic, making more space available when
needed while saving memory overall. This change is not as easy to
implement as it might seem, though.

[$] Facing down mapcount madness

Post Syndicated from corbet original https://lwn.net/Articles/974223/

The page
structure
is a complicated beast, but some parts of it are more
intimidating than others. The mapcount field is one of the
scarier parts. It allegedly records the number of references to the page
in page tables, but, as David Hildenbrand described during the
memory-management track at the 2024 Linux Storage,
Filesystem, Memory Management, and BPF Summit
, things are more
complicated than that. Few people truly understand the semantics of this
field, but the situation will hopefully get better over time.

Security updates for Tuesday

Post Syndicated from corbet original https://lwn.net/Articles/974450/

Security updates have been issued by AlmaLinux (firefox, nodejs, and thunderbird), Fedora (uriparser), Oracle (firefox and thunderbird), Slackware (mariadb), SUSE (cairo, gdk-pixbuf, krb5, libosinfo, postgresql14, and python310), and Ubuntu (firefox, linux-aws, linux-aws-5.15, and linux-azure).

[$] What’s next for the SLUB allocator

Post Syndicated from corbet original https://lwn.net/Articles/974138/

There are two fundamental levels of memory allocator in the Linux kernel:
the page allocator, which allocates memory in units of pages, and the slab
allocator, which allocates arbitrarily-sized chunks that are usually (but
not necessarily) smaller than a page. The slab allocator is the one that
stands behind commonly used kernel functions like kmalloc(). At
the 2024 Linux
Storage, Filesystem, Memory Management, and BPF Summit
, slab maintainer
Vlastimil Babka provided an update on recent changes at the slab level and
discussed the changes that are yet to come.

[$] Better support for locally-attached-memory tiering

Post Syndicated from corbet original https://lwn.net/Articles/974126/

The term “memory tiering” refers to the management of memory placement on
systems with multiple types of memory, each of which has its own
performance characteristics. On such systems, poor placement can lead to
significantly worse performance. A memory-management-track discussion at
the 2024 Linux Storage,
Filesystem, Memory Management, and BPF Summit
took yet another look at
tiering challenges with a focus on upcoming technologies that may simplify
(or complicate) the picture.

Axboe: What’s new with io_uring in 6.10

Post Syndicated from corbet original https://lwn.net/Articles/974341/

Jens Axboe describes
the new io_uring features
that will be a part of the 6.10 kernel
release.

Bundles are multiple buffers used in a single operation. On the
receive side, this means a single receive may utilize multiple
buffers, reducing the roundtrip through the networking stack from N
per N buffers to just a single one. On the send side, this also
enables better handling of how an application deals with sends from
a socket, eliminating the need to serialize sends on a single
socket. Bundles work with provided buffers, hence this feature also
adds support for provided buffers for send operations.

Security updates for Monday

Post Syndicated from corbet original https://lwn.net/Articles/974339/

Security updates have been issued by Debian (bind9, chromium, and thunderbird), Fedora (buildah, chromium, firefox, mingw-python-werkzeug, and suricata), Mageia (golang), Oracle (firefox and nodejs:20), Red Hat (firefox, httpd:2.4, nodejs, and thunderbird), and SUSE (firefox, git-cliff, and ucode-intel).

[$] Extending the mempolicy interface for heterogeneous systems

Post Syndicated from corbet original https://lwn.net/Articles/973964/

Non-uniform memory access (NUMA) systems are organized with their CPUs
grouped into nodes, each of which has memory attached to it. All memory in
the system is accessible from all CPUs, but memory attached to the local
node is faster. The kernel’s memory-policy
(“mempolicy”) interface
allows threads to inform the kernel about how
they would like their memory placed to get the best performance. In recent
years, the NUMA concept has been extended to support the management of
different types of memory in a system, pushing the limits of the mempolicy
subsystem. In a remotely presented session at the 2024 Linux Storage,
Filesystem, Memory Management, and BPF Summit
, Gregory Price discussed
the ways in which the kernel’s memory-policy support should evolve to
handle today’s more-complex systems.

[$] An update and future plans for DAMON

Post Syndicated from corbet original https://lwn.net/Articles/973702/

The DAMON
subsystem was the subject of the first session in the memory-management
track at the Linux
Storage, Filesystem, Memory Management, and BPF Summit
. DAMON
maintainer SeongJae Park introduced the data-access monitoring
framework, which can generate snapshots of how memory is accessed, enabling
the detection of hot and cold regions of memory in both the virtual and
physical address spaces. The session covered recent changes and future
plans for this tool.

White paper: Vendor Kernels, Bugs and Stability

Post Syndicated from corbet original https://lwn.net/Articles/973996/

Ronnie Sahlberg, Jonathan Maple, and Jeremy Allison of CiQ have published
a white
paper
looking at the security-relevant bug fixes applied (or not
applied) to the RHEL 8.x kernel over time.

This means that over time, the security of the RHEL kernels get
worse and worse as more issues are discovered in the upstream code
and are potentially exploitable but fewer and fewer of the fixes
for these known bugs are back-ported into RHEL kernels.


After reaching RHEL 8.7, the theory is that the kernel has been
stabilized, with a corresponding improvement in security. However
we still have an influx of newly discovered bugs in the upstream
kernel affecting RHEL 8.7 that are not addressed. Each minor
version of upstream is released on an approximately quarterly basis
and we can see that the influx of new bugs that are unaddressed in
RHEL is growing. The number of known issues in these kernels
increases by approximately 250 new bugs per quarter or more.

[$] The first half of the 6.10 merge window

Post Syndicated from corbet original https://lwn.net/Articles/973687/

The merge window for the 6.10 kernel release opened on May 12; between
then and the time of this writing, 6,819 non-merge commits were pulled into
the mainline kernel for that release. Your editor has taken some time out
from LSFMM+BPF in an attempt to keep
up with the commit flood. Read on for an overview of the most significant
changes that were pulled in the early part of the 6.10 merge window.

Mozilla Foundation Welcomes Nabiha Syed as Executive Director

Post Syndicated from corbet original https://lwn.net/Articles/973820/

The Mozilla Foundation has announced
that its new executive director will be Nabiha Syed.

Syed is known for her mission-driven leadership, focused on
increasing transparency into the most powerful institutions in
society. She comes to Mozilla after leading The Markup, an
award-winning publication that challenges technology to serve the
public good, from its launch through its successful acquisition in
2024.