All posts by corbet

[$] Two approaches to better kernel samepage merging

Post Syndicated from corbet original https://lwn.net/Articles/1016426/

The kernel
samepage merging (KSM)
subsystem works by finding pages in memory with
the same contents, then replacing the duplicated copies with a single,
shared copy. KSM can improve memory utilization in a system, but has some
problems as well. In two memory-management-track sessions at the 2025
Linux Storage, Filesystem, Memory-Management, and BPF Summit, Mathieu
Desnoyers and Sourav Panda proposed improvements to KSM to
make it work better for specific use cases.

[$] Using large folios for text areas

Post Syndicated from corbet original https://lwn.net/Articles/1016416/

Quite a bit of work has been done in recent years to allow the kernel to
make more use of large folios. That progress has not yet reached the
handling of text (executable code) areas, though. During the
memory-management track of the 2025 Linux Storage, Filesystem,
Memory-Management, and BPF Summit, Ryan Roberts ran a session on how that
situation might be improved. It would be a relatively small and contained
operation, but can give a measurable performance improvement.

[$] Per-CPU memory for user space

Post Syndicated from corbet original https://lwn.net/Articles/1016408/

The kernel makes extensive use of per-CPU data as a way to avoid contention
between processors and improve scalability. Using the same technique in
user space is harder, though, since there is little control over which CPU
a process may be running on at any given time. That hasn’t stopped Mathieu
Desnoyers from trying, though; in the memory-management track of the 2025
Linux Storage, Filesystem, Memory-Management, and BPF Summit, he presented
a proposal for how user-space per-CPU memory could work.

Security updates for Tuesday

Post Syndicated from corbet original https://lwn.net/Articles/1016774/

Security updates have been issued by AlmaLinux (gimp, libxslt, python3.11, python3.12, and tomcat), Debian (ghostscript and libnet-easytcp-perl), Fedora (openvpn, perl-Data-Entropy, and webkitgtk), Red Hat (python-jinja2), SUSE (giflib, pam, and xen), and Ubuntu (apache2, binutils, expat, fis-gtm, linux-azure, linux-azure-6.8, linux-nvidia-lowlatency, linux-azure, linux-azure-fde, linux-azure-5.15, linux-azure-fde-5.15, linux-azure-fips, linux-gcp-fips, linux-hwe-5.4, linux-nvidia, linux-nvidia-tegra-igx, ruby2.7, ruby3.0, ruby3.2, ruby3.3, and vim).

Fifty Years of Open Source Software Supply Chain Security (Queue)

Post Syndicated from corbet original https://lwn.net/Articles/1016715/

ACM Queue looks at
the security problem
in the light of a report on Multics security that
was published in 1974.

We are all struggling with a massive shift that has happened in the
past 10 or 20 years in the software industry. For decades, software
reuse was only a lofty goal. Now it’s very real. Modern
programming environments such as Go, Node, and Rust have made it
trivial to reuse work by others, but our instincts about
responsible behaviors have not yet adapted to this new reality.

The fact that the 1974 Multics review anticipated many of the
problems we face today is evidence that these problems are
fundamental and have no easy answers. We must work to make
continuous improvements to open source software supply chain
security, making attacks more and more difficult and expensive.

[$] Three ways to rework the swap subsystem

Post Syndicated from corbet original https://lwn.net/Articles/1016136/

The kernel’s swap subsystem is complex and highly optimized — though not
always optimized for today’s workloads. In three adjacent sessions during
the memory-management track of the 2025 Linux Storage, Filesystem,
Memory-Management, and BPF Summit, Kairui Song, Nhat Pham, and Usama Arif
all talked about some of the problems that they are trying to solve in the
Linux swap subsystem. In the first two cases, the solutions take the form of
an additional layer of indirection in the kernel’s swap map; the third,
which enables swap-in of large folios, may or may not be worthwhile in the
end.

Kernel prepatch 6.15-rc1

Post Syndicated from corbet original https://lwn.net/Articles/1016577/

Linus has released 6.15-rc1 and closed the
merge window for this release. “As expected, this was one of the bigger
merge windows, almost certainly just because we had some pent-up
development due to the previous releases being impacted by the holiday
season. That said, while it’s bigger than normal, it’s not some kind of
record-breaking thing.
“. In the end, 12.633 non-merge changesets were
pulled into the mainline during this merge window.

[$] The state of guest_memfd

Post Syndicated from corbet original https://lwn.net/Articles/1016133/

A typical cloud-computing host will share some of its memory with each
guest that it runs. The host retains its access to that memory, though,
meaning that it can readily dig through that memory in search of data that
the guest would prefer to keep private. The guest_memfd subsystem removes (most of) the
host’s access to guest memory, making the guest’s data more secure. In the
memory-management track of the 2025 Linux Storage, Filesystem,
Memory-Management, and BPF Summit, David Hildenbrand ran a discussion on
the state and future of this feature.

[$] The future of ZONE_DEVICE

Post Syndicated from corbet original https://lwn.net/Articles/1016124/

Alistair Popple started his session at the 2025 Linux Storage, Filesystem,
Memory-Management, and BPF Summit by proclaiming that ZONE_DEVICE
is “the ugly stepchild” of the kernel’s memory-management subsystem.
Ugly or not, the ability to manage memory that is attached to a peripheral
device rather than a CPU is increasingly important on current hardware.
Popple hoped to cover some of the challenges with ZONE_DEVICE and
find ways to make the stepchild a bit more attractive, if not bring it into
the family entirely.

[$] Page allocation for address-space isolation

Post Syndicated from corbet original https://lwn.net/Articles/1016013/

Address-space isolation may well be, as Brendan Jackman said at the
beginning of his memory-management-track session at the 2025 Linux Storage,
Filesystem, Memory-Management, and BPF Summit, “some security
bullshit
“. But it also holds the potential to protect the kernel from
a wide range of vulnerabilities, both known and unknown, while reducing the
impact of existing mitigations. Implementing address-space isolation with
reasonable performance, though, is going to require some significant
changes. Jackman was there to get feedback from the memory-management
community on how those changes should be implemented.

[$] Better hugetlb page-table walking

Post Syndicated from corbet original https://lwn.net/Articles/1016011/

The kernel must often step through the page tables of one or more processes
to carry out various operations. This “page-table walking” tends to be
performed by ad-hoc (duplicated) code all over the kernel. Oscar Salvador
used a memory-management-track session at the 2025 Linux Storage,
Filesystem, Memory-Management, and BPF Summit to talk about strategies to
unify the kernel’s page-table walking code just a little bit by making
hugetlb pages look more like ordinary pages.

[$] Approaches to reducing TLB pressure

Post Syndicated from corbet original https://lwn.net/Articles/1016009/

The CPU’s translation lookaside buffer (TLB) caches the results of
virtual-address translations, significantly speeding memory accesses. TLB
misses are expensive, so a lot of thought goes into using the TLB as
efficiently as possible. Reducing pressure on the TLB was the topic of Rik
van Riel’s memory-management-track session at the 2025 Linux Storage,
Filesystem, Memory-Management, and BPF Summit. Some approaches were
considered, but the session was short on firm conclusions.

[$] Slab allocator: sheaves and any-context allocations

Post Syndicated from corbet original https://lwn.net/Articles/1016001/

The kernel’s slab allocator is charged with providing small objects on
demand; its performance and reliability are crucial for the functioning of
the system as a whole. At the 2025 Linux Storage, Filesystem,
Memory-Management, and BPF Summit, two adjacent sessions in the
memory-management track dug into current work on the slab allocator. The
first focused on the new sheaves feature, while the second discussed a set
of allocation functions that are safe to call in any context.

Dave Täht RIP

Post Syndicated from corbet original https://lwn.net/Articles/1016109/

[Dave Täht]

From the LibreQoS site comes the sad
news
that Dave Täht has passed away. Among many other things, he bears
a lot of credit for our networks functioning as well as they do. “We’re
incredibly grateful to have Dave as our friend, mentor, and as someone who
continuously inspired us – showing us that we could do better for each
other in the world, and leverage technology to make that happen. He will be
dearly missed
“.

Searching through LWN’s archives will turn up many references to his work
fixing WiFi, improving queue management, tackling bufferbloat, and more. Farewell,
Dave, we hope the music is good wherever you are.

(Thanks to Jon Masters for the heads-up).

[$] Memory persistence over kexec

Post Syndicated from corbet original https://lwn.net/Articles/1015997/

The kernel’s kexec
mechanism
allows one kernel to directly boot a new one; it can be
thought of as a sort of kernel equivalent to the execve()
system call. Kexec has a number of uses, including booting a special kernel
to perform dumps after a crash. Normally, one does not expect user-space
processes to survive booting into a new kernel, but that has not stopped
developers from trying to implement that ability. Mike Rapoport ran a
memory-management-track session at the 2025 Linux Storage, Filesystem,
Memory-Management, and BPF Summit to discuss one piece of that problem:
enabling the contents of memory to persist across a kexec handover so that
the new kernel can pick up where the old one left off.

Security updates for Tuesday

Post Syndicated from corbet original https://lwn.net/Articles/1016076/

Security updates have been issued by AlmaLinux (freetype, grub2, kernel, kernel-rt, and python-jinja2), Debian (freetype, linux-6.1, suricata, tzdata, and varnish), Fedora (mingw-libxslt and qgis), Mageia (elfutils, mercurial, and zvbi), Oracle (grafana, kernel, libxslt, nginx:1.22, and postgresql:12), Red Hat (opentelemetry-collector), SUSE (corosync, opera, and restic), and Ubuntu (aom, libtar, mariadb, ovn, php7.4, php8.1, php8.3, rabbitmq-server, and webkit2gtk).