Post Syndicated from original https://lwn.net/Articles/893565/
“Hardware poisoning” is a mechanism for detecting and handling memory
errors in a running system. When a particular range of memory ceases to
remember correctly, it is “poisoned” and further accesses to it will
generate errors. The kernel has had support for
hardware poisoning for over a decade, but that doesn’t mean it can’t be
improved. At the 2022 Linux Storage,
Filesystem, Memory-management and BPF Summit, Yang Shi discussed the
challenges of dealing with hardware poisoning when it affects memory used
for the page cache.