Post Syndicated from Robert Graham original https://blog.erratasec.com/2018/12/notes-on-build-hardening.html
I thought I’d comment on a paper about “build safety” in consumer products, describing how software is built to harden it against hackers trying to exploit bugs.
What is build safety?
However, C/C++ is “unsafe”, and is the most popular language for building stuff that interacts with the network. In other cases, while the language itself may be safe, it’ll use underlying infrastructure (“libraries“) written in C/C++. When we are talking about hardening builds, making them safe or security, we are talking about C/C++.
How software is built
The way stack guards work is to stick a carefully constructed value in between each stack frame, known as a canary. Right before the function exits, it’ll check this canary in order to validate it hasn’t been corrupted. If corruption is detected, the program exits, or crashes, to prevent worse things from happening.
Since this feature was added, many vulnerabilities have been found that evade the default settings. Recently, -fstack-protector-strong has been added to gcc that significantly increases the number of protected functions. The setting -fstack-protector-all is still avoided due to performance cost, as even trivial functions which can’t possibly overflow are still instrumented.
The other major dynamic memory structure is known as the heap (or the malloc region). When a function returns, everything in its scratchpad memory on the stack will lost. If something needs to stay around longer than this, then it must be allocated from the heap rather than the stck.
Whereas stack guards change the code generated by the compiler, heap guards don’t. Instead, the heap exists in library functions.
The most common library added by a linker is known as glibc, the standard GNU C library. However, this library is about 1.8-megabytes in size. Many of the home devices in the paper above may only have 4-megabytes total flash drive space, so this is too large. Instead, most of these home devices use an alternate library, something like uClibc or musl, which is only 0.6-megabytes in size. In addition, regardless of the standard library used for other features, a program my still replace the heap implementation with a custom one, such as jemalloc.
Even if using a library that does heap guards, it may not be enabled in the software. If using glibc, a program can still turn off checking internally (using mallopt), or it can be disabled externally, before running a program, by setting the environment variable MALLOC_CHECK_.
The above paper didn’t evaluate heap guards. I assume this is because it can be so hard to check.
Obviously a useful mitigation step would be to randomize the layout of memory, so nothing is in a predictable location. This is known as address space layout randomization or ASLR.
The word layout comes from the fact that the when a program runs, it’ll consist of several segments of memory. The basic list of segments are:
- the executable code
- static values (like strings)
- global variables
- heap (growable)
- stack (growable)
- mmap()/VirtualAlloc() (random location)
Historically, the first few segments are laid out sequentially, starting from address zero. Remember that user-mode programs have virtual memory, so what’s located starting at 0 for one program is different from another.
As mentioned above, the heap and the stack need to be able to grow as functions are called data allocated from the heap. The way this is done is to place the heap after all the fixed-sized segments, so that it can grow upwards. Then, the stack is placed at the top of memory, and grows downward (as functions are called, the stack frames are added at the bottom).
Sometimes a program may request memory chunks outside the heap/stack directly from the operating system, such as using the mmap() system call on Linux, or the VirtualAlloc() system call on Windows. This will usually be placed somewhere in the middle between the heap and stack.
With ASLR, all these locations are randomized, and can appear anywhere in memory. Instead of growing contiguously, the heap has to sometimes jump around things already allocated in its way, which is a fairly easy problem to solve, since the heap isn’t really contiguous anyway (as chunks are allocated and freed from the middle). However, the stack has a problem. It must grow contiguously, and if there is something in its way, the program has little choice but to exit (i.e. crash). Usually, that’s not a problem, because the stack rarely grows very large. If it does grow too big, it’s usually because of a bug that requires the program to crash anyway.
ASLR for code
The problem for executable code is that for ASLR to work, it must be made position independent. Historically, when code would call a function, it would jump to the fixed location in memory where that function was know to be located, thus it was dependent on the position in memory.
To fix this, code can be changed to jump to relative positions instead, where the code jumps at an offset from wherever it was jumping from.
To enable this on the compiler, the flag -fPIE (position independent executable) is used. Or, if building just a library and not a full executable program, the flag -fPIC (position independent code) is used.
Then, when linking a program composed of compiled files and libraries, the flag -pie is used. In other words, use -pie -fPIE when compiling executables, and -fPIC when compiling for libraries.
When compiled this way, exploits will no longer be able to jump directly into known locations for code.
ASLR for libraries
The above paper glossed over details about ASLR, probably just looking at whether an executable program was compiled to be position independent. However, code links to shared libraries that may or may not likewise be position independent, regardless of the settings for the main executable.
I’m not sure it matters for the current paper, as most programs had position independence disabled, but in the future, a comprehensive study will need to look at libraries as a separate case.
ASLR for other segments
The above paper equated ASLR with randomized location for code, but ASLR also applies to the heap and stack. The randomization status of these programs is independent of whatever was configured for the main executable.
As far as I can tell, modern Linux systems will randomize these locations, regardless of build settings. Thus, for build settings, it just code randomization that needs to be worried about. But when running the software, care must be taken that the operating system will behave correctly. A lot of devices, especially old ones, use older versions of Linux that may not have this randomization enabled, or be using custom kernels where it has been turned off.
- figure out where the stack is located (mitigated by ASLR)
- overwrite the stack frame control structure (mitigated by stack guards)
- execute code in the buffer
This open can be set with -Wl,-z,noexecstack, when compiling both the executable and the libraries. This is the default, so you shouldn’t need to do anything special. However, as the paper points out, there are things that get in the way of this if you aren’t careful. The setting is more what you’d call “guidelines” than actual “rules”. Despite setting this flag, building software may result in an executable stack.
So, you may want to verify it after building software, such as using the “readelf -l [programname]”. This will tell you what the stack has been configured to be.
Format string bugs
-Wformat -Wformat-security -Werror=format-security
Warnings and static analysis
What about sanitizers?
-Wall -Wformat -Wformat-security -Werror=format-security -fstack-protector -pie -fPIE -D_FORTIFY_SOURCE=2 -O2 -Wl,-z,relro -Wl,-z,now -Wl,-z,noexecstack
If you are more paranoid, these options would be:
-Wall -Wformat -Wformat-security -Wstack-protector -Werror -pedantic -fstack-protector-all –param ssp-buffer-size=1 -pie -fPIE -D_FORTIFY_SOURCE=2 -O1 -Wl,-z,relro -Wl,-z,now -Wl,-z,noexecstack