[RFC] Hardening in libc++ (original) (raw)

We would like to improve security in libc++ by providing optional hardening modes that, when enabled, turn certain cases of undefined behavior into guaranteed program termination (in other words, turn undefined behavior into implementation-defined behavior). See the C++ Buffer Hardening RFC for more context about the overarching effort.

(Note: this RFC presents our vision for hardening in libc++. While certain parts of the RFC have been implemented in the main branch, many aren’t, and we don’t yet officially support hardening in the LLVM 17 release or on main )

Different modes provide different trade-offs between security and performance. Our current design has four modes; sorted in the order of increased security, they are:

unchecked — the default mode that doesn’t compromise any runtime performance to check for undefined behavior;
hardened — contains a minimal set of low-overhead checks deemed security-critical;
debug-lite — extends the hardened mode with additional low-overhead checks that are not security-critical;
debug — extends the debug-lite mode with checks that might impose significant overhead (for example, might change the complexity of algorithms).

(If you’re familiar with the safe mode that was added to libc++ in the LLVM 15 release, hardening modes can be seen as an extension of that work)

Each mode on the list is a superset of the previous one (but this might not be true of any new modes potentially added in the future). Of these, hardened and debug-lite modes are intended to be usable in production; for the debug mode, being usable in production is a non-goal (it is intended for testing). The hardened mode aims to be minimalistic and heavily prioritizes performance; we intend to set the bar high for any check to be enabled in the hardened mode, only enabling those checks that prevent memory safety bugs. The debug-lite mode additionally aims to catch common programming errors that aren’t directly exploitable; here, the criteria for a check to be enabled is roughly that the performance overhead is relatively low and the error is caused by user input (purely internal assertions are not enabled in the debug-lite mode). Different projects would make different trade-offs here, which is why we aim to provide two different modes.

(A note on terminology: a hardening mode is a name for any of the modes described above; e.g. the unchecked mode is one of the hardening modes. On the other hand, the hardened mode is the specific mode that’s in-between the unchecked mode and the debug-lite mode)

Termination

When an unsuccessful check is triggered, the program is terminated via a call to __builtin_trap ; the intent is to turn undefined behavior into a guaranteed program termination and make it terminate as fast as possible (faster than a call to std::abort , which has other security problems as well). In the future, we will explore ways to provide an additional error message and potentially to allow the behavior to be customized.

Note that we will not be using the existing __libcpp_verbose_abort mechanism because its semantics are essentially to call std::abort . __libcpp_verbose_abort will still be supported and used for cases where we terminate for reasons other than encountering undefined behavior (e.g. when an exception is thrown under -fno-exceptions , and in the future from libc++abi when various runtime operations fail).

ABI considerations

Some checks require storing additional information in standard library classes — for example, to be able to check whether an iterator dereference is valid, the iterator object needs to somehow store a reference to the corresponding container. This requires an ABI break.

In the proposed design, breaking the ABI is orthogonal to setting a hardening mode. The rationale for this design stems from the observation that the ABI configuration is a property of the platform and is set by the vendor whereas the hardening mode is a property of an application and is set by the user (even though vendors can set the default hardening mode). The ABI is a property of the platform because in general every component built on the platform has to be ABI-compatible. If we were to provide e.g. a “hardened-abi-breaking” mode, it would give users an easy way to unintentionally build their application with an ABI that’s incompatible with the rest of the platform, which in almost all circumstances should be avoided. Moreover, since there will be several independent ABI-breaking settings, this would either create a combinatorial explosion of ABI modes or disallow mixing-and-matching different ABI settings (for example, it might make sense to enable bounded iterators for constant-sized containers such as std::array but not for variable-sized containers such as std::vector , but that would be impossible if the only available modes were “hardened-abi-stable” and “hardened-abi-breaking”).

ABI-breaking changes, such as enabling container-aware iterators, are controlled by a separate set of macros that are grouped together with other ABI macros (which are unrelated to hardening). Enabling a hardening mode doesn’t affect the ABI; rather, the hardening mode will enable whichever checks are possible within the current ABI configuration. For example, enabling the hardened mode will always enable the “valid-element-access” checks in std::span::operator[] (because those don’t depend on the ABI configuration), but will only enable “valid-element-access” checks in std::span::iterator::operator* if container-aware iterators for std::span are enabled in the ABI configuration (in this case, the relevant macro is _LIBCPP_ABI_BOUNDED_ITERATORS ).

Enabling hardening

At the platform level, vendors can control the default hardening mode via a CMake variable. At the application level, the hardening mode can be overridden by users via either a compiler flag or a macro.

The default hardening mode can be set by vendors via the CMake variable LIBCXX_HARDENING_MODE with possible values of unchecked, hardened, debug_lite and debug.
The preferred way to set the hardening mode at the application level is via the compiler flag -flibc++-hardening=<mode> with possible values of unchecked, hardened, debug_lite and debug (same values as the CMake variable).
In addition to the compiler flag, the hardening mode can be configured using the macro _LIBCPP_HARDENING_MODEwith possible values:
- _LIBCPP_HARDENING_MODE_UNCHECKED
- _LIBCPP_HARDENING_MODE_HARDENED
- _LIBCPP_HARDENING_MODE_DEBUG_LITE
- _LIBCPP_HARDENING_MODE_DEBUG

The exact numeric values of these macros are unspecified and deliberately not ordered to prevent users from relying on implementation details.

-flibc++-hardening and _LIBCPP_HARDENING_MODE are mutually exclusive: when compiling with -flibc++-hardening , attempting to define _LIBCPP_HARDENING_MODE will result in an error.

GCC compatibility

The _LIBCPP_HARDENING_MODE macro allows enabling hardening in libc++ when compiling with the GCC compiler where the proposed -flibc++-hardening Clang flag will not be available.

Additionally, GCC has recently introduced the -fhardened flag that enables hardening in libstdc++. We plan to explore making libc++ honor that flag when compiling under GCC (it will likely enable the hardened mode) as well as adding the -fhardened flag to Clang. While the exact semantics of the -fhardened flag will necessarily differ between libc++ and libstdc++, we believe that having some broad compatibility will still be beneficial.

Configuring hardening on a per-TU basis

The hardening mode can be overridden on a per-TU basis by compiling the TU with the -flibc++-hardening flag or the _LIBCPP_HARDENING_MODE macro defined to a different value from the rest of the application. This would allow, for example, disabling checks for performance-critical parts of the code.

Note that the ability to select the hardening mode on a per-TU basis has ODR implications. However, we can use ABI tags to ensure that inline functions have a different mangling based on the hardening mode, thus avoiding ODR violations. This mechanism only covers functions defined inline — the functions compiled inside the dylib will still use the hardening mode that the library was configured with by the vendor, and the value of _LIBCPP_HARDENING_MODE set by the user won’t be respected. However, the vast majority of functions in the standard library are defined inline, so that should not be seen as a significant limitation.

Rollout

We aim to first make hardening modes available in the LLVM 18 release, with no breaking changes. LLVM 19 and 20 will contain breaking changes. Proposed timeline:

LLVM 18: first release that supports hardening modes and ways to enable them as described in the RFC.
- The safe mode (available since the LLVM 15 release) is still supported; the release notes will mention that projects using the safe mode have to transition to use the hardened mode or the debug-lite mode instead (debug-lite is the rough equivalent of the old safe mode).
- A few checks that used to be in the safe mode might become excluded (internally, safe will be mapped to debug-lite). In LLVM 17, the safe mode contains every check that isn’t explicitly marked as _debug_-only, but finer-grained categorization might allow trimming it down further.
- The safe mode will no longer use __libcpp_verbose_abort when a check fails (__builtin_trap will be used instead). Overriding __libcpp_verbose_abort will no longer have an effect on the behavior of the _safe_mode.
- The meaning of the debug mode will change. The legacy debug mode has been removed in LLVM 17. The new debug mode that is part of hardening will be enabled using the mechanisms explained in the RFC and will function differently (e.g. it won’t require a global database).
LLVM 19: the safe mode will be deprecated. The LIBCXX_ENABLE_ASSERTIONS CMake variable and the _LIBCPP_ENABLE_ASSERTIONS macro will be deprecated (with a warning) and users will be given a message to migrate to the hardened mode or the debug-lite mode instead.
LLVM 20: the safe mode will be removed along with the associated macros and the CMake variable.

Future work

We would like to explore the possibility of shipping multiple ABI configurations of the library to be enabled via a compiler flag.
- An existing RFC for shipping an AddressSanitizer-instrumented version of libc++ might also require different ABI configurations; ideally any solution we come up with would cover both cases.

FAQ

Why is the safe mode being replaced with the debug-lite mode?
- There are several issues with the safe mode:
  * The set of assertions enabled in this mode is not well-curated — it essentially consists of everything except the most heavyweight debug assertions. This could prevent many projects from adopting it. In fact, the safe mode was always meant to be a stepping stone for finer-grained modes like this RFC.
  * While it makes the application safer, the safe mode does not to attempt to prevent all potentially unsafe uses of the standard library, making the name problematic. “Safe” is a very tempting name, and using that name would both fail to deliver its the promises and also be more tempting to use than the hardened mode, which is not what we want to recommend.
Why is the mode named debug-lite if it’s intended to be usable in production?
- It is arguably somewhat counter-intuitive; however, we see the debug-lite mode as a trimmed down, more performant debug mode rather than an extended hardened mode. The key distinction here is that the _hardened_mode focuses on security whereas the debug mode focuses on validity. The two debug modes (debug and debug-lite) focus on finding logic issues (of which security issues are a subset of) with different tradeoffs between coverage and performance. The hardened mode, on the other hand, focuses on security-critical issues while heavily prioritizing performance.
Both the hardened mode and the debug-lite mode are intended to be usable in production. Which one would you recommend by default?
- We would recommend to almost every project to use the hardened mode in production (perhaps with opt-outs for performance-critical parts of the code). It is designed to keep the performance penalty minimal and only contains checks which prevent critical security vulnerabilities. The debug-lite mode is intended for projects that are actively seeking to prevent as many general logic issues in production as possible and are okay to trade off some additional performance for that goal.
Why doesn’t the RFC provide a way for projects to select individual categories of assertions to enable?
- This would severely limit our ability to change or extend the categories, as well as make the whole model lower-level and harder to understand — we are going for simplicity over unbounded configurability.
- We see each mode as representing some fundamental, generally useful concept, not just a collection of largely unrelated checks. We are open to add new modes in the future as long as they represent some well-defined abstraction, are generally useful and sufficiently different from the existing modes.
Why aren’t checks for null pointers a part of the hardened mode?
- Most platforms have a guard virtual memory region starting at address 0, so a stray memory access close to 0 is guaranteed to trap and doesn’t compromise the memory safety. The hardened mode aims to only enable security-critical checks. We might explore adding null pointer checks to the hardened mode on platforms where trapping is not guaranteed if we can determine that at compile time.
Making ABI configuration separate from enabling a hardening mode means that, for example, accessing a spanelement through operator[] is always checked if hardening is enabled whereas the same access through an iterator might or might not be checked depending on the ABI configuration by the vendor. Wouldn’t that create confusion for the users?
- There is some potential for confusion there, but we believe the alternatives are worse. ABI is a property of the platform and should not be changed by the user; moreover, lumping together all possible combinations of different ABI settings (which are independent from each other) and hardening modes would result in a combinatorial explosion. Also, users have the option of using their own copy of libc++ and thus essentially becoming their own vendor. That said, we are open to explore the possibility of shipping multiple ABI configurations of the library in the future.