[RFC] Introduce sentinel pointer value to DataLayout
(original) (raw)
Regarding the interaction with nonnull
, dereferenceable_or_null
, and null_pointer_is_valid
, it depends on how much we want to preserve the current semantics of null
, which currently represents zero in LLVM.
I think @arichardson made a great point. Rather than introducing a new concept called “sentinel pointer”, using nullptr
and deprecating null
would make things clearer and less confusing.
Proposed Changes
- A new IR representation
ptr addrspace(n) nullptr
will be introduced to represent the actual or canonicalnullptr
. - A new class
ConstantNullPointer
will be introduced. ConstantPointerNull
will likely be kept for now, based on feedback from previous discussions.- It will correspond to
ptr addrspace(n) zeroinitializer
. - All existing
ptr addrspace(n) null
will be auto-upgraded toptr addrspace(n) zeroinitializer
. - However, we will update its usage wherever the semantics allow.
- It will correspond to
Interaction with metadata
nonnull
and dereferenceable_or_null
will be replaced with nonnullptr
and dereferenceable_or_nullptr
, respectively.
My experience and knowledge in this area are fairly limited, but my understanding is that the key concern behind these attributes is probably not whether the pointer literally holds a zero value but whether it implies the actual nullptr
. If nullptr
in address space N is not zero, does it really matter whether a pointer is literally zero? We probably care more about whether it is actually a nullptr
in that context.
Interaction with attributes
null_pointer_is_valid
is a bit trickier. Based on my understanding, it only applies to address space 0 and is specifically used for the null
address. I think we should keep the name but adjust its semantics to refer to nullptr
instead of null
.
I expect this change will not have any actual effect on existing code because:
- We always initialize the pointer specification for address space 0, and the default sentinel value is 0, unless it is override by data layout string.
- All existing upstream LLVM targets currently use
0
fornullptr
.
Handling Legacy Attributes
If I remember correctly, we recently removed nocapture
and replaced it with captures(...)
. How do we currently handle cases where we encounter the old nocapture
attribute? @nikic
An “Easier” Alternative
The proposal above is to avoid confusion by replacing a “deceiving” terminology with clearer alternatives. However, as @nikic and @arichardson pointed out, we could also modify the semantics of existing terms instead of introducing new ones.
A more straightforward approach would be to redefine the meaning of null
pointer across LLVM to represent the actual nullptr
in its corresponding address space, while still keeping the null
spelling.
We will still introduce the new nullptr
representation in DataLayout
, ensuring each address space has a well-defined nullptr
valuek, but we modify the semantics of null
to match the nullptr
value defined for each address space.
This is an enhancement to the existing approach and doesn’t make things worse, even if we redefine null
to mean nullptr
. In most places, LLVM already avoids assuming null
is always 0, though there are exceptions, such as this bug.
For handling ConstantPointerNull
, we first replace all existing uses of ConstantPointerNull
with Constant::getNull(PtrTy)
, and then put back ConstantPointerNull
only in contexts where the pointer is not intended to represent a literal zero.
After making this change, we can safely assume that null
represents nullptr
in all contexts, and use it for futher development.
What do you think? @arsenm @arichardson @nikic