Symbolic/named address spaces (original) (raw)
November 19, 2025, 7:22pm 1
Hi all,
For targets that use > 1 address space (GPUs and accelerators), for IR readability, would it make sense to support symbolic or named address spaces. For NVPTX target, for example, this means that ptr addrspace(1) %x could also be printed as ptr addrspace(global) %x. This will help improve readability of the IR. If we go with what we did for ImmArg pretty printing, another possibility is ptr addrspace(/* global */1) %x so that the symbolic name is a pure comment.
I have not looked into how this can be done, but at a first order, this will be purely a pretty printing and maybe parsing feature and LLVM IR will still continue to use numbers. Additionally, this feature will be available only for modules that have a triple specified. Based on the triple, the implementation will query the number → name mappings for that triple and use that for printing and parsing. I am hoping that just having the triple will be sufficient to give meaningful names to address spaces, but folks can chime in if that’s not the case.
I wanted to check if this would be useful generally before digging into details.
Thanks
Rahul
I am very much +1 on this, it always bugged me that we use integer for something that should be a keyword: looks like obfuscation…
I thought about it in the past and was considering maybe module metadata to encode the mapping of string->integer to handle the printing/parsing.
Triple may work, but wouldn’t this then require the target to be available in order to parse the IR?
jurahul November 19, 2025, 7:58pm 3
Right, that’s the impl detail I don’t know (that is, is triple sufficient and can be extended to vend out this information). With the metadata option, what you are suggesting is essentially encoding this as a first-class thing in the IR, right? As an example, similar to target triple and datalayout, we can have number->name mappings represented directly in the IR (as a new target address-space-names="0:-1:global-2:shared entry at the top of the module). Here, 0 has no name and 1 and 2 have names. Maybe it should be just 1:global-2:shared.
One concern is that since this has not semantically load bearing (yet) having a new first-class thing in the IR may be too heavy weight for what it’s trying to achieve.
Nothing new: I was thinking of module metadata
Metadata attached to a module using named metadata may not be dropped,
jurahul November 19, 2025, 9:02pm 5
That might work, but that means targets that want to use this have to explicitly add this metadata to the IR during one of their initial passes. With the triple-based option, I was thinking that it will work on any existing IR as well. But without needing to create a target. So we can add something like Triple::getAddressSpaceNames() similar to Triple::computeDataLayout and then use that.
arsenm November 19, 2025, 9:29pm 6
This is not a dynamic program property and should not be encoded in the module. It will introduce new edge cases for the compiler to deal with
No. The target would only be required it this information was put into CodeGen, where it shouldn’t be. This is more of an IR+ABI type information which should not depend on the target, similar to the other information in TargetParser
jurahul November 19, 2025, 9:32pm 7
Thanks @arsenm. Does that mean adding a Triple::getAddressSpaceNames() that returns this mapping which is then used by the IR printer (and parser) is a feasible path forward?
arsenm November 19, 2025, 9:39pm 8
Probably shouldn’t implement this as a return direct 1:1 mapping, as the address space is a 24-bit value. You could implement this as a give-pretty-name string in the triple as a function of the address space number
jurahul November 19, 2025, 9:51pm 9
Right, so something like StringRef Triple::getAddressSpaceName(unsigned AS). If we want parsing support, additionally std::optional<unsigned> Target::getAddressSpaceNumber(StringRef Name);. The parser/printer can cache these to avoid repeated calls if necessary. The name is required to be a single token if we want parsing support, else it can be a free form string that printed in a comment.
resistor November 24, 2025, 9:35pm 10
I’ll just +1 that this would be very convenient for CHERI as well.
I added some limited support for something similar to LLParser in December 2022: [LLParser] Support symbolic address space numbers · llvm/llvm-project@f850035 · GitHub . It is possible to use the alloca, globals, and program address spaces as ptr addrspace("A"/"P"/"G") as symbolic versions of the data layout components.
This does not address all the requested use cases but imagine this should be quite easy to extend as long as we define them in the data layout.
nikic December 1, 2025, 1:46pm 12
I generally like the idea of supporting symbolic address spaces.
I’m a bit uneasy about binding this to the triple, because outside the backend, we generally want to avoid triples in tests, to show that the behavior is target-independent and avoid risk of REQUIRES failures.
I wonder whether it would make sense to include this in the DataLayout? Like if the DL currently has p1:64:64:64 or something, it could become p1(global):64:64:64.
jurahul December 1, 2025, 8:49pm 13
Encoding the name of the address space in the datalayout string seems reasonable. I had earlier created this POC PR: [llvm/llvm-project: [LLVM] Add support for printing and parsing symbolic address spaces by jurahul · Pull Request #169422 · llvm/llvm-project based on Triple, but I assume the same can be based on DataLayout instead. So we can have new functions:
StringRef DataLayout::getAddressSpaceName(unsigned AS);
std::optional<unsigned> DataLayout::getAddressSpaceNumber(StringRef Name);
If that seems ok, I can work on a change to implement this.
jurahul December 4, 2025, 5:50am 14
Here’s a PR that implements support for specifying address space names in data layout: [LLVM][IR] Add support for address space names in DataLayout by jurahul · Pull Request #170559 · llvm/llvm-project. Sample IR dump looks like:
target datalayout = "p2(global):32:8-p8(stack):8:8-p11(code):8:8"
@str = private addrspace("global") constant [4 x i8] c"str\00"
define void @foo() {
%alloca = alloca i32, addrspace("stack")
ret void
}