Support clobber_abi and vector/access registers (clobber-only) in s390x inline assembly by taiki-e · Pull Request #130630 · rust-lang/rust (original) (raw)

This supports clobber_abi which is one of the requirements of stabilization mentioned in #93335.

Thanks for working on SystemZ support for this!

ELF Application Binary Interface s390x Supplement says that cc (condition code, bits 18-19 of PSW) is "Volatile".
However, we do not have a register class for cc and instead mark cc as clobbered unless preserves_flags is specified (Mark s390x condition code register as clobbered in inline assembly #111331).
Therefore, in the current implementation, if both preserves_flags and clobber_abi are specified, cc is not marked as clobbered. Is this okay? Or even if preserves_flags is used, should cc be marked as clobbered if clobber_abi is used?

As I read the Rust inline-asm documention (https://doc.rust-lang.org/stable/reference/inline-assembly.html), this looks OK to me. We have the statement:

Any registers not specified as outputs must have the same value upon exiting the asm block as they had on entry, otherwise behavior is undefined.
- This only applies to registers which can be specified as an input or output. Other registers follow target-specific rules.

Since the flags register cannot be specified as an input or output, the target-specific rules exception applies. For the flags register specifically, we then have the following rules:

These flags registers must be restored upon exiting the asm block if the preserves_flags option is set: 
- [list per architecture]

We can define this for SystemZ, and the definition that makes sense is to preserve the full CC value. (And I guess also some of the flags in the floating-point control register, specifically the exception masks and rounding modes. But since we don't really model the FPC, we may have to just ignore that part for now.)

Finally, looking at the list of clobbered registers on the other supported targets shows that for none of them, clobber_abi affects the flags register(s). So it seems intentional that clobber_abi and preserves_flags are somewhat orthogonal; it would seem to make sense to follow that precedent on SystemZ as well.

ELF Application Binary Interface s390x Supplement says that pm (program mask, bits 20-23 of PSW) is "Cleared".
There does not appear to be any registers associated with this in either LLVM or GCC, so at this point I don't see any way other than to just ignore it. Is this okay as-is?

Yes, that should be fine to ignore.

Is "areg" a good name for register class name for access registers? It may be a bit confusing between that and reg_addr, which uses the “a” constraint (Support reg_addr register class in s390x inline assembly #119431)...

Not sure exactly what the Rust convention here is. I'd be fine with "areg", but maybe "acreg" or something might be clearer?

GCC seems to recognize only a0 and a1, and using a[2-15] causes errors.
Given that cg_gcc has a similar problem with other architecture (x86_64 asm: Build error when using r[8-15]b rustc_codegen_gcc#485), I don't feel this is a blocker for this PR, but it is worth mentioning here.

Interesting, thanks for pointing this out. This should be fixed in GCC at some point, but it's probably not particularly important. I agree it should not block this PR.

vreg should be able to accept #[repr(simd)] types as input if the vector target feature added in rustc_target: add known safe s390x target features #127506 is enabled, but core_arch has no s390x vector type and both #[repr(simd)] and core::simd are unstable, so I have not implemented it in this PR. EDIT: And supporting it is probably more complex than doing the equivalent on other architectures... S390x inline asm #88245 (comment)

Actually, the situation has improved: the problem I pointed out in that last comment has been fixed in LLVM 16, which now always uses the same data_layout string, no matter whether the vector feature is enabled or not:
https://releases.llvm.org/16.0.0/docs/ReleaseNotes.html#changes-to-the-systemz-backend

Assuming rustc is now always built with LLVM 16 or later, we can drop the code that forces the vector feature to be always off, which would enable making general use of vector facilities if enabled. This would include fully supporting the vreg class for inline asm as well as enabling auto-SIMD and possibly even defining a set of platform-specific vector intrinsics.