gcc - GNU Compiler Collection (original) (raw)
author | Richard Sandiford richard.sandiford@arm.com | 2025-06-12 12:10:39 +0100 |
---|---|---|
committer | Richard Sandiford richard.sandiford@arm.com | 2025-06-12 12:10:39 +0100 |
commit | 8546265e2ee386ea8a4b2f9150ddfed32c9d15ea (patch) | |
tree | 028e4a93b2817286e20317bd29600ae3cffcc933 | |
parent | Daily bump. (diff) |
aarch64: Incorrect removal of ZA restore [PR120624]HEADtrunkmaster
The PCS defines a lazy save scheme for managing ZA across normal "private-ZA" functions. GCC currently uses this scheme for calls to all private-ZA functions (rather than using caller-save). Therefore, before a sequence of calls to private-ZA functions, GCC emits code to set up a lazy save. After the sequence of calls, GCC emits code to check whether lazy save was committed and restore the ZA contents if so. These sequences are emitted by the mode-switching pass, in an attempt to reduce the number of redundant saves and restores. The lazy save scheme also means that, before a function can use ZA, it must first conditionally store the old contents of ZA to the caller's lazy save buffer, if any. This all creates some relatively complex dependencies between setup code, save/restore code, and normal reads from and writes to ZA. These dependencies are modelled using special fake hard registers: ;; Sometimes we use placeholder instructions to mark where later ;; ABI-related lowering is needed. These placeholders read and ;; write this register. Instructions that depend on the lowering ;; read the register. (LOWERING_REGNUM 87) ;; Represents the contents of the current function's TPIDR2 block, ;; in abstract form. (TPIDR2_BLOCK_REGNUM 88) ;; Holds the value that the current function wants PSTATE.ZA to be. ;; The actual value can sometimes vary, because it does not track ;; changes to PSTATE.ZA that happen during a lazy save and restore. ;; Those effects are instead tracked by ZA_SAVED_REGNUM. (SME_STATE_REGNUM 89) ;; Instructions write to this register if they set TPIDR2_EL0 to a ;; well-defined value. Instructions read from the register if they ;; depend on the result of such writes. ;; ;; The register does not model the architected TPIDR2_ELO, just the ;; current function's management of it. (TPIDR2_SETUP_REGNUM 90) ;; Represents the property "has an incoming lazy save been committed?". (ZA_FREE_REGNUM 91) ;; Represents the property "are the current function's ZA contents ;; stored in the lazy save buffer, rather than in ZA itself?". (ZA_SAVED_REGNUM 92) ;; Represents the contents of the current function's ZA state in ;; abstract form. At various times in the function, these contents ;; might be stored in ZA itself, or in the function's lazy save buffer. ;; ;; The contents persist even when the architected ZA is off. Private-ZA ;; functions have no effect on its contents. (ZA_REGNUM 93) Every normal read from ZA and write to ZA depends on SME_STATE_REGNUM, in order to sequence the code with the initial setup of ZA and with the lazy save scheme. The code to restore ZA after a call involves several instructions, including conditional control flow. It is initially represented as a single define_insn and is split late, after shrink-wrapping and prologue/epilogue insertion. The split form of the restore instruction includes a conditional call to __arm_tpidr2_restore: (define_insn "aarch64_tpidr2_restore" [(set (reg:DI ZA_SAVED_REGNUM) (unspec:DI [(reg:DI R0_REGNUM)] UNSPEC_TPIDR2_RESTORE)) (set (reg:DI SME_STATE_REGNUM) (unspec:DI [(reg:DI SME_STATE_REGNUM)] UNSPEC_TPIDR2_RESTORE)) ... ) The write to SME_STATE_REGNUM indicates the end of the region where ZA_REGNUM might differ from the real contents of ZA. In other words, it is the point at which normal reads from ZA and writes to ZA can safely take place. To finally get to the point, the problem in this PR was that the unsplit aarch64_restore_za pattern was missing this change to SME_STATE_REGNUM. It could therefore be deleted as dead before it had chance to be split. The split form had the correct dataflow, but the unsplit form didn't. Unfortunately, the tests for this code tended to use calls and asms to model regions of ZA usage, and those don't seem to be affected in the same way. gcc/ PR target/120624 * config/aarch64/aarch64.md (SME_STATE_REGNUM): Expand on comments. * config/aarch64/aarch64-sme.md (aarch64_restore_za): Also set SME_STATE_REGNUM gcc/testsuite/ PR target/120624 * gcc.target/aarch64/sme/za_state_7.c: New test.
-rw-r--r-- | gcc/config/aarch64/aarch64-sme.md | 2 |
---|---|---|
-rw-r--r-- | gcc/config/aarch64/aarch64.md | 8 |
-rw-r--r-- | gcc/testsuite/gcc.target/aarch64/sme/za_state_7.c | 21 |
3 files changed, 31 insertions, 0 deletions
diff --git a/gcc/config/aarch64/aarch64-sme.md b/gcc/config/aarch64/aarch64-sme.mdindex c49affd0dd39..f7958c90eae4 100644--- a/gcc/config/aarch64/aarch64-sme.md+++ b/gcc/config/aarch64/aarch64-sme.md | |||
---|---|---|---|
@@ -373,6 +373,8 @@ | |||
373 | (reg:DI SME_STATE_REGNUM) | 373 | (reg:DI SME_STATE_REGNUM) |
374 | (reg:DI TPIDR2_SETUP_REGNUM) | 374 | (reg:DI TPIDR2_SETUP_REGNUM) |
375 | (reg:DI ZA_SAVED_REGNUM)] UNSPEC_RESTORE_ZA)) | 375 | (reg:DI ZA_SAVED_REGNUM)] UNSPEC_RESTORE_ZA)) |
376 | (set (reg:DI SME_STATE_REGNUM) | ||
377 | (unspec:DI [(reg:DI SME_STATE_REGNUM)] UNSPEC_TPIDR2_RESTORE)) | ||
376 | (clobber (reg:DI R0_REGNUM)) | 378 | (clobber (reg:DI R0_REGNUM)) |
377 | (clobber (reg:DI R14_REGNUM)) | 379 | (clobber (reg:DI R14_REGNUM)) |
378 | (clobber (reg:DI R15_REGNUM)) | 380 | (clobber (reg:DI R15_REGNUM)) |
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.mdindex 6dbc9faf7130..e11e13033d2e 100644--- a/gcc/config/aarch64/aarch64.md+++ b/gcc/config/aarch64/aarch64.md | |||
@@ -136,6 +136,14 @@ | |||
136 | ;; The actual value can sometimes vary, because it does not track | 136 | ;; The actual value can sometimes vary, because it does not track |
137 | ;; changes to PSTATE.ZA that happen during a lazy save and restore. | 137 | ;; changes to PSTATE.ZA that happen during a lazy save and restore. |
138 | ;; Those effects are instead tracked by ZA_SAVED_REGNUM. | 138 | ;; Those effects are instead tracked by ZA_SAVED_REGNUM. |
139 | ;; | ||
140 | ;; Sequences also write to this register if they synchronize the | ||
141 | ;; actual contents of ZA and PSTATE.ZA with the current function's | ||
142 | ;; ZA_REGNUM and SME_STATE_REGNUM. Conceptually, these extra writes | ||
143 | ;; do not change the value of SME_STATE_REGNUM. They simply act as | ||
144 | ;; sequencing points. They means that all direct accesses to ZA can | ||
145 | ;; depend only on ZA_REGNUM and SME_STATE_REGNUM, rather than also | ||
146 | ;; depending on ZA_SAVED_REGNUM etc. | ||
139 | (SME_STATE_REGNUM 89) | 147 | (SME_STATE_REGNUM 89) |
140 | 148 | ||
141 | ;; Instructions write to this register if they set TPIDR2_EL0 to a | 149 | ;; Instructions write to this register if they set TPIDR2_EL0 to a |
diff --git a/gcc/testsuite/gcc.target/aarch64/sme/za_state_7.c b/gcc/testsuite/gcc.target/aarch64/sme/za_state_7.cnew file mode 100644index 000000000000..38bc13471591--- /dev/null+++ b/gcc/testsuite/gcc.target/aarch64/sme/za_state_7.c | |||
@@ -0,0 +1,21 @@ | |||
1 | // { dg-options "-O -fno-optimize-sibling-calls -fomit-frame-pointer" } | ||
2 | |||
3 | #include <arm_sme.h> | ||
4 | |||
5 | void callee(); | ||
6 | |||
7 | __arm_new("za") __arm_locally_streaming int test() | ||
8 | { | ||
9 | svbool_t all = svptrue_b8(); | ||
10 | svint8_t expected = svindex_s8(1, 1); | ||
11 | svwrite_hor_za8_m(0, 0, all, expected); | ||
12 | |||
13 | callee(); | ||
14 | |||
15 | svint8_t actual = svread_hor_za8_m(svdup_s8(0), all, 0, 0); | ||
16 | return svptest_any(all, svcmpne(all, expected, actual)); | ||
17 | } | ||
18 | |||
19 | // { dg-final { scan-assembler {\tbl\t__arm_tpidr2_save\n} } } | ||
20 | // { dg-final { scan-assembler {\tbl\t__arm_tpidr2_restore\n} } } | ||
21 | // { dg-final { scan-assembler-times {\tsmstart\tza\n} 2 } } |