[llvm-dev] [RFC] Adding thread group semantics to LangRef (motivated by GPUs) (original) (raw)

Nicolai Hähnle via llvm-dev llvm-dev at lists.llvm.org
Sat Dec 29 08:32:06 PST 2018


On 20.12.18 18:03, Connor Abbott wrote:

We already have the notion of "convergent" functions like syncthreads(), to which we cannot add control-flow dependencies. That is, it's legal to hoist syncthreads out of an "if", but it's not legal to sink it into an "if".  It's not clear to me why we can't have "anticonvergent" (terrible name) functions which cannot have control-flow dependencies removed from them?  ballot() would be both convergent and anticonvergent.

Would that solve your problem?

I think it's important to note that we already have such an attribute, although with the opposite sense - it's impossible to remove control flow dependencies from a call unless you mark it as "speculatable".

This isn't actually true. If both sides of an if/else have the same non-speculative function call, it can still be moved out of control flow.

That's because doing so doesn't change anything at all from a single-threaded perspective. Hence why I think we should model the communication between threads honestly.

However, this doesn't prevent

if (...) { } else { } foo = ballot(); from being turned into if (...) { foo1 = ballot(); } else { foo2 = ballot(); } foo = phi(foo1, foo2) and vice versa. We have a "noduplicate" attribute which prevents transforming the first into the second, but not the other way around. Of course we could keep going this way and add a "nocombine" attribute to complement noduplicate. But even then, there are even still problematic transforms. For example, take this program, which is simplified from a real game that doesn't work with the AMDGPU backend: while (cond1 /* uniform */) { ballot(); ... if (cond2 /* non-uniform */) continue; ... } In SPIR-V, when using structured control flow, the semantics of this are pretty clearly defined. In particular, there's a continue block after the body of the loop where control flow re-converges, and the only back edge is from the continue block, so the ballot is in uniform control flow. But LLVM will get rid of the continue block since it's empty, and re-analyze the loop as two nested loops, splitting the loop header in two, producing a CFG which corresponds to this: while (cond1 /* uniform */) { do { ballot();  ... } while (cond2 /* non-uniform */); ... } Now, in an implementation where control flow re-converges at the immediate post-dominator, this won't do the right thing anymore. In order to handle it correctly, you'd effectively need to always flatten nested loops, which will probably be really bad for performance if the programmer actually wanted the second thing. It also makes it impossible when translating a high-level language to LLVM to get the "natural" behavior which game developers actually expect. This is exactly the sort of "spooky action at a distance" which makes me think that everything we've done so far is really insufficient, and we need to add an explicit notion of control-flow divergence and reconvergence to the IR. We need a way to say that control flow re-converges at the continue block, so that LLVM won't eliminate it, and we can vectorize it correctly without penalizing cases where it's better for control flow not to re-converge.

Well said!

Cheers, Nicolai

Lerne, wie die Welt wirklich ist, Aber vergiss niemals, wie sie sein sollte.



More information about the llvm-dev mailing list