(original) (raw)

On Dec 21, 2018, at 6:51 PM, Matt Arsenault <arsenm2@gmail.com> wrote:

On Dec 21, 2018, at 11:15 AM, Quentin Colombet <quentin.colombet@gmail.com> wrote:

Hi Matt,

Your use case falls definitely in what RegBankSelect meant to solve.
That said, the support you need is not implemented because we didn't
have use cases to test the code against.

Regarding the cost, if the mapping produces more than 1 partial value,
right now RegBankSelect::getRepairCost will say this is too expensive
and this is actually where you need to patch the pass to add a target
hook to compute something that would use instruction to decompose the
value.

Yes, this is what happens with greedy. With fast I get a little further.

So the copy part cost I covered it. For the cost of rewriting the
instruction completely, this is captured by
InstructionMapping::getCost.
The idea of InstructionMapping::getCost is to reflect the cost for
transforming the current instruction into the instruction after we
apply this mapping. Then the RepairCost is here to account for the
cost of "bringing" every operand to the right place for this mapping
using copy or some target specific sequence.
Like the cost computation, the target specific sequences are not
implemented, but should happen in RegBankSelect::repairReg.

This seems to contradict the comment on repairReg?
/// \\note The caller is supposed to do the rewriting of op if need be.
/// I.e., Reg = op ... => = NewOp …

Right now,
this will assert that the number of break downs should be == 1 but the
code to decompose the operand should happen there.
Finally, the rewriting of the current instruction is supposed to
happen in RegisterBankInfo::applyMapping.

If you have an example (.mir) that you can share, we can work together
to make this happen.

Cheers,
-Quentin

The simplest case is this, where there’s only one register bank involved. The cost of the unmerge and merge should be 0, there’s only a real cost from the fact that it is now 2 operations.

---
name: and\_i64\_vv
legalized: true

body: |
bb.0:
; Should turn into something like this, although the merge\_values and unmerge\_values can be optimized out
; %0:vgpr(s64) = COPY vgpr0_vgpr1<brclass="">;vgpr0\_vgpr1<br class=""> ; %1:vgpr(s64) = COPY vgpr0_vgpr1<brclass="">;vgpr2\_vgpr3
; %2:vgpr(s32), %3:vgpr(s32) = G\_UNMERGE\_VALUES %0
; %4:vgpr(s32), %5:vgpr(s32) = G\_UNMERGE\_VALUES %1
; %6:vgpr(s32) = G\_AND %2, %3
; %7:vgpr(s32) = G\_AND %4, %5
; %8:vgpr(s64) = G\_MERGE\_VALUES %6, %7

liveins: vgpr0_vgpr1,vgpr0\_vgpr1, vgpr0_vgpr1,vgpr2\_vgpr3
%0:\_(s64) = COPY vgpr0_vgpr1<brclass="">vgpr0\_vgpr1<br class=""> %1:\_(s64) = COPY vgpr0_vgpr1<brclass="">vgpr2\_vgpr3
%2:\_(s64) = G\_AND %0, %1
…

Part of my confusion about the operand focus is the use of RepairPts. In this case the inputs %0 and %1 have been trivially assigned already, but I kind of expected those to be present as something to handle here if that makes sense.

-Matt

I posted a rough first step here: https://reviews.llvm.org/D55988