[llvm-dev] LSR (original) (raw)

Hal Finkel via llvm-dev llvm-dev at lists.llvm.org
Wed Apr 19 09:25:15 PDT 2017


On 04/18/2017 12:28 AM, Jonas Paulsson wrote:

Hi Hal,

No, LSR won't add new PHIs. This is a long-standing deficiency (and also, in part, prevents it from properly handling pre/post-inc addressing modes, which motives some target-specific passes such as lib/Target/PowerPC/PPCLoopPreIncPrep.cpp). I do find in this regression that gcc manages to use four address registers, while llvm uses just one, with a lot of extra address building instructions as a result, where the gcc loop looks very clean. I suspect that this could be a needed feature. Would you recommend doing this (splitting PHIs) as a "LSRPrep" pass, perhaps in the target, or would you try to extend LSR itself?

I recall thinking that the "right" way to do this is to extend LSR itself. This way the cost of adding extra PHIs could be weighed with addressing modes, register pressure, and other factors. I have not, however, looked enough into the details to say exactly how this would work - if I had, I probably would have done it myself ;) -- cc'ing Andy in case he'd like to chime in.

I also see that LSR is thinking in terms of increments between the memory accesses. In the loop I am working with it's disappointing to see that before each memory access, the base address is loaded into register, and then the offset is added, and then the access, which is 3 instructions. It should have been just an add/sub after the previous access before the memory access, per LSRs intentions. I wonder where this is supposed to be handled: In some sort of target pre-isel pass that chains the GEPs? Or is this just folded more often on other targets? As I recall, it does not do this now (although this is also needed for handling pre/post-inc addressing modes properly). Same question - It might be simpler to do a separate post-LSR GEP handling (in CodeGenPrepare, perhaps?), but I suspect it would also be possible to extend LSR to do this instead?

Splitting is what, in practice, we have now. Targets have "fixup/prep" passes to account for the fact that LSR won't add new PHIs. This seems simpler, but it is probably also suboptimal (i.e. it works reasonably for targets with simpler addressing modes, like PPC, although there are still issues, but I don't see that it would work well for targets with complicated ones).

-Hal

/Jonas

-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory



More information about the llvm-dev mailing list