[LLVMdev] [RFC] AArch64: Should we disable GlobalMerge? (original) (raw)

Eric Christopher echristo at gmail.com
Fri Feb 27 14:21:34 PST 2015


On Fri, Feb 27, 2015 at 2:13 PM Ahmed Bougacha <ahmed.bougacha at gmail.com> wrote:

On Fri, Feb 27, 2015 at 1:42 PM, Eric Christopher <echristo at gmail.com> wrote: > > > On Fri, Feb 27, 2015 at 1:38 PM Ahmed Bougacha <ahmed.bougacha at gmail.com_ _> > wrote: >> >> On Thu, Feb 26, 2015 at 2:33 AM, Kristof Beyls <kristof.beyls at arm.com> >> wrote: >> > >> > Hi Ahmed, >> > >> > Did you run these experiments on a platform with a linker that makes >> > use of the AArch64CollectLOH-pass-produced information? >> >> As Jim says, I'm on iOS, so yes. However, I'm mostly running tests >> with the pass disabled. >> >> > >> > I'm guessing that the AArch64CollectLOH-pass information and a linker >> > that makes use of that information could affect the profitability of >> > the GlobalMerge pass? >> >> It could, and does, from what I've seen (beware anecdata): >> - reusing the adrp base prevents optimizing it (the various >> Adrp*{ldr,str} LOHs). >> - reusing the adrp+add MergedGlobal pointer, with indexed addressing, >> doesn't prevent the AdrpAdd optimization. >> >> All in all, whether GlobalMerge is profitable or not (by increasing >> register pressure, or adding another indirection), whenever the LOH >> optimizations fire, they reduce its usefulness. >> >> AFAICT, the only case where LOHs help GlobalMerge is when the >> MergedGlobal base is closer to the adrp sequence than the actual >> global. Given that we only merge 4k of globals, on a 1MB range this >> doesn't happen very often. >> >> >> >> Which brings us to my fallback proposal: what about disabling the >> pass on darwin only? Various darwin-enabled features (e.g., LOHs) >> help mitigate the adrp problem, and global usage is usually frowned >> upon in those circles (except for singletons, class-/function-statics >> and whatnot, which I'm trying to address in an upcoming patch). >> > > Before making the disabling darwin only I'd like to see some analysis of the > regressions/improvements. Has anyone looked at the code for those yet?

Yep, I put a quick analysis in my other reply.

The LOH/ADRP bit?

> >> >> As for other targets, as a first step, making the pass run under -O3 >> rather than -O1 is hopefully agreeable to everyone? After all, it is >> "aggressive", and isn't always profitable. That's pretty much the >> description of -O3. >> We can still run into problematic cases under LTO, though. >> > > Seems reasonable to me, but probably want to see what happens with the above > questions first. Fair enough. Bottom line is: - disabling it without LTO is a slight win on the test-suite, a solid win everywhere else I've looked. - disabling it with LTO regresses quite a few SPEC benchmarks, and is overall a slight regression on the test-suite. Ah, I meant an analysis of the code, not just the numbers. I think the ADRP/LOH commentary really helps. It might only be a decent LTOish optimization, but I'm still curious how it's helping there over other optimizations.

Anyhow, FWIW I'm in favor of pulling it out of the non-LTO pipeline universally.

-eric

-Ahmed

> -eric > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150227/b43e6d1a/attachment.html>



More information about the llvm-dev mailing list