Roger Sayle - Re: [PATCH] Fix PR29789, Missed invariant out of the loop with conditio (original) (raw)

This is the mail archive of the gcc-patches@gcc.gnu.orgmailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix PR29789, Missed invariant out of the loop with conditionals and shifts


Hi Richard,

On Thu, 15 Feb 2007, Richard Guenther wrote:

2007-01-25 Richard Guenther rguenther@suse.de

PR middle-end/29789
* fold-const.c (fold_binary): Do not fold (1 << a) & b to
(b >> a) & 1.

I will apply this later.

Noooo! Or at least not so quickly, without further benchmarking.

So I verified that we create exactly the same assembler on two-operand insn machines with and without the patch.

I didn't believe you :-), so I tested this out myself.

For the following test case:

int foo(int a, int b) { return ((1 << a) & b) != 0; }

which currently on i686-pc-linux-gnu with -O2 -fomit-frame-pointer is

_foo: movl 8(%esp), %eax movl 4(%esp), %ecx sarl %cl, %eax andl $1, %eax ret

but with your proposed patch becomes:

_foo: movl 4(%esp), %ecx movl $1, %eax sall %cl, %eax testl %eax, 8(%esp) setne %al movzbl %al, %eax ret

Which is significantly larger and slower.

I'm suprised that pinskia hadn't mentioned the various PRs concerning shifts and single bit tests, including those that would have to be reopened if you eliminated this code.

This is just working around the real problem of invariant identification by reassociations. Given an expression such as iv + (invar + 1), we currently canonicalize with the constant outermost, (iv + invar) + 1 which may cause a simplistic invariant subexpression pass to miss the fact that (invar + 1) is loop invariant. It's not just shifts, but all kinds of expressions that may need to be restructured to separate the loop variant from the loop invariant terms.

I'll see what can be done from within fold, or perhaps cleaned up within the RTL optimizers, but these transformations are part of a complex push-pull transformation chain, with the canonical forms being identified and appropriately lowered during RTL expansion.

[This is why you didn't see a change in an if(...) test. We canonicalize to "(a >> b) & 1" at the tree-level and then at RTL expansion time, we determine whether the integer value is needed or not (in expand_expr vs. do_jump) and emit/expand the appropriate form.]

Even if we decided on the alternate canonical form, which doesn't satisfy truth_value_p, your patch would also then need to transform "(a >> b) & 1" into "(1 << b) & a", so we could spot their equivalence, and lower it intelligently in expr.c/jump.c.

At a minimum you should run SPEC (for performance) and CSiBE (for code size) to see, if your change wins more often than it loses.

Roger


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]