[LLVMdev] X86TargetLowering::LowerToBT (original) (raw)
Chris Sears chris.sears at gmail.com
Sun Jan 18 22:57:33 PST 2015
- Previous message: [LLVMdev] X86TargetLowering::LowerToBT
- Next message: [LLVMdev] X86TargetLowering::LowerToBT
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Sure. Attached is the file but here are the functions. The first uses a fixed bit offset. The second has a indexed bit offset. Compiling with llc -O3, LLVM version 3.7.0svn, it compiles the IR from IsBitSetB() using btq %rsi, %rdi. Good. But then it compiles IsBitSetA() with shrq/andq, which is is pretty much what Clang had generated as IR.
shrq $25, %rdi andq $1, %rdi
LLVM should be able to replace these two with a single X86_64 instruction: btq reg,25 The generated code is correct in both cases. It just isn't optimized in the immediate operatnd case.
unsigned long long IsBitSetA(unsigned long long val) { return (val & (1ULL<<25)) != 0ULL; }
unsigned long long IsBitSetB(unsigned long long val, int index) { return (val & (1ULL<<index)) != 0ULL; }
On Sun, Jan 18, 2015 at 10:02 PM, Mehdi Amini <mehdi.amini at apple.com> wrote:
Hi,
Can you provide a reproducible example? I feel especially your first IR sample is incomplete. If you can also make more explicit how is the generated code wrong? You can give a C file if you are sure that it is reproducible with the current clang. Thanks, Mehdi On Jan 18, 2015, at 5:13 PM, Chris Sears <chris.sears at gmail.com> wrote: I'm tracking down an X86 code generation malfeasance regarding BT (bit test) and I have some questions. This IR matches and then *X86TargetLowering::LowerToBT *is called: %and = and i64 %shl, %val * ; (val & (1 << index)) != 0 ; *bit test with a register index
This IR does not match and so *X86TargetLowering::LowerToBT **is not called:* %and = lshr i64 %val, 25 * ; (val & (1 **<< 25)) != 0 ; *bit test with an immediate index %conv = and i64 %and, 1 Let's back that up a bit. Clang emits this IR. These expressions start out life in C as and with a left shifted masking bit, and are then converted into IR as right shifted values anded with a masking bit. This IR then remains untouched until Expand ISel Pseudo-instructions in llc (-O3). At that point, LowerToBT is called on the REGISTER version and substitutes in a BT reg,reg instruction: btq %rsi, %rdi ## <MCInst #312 BT64rr The IMMEDIATE version doesn't match the pattern and so LowerToBT is not called. Question: This is during pseudo instruction expansion. How could *LowerToBT'*s caller have enough context to match the immediate IR version? In fact, lli isn't calling LowerToBT so it isn't matching. But isn't this really a peephole optimization issue? LLVM has a generic peephole optimizer, *CodeGen/PeepholeOptimizer.cpp *which has exactly one subclass in NVPTXTargetMachine.cpp. But isn't it better to deal with X86 LowerToBT in a PeepholeOptimizer subclass where you have a small window of instructions rather than during pseudo instruction expansion where you have really one instruction? *PeepholeOptimizer *doesn't seem to be getting much attention and certainly no attention at the subclass level. Bluntly, expansion is about expansion. Peephole optimization is the opposite. Question: Regardless, why is LowerToBT not being called for the IMMEDIATE version? I suppose you could look at the preceding instruction in the DAG. That seems a bit hacky*.* Another approach using LowerToBT would be to match lshr reg/imm first and then if the following instruction was an *and reg,1 *replace both with a BT*. *It doesn't look like LowerToBT as is can do that right now since it is matching the and instruction. SDValue X86TargetLowering::LowerToBT(SDValue And, ISD::CondCode CC, SDLoc dl, SelectionDAG &DAG) const { ... } But I think this is better done in a subclass of CodeGen/PeepholeOptimizer.cpp. thanks.
LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-- Ite Ursi -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150118/9927a6c4/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: tst.c Type: text/x-csrc Size: 207 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150118/9927a6c4/attachment.c>
- Previous message: [LLVMdev] X86TargetLowering::LowerToBT
- Next message: [LLVMdev] X86TargetLowering::LowerToBT
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]