[LLVMdev] Memcpy / Memset for address spaces >= 256 (original) (raw)

Manuel Jacob me at manueljacob.de
Wed Mar 12 13:34:47 PDT 2014


Hi David,

sorry for sending you the mail two times, I forgot to send to the list the first time.

On 2014-03-12 09:48, David Chisnall wrote:

I have some patches that automatically expand all memcpy and similar if the operands are not in AS 0. I think this is probably not quite the right approach though, and we should be asking the back end for the function that does a memcpy / memset / whatever in a non-0 address space, and expand automatically if it doesn't provide one.

Can you share these patches? This would be a tentative solution for the reporter of the bug I linked in the original post.

In an ideal world, I'd rather have the memcpy / memset lowering moved entirely out of SelectionDAG and into a FunctionPass, where it would be much easier to debug. I'd also want to do the same for lowering of unaligned loads / stores, so by the time you get to the back end every load and store is something that can map trivially to a single instruction (assuming an adequate addressing mode exists).

While I agree that the memcpy lowering pass could be done as an IR pass because it involves loops, I don't think you should do that for lowering of unaligned loads / stores. But that's mostly unrelated to this thread and should be discussed separately.

There are still some advantages of lowering the memcpy / memset in SelectionDAGBuilder. The infrastructure (e.g. target hooks for determining the right register class for memory operations) is already there. I don't know how hard it is to generate loops in SelectionDAGBuilder, though.

-Manuel

David

On 11 Mar 2014, at 22:23, Manuel Jacob <me at manueljacob.de> wrote:

Hi,

SelectionDAGBuilder doesn't know how to lower a Memcpy and Memset if one of the pointer operands have an address space >= 256. This is understandable since the libc's memcpy / memset don't work for these address spaces. However, both Clang (when copying a struct) and some optimization passes (LoopIdiomRecognize, MemCpyOpt) can emit memcpy / memset for these address spaces. This triggers an assert in SelectionDAGBuilder. The optimization passes could be modified to give up when they encounter an address space >= 256, but I think clang would need some new code that emits a struct copy member-by-member. I think it's better to extend the code generator to be able to emit code for that. What do you think? The problem is also described here: http://llvm.org/bugs/showbug.cgi?id=18549 -Manuel



More information about the llvm-dev mailing list