[cfe-dev] compiling CUDA w/ -fdebug-default-version=5 generates invalid PTX (original) (raw)
Alexey.Bataev via cfe-dev cfe-dev at lists.llvm.org
Thu Dec 3 11:57:32 PST 2020
- Previous message: [cfe-dev] compiling CUDA w/ -fdebug-default-version=5 generates invalid PTX
- Next message: [cfe-dev] compiling CUDA w/ -fdebug-default-version=5 generates invalid PTX
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Artem, here is what I found about this.
These labels are emitted only if DWARF 4 or 5 is used. They are required for emission of the DW_AT_call_site attribute. The info about callsites also emitted for lineinfo emission with debug info for profiling like in your example (-O1 -gmlt).
Call/CallUni instruction is treated as a separate instruction by the debug info generator and it treats it as a separate call and emits labels for it. You can try to mark the CallUni instructions as hasDelaySlot = 1 or stop treating it as a call instruction in NVPTXInstrInfo.td. Can't say which one is better/correct. Looks like the representation for call/call.uni is not quite compatible with the debug info
Best regards, Alexey Bataev
12/2/2020 6:00 PM, Artem Belevich пишет:
Hi, Alexey!
I've ran into an odd case with debug info generation in NVPTX. Reproduction: ------------------------ device attribute((noinline)) void bar() { printf("Hi!"); } global void foo() { bar(); } int main(){} ------------------------ $ clang++ -v --cuda-gpu-arch=sm70 --cuda-device-only -fdebug-default-version=5 a.cu <http://a.cu> -gmlt -O1 -c Compilation fails due to a syntax error reported by ptxas. The reason for the error is that clang generates a label in the middle of a
call.uni
instuction. E.g: { // callseq 1, 0 .reg .b32 tempparamreg; call.uni Ltmp14: Z3barv, ( ); } // callseq 1 The odd part is that we're only generating line info and there is no DWARF in the generated PTX. It appears that this behavior is triggered by-dwarf-version=5
passed to cc1. Looks like another case where PTX syntax breaks DWARF generator assumptions. It's possible to work around it with an additional-Xarchdevice_ _-fdebug-default-version=2
, but I'd appreciate it if you could take a look and see if that could be fixed. -- --Artem Belevich -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20201203/4d4b1a83/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature Type: application/pgp-signature Size: 840 bytes Desc: OpenPGP digital signature URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20201203/4d4b1a83/attachment.sig>
- Previous message: [cfe-dev] compiling CUDA w/ -fdebug-default-version=5 generates invalid PTX
- Next message: [cfe-dev] compiling CUDA w/ -fdebug-default-version=5 generates invalid PTX
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]