Cuda-GDB bug when assertion fails (original) (raw)

October 14, 2024, 9:59am 1

Hello, I’m reporting this bug after seeing a request in my terminal.

Briefly, I put an assert in a device function that failed. After the failure, the cuda-gdb printed:

Thread 1 "game" received signal CUDA_EXCEPTION_12, Warp Assert.
[Switching focus to CUDA kernel 1, grid 5, block (0,0,0), thread (160,0,0), device 0, sm 0, warp 4, lane 0]
0x00007fffcce4a1a0 in __assert_fail ()

I then tried to use the “step” command, and the debugger broke.
Here’s the output:

**(cuda-gdb) step**
Single stepping until exit from function __assert_fail,
which has no line number information.
cuda-gdb/13/gdb/infrun.c:2703: internal-error: resume_1: Assertion `pc_in_thread_step_range (pc, tp)' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
----- Backtrace -----
0x4f8b67 ???
0x8dc864 ???
0x8dcbd8 ???
0xa77171 ???
0x726fdb ???
0x7272b5 ???
0x72759d ???
0x72867f ???
0x710818 ???
0x528073 ???
0x8be387 ???
0x69830b ???
0x69955d ???
0x698c78 ???
0x91f850 ???
0x698a8d ???
0x698b5c ???
0x697d17 ???
0xa77bfc ???
0xa77dd0 ???
0x77b98e ???
0x77d214 ???
0x40fc84 ???
0x71a915634e07 ???
0x71a915634ecb ???
0x417034 ???
0xffffffffffffffff ???
---------------------
cuda-gdb/13/gdb/infrun.c:2703: internal-error: resume_1: Assertion `pc_in_thread_step_range (pc, tp)' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) y

This is a bug, please report it.  For instructions, see:
<https://forums.developer.nvidia.com/c/developer-tools/cuda-developer-tools/cuda-gdb>.

cuda-gdb/13/gdb/infrun.c:2703: internal-error: resume_1: Assertion `pc_in_thread_step_range (pc, tp)' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Create a core file of GDB? (y or n) n

cuda-gdb has received a SIGSEGV and will attempt to get its own backtrace.

...-gdb-minimal| segv_handler() +0x4a
...ib/libc.so.6| ?????
...-gdb-minimal| std::_Function_handler<void (objfile*), void (*)(objfile*)>::_M_invoke() +0x6
...-gdb-minimal| objfile::~objfile() +0xd9
...-gdb-minimal| program_space::remove_objfile() +0x79
...-gdb-minimal| cuda_elf_image_unload() +0x72
...-gdb-minimal| module_delete() +0x1b1
...-gdb-minimal| modules_delete() +0x2c
...-gdb-minimal| context_delete() +0x2d
...-gdb-minimal| contexts_delete() +0x96
...-gdb-minimal| cuda_device::cleanup_contexts() +0x23
...-gdb-minimal| cuda_device::~cuda_device() +0x1b
...-gdb-minimal| cuda_state::~cuda_state() +0x30
...ib/libc.so.6| ?????
...ib/libc.so.6| ?????
...-gdb-minimal| internal_vproblem() +0x2da
...-gdb-minimal| internal_verror() +0x19
...-gdb-minimal| internal_error_loc() +0x82
...-gdb-minimal| resume_1() +0xe7c
...-gdb-minimal| resume() +0x6
...-gdb-minimal| keep_going_pass_signal() +0x27e
...-gdb-minimal| proceed() +0x730
...-gdb-minimal| step_1() +0x219
...-gdb-minimal| cmd_func() +0x54
...-gdb-minimal| execute_command() +0x708
...-gdb-minimal| command_handler() +0x6c
...-gdb-minimal| command_line_handler() +0x3e
...-gdb-minimal| gdb_rl_callback_handler() +0x79
...-gdb-minimal| rl_callback_read_char() +0x1d1
...-gdb-minimal| gdb_rl_callback_read_char_wrapper_noexcept() +0x4e
...-gdb-minimal| gdb_rl_callback_read_char_wrapper() +0xd
...-gdb-minimal| stdin_event_handler() +0x68
...-gdb-minimal| gdb_wait_for_event() +0x53d
...-gdb-minimal| gdb_do_one_event() +0x121
...-gdb-minimal| captured_command_loop() +0x3f
...-gdb-minimal| gdb_main() +0x15
...-gdb-minimal| main() +0x25
...ib/libc.so.6| ?????
...ib/libc.so.6| __libc_start_main() +0x8c
...-gdb-minimal| _start() +0x29
Recursive internal problem.


Fatal signal: Aborted
----- Backtrace -----
0x4f8b67 ???
0x697bff ???
0x71a91564c1cf ???
0x71a9156a53f4 ???
0x71a91564c11f ???
0x71a9156334c2 ???
0x8d8223 ???
0x8dc5e3 ???
0x8dcbd8 ???
0xa77171 ???
0x7161ae ???
0x80579d ???
0x80587f ???
0x8b2d2f ???
0x8b2ed1 ???
0x8b3134 ???
0x4d467e ???
0x4e3ebf ???
0x4ed0f6 ???
0x5eba85 ???
0x601b86 ???
0x71a91564c1cf ???
0x44e576 ???
0x7c83e8 ???
0x7e9cf8 ???
0x5bec81 ???
0x5d5890 ???
0x5d5bbb ???
0x5aeb1c ???
0x5aef05 ???
0x5dedf2 ???
0x5e166a ???
0x5e680f ???
0x71a91564e890 ???
0x71a91564e95d ???
0x8dc859 ???
0x8dcbd8 ???
0xa77171 ???
0x726fdb ???
0x7272b5 ???
0x72759d ???
0x72867f ???
0x710818 ???
0x528073 ???
0x8be387 ???
0x69830b ???
0x69955d ???
0x698c78 ???
0x91f850 ???
0x698a8d ???
0x698b5c ???
0x697d17 ???
0xa77bfc ???
0xa77dd0 ???
0x77b98e ???
0x77d214 ???
0x40fc84 ???
0x71a915634e07 ???
0x71a915634ecb ???
0x417034 ???
0xffffffffffffffff ???
---------------------
A fatal error internal to GDB has been detected, further
debugging is not possible.  GDB will now terminate.

This is a bug, please report it.  For instructions, see:
<https://forums.developer.nvidia.com/c/developer-tools/cuda-developer-tools/cuda-gdb>.

Aborted (core dumped)

My system info:
Operating System: Arch Linux, GPU: NVIDIA GeForce RTX 2060

Thanks for the bug report! We are able to reproduce this in house and a fix will come in a future CUDA Toolkit release.

system Closed November 17, 2024, 11:02pm 4

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.